Data upload fails silently on Enum and Embeddings

Hi I’m trialing the free version of the tool as my team explores potential labeling options. I’ve integrated my account with aws and successfully created a small dataset with some images using the python sdk.

I’m interested in using your data explorer, specifically the embedding/similarity search capabilities. I’ve embedded my images into 128 dim space per https://github.com/Labelbox/labelbox-python/blob/develop/examples/basics/data_row_metadata.ipynb

I’ve also created a metadata schema for a enum in my account “label_id”.

I have a create_metadata function for creating metadata for a row. When I don’t add the ‘label_id’ and ‘embedding’ metadata fields the dataset is appropriately uploaded and I can see it in the catalog view. When I add either of the ‘label_id’/‘embedding’ metadata fields (together or separately) the code completes successfully, but the created dataset in the catalog has nothing in it. When I dig into the code, the task from wait_til_done() has the status ‘FAILED’ which is returned from the labelbox server.

Can you please explain what I am doing wrong?

In addition, it would be nice if the server sent back a more descriptive error status.

Below I’ve put the relevant parts of my code.

Thank you!


def create_row(lb_client, t):
    metadata_fields: List[DataRowMetadataField] = create_metadata(lb_client, t)
    row =  {
        "row_data": t.s3_web_url,
        "external_id": str(uuid4()),
        "metadata_fields": metadata_fields,
    }
    return row

def create_metadata(lb_client, t) -> List[DataRowMetadataField]:
    ## Fetch metadata schema ontology. A Labelbox workspace has a single metadata ontology.
    metadata_ontology = lb_client.get_data_row_metadata_ontology()

    # List all available fields
    metadata_ontology.fields
    metadata_fields = []

    # Construct a metadata field of Enums options
    train_schema = metadata_ontology.reserved_by_name["split"]["train"]
    split_metadata_field = DataRowMetadataField(
        schema_id=train_schema.parent,  # specify the schema id
        value=train_schema.uid,  # typed inputs
    )
    metadata_fields.append(split_metadata_field)

    label_schema = metadata_ontology.custom_by_name["label"][ClassLabels.from_guid(t.label).name.lower().replace(' ','').replace('-','')]
    label_metadata_field = DataRowMetadataField(
        schema_id=label_schema.parent,
        value=label_schema.uid
    )
    metadata_fields.append(label_metadata_field)        

    embedding: np.ndarray = np.load(t.embedding_path)
    embedding_metadata_field = DataRowMetadataField(
            schema_id=metadata_ontology.reserved_by_name["embedding"].uid,
            value=embedding.tolist(),  # convert from numpy to list
        )
    metadata_fields.append(embedding_metadata_field)
    return metadata_fields

rows = [create_row(lb, t) for t in list(df.itertuples())[:10]][:1]
dataset = lb.create_dataset(name="Test")
task = dataset.create_data_rows(rows)
task.wait_till_done()