Issues with uploading text from a .csv file

I’m trying to create a dataset from a .csv file I have of drug-drug interactions. It has 3 columns. The first one is the drug of interest, the second column is the interaction drug, and the 3rd column is the explanation of what the interaction is. This .csv file contains thousands of these interactions. The script I have attempted to write, in theory, should iterate over the rows in my .csv file and upload each independent row as a separate dataset.
When I run the code, it times out throwing the code “google.api_core.exceptions.RetryError: Deadline of 120.0s exceeded while calling target function, last exception: Internal server error(‘Internal server error’, None)”

I can see the data set created on the label box UI, but none of the data rows are being uploaded. I cannot think of what else I should change in my code to make this work correctly. I should also mention that I am using the .csv file locally. Here is the code I’m using at the moment.

#Credentials
api_key = ‘MyAPIkey’
labelbox_client = lb.Client(api_key)
dataset = labelbox_client.create_dataset(name=‘Drug-Drug Interactions’)
#CSV
df = pd.read_csv(“C:/location/of/file.csv”)

#Row iteration
for index, row in df.iterrows():
DOI = row[‘drug_of_interest’]
interactor = row[‘interactor’]
interaction = row[‘interaction’]

# Creating the asset with the csv info
asset = {
    "row_data": DOI,
    "external_id": str(row['drug_of_interest']) + '_' + str(row['interactor']),
    "global_key": f"asset-{index}",
    "metadata_fields": [{"schema_id": "my_schema_id", "value": "tag_string"}],  
    "attachments": []  
}

# Add the asset to the dataset
task = dataset.create_data_row(asset)

task.wait_till_done()
print(task.errors)

print(f"Asset '{DOI}' uploaded successfully.")
print("All assets uploaded to Labelbox.")

Does anyone have any ideas?

Hi @clark.thurston,

Thanks for your post.

I can see that you used create_data_row() which returns a DataRow object, but you then used it like create_data_rows(), which returns a Task object. Was it intentional?

Best regards,

Paul N.
Labelbox Support

@PaulN I was finally able to get it working yesterday afternoon! By removing everything below it worked with no issues.

task = dataset.create_data_row(asset)

Thank you so much!

Thanks for the update @clark.thurston, and I am glad you found your solution.

Just to summarize, you have 2 options here:

task = dataset.create_data_rows([asset])

task.wait_till_done()
print(task.errors)

or

data_row = dataset.create_data_row(asset)

Best regards,

Paul N.
Labelbox Support