Video upload example script is not working (same issue with our code)

What I am trying to do:

  • Upload a video in GCS to a dataset via dataset ID.
  • The dataset was newly created, and contains no videos.
  • I verified the dataset ID is correct.

Problem:

  • I am getting an error saying labelbox.exceptions.InvalidAttributeError: Field(s) ''global_key'' not valid on DB type 'DataRow'("Field(s) ''global_key'' not valid on DB type 'DataRow'", None).
  • If I remove global key it happens with the other optional fields, as specified in Labelbox documentation here.
  • I then decided to test with the sample code provided in the docs, but the same issue happens!

This is the code we are running

from labelbox import Client
from uuid import uuid4 ## to generate unique IDs
import datetime 
import os

def main():
    API_KEY = os.environ["LABEL_BOX_API_KEY"]
    client = Client(api_key=API_KEY)
    dataset = client.get_dataset("clvdcc6uw002g0770q2cfuz9v")

    assets = [
      {
        "row_data": "https://storage.googleapis.com/labelbox-datasets/video-sample-data/sample-video-1.mp4",
        "global_key": "https://storage.googleapis.com/labelbox-datasets/video-sample-data/sample-video-1.mp44",
        "media_type": "VIDEO",
        "metadata_fields": [{"name": "<metadata_field_name>", "value": "tag_string"}],
        "attachments": [{"type": "VIDEO", "value": "https://storage.googleapis.com/labelbox-sample-datasets/Docs/drone_video.mp4" }]
      },
      {
        "row_data": "https://storage.googleapis.com/labelbox-datasets/video-sample-data/sample-video-2.mp4",
        "global_key": "https://storage.googleapis.com/labelbox-datasets/video-sample-data/sample-video-2.mp4",
        "media_type": "VIDEO",
        "metadata_fields": [{"name": "<metadata_field_name>", "value": "tag_string"}],
        "attachments": [{"type": "TEXT_URL", "value": "https://storage.googleapis.com/labelbox-sample-datasets/Docs/text_attachment.txt"}]
      }
    ]

    task = dataset.create_data_rows(assets)
    task.wait_till_done()
    print(task.errors)

if __name__ == "__main__":
    main()

Full traceback to running the above script:

Traceback (most recent call last):
  File "debug_labelbox_video.py", line 34, in <module>
    main()
  File "debug_labelbox_video.py", line 29, in main
    task = dataset.create_data_rows(assets)
  File "/Users/xaviernogueira/miniconda3/envs/ml-data-curation/lib/python3.8/site-packages/labelbox/schema/dataset.py", line 157, in create_data_rows
    items = [convert_item(item) for item in items]
  File "/Users/xaviernogueira/miniconda3/envs/ml-data-curation/lib/python3.8/site-packages/labelbox/schema/dataset.py", line 157, in <listcomp>
    items = [convert_item(item) for item in items]
  File "/Users/xaviernogueira/miniconda3/envs/ml-data-curation/lib/python3.8/site-packages/labelbox/schema/dataset.py", line 136, in convert_item
    item = {
  File "/Users/xaviernogueira/miniconda3/envs/ml-data-curation/lib/python3.8/site-packages/labelbox/schema/dataset.py", line 137, in <dictcomp>
    key if isinstance(key, Field) else DataRow.field(key): value
  File "/Users/xaviernogueira/miniconda3/envs/ml-data-curation/lib/python3.8/site-packages/labelbox/orm/model.py", line 365, in field
    raise InvalidAttributeError(cls, field_name)
labelbox.exceptions.InvalidAttributeError: Field(s) ''global_key'' not valid on DB type 'DataRow'("Field(s) ''global_key'' not valid on DB type 'DataRow'", None)

Mind helping me out here? I would have expected the sample code to work.

1 Like

Hey,
So for the row data URL you have to use the GSutil URL, gs://, not the HTTPS one. Here is a link to labelbox docs.

Thanks,
Gabe

Thanks for the quick response! @gunderwood

So I tried the following and neither solution worked:

  • row_data="gs://storage.googleapis.com/labelbox-datasets/video-sample-data/sample-video-1.mp4"
  • row_data="gs://labelbox-datasets/video-sample-data/sample-video-1.mp4"

Am I missing something?

Also you might want to update the documentation then, as i copied the code verbatim.

Hey @xavier.nogueira ,

Good point on the documentation this is a clear oversight.
I just used this to upload 2 videos from a GCP bucket, latest SDK version :

assets = [
      {
        "row_data": "gs://test_upload_from_lb/airport-36510.mp4",
        "global_key": "https://storage.cloud.google.com/test_upload_from_lb/airport-36510.mp4",
        "media_type": "VIDEO"
      },
      {
        "row_data": "gs://test_upload_from_lb/dubbing.mp4",
        "global_key": "https://storage.cloud.google.com/test_upload_from_lb/dubbing.mp4",
        "media_type": "VIDEO"
      }
    ]

No issues, are you still getting the global key error? What version of the SDK are you using?

1 Like

Thanks @ptancre, it didn’t work for me at first, but once I updated labelbox to the latest version it worked.

1 Like