Dataset Timing Out & Images Unavailable

Hi LabelBox Community,

I’m having issues importing my image dataset and having the dataset remain in the catalog. I have now twice had a subset of the images import properly and be visible, I setup the ontology and even began labeling a few, however, checking a few hours later, each time the images disappear and will say they cannot be previewed in catalog and if I try to label them, they say “Asset Didn’t Load Properly”

I have setup the GCS integration and am just confused as everything seemed to import somewhat properly and then later disappears. Can someone help me troubleshoot/solve this issue?

Hi @LSG ,

Looking at your workspace seems you are uploading pre-signed URL, what happens is those signed url are expiring (usually 24h) :

Error>
<Code>ExpiredToken</Code>
<Message>Invalid argument.</Message>
<Details>The provided token has expired. Request signature expired at: 2023-10-11T22:33:06+00:00</Details>
</Error>

Since you are mentioning Integration I would advise you to take a peak at (and set it up!) : Google Cloud Storage
Having a fully functional integration allow Labelbox to read and create token on your behalf without having any intervention from you.

Many thanks,
PT

Can you clarify - the documentation feels like it is contradictory since it provides a schema for the URL query and then says those cannot be used and only gsutil URIs are supported?

If a dataset is signed by a GCP IAM integration, Labelbox will attempt to sign all data rows with this integration. The value of rowData for each Data Row will be updated as follows:
<https://storage.googleapis.com/${bucket}/${key}?{queryParams}>
The queryParams contain signing information.

Only gsutil URIs are supported

Please ensure that you are using gsutil URIs during data import (JSON file or Python SDK).

Example gsutil URI: gs://gcs-lb-demo-bucket/test.png

I do have the LabelBox GCS integration set up for that dataset but am unable to get it to pull through the assets. I defaulted to using pre-signed URLs because it was the only way I was able to get the data to import without saying it didn’t have access (I did follow the documentation in that link but steps 4-5 were not working properly).

Sure, use gsutil and not the Authenticated URL to ingest data to labelbox :

your row_data needs to look like :

[{'row_data': 'gs://lb_vertex_pt/training/images/clbw9ybuf25mx07yl5o2p6gq9_500_500.jpg'}]

and that way we will be able to signed it.

Many thanks,
PT