Access Denied when creating data rows via SDK; rows successfully imported after reprocessing

I am consistently unable to import Document data rows via create_data_rows in the Python SDK. All of the rows show the following error:
Access Denied: The data row could not be fetched because the user does not have access to it.

When I manually reprocess the rows, the access error goes away and the documents are successfully imported.

In my workspace settings, the associated bucket roles are indeed connected. I have double checked the associated policies, and they are also correct, and should theoretically allow access to all objects contained within the aforementioned bucket.

I have also confirmed that the URL is correct, does not contain any spaces or invalid characters.

Any thoughts? This is having an impact on our ability to automate labeling on a large scale.

Hi @ts
Are the CORS policy good too? There is a retry mechanism in case failure, who you is your cloud provider and what type of asset are you using?

I have confirmed that the CORS policy is in place. I am using AWS and the assets are PDF documents. The PDFs are relatively small, so we’re not bumping up against size or page limits.

Can you also clarify what you mean by the retry mechanism? I can’t seem to find anything in the docs or the labelbox-python github repository.

Every time you import data to Labelbox will try to reach the data 3 times this is a backend process.
If you could leave one data row in an error state and reply here with the data row id I can take a look.

Thanks for the prompt response! I’ve got a data row right here for you:

id=clzviso9k3cvf0769dcgjq7ix
global_key=b6bf5ae5-eaa3-4595-bdc6-e99d8c60fa76

Is there any other information that you need in order to identify it?

I found the error, given we generate a layer to annotate PDF seems there is an issue here, I’m taking this for review internally, we can probably stop generating the layer if this helps? Let me know.

Interesting, so it’s the text layer that’s causing trouble? How can I disable generation?

You can’t but we can if you want.

If you disable it, will I still be able to provide my own custom text layer? I had been planning on doing that eventually once my internal OCR pipeline is working. If so, I’m fine with you disabling it for the purposes of unblocking me.

Yep you can still provide your layer! and I disabled the auto generation.
Give it a try to see if this works better.

Fantastic, it worked! Thank you very much!

1 Like

Once the issue is resolved internally, would it be possible to re-enable text layer geneneration and to notify me? Another question, actually: is this setting on the account level? My coworkers will be using the workflow I am developing and I’d like for them to avoid running into the same error state.

Yes and yes we will looking into it, and this is at the workspace level so this covers everyone.

1 Like