Hi! I have set up an integration with S3 and I can create data rows that point to files stored in my bucket. However, I saw that this only works if I whitelist Labelbox’s IP addresses, and that my data is being sent to Labelbox servers during the data-row creation process (to extract e.g. image or video metadata, if I understood correctly). I don’t want my data being sent to an external server, not even if it is just for processing. The only time the data should be pulled from S3 is when an annotator annotates an image, in which case the image would be downloaded directly to the annotator’s browser.
So: How can I create data rows (while using an integration with S3) and avoid sending my data to Labelbox servers? E.g. can I provide the required metadata myself or can I just not use this metadata?
Hey @philipp.andermatt
Welcome to the Labelbox Community,
So the way we ingested self hosted data like yours required for processing an initial download (metadata, extract embeddings and generation of a thumbnail) BUT we only keep the details mentioned here not the data.
So rest assured that aside from the initial steps we do not keep your data
Many thanks,
PT
Hi @ptancre
Thank you for the quick reply!
If I would still like to avoid sending data to Labelbox servers (e.g. to avoid sending data from the EU to servers in the US), is there a way how I can do this? Or is it always required that data gets processed during ingestion?
The data processing is required since this cascade to the editor, we have a strong data protection policies.
If you want to have more information about it : Privacy & Security Program | Labelbox
There is an email address at the bottom of that page if you have further question around our policies and certifications.