Efficiently Fetch Label Mask pngs

Hello All,

I’m very new to labelbox so I apologize for the newbie question.

I’m still kind of figuring out how this system is supposed to be used, and I’m struggling with how I should be fetching the label mask pngs. For example, if I follow along with this tutorial: https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html

You’ll notice that there is a dataset class (unrelated to labelbox dataset class) and the all-important function in that class is “getitem” . Now, the first time I used this learner, I was loading images from local storage which i generated with a different labeler. That was super quick. Now what I’ve been trying to do is fetch the mask pngs from the “instanceURI” associated with each label. Every time the getitem is called. The fetch times are like ~.5 seconds, so obviously too slow to actually do real learning with.

One solution I considered was pre-fetching all of the labels from the database into some python array but then we’re looking at trying to store gigabytes of photos on ram which obviously isn’t going to work. So I figure that I must need to download all of the masks from the server and store them locally? That seems a little counter to this cool service that you guys have which stores all the labels already… I just wanted to check that I’m using this tool correctly because I feel like this is a solved problem.

Thanks for dealing with a n00b :slight_smile:

-Tomas

Hi @tcastrosantos, it is recommended that you pre-fetch/download all masks first and then in convert the data to the respective format. For example, check out this colab notebook. In this notebook, I download all images and masks at once using the APIs and then load into the format accepted by pytorch.