Workflows for Rejecting/Deleting Examples

We are seeing cases where reviewers find that an example image is ambiguous enough to justify exclusion from our dataset. As an example, consider a cats and dogs classification project with an example that the reviewer cannot unambiguously state is cat or dog. We want to remove it from the dataset altogether.

We do not, however, want to grant data row deletion privs to our reviewers as this would be a significant escalation in privilege. Instead we want to flag the data row as “to be deleted” and clean it up later (delete it from the lb project as well as from our managed storage resource simultaneously) via service account. What is the happy path for achieving this?

I’ve looked in to workflows but the states of “done” or “rework” don’t seem to allow this. If we could use logic to add a tag that would be sufficient, i.e., the reviewer sends the label to a special “to be deleted” logical step that appends a metadata tag to the datarow flagging it for deletion before moving it to “done”, we could then scan for datarows with the specific tag.

Hey @sstansell ,

What could be done is have those data row being skipped by the labelers or as you mentioned having a classification to_delete or ambigious and custom your workflow to be able to filter them down via Catalog:

I left a review step to make sure those data rows still gets to a review.

Once the labeling / reviewing is done via Catalog:


You should be able to filter the data rows that you don’t need and delete them either via the UI or SDK.

Now to have a full loop this would depends on what the data rows have in commun with the source of the data, either external_id or global_key you would need to have a mapping to get an export and potentially have a script to delete those asset in your storage.

Hope that get you started!