Hello Labelbox Community!
This tutorial is designed to walk you through the process of importing annotations as prelabels, significantly simplifying your labeling tasks.
Set up environment
- Before diving into the specifics, ensure your development environment is ready. You’ll need certain Python libraries installed:
import uuid
from PIL import Image
import requests
import base64
import labelbox as lb
import labelbox.types as lb_types
from io import BytesIO
- Next, initialize your Labelbox client with your API key:
api_key = "API_KEY"
client = lb.Client(api_key)
Crafting Supported Annotations
Let’s start by defining the types of annotations you’ll be working with. We’ll cover radio buttons, bounding boxes (BBoxes), and polygons. Each type requires careful attention to naming conventions, especially ensuring they align with your ontology features.
- Here’s how to create a radio button annotation, both in Python and in NDJSON format:
# Python annotation
radio_annotation = lb_types.ClassificationAnnotation(
name="Is it daytime or nighttime?",
value=lb_types.Radio(answer=lb_types.ClassificationAnswer(
name="Daytime")))
# NDJSON
radio_annotation_ndjson = {
"name": "Is it daytime or nighttime?",
"answer": {
"name": "Daytime"
}
}
- For bounding box annotations, remember to match the annotation name with your ontology feature’s name:
# Python annotation
bbox_annotation = lb_types.ObjectAnnotation(
name="Human", # must match your ontology feature's name
value=lb_types.Rectangle(
start=lb_types.Point(x=55.0, y=670.0),
end=lb_types.Point(x=151.0, y=956.0)
))
# NDJSON
bbox_annotation_ndjson = {
"name": "Human",
"bbox": {
"top": 670.0,
"left": 55.0,
"height": 286.0,
"width": 96.0
}
}
- Polygon annotations require specifying multiple vertices:
# Python annotation
polygon_annotation = lb_types.ObjectAnnotation(
name="Shopping store", # must match your ontology feature's name
value=lb_types.Polygon( # Coordinates for the vertices of your polygon
points=[
lb_types.Point(x=1.2, y=1.2),
lb_types.Point(x=380.4, y=0.0),
lb_types.Point(x=568.8, y=426.0),
lb_types.Point(x=562.8, y=692.4),
lb_types.Point(x=0.0, y=889.2),
lb_types.Point(x=1.2, y=1.2)
]))
# NDJSON
polygon_annotation_ndjson = {
"name": "Shopping store",
"polygon": [
{"x": 1.2, "y": 1.2},
{"x": 380.4, "y": 0.0},
{"x": 568.8, "y": 426.0},
{"x": 562.8, "y": 692.4},
{"x": 0.0, "y": 889.2},
{"x": 1.2, "y": 1.2}
]
}
Importing Data Rows into Catalog
- Now, let’s move on to importing data rows into your catalog. This involves creating a dataset, uploading an image, and handling potential errors:
# send a sample image as batch to the project
global_key = "stanford-test-image"
test_img_url = {
"row_data":
"https://labelbox-jannybucket.s3.us-west-2.amazonaws.com/stanford-shopping-center-06.jpg",
"global_key":
global_key
}
dataset = client.create_dataset(name="stanford-demo-dataset")
task = dataset.create_data_rows([test_img_url])
task.wait_till_done()
print(f"Failed data rows: {task.failed_data_rows}")
print(f"Errors: {task.errors}")
if task.errors:
for error in task.errors:
if 'Duplicate global key' in error['message'] and dataset.row_count == 0:
# If the global key already exists in the workspace the dataset will be created empty, so we can delete it.
print(f"Deleting empty dataset: {dataset}")
dataset.delete()
- Create or select an ontology. Ensure your project has the correct ontology set up, matching the tool names and classification instructions with your annotations:
ontology_builder = lb.OntologyBuilder(
classifications=[ # List of Classification objects
lb.Classification(class_type=lb.Classification.Type.RADIO,
name="Is it daytime or nighttime?",
options=[
lb.Option(value="Daytime"),
lb.Option(value="Nighttime")
]),
],
tools=[ # List of Tool objects
lb.Tool(tool=lb.Tool.Type.BBOX, name="Human"),
lb.Tool(tool=lb.Tool.Type.POLYGON, name="Shopping store"),
])
ontology = client.create_ontology("stanford-test-ontology",
ontology_builder.asdict(),
media_type=lb.MediaType.Image
)
- Create a labeling project.
# Project defaults to batch mode with benchmark quality settings if this argument is not provided
project = client.create_project(name="stanford demo",
media_type=lb.MediaType.Image)
project.setup_editor(ontology)
- Send the batch of data rows to the project. Prepare your data rows for submission:
batch = project.create_batch(
"stanford-demo-batch", # each batch in a project must have a unique name
global_keys=[
global_key
], # paginated collection of data row objects, list of data row ids or global keys
priority=1 # priority between 1(highest) - 5(lowest)
)
print(f"Batch: {batch}")
Creating Annotations Payload
Both Python and NDJSON formats are supported for annotations:
- Python annotations
label = []
annotations = [
radio_annotation,
bbox_annotation,
polygon_annotation,
]
label.append(
lb_types.Label(data={"global_key" : global_key},
annotations=annotations))
- NDJSON annotations
label_ndjson = []
annotations = [
radio_annotation_ndjson,
bbox_annotation_ndjson,
polygon_annotation_ndjson,
]
for annotation in annotations:
annotation.update({
"dataRow": {
"globalKey": global_key
}
})
label_ndjson.append(annotation)
Uploading Annotations as Prelabels
Finally, upload your annotations to the project:
# upload MAL labels for this data row in project
upload_job = lb.MALPredictionImport.create_from_objects(
client=client,
project_id=project.uid,
name="mal_job" + str(uuid.uuid4()),
predictions=label
)
upload_job.wait_until_done()
print(f"Errors: {upload_job.errors}")
print(f"Status of uploads: {upload_job.statuses}")
Good thing to note:
- Pre-label: (aka model-assisted labeling, MAL) are for assets (data row) that have no current a label allocated (ground-truth).
- Ground-truth (GT): Creates a label on a given data row
With this guide, you’re well on your way to enhancing your labeling process with prelabelled annotations on Labelbox. Happy annotating!