[Model] Evaluation Metrics for Multi-Class Object Detection Use Cases

How does Labelbox handle multi-class evaluation metric calculations (e.g. precision and recall) for object detection use cases?

Specifically, for example, if I have 100 images with 8 different classes (i.e. 8 different bounding box detections) in EACH image, how do we calculate precision, recall, etc.? Do we use any sort of average/weighted average scheme across all the detections across all the images?

Hey @Kush

Great question, for multiple classes (schema as we referred them in Labelbox) we would do an arithmetic mean of averages across to obtain all metrics.

Bear in mind depending on the data imported (prediction + confidence score at minimum) we provide as well precision and recall per object class.

Many thanks,