Export v2 Format - Video Keyframes Information Incomplete!

btalberg · November 14, 2023, 11:11pm

I’m unable to generate valid training data from labeled video data because of incomplete/missing keyframe information in the export v2 JSON.

Our ontology has an optional radio category. We use an optional radio category to describe an activity that occurs in a classroom context. i.e. when a student walks off the frame, we choose NOT to apply a label from the respective radio category.

The keyframes data in the export JSON does not explicitly describe when a radio option is toggled on/off. Essentially, it only describes that a radio button option was clicked. I have no choice but to assume that every time a keyframe with the same category value is repeated in the export JSON, that that must be the user toggling on/off the value. That assumption is valid in most cases, but not all cases.

I captured this Loom video to demonstrate how the issue arises: Loom | Free Screen & Video Recording Software | Loom

And here’s a snippet of the problematic JSON. There should be two time periods/segments in which radio classification “A” applies: frames 1-9 and frames 15-240. Without providing the state of the radio classification button, the export v2 JSON formation lacks the necessary information for me to reconstruct these time periods/segments:

"annotations": {
  "frames": {
      "1": {
          "objects": {},
          "classifications": [
              {
                  "feature_id": "cloywq54x00053b6n7lpmmdpx",
                  "feature_schema_id": "cloyvg6200e3u072qfo24esaj",
                  "name": "Keyframe Bug Demo Radio",
                  "radio_answer": {
                      "feature_id": "cloywq54x00063b6nrw2b1mi0",
                      "feature_schema_id": "cloyvg6200e3v072qhxyk2wpc",
                      "name": "A",
                      "classifications": []
                  }
              }
          ]
      },
      "9": {
          "objects": {},
          "classifications": [
              {
                  "feature_id": "cloywq54x00053b6n7lpmmdpx",
                  "feature_schema_id": "cloyvg6200e3u072qfo24esaj",
                  "name": "Keyframe Bug Demo Radio",
                  "radio_answer": {
                      "feature_id": "cloywq54x00063b6nrw2b1mi0",
                      "feature_schema_id": "cloyvg6200e3v072qhxyk2wpc",
                      "name": "A",
                      "classifications": []
                  }
              }
          ]
      },
      "15": {
          "objects": {},
          "classifications": [
              {
                  "feature_id": "cloywq54x00053b6n7lpmmdpx",
                  "feature_schema_id": "cloyvg6200e3u072qfo24esaj",
                  "name": "Keyframe Bug Demo Radio",
                  "radio_answer": {
                      "feature_id": "cloywq54x00063b6nrw2b1mi0",
                      "feature_schema_id": "cloyvg6200e3v072qhxyk2wpc",
                      "name": "A",
                      "classifications": []
                  }
              }
          ]
      },
      "20": {
          "objects": {},
          "classifications": [
              {
                  "feature_id": "cloywq54x00053b6n7lpmmdpx",
                  "feature_schema_id": "cloyvg6200e3u072qfo24esaj",
                  "name": "Keyframe Bug Demo Radio",
                  "radio_answer": {
                      "feature_id": "cloywq54x00063b6nrw2b1mi0",
                      "feature_schema_id": "cloyvg6200e3v072qhxyk2wpc",
                      "name": "A",
                      "classifications": []
                  }
              }
          ]
      },
      "240": {
          "objects": {},
          "classifications": [
              {
                  "feature_id": "cloywq54x00053b6n7lpmmdpx",
                  "feature_schema_id": "cloyvg6200e3u072qfo24esaj",
                  "name": "Keyframe Bug Demo Radio",
                  "radio_answer": {
                      "feature_id": "cloywq54x00063b6nrw2b1mi0",
                      "feature_schema_id": "cloyvg6200e3v072qhxyk2wpc",
                      "name": "A",
                      "classifications": []
                  }
              }
          ]
      }
  },
  "segments": {
      "cloywq54x00053b6n7lpmmdpx": [
          [
              1,
              1
          ],
          [
              9,
              9
          ],
          [
              15,
              15
          ],
          [
              20,
              20
          ],
          [
              240,
              240
          ]
      ]
  },
  "key_frame_feature_map": {
      "cloywq54x00053b6n7lpmmdpx": [
          1,
          9,
          15,
          20,
          240
      ]
  },
  "classifications": []
}

The solution to this bug would be to include one of following options in the export v2 JSON:

Include an attribute that describes the state of the radio category/button. i.e. if a keyframe represents a category option being selected/deselected or toggled on/off
Provide a keyframe grouping for keyframes that have a repeated category option and occur in sequence

In the meantime, I feel I have no option but to execute a custom GraphQL query that I’m able to scrape from your frontend UI.

(NOTE, I struggled with the same problem here Checkbox JSON output format)

btalberg · November 15, 2023, 4:58pm

FYI. I realize I can use the Python SDK and it’s ORM to:

Load the project
Loop through all of the project’s batches
Loop through each batch’s data rows
Loop through every data row’s labels
Download each label’s “frames” jsonl
Loop through the frames JSON to build an accurate reflection of the labels our labelers applied

This will work, but is more complex, and I imagine will require many hundreds of API calls as opposed to the export v2 approach which only requires a couple of calls.

Here’s a bit of code I’ve come up with to get me started:

import json

import labelbox as lb
import requests

client = lb.Client(api_key=LABELBOX_API_KEY)
labelbox_project_name = LABELBOX_PROJECT_NAME

lb_project = client.get_projects(
    where=lb.Project.name == labelbox_project_name
).get_one()

for lb_batch in lb_project.batches():
    for lb_data_row in lb_batch.export_data_rows(include_metadata=True):
        for lb_label in lb_data_row.labels():
            label = json.loads(lb_label.label)

            response = requests.get(
                url=label['frames'],
                headers=client.headers
            )

            frames = []
            for json_l_line in response.iter_lines(decode_unicode=True):
                if json_l_line:
                    frames.append(json.loads(json_l_line))
            
            # Work with labeled data, frame by frame

Topic		Replies	Views
Export keyframes in video labeling (point tool) Using Labelbox exports , video	1	562	January 30, 2023
Video export_v2 with interpolated_frames=True does not export interpolated frames Python SDK	0	359	August 10, 2023
Checkbox JSON output format Python SDK exports	2	368	July 11, 2023
Video Annotations sent in Webhook Python SDK exports , data-io	3	377	June 16, 2023
Annotation export empty labels list Python SDK exports , data-row , video , annotations	7	162	January 23, 2025

Export v2 Format - Video Keyframes Information Incomplete!

Related topics