Labelbox_json2yolo.py

Hi there,

  1. For computer vision, people mostly use Yolo-V8 these days, so I have exported the annotations in LabelBox, and I wanna figure out how to convert it to Yolo, because My goal is to train my own model using YoloV8 in KerasCV library.

So I was looking at your post; Accelerate your computer vision journey with YOLOv5 and Labelbox

But the code provided there does not work; https://github.com/ultralytics/JSON2YOLO/blob/master/labelbox_json2yolo.py
Specifically, Google Colab has an issue with the utils library(I did !pip install utils but did not help) ;
this line of the mentioned code has an issue: save_dir = make_dirs(file.stem)
the utils library, NameError: name ‘make_dirs’ is not defined.

  1. Do you guys plan to convert Labelbox JSON to different model formats? (something like Roboflow Labelbox conversions).

Thanks,
MoZen

For my exported Ndjson, I was able to modify the ultralytics conversion code as follows, and it is converting to Yolo format one by one in COlab now! I hope it works when I train it with KerasCV! ;

import json
import os
from pathlib import Path
import requests
import yaml
from PIL import Image
from tqdm import tqdm

def make_dirs(path):
path = Path(path)
if not path.exists():
path.mkdir(parents=True)
return path

def convert(file, zip=True):
names =
file = Path(file)
save_dir = make_dirs(file.stem)
data =
with open(file, ‘r’) as f:
for line in f:
data.append(json.loads(line))

for img in tqdm(data, desc=f'Converting {file}'):
    im_path = img['data_row']['row_data']
    external_id = img['data_row']['external_id']
    im = Image.open(requests.get(im_path, stream=True).raw if im_path.startswith('http') else im_path)
    width, height = im.size

    labels_dir = save_dir / 'labels'
    os.makedirs(labels_dir, exist_ok=True)
    label_path = labels_dir / Path(external_id).with_suffix('.txt').name

    images_dir = save_dir / 'images'
    os.makedirs(images_dir, exist_ok=True)
    image_path = images_dir / external_id
    im.save(image_path, quality=95, subsampling=0)

    for project in img['projects'].values():
        for label in project['labels']:
            for obj in label['annotations']['objects']:
                top, left, h, w = obj['bounding_box'].values()
                xywh = [(left + w / 2) / width, (top + h / 2) / height, w / width, h / height]
                cls = obj['name']
                if cls not in names:
                    names.append(cls)
                line = names.index(cls), *xywh
                with open(label_path, 'a') as f:
                    f.write(('%g ' * len(line)).rstrip() % line + '\n')

d = {'path': f"../datasets/{file.stem}", 'train': "images/train", 'val': "images/val", 'test': "", 'nc': len(names), 'names': names}
with open(save_dir / file.with_suffix('.yaml').name, 'w') as f:
    yaml.dump(d, f, sort_keys=False)

if zip:
    print(f'Zipping as {save_dir}.zip...')
    os.system(f'zip -qr {save_dir}.zip {save_dir}')
print('Conversion completed successfully!')

if name == ‘main’:
convert(‘export_result_small.ndjson’)

Hi,

Thanks for sharing your code. I am trying to get it to work on my end but it seems to fail to read in the image using PIL. (the images on labelbox are .bmp, but that should normally not be an issue for PIL.

I get the following error:
UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7fda5f781530>

Did you encounter something similar?

1 Like