Request limit maxed by hanging export?

So at first I ran into this same issue: Export_v2 from a project not working anymore, task IN_PROGRESS indefinitely while doing a batch of Project.export_v2() exports.

Somewhat randomly too after my code working many time with no tangible changes. I have not been able to try the solution however, because these never ending tasks appear to still be server side, and I receive the following error when I try the first export.

[WARNING][labelbox.client][04/03/2024 21:25:11] Unparsed errors on query execution: [{'message': 'Resource Limits Reached', 'errors': [{'error': 'You have reached your maximum number of concurrent export tasks limit in the project(s). Wait until some of the tasks in those project(s) are completed and try again.', 'maxExportTasksPerProject': 10, 'projectIds': ['cls8wukhr05di07z736nc0vir']}], 'locations': [{'line': 1, 'column': 78}], 'extensions': {'code': 'RESOURCE_LIMITS_REACHED', 'exception': {'message': 'Resource Limits Reached'}}, 'path': ['exportDataRowsInProject']}]
Error: Unknown error: [{'message': 'Resource Limits Reached', 'code': 'RESOURCE_LIMITS_REACHED'}]("Unknown error: [{'message': 'Resource Limits Reached', 'code': 'RESOURCE_LIMITS_REACHED'}]", None)

How can I terminate these tasks so I can move forward! Additionally, is there any idea why this issue happens in the first place?

Hello Xavier,

Welcome to Labelbox Community!

I find all the export tasks for your project : cls8wukhr05di07z736nc0vir are now completed in green. Could you please confirm?

Thanks,
Sravani

Hey @xavier.nogueira
As mentioned in other posts, we would encourage to use the streamable version of the export to avoid those issues going forward :

here is a sample you can use/modity to your needs:

API_KEY = None
PROJECT_ID = '<YOUR_PROJECT_ID>'
client = lb.Client(api_key=API_KEY, enable_experimental=True)
project = client.get_project(PROJECT_ID)

# Set the export params to include/exclude certain fields.
export_params= {
  "attachments": True,
  "metadata_fields": True,
  "data_row_details": True,
  "project_details": True,
  "label_details": True,
  "performance_details": True
}

# Note: Filters follow AND logic, so typically using one filter is sufficient.
filters= {
  "last_activity_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"],
  "label_created_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"],
  # "global_keys": ["<global_key>", "<global_key>"],
  # "data_row_ids": ["<data_row_id>", "<data_row_id>"],
  # "batch_ids": ["<batch_id>", "<batch_id>"],
  # "workflow_status": "<workflow_status>"
}

export_task = project.export(params=export_params, filters=filters)

export_task.wait_till_done(timeout_seconds=120)

# Provide results with file converter

if export_task.has_errors():
    export_task.get_stream(
        converter=lb.FileConverter(file_path=f"./{project.name}_export_error.ndjson"),
         stream_type=lb.StreamType.ERRORS
     ).start()

if export_task.has_result():
     export_task.get_stream(
         converter=lb.FileConverter(file_path=f"./{project.name}_export.ndjson"),
         stream_type=lb.StreamType.RESULT
     ).start()

Thank you! Here is the filled out sample:

API_KEY = None
labelbox.__version__ == '3.66.0'
PROJECT_ID = 'cls8wukhr05di07z736nc0vir' # although it happens with others too
client = labelbox.Client(api_key=API_KEY, enable_experimental=True)
projects = label_box_client.get_projects()
project = # we iterate and run the below code in a function for each project, if it is in a different list of project names. As of now we are testing with just project.name = 'ttd_2023_sample'

# Set the export params to include/exclude certain fields.
export_params= {
            "attachments": True,
            "metadata_fields": True,
            "data_row_details": True,
            "project_details": True,
            "label_details": True,
            "performance_details": True,
            "interpolated_frames": False,
}

# Note: Filters follow AND logic, so typically using one filter is sufficient.
filters= {}

export_task = project.export_v2(params=export_params, filters=filters)

export_task.wait_till_done()

if export_task.errors:
    logger.warning(export_task.errors)

export_json = export_task.result
if isinstance(export_json, dict):
    return [export_json]
logger.debug(f"Exported {len(export_json)} labels for project {project.name}.")
return export_json

Also for your reference I tried running it again and it is still hanging indefinitely.

I can see you have a better result with the streamable method, let me clean the other task you have created and keep using the provided, any issues let me know.

Yes I am currently experimenting and the stream-able version works. That said, I would rather not process it in a the stream-able pattern, so getting the original version to work is still my priority, especially because it was working with no code changes just earlier yesterday :thinking:

The tasks hang because we scale on demand but sometimes there are spikes that can elongate the time it takes to export hence why sometimes you may not have issues sometimes it appear the export hangs.

That makes sense, that said the hanging I have/still am experiencing is basically endless. I have yet to be able to wait it out even after 30 minutes (on one that previously took a couple seconds).

Out of curiosity why does this not effect the streaming one, would you recommend reworking code to leverage the streaming pattern?

The distinction lies in the method of processing data. Rather than iterating through an entire paginated collection of data rows after exporting it, you process the data row by row during the initial retrieval process. This approach reduces the memory footprint and is more efficient when working with large exports.

Going forward project.export() will likely be the standard given they is no limitation.

That makes sense, I’ll likely rework towards the streaming model, personally I would have preferred a single response but no need to swim upstream is reliability will be uncertain.

That said it’s not entirely clear to me why the non-streaming version just 100% stopped working. I still think that should be looked into and repaired, it does not seem like a volume issue alone as it went from 3 seconds → indefinite randomly.

Rest assured we are looking into it, the method share with you is the essentially a quick work around for you not to have to wait to get your export.

1 Like

Thank you! Keep up the good work, I’ll consider this solved for now, but please let me know once there are any updates to the SDK involving this issue.

1 Like

@xavier.nogueira We pushed a new SDK version (3.67.0) so steamable becomes the default without you changing anything on you code.
This should solve the inconsistent state of the export.

labelbox · PyPI

1 Like

Awesome thank you, ofc I just switched it over with my own stream handler but I’ll switch it right back :sweat_smile:

2 Likes