[Catalog] TIP - Using Slices with Creation Data Metadata Filter to Auto-Sync Data Rows For a Given Month

Kush · July 28, 2023, 9:20pm

Often times, ML teams need to re-run/re-train models on newer data on a certain cadence (weekly, monthly, etc.) for a variety of reasons (e.g. distributional data drift).

A common way to organize data, therefore, would be to bucket data rows based on the day (if dealing with telemetry data, for example, which can comprise of data rows every minute/five minutes), or month they were created in.

Using Catalog’s built-in slices feature in tandem with the intelligent tooling for ‘Data Row Created At’, a practitioner can save a filter, so that any new records that get created for a given day, week, or month can automatically be added to the slice.

The example below shows how to construct a slice such that new records between June and July 2022 will be automatically added to the slice.

Topic		Replies	Views
How to create Slices Catalog	1	256	March 22, 2024
Feature request for slices Using Labelbox	7	463	November 26, 2022
How to query/export data rows by Metadata? Python SDK exports , metadata , data-io , data-row	2	450	November 9, 2022
Using export v2 filters to filter review data ’ created_at’ value Python SDK exports , data-io , data-row	4	271	October 31, 2023
How to conditionally create a new batch using an existing batch in Catalog to Annotate Python SDK data-row , annotations	3	177	March 13, 2024

[Catalog] TIP - Using Slices with Creation Data Metadata Filter to Auto-Sync Data Rows For a Given Month

Related topics