Hi, I have a dataset with gold standard labels. Now I want to annotate them with at least two annotators and calculate the agreement 1) among the annotators and 2) between the annotators (when there’s agreement) and the gold standard labels, and export the data. Few questions here:
-
First of all, should I pick the “Benchmarks” or “Consensus” project type for this? I think technically I may be able to do it with either, but I’m wondering which project type is a better choice in my scenario?
-
If I pick the “Benchmarks” project, how can I mark the gold standard in the project? I know how to do this with the UI, but let’s say we have thousands of data rows, we cannot go and “add to benchmark” one by one on the UI. So I wonder how can I do it using Python SDK programmatically?
-
Let’s say I already specified the gold standard data rows in the “Benchmarks” project (they move to the Done section now), how can I assign at least two annotators to these gold standard data rows for annotation? For “Consensus,” when we set up the project, I see we have an option to choose # of annotators, but I wonder how we can do it for Benchmarks after specifying the gold standard data rows.