Help with "Example Awnsers" for MAL

Hey,

I want to start to use the Model Assisted Labeling (MAL fka Foundry) for some more complex annotation projects but am unfamiliar with adding Example Answers during this process. In the past, my annotation projects were pretty simple global classifications that I got away with just using a good prompt.

Can anyone here make some time to zoom with me? Here is a link to my cal: Calendly - Brent Combs

If you’d prefer not to have to zoom with a stranger (lol) then I will also write out my scenario to the best of my abilities.

I have text data that looks like this which I extracted from a PDF file, without using OCR, which I am quite proud of!

I need to label various named entities but to keep it simple while I am learning how to wield MAL we can just talk about 1 of the named entities and then when I understand better how to work with MAL inside Labelbox I can apply what I learn to all my relevant named entities.

NER1: Insured Type: This describes who is insured, either an Individual or a Family (group of individuals). Examples in the screenshot are pretty straight forward for this text, each instance of “Family” and “Individual” are my “Insured Types”. There are a few dozen synonyms for each of the two Insured Types across my dataset, so it won’t always be this clear. “Family” is pretty consistent, but for “Insured”, there can be a lot of variance, such as “Member”, “Employee”, “Single”, or “Person” to name a few.

Thoughts on how to leverage the Examples (of answers) section inside the MAL process would be very helpful - thank you!

For the “Output” would I put the text from the data row, or the data row ID?

Also, how do I note sub-types for named entities in the example output?

Using this as an example has a few of the NER labels getting placed accurately, but missing all of the labels for Amount, Frequency, and Important Questions.

[
{
“input”: “clvxvm8hm0tni0794ojb67xmu”,
“output”: {
“Important Question”: “What is the overall deductible?”,
“Medical Service”: “Prescription drugs”,
“Amount”: “$3,000”,
“Insured”: “Family”,
“Network”: “in-network”,
“Frequency”: “calendar year”,
“Medical Cost-sharing”: “coinsurance”
}
}
]

Wonder why?

When I give ChatGPT4 the ontology, prompt, example json, and raw text directly it properly predicts all the entities, is it possible there is an issue inside labelbox?