Jacob Drori
๐ค SpeakerAppearances Over Time
Podcast Appearances
When M.I.A.
was at the park, she.
The names are sampled from the 10 most common names, 5 male, 5 female, from the pre-training set, Simple Stories.
The task loss used for pruning is the CE in predicting the final token, he wore she.
Subheading.
Task 2.
Simplified IOI.
I use a simplified version of the standard indirect object identification task.
Prompts have the form when name underscore one, action, name underscore two, verb, pronoun matching name underscore one.
For example.
When Leah went to the shop, MIA urged him.
When Rita was at the house, Alex hugged her.
The task loss used for pruning is the binary CE.
We first compute the model's probability distribution just over him and her, soft-maxing just those two logits, and then compute the CE using those probabilities.
Subheading.
Task 3.
Question marks.
The prompts are short sentences from the pre-training set that either end in a period or a question mark, filtered to keep only those where 1.
The dense model predicts the correct final token, period or question mark, with p greater than 0.3, and 2.
When restricted to just the period and question mark, the probability that the dense model assigns to the correct token is greater than 0.8.