Dwarkesh
👤 PersonAppearances Over Time
Podcast Appearances
You know, sometimes I like to make crazy YOLO AI vets.
And Mercury makes me feel super confident about exactly how much money I can take out of our business while still having enough left over to pay bills and to pay taxes.
Visit Mercury.com to join me and 200,000 other entrepreneurs who use Mercury.
Mercury is a financial technology company, not a bank.
Banking services provided through Choice Financial Group, Column A, and Evolve Bank & Trust, members FDIC.
Public benchmarks can be useful, but they don't always measure what you think they do.
Take AMI or HMMT, which both contain extremely challenging, competition-level math problems.
Now, LibVox researchers found that some of the larger models were brute-forcing their answers through repeated trial and error, and they were sidestepping the very kind of mathematical reasoning that these benchmarks were designed to test in the first place.
And on top of this, some of the questions had actually leaked into the model's training data.
So when LabelBox researchers swapped in fresh, equally challenging problems, the scores plummeted, and in many cases by more than 40%.
So when a LabelBox customer wanted to improve their model's math capabilities, LabelBox brought in a team of previous Math Olympiad winners to develop brand new AMI and HMMT-style problems that the models couldn't have possibly memorized, along with clever variations that made brute force approaches computationally infeasible for the model.
Labelbox also mapped out the model's strengths and weaknesses.
For example, was it better at domains like combinatorics or algebra?
And then weighted the final fine-tuned data towards the areas that needed the most work.
There's no shortcut here.
Labelbox just scrutinized benchmarks at the single data row level to ensure that their customers' models were learning and improving as effectively as possible.
You can learn more at labelbox.com.
Okay, first question.
I want to understand the role of ideology in the Sino-Soviet split.
So these are the two major communist countries in the world.