David Reich
π€ SpeakerAppearances Over Time
Podcast Appearances
Now, there's a bit of a statistical problem in figuring out how many there are because they're so densely packed that they're close to each other and they're interfering with each other.
But when you try to piece them out and say, let's count them only one in each place in the DNA and blank out the others, we find at least about 479 positions that are all independently pushing in the same way.
those positions are 99% confident that they're real.
By another criteria of more than 50% confident that they're real, we think that about 3,800 positions are all pushing in the same direction.
So this is like a crazy number of results, given that in our work previously and other people's work, there were at most a couple of dozen discoveries coming from a single scan.
So when we got this result, we were very surprised.
We thought it must be wrong, and we spent the next couple of years trying to make the results go away, but they just kept getting stronger.
What we were trying to do is to look for some kind of independent type of evidence to tell us whether these positions were real.
We stumbled on something really powerful for this purpose that had not been used in this way before.
It relied on the fact that we had very large numbers of discoveries, like many hundreds of discoveries or even thousands.
And so what we did is we took a completely independent data set, which was the corpus of genome-wide association studies.
So these are studies that people have carried out in hundreds of thousands of people looking for whether particular genetic mutations are more common in people with high blood pressure than with low blood pressure or something like this.
So we took the UK Biobank, which is about 500,000 people.
from Great Britain who have been measured for hundreds and hundreds of traits.
The whole genomes of all these peoples have been sequenced.
And for each of these traits, we could look whether each of these 10 million positions are connected to this trait in some way, in a convincing way.
So in 10 million positions, about 15%, about 1.5 million positions in the DNA are predictive of at least one of these several hundred traits.
So then we could ask a question, is our natural selection signal, our statistic,
Is it related to whether a mutation causes a high blood pressure or some other trait?
So we slid our statistic for natural selection from upward to a value of one, a value of two, a value of three, a value of four, a value of five.