Brian O'Grady
๐ค SpeakerAppearances Over Time
Podcast Appearances
Well, now we're talking about doing 1 million comparisons, right?
That will take forever, doing 1 million mathematical distance comparisons.
So what was introduced in like 2014 or 2016, something in that time range, was this concept of approximate nearest neighbors, where rather than having to do, you know, 1 million distance comparisons just to find what the nearest, what the mathematically nearest vector is to, you know, your input, you approximate it by leveraging sort of
clustering inherent in your vector space so the idea is that you know just as we were discussing like we think about arid and dry maybe forming like a cluster we could also think about like burger and sandwich forming a cluster and then like brick and stone would form a separate cluster and then there might be like a bunch of clusters everywhere and they're kind of like interconnected there's this idea of trying to take advantage of this inherent pattern in
the vector space to get results that are closest to your input query in meaning without having to do very computationally intensive search of doing every single comparison against every item you have.
This concept of approximate nearest neighbors is still relatively computationally intensive because it still involves doing some distance comparisons.
So the idea is that when people try to bolt this on to Elasticsearch or OpenSearch, it often degrades performance of their existing text search system.
So what they do is they then say, okay, well, I'm going to now, maybe I have like a cluster on Elastic or OpenSearch for my text search.
I'm going to stand up another cluster on Elastic or OpenSearch for my vector search.
But then the question becomes, if I'm just using OpenSearch or Elastic, I already have a dedicated cluster for vector search, should I really be using that tool?
Or should I be using a tool that was purpose-built for doing these exact type of operations?
And that's where Quadrant comes in.
And like, Quadrant's written in a systems language that has no garbage collection overhead.
And I know that some people out here are like, ah, Rust is kind of overhyped.
So for database engineering, it was a design choice, right?
So needed a systems-level programming language that had memory safety.