Jeff Kao
๐ค SpeakerAppearances Over Time
Podcast Appearances
And this is the difference between a database and a search engine.
How do you essentially rank which of these documents is most relevant?
I mean, that is its own, you know, that is its own, like, sort of, there's a lot of work around figuring out what is the most relevant.
I mean, it's different for every company and every use case.
So Tantive, you know, that's a library that came from, I think, these people from France.
And they were building essentially, it's funny, it's an elastic search replacement, which is we've been talking a little bit about that, called QuickWit.
And so there's just a lot of primitives that, you know, that came along.
with this sort of Rust ecosystem.
And that was one of the libraries that really drew us to using Rust.
We're migrating to a different implementation that doesn't use any of these libraries now.
We use this thing called Roaring Bitmap, but it's the same concept.
And we're doing that largely because of sort of the more structured nature of some of the geo-entities we
We have, you know, we have addresses and regions and people tend to type addresses in a certain way so we can take more advantage of that and fine tune how we, you know, do our search workloads.
But at the core of it really is this concept of an inverted index.
So that is, I would say, the core of geocoding.
And then we can talk a little bit about light GBM and fast text.
And these things always move very quickly.
So we're actually considering moving off of these libraries.
But we're still doing something that would serve what these libraries do.
And so light GBM is a gradient boosted tree.