Jeff Kao
๐ค SpeakerAppearances Over Time
Podcast Appearances
So there are some boundary conditions to take into account.
You maintain because of like, you know, somebody who's smarter than me who came up with this thing called like using a Hilbert curve, you have a locality with adjacent IDs, which means that things that are close to each other will be, you know, if you sort it will be next to each other.
Obviously, you know, barring some boundary conditions, but that fits really well into a system that has, you know, things sorted like a log structure merge tree.
So you're able to make very efficient range and geo queries from something like that.
So this is more related to the forward geocoding side, which is essentially translating your text query into some sort of geo entity.
And so one of the requirements we had to deal with was essentially...
being able to handle a little bit of typo tolerance from our address validation service.
And that comes in many different forms.
Like there's so many like sort of failure cases for search, which is a little bit different from like more typical web applications.
It's like you click through a couple of things and that you expected this.
really like all the different use cases are literally every type of single character that a user can type.
Those are all the potential use cases.
So the cardinality is extremely high.
And essentially the number of failure cases is almost unbounded in some sense.
There's just so many combinations that at that point, like there's so many ways to type something in.
So we deal with fuzzy search in a couple of ways.
And I remember there's an episode you had with Charlie Marsh from UV.