Jeff Kao
π€ SpeakerAppearances Over Time
Podcast Appearances
Maybe you pay some cost of serialization, deserialization from some of the storage systems, which will change soon because, you know, there's a lot of very nice like zero copy libraries from Rust.
The CPU intensive queries tend to be of search.
I would say it's actually both because if you think about it, like as I mentioned, this inverted index structure, if you have to do many key lookups, right?
You tokenize your query, one, two, three, fake street, Brooklyn, New York.
And in fact, there's many ways to express it in synonyms.
So maybe you expand the query and maybe instead of
123 fake ST, it could be street.
Or if you know the users in Germany, it's StraΓe.
And we haven't indexed or normalized everything in a certain way.
And you can see how it can fan out to many keys being pulled back.
And once you have all these candidates, you essentially have to rank them.
And so there is a little bit of CPU intensiveness in that, but we're working on optimizing these things.
And I will say that if you do decide to build this sort of analytics or search system, we're very big fans of this new approach with growing bitmap, which is...
It's actually, you know, an implementation that you see across many different languages.
But we found, you know, at least from the sort of candidate fetching where we're essentially in the inverted index and trying to pull out which candidates match a criteria.
Those tend to be like microsecond operations for most of our geocoding queries.
So that's why we're sort of moving over to a more like dedicated and specific system for our use case.