David Shu
👤 PersonAppearances Over Time
Podcast Appearances
I'm not actually the CTO anymore. Oh no, your LinkedIn is outdated. Oh, does it still say that? I thought I had updated it.
I'm not actually the CTO anymore. Oh no, your LinkedIn is outdated. Oh, does it still say that? I thought I had updated it.
Let's do it.
Let's do it.
I think my LinkedIn, it might be confusing because it still lists that I was the CTO. I stepped back from the CTO last year. Okay. So what are you doing now? I am spending my time exploring sort of new product spaces, things that can be done. So both inside and outside of Tailscale. Very cool.
I think my LinkedIn, it might be confusing because it still lists that I was the CTO. I stepped back from the CTO last year. Okay. So what are you doing now? I am spending my time exploring sort of new product spaces, things that can be done. So both inside and outside of Tailscale. Very cool.
Most of my work inside Tailscale is around helping on the sort of customer side, talking to users or potential users about how it can be useful. And then because I have such an interest in sort of the world of large language models, I've been exploring that. But that is not a particularly good fit for the Tailscale product.
Most of my work inside Tailscale is around helping on the sort of customer side, talking to users or potential users about how it can be useful. And then because I have such an interest in sort of the world of large language models, I've been exploring that. But that is not a particularly good fit for the Tailscale product.
I spent quite a long time looking for ways to use this technology inside Tailscale and it doesn't really fit. And I actually think that's a good thing. It's really nice to find clear lines like that when you find something where it's not particularly useful. And I wouldn't want to try and, you know, a lot of companies are attempting to make things work, even if they don't quite make sense.
I spent quite a long time looking for ways to use this technology inside Tailscale and it doesn't really fit. And I actually think that's a good thing. It's really nice to find clear lines like that when you find something where it's not particularly useful. And I wouldn't want to try and, you know, a lot of companies are attempting to make things work, even if they don't quite make sense.
And I think it's very sensible of Tailskill to not go in that direction.
And I think it's very sensible of Tailskill to not go in that direction.
So what would Tailscale do with LLMs is the question I was asking from a Tailscale perspective. I think Tailscale is extremely useful for running LLMs yourself for a network backplane. In particular because of the sort of surprising nature of the network traffic associated with LLMs On the inference side, so you can kind of think about working with models from both a training and an inference.
So what would Tailscale do with LLMs is the question I was asking from a Tailscale perspective. I think Tailscale is extremely useful for running LLMs yourself for a network backplane. In particular because of the sort of surprising nature of the network traffic associated with LLMs On the inference side, so you can kind of think about working with models from both a training and an inference.
These are sort of two sides of the coin here. And training is very, very data heavy and is usually done on extremely high bandwidth, low latency networks, InfiniBand style setups on clusters of machines in a single room. or if they're spread beyond the room, the next room is literally in the building next door. The inference side looks very different.
These are sort of two sides of the coin here. And training is very, very data heavy and is usually done on extremely high bandwidth, low latency networks, InfiniBand style setups on clusters of machines in a single room. or if they're spread beyond the room, the next room is literally in the building next door. The inference side looks very different.
There's very little network traffic involved in doing inference on models in terms of bandwidth. And the layout of the network is surprisingly messy. This is because the nature of finding GPUs is tricky even still today. despite the fact that this has been a thing for years now. Very tricky. Yeah.
There's very little network traffic involved in doing inference on models in terms of bandwidth. And the layout of the network is surprisingly messy. This is because the nature of finding GPUs is tricky even still today. despite the fact that this has been a thing for years now. Very tricky. Yeah.
I feel I should try and explain it just because it's always worth trying to explain things, but I'm sure you all know this, which is that if you're running a service on a cloud provider that you chose years ago for very good reasons, All the cloud providers are very good at fundamental services, but they all have some subset of GPUs and they have them available in some places and not others.
I feel I should try and explain it just because it's always worth trying to explain things, but I'm sure you all know this, which is that if you're running a service on a cloud provider that you chose years ago for very good reasons, All the cloud providers are very good at fundamental services, but they all have some subset of GPUs and they have them available in some places and not others.