Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Anita Zhang

👤 Person
146 total appearances

Appearances Over Time

Podcast Appearances

And then we have Chef running to actually pick up the new packages and things and just updates depending on what's in those repositories. So the change to Stream didn't really change that model at all. We're still doing that, picking up new packages on like a two-week cadence.

And then we have Chef running to actually pick up the new packages and things and just updates depending on what's in those repositories. So the change to Stream didn't really change that model at all. We're still doing that, picking up new packages on like a two-week cadence.

Yeah, we kind of have to.

Yeah, we kind of have to.

Yep. So containers, they don't get the live updates that the bare metal hosts get. So users can just find their jobs in a spec. And for the lifetime of the job, the packages and things that go into it don't change. I mean, there are certificates that also are used to identify the job. Those get renewed. But we have a big push to get every job updated at least every 90 days.

Yep. So containers, they don't get the live updates that the bare metal hosts get. So users can just find their jobs in a spec. And for the lifetime of the job, the packages and things that go into it don't change. I mean, there are certificates that also are used to identify the job. Those get renewed. But we have a big push to get every job updated at least every 90 days.

Most jobs update more frequently than that.

Most jobs update more frequently than that.

Yeah, they'll actually have to shut down their job and restart it on a fresh container and they'll pick up any new changes to the images or any changes to the packages that have happened in that time.

Yeah, they'll actually have to shut down their job and restart it on a fresh container and they'll pick up any new changes to the images or any changes to the packages that have happened in that time.

So I used to work on the containers team, the part that's actually on the host. The whole like Twine team consists of like the scheduler and they're like resource allocation teams. to figure out which hosts we can actually use, how to allocate them between the teams that need them.

So I used to work on the containers team, the part that's actually on the host. The whole like Twine team consists of like the scheduler and they're like resource allocation teams. to figure out which hosts we can actually use, how to allocate them between the teams that need them.

And then on the actual container side, we have something called the agent that actually talks directly to the scheduler and translate the user specification into the actual code that needs to get run on the host. And that agent sets up a bunch of namespaces and starts systemd and basically just gets the job started.

And then on the actual container side, we have something called the agent that actually talks directly to the scheduler and translate the user specification into the actual code that needs to get run on the host. And that agent sets up a bunch of namespaces and starts systemd and basically just gets the job started.

Yeah. So the bulk of the work that is done in the agent, at least for the systemd setup, is it translates the spec into systemd units that get run in the container. So if there are jobs, if there are commands that need to run before the main job, those get translated to different units. And then the main job is in its own unit as well.

Yeah. So the bulk of the work that is done in the agent, at least for the systemd setup, is it translates the spec into systemd units that get run in the container. So if there are jobs, if there are commands that need to run before the main job, those get translated to different units. And then the main job is in its own unit as well.

And then there's a bunch of different configuration to make sure the kill behavior for the container is the way we expect and things like that. There is a sidecar for the logs specifically. So logs are pretty important, as you'd imagine, to users being able to debug their jobs. There is a separate service that runs alongside the container to actually make sure that no logs get lost.

And then there's a bunch of different configuration to make sure the kill behavior for the container is the way we expect and things like that. There is a sidecar for the logs specifically. So logs are pretty important, as you'd imagine, to users being able to debug their jobs. There is a separate service that runs alongside the container to actually make sure that no logs get lost.

And so those logs get preserved in the host somewhere.

And so those logs get preserved in the host somewhere.