Ave Gatton
๐ค SpeakerAppearances Over Time
Podcast Appearances
When we talk about inference, we're really talking about not so much the state of the model that you took off the shelf, but now the capabilities and the prompt that the immediate model is working with.
Can you talk to that LLM, talk to that agent framework, and cause it to take malicious actions or exfiltrate sensitive data?
So it's in the moment, in a sense.
Very tactile versus maybe what you would call strategic safety of the model in general.
We're all starting at this from the same starting point.
And it's definitely early innings in terms of security for LLMs at large and agents at large.
So we all probably come up with essentially the same ideas.
But if you think about an agentic system that you're going to put into production,
It's going to scale.
And so if it's going to scale and service thousands of users, ultimately, you should expect that you will encounter a bad actor and that you will encounter someone who's going to try and jailbreak it.
And the principle of zero trust in all of this extended to agents would be you should assume that agent will be jailbroken, that despite your best attempts at keeping it safe and immune from prompt poisoning or prompt injection,
or somebody jailbreaking the agent, despite all of your best attempts, assume that it will be jailbroken at some point.
And if it is, then what's the worst it could do?
How do you prevent it from, say, going and just starting to read all your sensitive data and sending it out to a hacker?
And that's the reason why we need these protections.
is because the more you scale the system, the more guaranteed you are to run into a bad actor or an attempt to jailbreak.
So all agents break in the same way, right?
So the types of attacks and the way in which we can protect agents are common to almost every industry.
The specifics of the information that's being exfiltrated or the types of actions that the agent could take to
disrupt a business are unique, obviously, to each industry.