Kwasi Ankomah
👤 PersonAppearances Over Time
Podcast Appearances
You just sort the model name.
And that model is available to you.
So it's there ready and you can use it in your workflow.
And again, we've had people do incredible things with this.
We've had people use a Qend model for this at like 32 billion parameters and have like a Lama.
And then they might have, again, GPT OSS.
And then you can, you can have those all sitting, sitting on a chip.
Now that, that makes a huge difference in terms of, if you take that, if you take the cost profile of that application and you compare that to a cost profile running on say like Claude or GPT file, the difference will be outstanding.
You know, like all of the costs that you see here versus all the costs that you see there.
So this is the starting thing that we're starting to see people shift to this whole thing of like, hey, do I need this thing?
And B, is there a better way to do this?
You know, using this kind of multi-model architecture as well.
Yeah, it is.
It's a big one, right?
I think you'll see lots of papers around.
I think there was a recent one NVIDIA did where they were talking about agentic uses small language models with fine-tuning can be great.
And I think the OSS 20 billion parameter one is a great use case.
You know, there's been some models that have really stood out in this area.
One was the Lama 8B, right?
It was a fantastic model for fine tuning.