Kieran Kunhya
๐ค SpeakerAppearances Over Time
Podcast Appearances
Here, we are degrading the signal, right?
And so we need to degrade both the audio and the video signal in the best way possible.
And we can do that, but it involves
first, a lot of theoretical knowledge about how it works, the eye works, but a lot of mathematical change, a lot of mathematical tricks, right?
For example, when you move to RGB and you go to YUV, for example, what we do very often is that we scale down the resolution of the color compared to the brightness.
And most of the time, and just this without compression, it divides the size by two.
But most people don't see it, right?
And so on and so on, right?
And then you go to very complex mathematical change.
So of course, Fourier's transform, which de facto are not Fourier transform, they are like
discrete continuous transform, but that's the same idea.
So frequency domain, we split the video by blocks, right?
So that's why when it's wrongly decoded, you see those blocks and badly encoded, you see those blocks and so on to arrive to compression states that are insanely high, right?
And each generation of the codec is like 30% less for the same quality, right?
And this requires amount of power, of computational power that are huge.
So what happens is globally is that you have an address, right, which gives you, with the operating system, a stream of bytes, a stream of data, right?
And this is the first step.
And the second step arises with demuxing, where you're going to separate audio, video, subtitle in type of different tracks.
And then on each of those tracks, you're going to decompress them, decode them, either audio with an audio codec, video to video codec, and subtitle to subtitle codec.
And once you've decompressed those type of things, you have raw images, raw, and then you're going to talk with your graphic card in your screen and display that.