Dr. Aqib Rashid
๐ค SpeakerAppearances Over Time
Podcast Appearances
different signals from the file in order to be able to arrive at a verdict or some kind of prediction as to whether that file is malicious.
Obviously my expertise, they all are in the cybersecurity ML intersection.
So I was quite well positioned to be working on this problem at Glassfall when I joined a couple of years ago.
The MVP at the time, it was... So before we actually landed on the specifics of this product, so for example, which kind of file types we want to target, what are the various non-functional and functional requirements, etc.
We had to prove out the science of doing all this.
So taking different...
pieces of telemetry and using that to train a machine learning model and that model should then have the ability to reliably distinguish between goodware and malware.
So the first real question we wanted to explore is what kind of signal would be genuinely discriminative?
You could say that was the MVP phase in the research portion of this project or this product.
We intended to prove out that you could use CDR telemetry.
So that is the structural telemetry that you obtain as a result of cleaning files and analyzing files and understanding what's in files.
and using that kind of data to build machine learning models.
So we wanted to first prove out that end-to-end process.
Some researchers in the past had proved that it could work to some degree, but there was room for improvement in the performance there.
So effectively, the MVP there became, let's first understand which bits of telemetry, if any, can be used for this process.
If so, how good can we get this?
Can we validate the hypothesis that do malicious files look structurally different from benign ones given the data that CDR exposes?
So what we found at that point was that the answer is yes.
There is a statistical difference between the CDR data that represents malware versus the CDR data that represents goodware.
So that is the deep structural telemetry that I referred to earlier.