What specific skills do you think are needed in R&D, reverse engineering, and vulnerability research?
I think the skills and mentality for “standard” software development (R&D) vs. reverse engineering / vulnerability research are vastly different, so I’ll answer in two parts.
For R&D, I believe it’s important to prioritise continuous learning and stay up-to-date with the latest technologies. Learning new programming languages and concepts can expose you to different problem solving methods, helping you solve immediate problems faster. I’m also a big fan of a notion presented in Larry Wall’s “Three great virtues of a programmer” – Laziness will make you write code that does the work for you (and will later benefit others). Impatience will make you write code that runs quickly and Hubris will make you write good looking and easy-to-modify code. Every great programmer I’ve met has a mix of these qualities.
For reverse engineering and vulnerability research, I think a completely different set of skills and qualities is required. First, you need to be a very curious person, which isn’t a necessity for R&D. If you’re not constantly asking yourself “How does this work” you’ll never be a great researcher. Secondly, you have to be insanely optimistic. In vulnerability research, there is no guarantee that a vulnerability actually exists so you have to believe that at least one will ALWAYS exist. I’ve personally researched pieces of code for a few months before I found a single vulnerability. You need to be very focused on your hypothesis and determined to keep going.
How has your extensive experience in security research influenced the development of JFrog’s security research division?
As security researchers, we try to solve the problems that we know to be ongoing pain points for those in our field. For example, the concept of contextual analysis came from our days as security researchers and engineers before joining JFrog, when we needed to “fix” (or triage) CVEs in our products even if they couldn’t be exploited by attackers. In our roles at that time, we longed to stop wasting time on futile and unscalable efforts. Every new feature we infuse into JFrog Security is based on research of real-world, current security problems facing companies today and we’re using tangible data to solve them.
What are some best practices for securing MLOps platforms during deployment?
In our MLOps research we found that the biggest problems plaguing MLOps deployments today include:
- Automated ML Pipelines & Model Registries: Check if your MLOps platform supports ML Pipelines, Model serving or a Model registry. If you don’t need any of these features, disable them completely. Our research proved these features can enable both initial external infection and lateral movement of said infection within the organisation.
- Lack of Guardrails for Arbitrary Code Execution: Loading some types of untrusted ML models can lead to arbitrary code execution, making it necessary for companies to implement policies preventing use of models that support code execution on load (for example Safetensors). They should also educate developers, data engineers, of anyone who loads ML models on the dangers of using untrusted ML models and datasets.
- Gaps in HTTPS Framework Authentication: Make sure your HTTPS authentication is enabled in your MLOps platform. We saw many MLOps platforms that don’t support these features, either by default or at all.
How can loading untrusted models or datasets exploit inherent vulnerabilities in MLOps platforms?
Depending on the type of ML model or dataset (& the libraries used to load them), loading an untrusted model or dataset can immediately lead to arbitrary code execution. For example, loading an untrusted model of a “Pickle” or “Keras H5” type will automatically run code that’s embedded in the model, and that code can be malicious if the creator of the model is a threat actor. Once such code is running, it’s “game over” for whoever loaded the model, i.e. their machine is completely breached. Loading an untrusted model or dataset predisposes the user to a remote code execution attack, meaning a bad actor can run code that exploits implementation vulnerabilities in any MLOps platform used on the organisation’s internal network.
Why is it important for organisations to be aware of the risks associated with specific ML models and datasets?
As mentioned in our research, specific types of ML models and datasets can lead to arbitrary code execution when loaded. This behaviour would be the same as installing an untrusted package from npm or PyPI, a well documented danger. Many developers and data scientists believe a model or dataset consists of pure data, but in reality many of them support a code layer that runs completely unsandboxed, which is as bad as running malware on your machine.
What should organisations be aware of regarding implementation vulnerabilities in MLOps platforms?
Our research showed a staggering number of MLOps platforms do not support proper authentication, thus it’s important to double-check authentication is properly configured when deploying a new MLOps platform. Additionally, the immaturity of these MLOps platforms means there’s still a lot we don’t know about them and there are still a large number of new vulnerabilities being discovered. Thus, we recommend you install the latest version of a specific MLOps platform and continuously monitor for emerging CVEs. You can follow the latest discoveries and technical updates from the JFrog Security Research team on our research website and on X @JFrogSecurity.
Shachar has more than 15 years of experience in security research & engineering, including low-level R&D, reverse engineering and vulnerability research. He currently leads the security research division in JFrog, specialising in automated vulnerability research techniques. Before joining Vdoo and JFrog, Shachar was responsible for building the low-level security of Magic Leap’s custom OS. Shachar holds a BSc in Electronics Engineering and Computer Science from Tel-Aviv University.