Frequent critical flaws open MLFlow users to imminent threats

MLFlow has emerged as the most-vulnerable open source machine learning framework with four highly critical (CVSS 10) vulnerabilities reported within 50 days, according to a Protect AI report.

Protect AI’s AI/ML bug bounty program, hunter AI, discovered these vulnerabilities within the MLFlow platform, which can allow Remote Code Execution (RCE), Arbitrary File Overwrite, and Local File Include. This could possibly lead to system takeover, sensitive information loss, denial of service, and destruction of data, according to Protect AI.

“The report includes four critical flaws found in MLflow, the popular open-source platform used by practitioners to manage various stages of a machine learning project, including experimentation, reproducibility, deployment, and a central model registry,” Protect AI said.

With lesser sought alternatives like Amazon Sagemaker, Neptune, Comet, and KuberFlow, MLFlow is a widely popular machine learning lifecycle platform with more than 10 million monthly downloads and a rich user community including Facebook, Databricks, Microsoft, Accenture, and Booking.com.

hunter AI traced RCE heavy vulnerabilities

Tracked as CVE-2024-0520, the latest vulnerability revealed by hunter AI is a path traversal flaw in the code used to pull down remote data storage. The flaw can be used for a remote code execution (RCE) attack by fooling a user into using a malicious remote data source that can execute commands on the user’s behalf.

The affected code is native to the MLFlow.data module listed within the PyPi registry, which is used to help keep a record of model training and evaluation datasets. The bug, which was fixed in the latest release of MLFLow, has had no known active exploitations.

A vulnerability hunted in December 2023, tracked as CVE-2023-6709, was also capable of allowing RCE attacks. The flaw allowed improper validation of special elements used in a template engine in MLFlow, according to a CVE entry description. A template engine is a git repository with a standardized, modular layout containing all of the customizable code.

Other bugs allow possible system takeover

Another vulnerability discovered in December, tracked as CVE-2023-6831, was found to allow a bypass in an MLFlow function that validates file paths. An attacker can use the flaw to remotely overwrite files on the MLFlow server.

“This arbitrary file overwrite flaw can also be combined with additional steps of overwriting the SSH keys on the system to perform an RCE attack,” Protect AI said. The bug affected MLFlow versions before 2.9.2 with fixes available with the latest updates.

The fourth critical flaw revealed by hunter AI was also in December, allowing malicious actors to read sensitive files on MLFlow server.

“MLflow hosted on certain types of operating system could be tricked into displaying the file contents of sensitive files through a file path safety bypass,” Protect.AI said. “There is potential for system takeover if SSH keys or cloud keys were stored on the server and MLflow was started with permissions to read them.” With the release of Large Language Models (LLMs), organizations are quickly shifting to building their own generative AI. With these critical weaknesses, open-source machine learning frameworks like MLFlow can allow the stealing or poisoning of sensitive training data.

Machine Learning, Vulnerabilities