Severe Azure HDInsight flaws highlight dangers of cross-site scripting

Security researchers have found eight serious cross-site scripting (XSS) flaws in Azure HDInsight, a big data processing service powered by open-source technologies like Apache Hadoop, Spark, Hive and Kafka running on Azure. The flaws could have allowed attackers to inject and execute malicious scripts in visitors’ browsers.

“All XSS vulnerabilities posed significant security risks to data integrity and user privacy in the vulnerable Apache services, including session hijacking and delivering malicious payloads, putting any user of the Apache services at risk, including Apache Hadoop, Spark, and Oozie,” researchers from Orca Security said in their report.

The flaws were privately reported to Microsoft and were fixed last month. However, the presence of eight such basic web flaws into a service run by one of the largest tech companies highlights the need for organizations to be proactive in their defenses and not take the security of third-party services for granted.

Reflected and stored cross-site scripting

XSS is one of the most common and well-known types of web vulnerabilities. It is the result of poor sanitization of user input — usually in some sort of web form — that allows the input to contain JavaScript that would be served back to a visitor’s browser. Malicious JavaScript code that executes inside a browser in the context of a website is very dangerous because it has access to the user’s authenticated session. Such attacks can either result in the user’s browser performing actions on the site that the user didn’t intend — session piggybacking — or in the theft of the session cookie or tokens itself.

There are two types of XSS flaws: reflected and stored. Reflected XSS vulnerabilities are exploited by adding the malicious JavaScript payload as a parameter to a vulnerable URL. A victim would have to click on the specially crafted URL sent by the attacker to trigger the malicious payload execution inside their browser. If they navigate to the target website directly, they wouldn’t receive the payload. In other words, reflected XSS exploitation requires user interaction.

Stored XSS issues are more dangerous because the attacker only needs to exploit a vulnerable field once to permanently inject the malicious code into the web page. This code would then trigger every time the page is visited later by other users, without any additional interaction required such as clicking on a specially crafted URL.

Six of the XSS flaws found by Orca in Azure HDInsight were stored and the other two were reflected. They were tracked as CVE-2023-36881 (four flaws), CVE-2023-35394, CVE-2023-38188, CVE-2023-35393, and CVE-2023-36877 and were flagged by Microsoft as Important. The four CVE-2023-36881 flaws are all located in different components of Apache Ambari, a web-based dashboard for managing Apache Hadoop clusters.

“Our initial encounter with XSS in Azure HDInsight was straightforward,” the researchers said. “We discovered that the Apache Ambari Background operations had multiple parameters that, by default, could be modified. After identifying this primary stored XSS vulnerability, we expanded our investigation. Using various techniques, we subsequently pinpointed seven more similar vulnerabilities.”

The investigation was not difficult. The researchers used the fuzz testing Intruder tool from Burp Suite, a penetration testing tool for web applications that can deliver XSS payloads. The web dashboard had some XSS filtering for user input, but this was insufficient. “By careful inspection of HTTP responses and analyzing the Document Object Model (DOM), we were able to identify where the application was improperly escaping or sanitizing the user-supplied input,” the researchers said.

After the first flaw was identified in Ambari Background operations, additional stored XSS issues were found in the Managed Notifications, the YARN Queue Manager and YARN Configurations components. These four flaws were packaged under the CVE-2023-36881 identifier. Another stored XSS issue was found in Azure HDInsight’s Jupyter Notebook service, particularly in its Caja compiler. This vulnerability can lead to remote code execution because of the WebSocket communications capability of the service. The attacker can load up a rogue JavaScript file on a remote server that establishes a WebSocket communication channel and sends a reverse shell as a code payload to the service.

The sixth stored XSS issue was found in Azure HDInsight’s Apache Oozie Web Console and can be exploited through custom filters. Apache Oozie is a workflow scheduling system for Hadoop jobs. The two reflected XSS issues were identified in Hadoop itself and Apache Hive and can be exploited via endpoint manipulation.

How to mitigate XSS vulnerabilities

Even though Microsoft fixed the Azure HDInsight vulnerabilities in its service, they serve as a reminder for organizations to implement XSS defenses in their own web applications. Orca’s recommendations include:

Validate user inputs against expected formats, data types, and ranges to mitigate the risk of script injection.
Use output encoding (HTML, JavaScript, and URL encoding) to ensure that user-generated data is properly sanitized before being displayed in web pages.
Implement Content Security Policy (CSP) to add a layer of security that can restrict the execution of scripts and minimize the potential impact of any XSS vulnerabilities.
Use modern web frameworks and libraries that incorporate security features by default, including mechanisms meant to prevent XSS vulnerabilities.
Apply the principle of least privilege by giving users and processes only the permissions required for their specific tasks, so if they are compromised, the attackers gain limited access.

Internet Security, Vulnerabilities