The Java programming language offers a seamless and elegant way to store and retrieve data. However, without proper input validation and safeguards in place, your application can be vulnerable to unsafe deserialisation vulnerabilities.
In a best-case scenario, deserialisation vulnerabilities may simply cause data corruption or application crashes, leading to a denial of service (DoS) condition. Many times, the same bugs can be triggered by remote attackers to achieve arbitrary code execution capability on the vulnerable system.
What is serialisation in Java?
Serialisation refers to the process of saving an object's state as a sequence of bytes and conversely, deserialisation is the process of rebuilding those bytes back into an object.
Say you just developed an application that reads and writes data locally, such as from files present on a system. Or you built an application that sends and receives data across a network. What’s the best way to do this while preserving the integrity of the data?
As far as storage is concerned, the choice to store data in files or databases remains up to the developer. Even then, when it comes to transmitting data over a network, you’d have to pick an appropriate data format and encoding mechanism that “standardises” data and is preferably platform independent.
Many solutions exist, including manually converting binary or text data into its simple base64 ASCII form and decoding it. But why reinvent the wheel to implement a data encoding and decoding mechanism? Java’s inbuilt concept of serialisation, does all this for you, for the very objects created by your application that are still in memory.
The Java Serialisation API provides a standard mechanism for developers to handle object serialisation.
For example, say you have a “Person” class in Java that contains fields containing an individual’s personal information, such as “name,” “email address,” “phone number,” and “address.” If you wanted to offer a “save” option to your users, you could either choose to iterate over the “Person” object, convert each field into an appropriate format, such as JSON or CSV, and output it to a file.
If you don’t care about the human-readable aspect of the resulting file and merely want to store this data for retrieval by your application later, serialisation can save you enormous time. With serialisation, you can simply dump the “Person” object or an array (list) of multiple “Person” objects into a file with a single command. The encoding of data is taken care of by Java’s inbuilt serialisation libraries.
What makes serialisation an appealing solution for developers is that storage, retrieval, and transmission of data becomes possible with a single command and without worrying about the underlying logic or platform. Naturally, then, many applications and developers rely on serialisation to store data and the very state of objects as it is.
A simple example of a “Person” class that supports serialisation would be:
How unsafe object deserialisation vulnerabilities occur
Say your Java application was deserialising data from a file or network stream and retrieving previously serialised “Person” objects from it. Should your application be expecting a “Person” object, but instead receives an “Animal” object—either in error or deliberately due to malicious activity, what happens?
In most cases, an error message may occur crashing the application, which ends up in a DoS condition triggered by corrupted data. In more advanced cases, depending on how the objects are being used, closely related classes may be able to trigger remote code execution (RCE). This can, for example, happen when the application was expecting to receive “configuration” data or payload containing serialised Java objects.
For example, in July this year, a critical vulnerability (CVE-2021-35464) in ForgeRock’s OpenAM stemmed from unsafe Java deserialisation in the Jato framework used by the application. Through a simple GET request, an attacker could send a crafted serialised object to the server and execute their malicious code. A PoC exploit demonstrated by PortSwigger researcher Michael Stepankin explains this in detail.
More recently, Atlassian began emailing enterprise customers to patch a critical JIRA Data Center vulnerability, CVE-2020-36239, that could let remote attackers execute arbitrary code on vulnerable servers. The cause of the vulnerability? Unsafe deserialisation and exposed ports. An attacker could send crafted payload to the exposed Ehcache RMI network service ports 40001 and potentially 40011 and achieve code execution.
Deserialisation vulnerabilities don’t affect only Java apps
Java is not the only programming language affected by unsafe deserialisation vulnerabilities. Microsoft .NET languages also support serialisation, which means inadequately secured .NET applications that deserialise data could pose a risk.
Not too long ago, a threat actor group called Praying Mantis (TG1021) was targeting IIS servers running vulnerable ASP.NET applications. A zero-day in ASP.NET application “Checkbox” let remote attackers execute arbitrary code that stemmed from unsafe deserialisation.
With so many Java and .NET applications relying on serialisation for storing and exchanging information, a greater risk surface is available to threat actors when applications lack basic input sanitisation or are hosted on insufficiently secure servers (such as exposed ports or improperly authenticated API endpoints).
How to protect against unsafe deserialisation?
An obvious approach is to perform basic input sanitisation when parsing objects from a deserialised byte stream. Another essential ingredient to preventing unsafe deserialisation attacks is to allow only certain types (classes) of objects to be deserialised. This eliminates any ambiguity faced by your application and is an elegant way of dodging application crashes or the possibility of DoS attacks.
There are two ways of doing this: Follow a blacklist approach—i.e., explicitly forbidding objects of certain classes from being deserialised—or a more restrictive, whitelist approach. Although restrictive, the whitelist approach tends to be safer, as only the objects belonging to a pre-approved set of classes will be deserialised by the application, preventing any surprises.
Popular Java project Jackson Databind has previously implemented both types of fixes against deserialisation flaws. For the longest time the project went with a more permissible blacklist approach and would simply add forbidden gadgets/classes to the same list from time to time:
However, newer fixes follow a more selective whitelist approach by introducing a “PolymorphicTypeValidator” class. Only objects of classes belonging to the list will be deserialised. Here’s an example of how this class can be done in practice:
The example code shown would allow only the “com.gypsyengineer.jackson” type of objects to be deserialised.
Whatever approach you choose to use, the basic tenet here remains to never trust input, even when it appears to come from authoritative sources or an application (rather than a user). Performing basic sanitisation checks prior to processing an input can help prevent a major exploitation.
For interested researchers and pen-testers, a GitHub repository called ysoserial contains a collection of utilities and property-oriented programming gadget chains typically found in common Java libraries. Under the right conditions, these gadget chains could aid in conducting unsafe deserialisation attacks—a reasonable way to check if your Java application could be exploited via insecure deserialisation by advanced threat actors.