Friday, January 4, 2013

Handling Untrusted JSON Safely

JSON (JavaScript Object Notation) is quickly becoming the de-facto way to transport structured text data over the Web, a job also performed by XML. JSON is a limited subset of the object literal notation inherent to JavaScript, so you can think of JSON as just a part of the JavaScript language. As a limited subset of JavaScript object notation, JSON objects can represent simple name-value pairs as well as lists of values.

BUT with JSON comes JavaScript and with JavaScript comes the potential for JavaScript Injection, the most critical type of Cross Site Scripting (XSS).

Just like XML, JSON data needs to be parsed to be utilized in software. The two major locations within a Web application architecture where JSON needs to be parsed are in the browser client-side and in application code on the server.

Parsing JSON can be a dangerous procedure if the JSON text contains untrusted data. For example, if you parse untrusted JSON in a browser using the JavaScript "eval" function, and the untrusted JSON text itself contains JavaScript code itself, the code will execute during parse time.

From http://www.json.org/js.html
"To convert a JSON text into an object, you can use the eval() function. eval() invokes the JavaScript compiler. Since JSON is a proper subset of JavaScript, the compiler will correctly parse the text and produce an object structure. The text must be wrapped in parens to avoid tripping on an ambiguity in JavaScript's syntax.
var myObject = eval('(' + myJSONtext + ')'); 
The eval function is very fast. However, it can compile and execute any JavaScript program, so there can be security issues."
So, the essential question is: How can programmers and applications parse untrusted JSON safely?

Parsing JSON safely, Client Side

The most common safe way to parse JSON safely in a modern browser is to utilize the JSON.parse method inherent to JavaScript. Here is a good reference that describes the state of JSON.parse browser support. And for legacy browsers that do not support native JSON parsing, there is always Douglas Crockfords JSON parsing library for legacy browsers.

Parsing JSON in the browser is often the result of an asynchronous request returning JSON to the browser. Another technique that is becoming more common is to embed JSON directly in a Web page server side, and then to parse and render that JSON in the browser. The mechanism of embedding JSON safely in a Web page is described here:

https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet#RULE_.233.1_-_HTML_escape_JSON_values_in_an_HTML_context_and_read_the_data_with_JSON.parse

Step 1 includes embedded JSON on a web page safely through HTML Entity Encoding:

<span style="display:none" id="init_data">
    <%= data.to_json %>  <-- data is HTML escaped -->
 </span>

Step 2 and 3 includes decoding the JSON data and then parsing it safely.

 <script>
    // unescapes the content of the span
    var jsonText = document.getElementById('init_data').innerHTML;
    // parse untrusted JSON safely
    var initData = JSON.parse(jsonText);
 </script>

Parsing JSON safely, Server Side

It's important to use a formal JSON parser when handling untrusted JSON on the server. For example, Java Programmers can utilize the OWASP JSON Sanitizer for Java. The OWASP JSON Sanitizer project aspires to accomplish the following goals.
"Given JSON-like content, converts it to valid JSON. 
This can be attached at either end of a data-pipeline to help satisfy Postel's principle: 
Be conservative in what you do, be liberal in what you accept from others            
Applied to JSON-like content from others, it will produce well-formed JSON that should satisfy any parser you use. 
Applied to your output before you send, it will coerce minor mistakes in encoding and make it easier to embed your JSON in HTML and XML."
The OWASP JSON Sanitizer project was created by and is maintained by Mike Samuel, an esteemed member of the Google Application Security Team. For more information on the OWASP JSON Sanitizer, please visit the OWASP JSON Sanitizer Google Code page.


I hope this article helps you on your way to safer parsing of JSON in your applications. Please drop me a line if you have any questions at jim@owasp.org.



Jim Manico is the VP of Security Architecture for WhiteHat Security, a web security firm. Jim is also a board member for the OWASP foundation where he manages and participates in several projects.




2 comments:

Unknown said...

Re: "Parsing JSON safely, Client Side"

Applying this strategy is a precursor to removing inline javascript from an application. All javascript can be moved to external files which means that you can apply a Content Security Policy response header that instructs the browser to prevent the execution of any javascript on the page (over simplified). This effectively eliminates stored and reflected XSS (but has no impact on DOM-based XSS afaict).

Great post!

Jim Manico said...

Thank you Neil. Double thank you for bringing the embedded JSON strategy to our attention! GREAT update to the XSS Prevention Cheat Sheet from YOU! https://www.owasp.org/index.php?title=XSS_%28Cross_Site_Scripting%29_Prevention_Cheat_Sheet&action=historysubmit&diff=137511&oldid=136525