This week thousands of system administrators who make use of Goolge products will open their inbox to see an email from Google explaining that their Web Optimizer product contains an Cross-site scripting flaw that allows hackers to inject scripts into their Google Optimized web pages.
A part of this email follows:
“you are using a control script that could allow an attacker to execute malicious code on your site. To fix the vulnerable section of code, you should immediately either replace the control scripts in your affected experiments or stop the affected experiments and start new experiments”
On receiving this notification I quickly scrambled to my web sites to immediately implement the fix recommended by Google. Later on in the day I had time to to dig deeper into the problem and analyse the security flaw in more detail. What I found is a multi-staged attack that relies on cookie injection, improper text parsing and DOM script injection code.
I have documented my research in this article, and I hope that it will be of use to you. There is a lot to learn from other people’s mistakes, especially when those people are Google themselves.
The flaw exists in Googles Web Optimizer, which is a series of scripts that web administrators use to gain insight into how their web sites are navigated by online customers.
Below is a segment of the the flawed code.
<!-- Google Website Optimizer Control Script -->
<script>
function utmx_section(){}function utmx(){}
(function(){var k='XXXXXXXXXX',d=document,l=d.location,c=d.cookie;function f(n){
if(c){var i=c.indexOf(n+'=');if(i>-1){var j=c.indexOf(';',i);return c.substring(i+n.
length+1,j<0?c.length:j)}}}var x=f('__utmx'),xx=f('__utmxx'),h=l.hash;
d.write('<sc'+'ript src="'+
'http'+(l.protocol=='https:'?'s://ssl':'://www')+'.google-analytics.com'
+'/siteopt.js?v=1&utmxkey='+k+'&utmx='+(x?x:'')+'&utmxx='+(xx?xx:'')+'&utmxtime='
+new Date().valueOf()+(h?'&utmxhash='+escape(h.substr(1)):'')+
'" type="text/javascript" charset="utf-8"></sc'+'ript>')})();
</script><script>utmx("url",'A/B');</script>
<!-- End of Google Website Optimizer Control Script -->
This Website Optimizer Control Script is embedded within your web page to track it. It will be run on the user’s end, and under a successful attack it will extract a malicious script from their cookie and execute it in their browser.
The code above is standard JavaScript however it is not easy to read. There are two reasons for this; firstly, like most Google client side scripts, it is obfuscated, purposely making it cryptic. Secondly it was designed to work fast and efficiently, and not to be easily understood.
I manually de-obfuscated this code, and whilst doing that, I re-factored it to make it easy to understand. The code below should be easy enough to read by anyone with JavaScript knowledge, yet it fulfills the same function as the cryptic code provided by Google.
01. function AB_Analysis(){
02. var k='YOURTACKINGNUMBER'
03. var d=document;
04. var l=d.location;
05. var h=l.hash;
06. var injectionvector1 = ReadFromCookie('__utmx');
07. var injectionvector2 = ReadFromCookie('__utmxx');
08. d.write
09. ('<script src=http://www.google-analytics.com/siteopt.js?v=1&utmxkey='+k
10. +'&utmx=' + injectionvector1
11. +'&utmxx='+ injectionvector2
12. +'&utmxtime=' + new Date().valueOf()
13. +(h?'&utmxhash='+escape(h.substr(1)):'')
14. + '" type="text/javascript" charset="utf-8"></script>')
15. }
16.
17. function ReadFromCookie(field_name){
18. var c = document.cookie;
19. var start = c.indexOf(field_name+'=');
20. var end = c.indexOf(';',start);
21. return c.substring(start + field_name.length + 1, end);
22. }
23.
06. var injectionvector1 = ReadFromCookie('__utmx');
07. var injectionvector2 = ReadFromCookie('__utmxx');
Both these lines call into the function ReadFromCookie which parses the headers of a cookie file without sanitising the input. The lack of sanitation is on line 21:
21. return c.substring(start + field_name.length + 1, end);
Over here we can see a classic mistake – data is blindly read from an untrusted source. The substring function reads from the start of the field’s data all the way till the fist semicolon. What it reads should be a tracking number, but in this case it is a specifically planted ‘dormant’ script. It is dormant because it resides inside a cookie and not inside the HTML of the web page itself. The lines 10 and 11 are where the real trouble begins to show. The extracted and potentially dangerous script is injected into the user’s DOM:
08. d.write
09. ('<script src=http://www.google-analytics.com/siteopt.js?v=1&utmxkey='+k
10. +'&utmx=' + injectionvector1
11. +'&utmxx='+ injectionvector2
12. +'&utmxtime=' + new Date().valueOf()
13. +(h?'&utmxhash='+escape(h.substr(1)):'')
14. + '" type="text/javascript" charset="utf-8"></script>')
The code above is the one responsible for the fatal injection. There is some irony here. In the same statement of code there exists some protection against XSS, but it does not go far enough.
Look at line 13:
13. +(h?'&utmxhash='+escape(h.substr(1)):'')
This code correctly treats the DOM hash (variable h) as untrusted because it can be manipulated in a similar way as the cookie can. The lines before it, however omit calling the escape() function that effectively sanitises code against XSS and similar attacks. Its a typical case of ‘so close, yet so far away’.
For those who find it hard to read JavaScript, I have included a flow chart showing the two functions, AB_Analysis and ReadFromCookie.
The diagram above is a flowchart for the AB_Analysis script. This script is embedded on pages by web developers who are making use of the Google Web Site Optimiser. The red processes are where data is read from the cookie and added to a script, which is in turn injected into the DOM.
Above is a flowchart for the ReadFromCookie function. There is no actual flaw here, except maybe that there is no limit to how much data is read out of the cookie. Also, the end of record detection is rather crude – simply looking for a semicolon in the data.
Below is how a normal cookie might look. Cookies are not very sophisticated and are generally described as simple text files on the user’s computer. In HTML5 cookies have been replaced by a full blown relational database.
Normal Cookie Example
BEGIN COOKIE
umtx: some_value;
umtxx: some_other_value;
END COOKIE
The compromised cookie below contains script inside the umtx and umtxx fields. This script is not active and therefore not dangerous. However, when the AB_Analysis script is executed, the umtx script gets activated through this XSS attack.
Compromised Cookie Example
BEGIN COOKIE
umtx: <<malicious script goes here>>;
umtxx: <<malicious script goes here>>;
END COOKIE
An attack is two staged; first the malicious script has to be injected into a cookie on the victim’s browser. After that, the user must visit a web page. containing the Google AB_Analysis script. The attack can be summarised in the diagram below.
Google was fast to react and provide a fix however this fix needs to be deployed by every web site administrator that uses Google Web Optimiser. This applies to hundreds of thousands of web pages globally.
I hope that administrators are quick to fix this problem as it could easily result in an XSS attack against their site if targeted.
Get the latest content on web security
in your inbox each week.