Exploiting XSS via Markdown

I recently came across a web application in which I was able to exploit a Cross-Site Scripting (XSS) vulnerability through a markdown editor and rendering package. It was the first time I had come across this type of vulnerability, and I found it particularly interesting because it allowed me to bypass multiple layers of XSS filtering that was implemented in the application. Here’s a short article on how I came across the vulnerability and set about crafting an exploit. Enjoy!

What is markdown?

Markdown is a simple language for writing and formatting content. By simple, I mean there is a small amount of syntax to learn which allows writers to write clean but aesthetically pleasing content. It’s used all over the place, from Gists and readme files on GitHub to the very article you’re reading right now.

A standardised syntax allows the same document to be displayed in different ways by different markdown processors. A heading is always a heading, but the processor can choose which font and weight to apply, where to place the heading, and how the heading may or may not be displayed in a table of contents.

Here’s an example:

A photo of a cute puppy.

Articles are better with visuals, especially puppers with collars. But behind the scenes, Medium doesn’t store a web page of HTML and CSS, it stores a markdown file. Behind the scenes, this good boy looks something like this:

![The goodest boy](https://images.unsplash.com/the_good_boy.png)

Not so cute… But functional! What happens is Medium reads this line and knows, because of the markdown syntax, that this is an image that needs to be shared with the world. Medium processes the line and generates the HTML that makes up this article.

How can we get XSS in markdown?

The important part is in that last line. Medium reads the line of markdown, then generates HTML. What follows is that, if this is not done safely, we could include malicious JavaScript in the markdown so it is added to the page when processed by the markdown processor.

In the web application I tested, I knew that XSS was going to be a tough ask. It was an Angular application, which sanitises all content rendered on the page by default. And, based on testing of the API, I knew that anything that looked like HTML or JavaScript would be stripped out before it was stored in the database.

But, I figured, the markdown may be an entry point if it’s not sanitised correctly on the web application or the API.

Let’s take another look at that markdown

Another example of some markdown is a link. Its syntax is identical to an image, but without the prefixed ‘!’.

[Click Me](https://www.example.com/)

When processed by Medium, it will look a little like this:

<a href="https://www.example.com/">Click Me</a>

If we can structure the markdown correctly, we should be able to alter the resulting HTML and include whatever nasties we can imagine.

The exploit!

The initial exploit, as it turns out, was quite simple. Working backwards from the anchor code snippet above, we can see we have a few options. We can either escape out of the href attribute and add some script that fires on a DOM event. Or, we can keep it simple and place the code in the href itself. Here’s what we’re shooting for:

<a href="javascript:alert('XSS')">Click Me</a>

We’ll keep our exploit simple for now and work to loftier goals later. Comparing this goal with the link HTML and markdown above, we can see the exploit should be simple. Put the payload in the parentheses and we should be good to go!

[Click Me](javascript:alert('Uh oh...'))

Et voila! It worked! We now have a link that will pop up an alert — or whatever else we choose — when clicked. This demonstrates that both the front-end and back-end are not considering markdown as an XSS vector or are not sanitising correctly.

Is that the best we can do?

Let’s face it, this is cool, but not the best exploit. Firstly, the user has to actually click the link before the JavaScript is executed. Ideally, we want it to execute by only visiting the page. Secondly, when clicked, the link won’t do anything visible for the user. If you create a malicious link that does nothing when clicked, it’s only a matter of time before a web dev comes along and opens the developer tools to see what’s going on, and the game is up.

We want an exploit that executes when the page loads and runs silently without the user knowing. This brings us back to images. If we can create an image and set the script to run when the image loads, then the page will look as expected with our exploit code running in the background.

Exploit, Round 2!

Going back to our image markdown, we can assume that we can execute the same attack since it behaves so similarly to a link. Here’s the markdown:

![The goodest boy](https://images.unsplash.com/the_good_boy.png)

And here’s the resulting HTML:

<img src="https://images.unsplash.com/the\_good\_boy.png" alt="The goodest boy">

Surely we can put the same payload in the parentheses and get XSS. For whatever reason, it didn’t work. Here’s an important bit:

How the markdown is rendered to HTML may differ between markdown processors.

This particular markdown didn’t result in a successful XSS attack in the web application. But this is where pentesting starts being fun — when you know something should be exploitable, and it’s up to you to figure out how.

If I’m being honest, there was a lot of trial and error. Try a payload, send it to the API, see if it executes. Inspect the result, adjust and repeat.

To cut a couple of hours of messing around, here’s the best ways I could inject JavaScript into an image in markdown.

![Uh oh...]("onerror="alert('XSS'))

This is the first payload that I found that worked. I couldn’t seem to get the JavaScript to execute when placed directly in the src or alt attributes, but I could close the src attribute and add more attributes. This processes into:

<img src="" onerror="alert('XSS') alt="Uh oh...">

Since the src value is empty, loading the image will result in an error which will execute out code. Cool huh! But, we’re still noticing a difference on our page, since the image won’t load correctly. So, I came up with this:

![Uh oh...](https://www.example.com/image.png"onload="alert('XSS'))

As it turns out, we could still add the source link and have the onload attribute added, which executes after the page loads. Success!

Why did this work?

Now, don’t go thinking you’re about to make bank collecting bug bounties from GitHub, Medium and every other site that renders markdown. The smart people who write these markdown renderers have thought way ahead, and they do come with support for sanitisation. As it turns out, the sanitiser was causing issues in this web application, so was turned off. It was believed that the other protections put in place would prevent this type of attack, but in this case, it didn’t.

And this is why we have penetration testing. It’s not just to tick a compliance box or put the brakes on train wrecks of projects. It’s to take a fresh perspective and do some sanity checking. Get some people who think about security a lot to pick up the little mistakes that are made along the course of a development project and get them fixed before it’s released to the world.

Developers out there, keep in mind how somebody might abuse what you’re building. Turning off security features like sanitisation should ring alarm bells and should only be done with care and as a last resort. There’s more than one way to skin a cat, you may have to get creative to meet functional requirements without impacting security.

Pentesters out there, don’t forget markdown when you’re out there breaking things. Here’s a list of payloads to get you started.

Thanks for reading!
If you enjoyed this post, follow on Twitter or Mastodon for more content. If you have any feedback or suggestions, leave it in the comments below and I’ll do my best to get back to you.

What is markdown?#

How can we get XSS in markdown?#

Let’s take another look at that markdown#

The exploit!#

Is that the best we can do?#

Exploit, Round 2!#

Why did this work?#