Externalizing Syntax Highlighting

2024-11-24

Every good technical blog benefits from having code snippets syntax highlighted. I wanted to include syntax highlighting on my site too, but without including JavaScript to do it. My preferred tool to solve this particular problem is a library called shiki. Instead of including a stylesheet for each language, it embeds inline styles directly for the elements of the snippet that are included, but nothing else.

Let’s look at a small example. Given this rust code:

let builder = WebViewBuilder::new(&window);

Shiki will produce html that looks like this

<pre class="shiki nord" style="background-color:#2e3440ff;color:#d8dee9ff">
	<code>
		<span class="line">
		<span style="color:#81A1C1">let</span>
		<span style="color:#D8DEE9"> builder</span>
		<span style="color:#81A1C1"> =</span>
		<span style="color:#8FBCBB"> WebViewBuilder</span>
		<span style="color:#81A1C1">::</span>
		<span style="color:#88C0D0">new</span>
		<span style="color:#ECEFF4">(</span>
		<span style="color:#81A1C1">&amp;</span>
		<span style="color:#D8DEE9">window</span>
		<span style="color:#ECEFF4">);</span>
		</span>
	</code>
</pre>

Like everything, shiki comes with tradeoffs. Instead of runtime dependencies it includes a not-insignificant amount of build time dependencies. Typically, this is a worthy tradeoff. Shiki’s full bundle size is 6.4 MB minified. This includes all the configuration files required for all the languages it supports as well as all the themes it supports. It has a slimmed down bundle for just supporting web languages (html, css, js, ts, json, markdown, etc). That’s about 3.8 MB. There’s also a fine grained bundle that essentially includes no themes or languages by default, but allows you to add those in later. These are large, but if it’s just a build dependency it shouldn’t really matter.

The challenge with integrating Shiki in my site is that I actually render markdown files at request time. That means if I include Shiki it’s technically in the runtime path of my server code. My site is deployed via Cloudflare Pages and there are constraints on how large the overall server code can be. Besides the constraints though, the more code that has to be loaded, the slower the startup times for the worker will be.

Instead of including Shiki in my server I wondered… could I externalize it as a separate service? If I cached each code snippet based on its content, theme, and lang I could keep the common case fairly fast while reduce the overall size of my own server.

Simple services with Val.town

Cloudflare feels heavy for small services. An up-and-comer in this space that I enjoy using is val.town. The gist of val.town is that you write simple services and libraries directly in the browser and they’re immediately deployed. Feels like a good fit for this syntax highlighting service.

You can see the entirety of the service below.

Using it is fairly simple. Make a POST request to the endpoint with the lang and theme query params.

curl -X POST "https://just_be-highlight.web.val.run?lang=js&theme=nord" --data "console.log('hello world')"

Wiring up the service

This service will be outside my typical rendering flow so I want it to come in at the end. Essentially, after I’ve rendered my HTML, if there’s code blocks that can be syntax highlighted I’ll attempt to do so by leveraging Cloudflare’s HTMLRewriter. If that process fails for whatever reason then the result will just be served as is.

See the PR for the full changes to get it wired up.