Cleaning Up After Highlight.js: A Guide to Fixing Syntax Highlighting Issues
I initially replaced Highlight.js with manual formatting for code snippets after revamping my Blogger site, aiming for more control. However, the process proved too time-consuming and unsustainable, so I returned to using Highlight.js for efficiency. Since blogging is a side activity, it's important that it fits easily into my daily routine as a programmer.
In this post, I'll show you how I fix syntax highlighting issues that arise when using Highlight.js. I'll focus only on the browser version, since blogging platforms like Blogger often condense most of the theme's source code into a single file — making separation of concerns difficult. In this context, using a CDN to load and execute the script is the most practical approach.
1. Setup
When I come up with an idea for a new post, I open Obsidian and create a new Markdown file. In the latter, I may include plenty of code snippets to illustrate the methods I want to explain to readers. For example:
blog_idea.md
# 1. A PowerShell code snippet example
```powershell
#Requires -PSEdition Core
filter ConvertTo-MyHtml {
$_ |
ConvertFrom-Markdown |
Select-Object -ExpandProperty Html |
Out-File $args[0] -Encoding utf8 -NoNewline
}
function ConvertFrom-MyMarkdown {
param($PathWithoutExtension)
Get-Content "$PathWithoutExtension.md" -Raw |
ConvertTo-MyHtml "$PathWithoutExtension.html"
}
ConvertFrom-MyMarkdown .\blog_idea
```
# 2. A WQL query input as SQL
```sql
ASSOCIATORS OF {Win32_LogicalDisk="C:"}
WHERE ResultClass=Win32_DiskPartition
```
To turn this into a web-ready article, I use an automation tool such as the PowerShell Core cmdlet ConvertFrom-Markdown
to an HTML fragment. The latter does not include the full structure of a web page — it's essentially the inner HTML of the <body>
element.
While not strictly necessary, I prefer to treat each article as a standalone HTML document. So I wrap the converted content in a complete HTML structure with html
, head
, and body
tags so I can test the page locally using tools like Live Server. I also include styles and scripts, among which are the Highlight.JS stylesheets and scripts. The final post source code looks like this:
blog_idea.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title></title>
<style>
@import url('https://fonts.googleapis.com/css2?family=JetBrains+Mono:ital,wght@0,100..800;1,100..800&family=Roboto:ital,wght@0,100..900;1,100..900&display=swap');
pre > code {
border-radius: 4px;
font-family: 'JetBrains Mono', 'Courier New', monospace;
}
</style>
<!-- Highlight.js theme stylesheet (required for code highlighting) -->
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/styles/atom-one-dark.min.css">
</head>
<body>
<!-- #region: section generated from the Markdown to HTML conversion -->
<h1 id="a-powershell-code-snippet-example">1. A PowerShell code snippet example</h1>
<pre><code class="language-powershell">#Requires -PSEdition Core
filter ConvertTo-MyHtml {
$_ |
ConvertFrom-Markdown |
Select-Object -ExpandProperty Html |
Out-File $args[0] -Encoding utf8 -NoNewline
}
function ConvertFrom-MyMarkdown {
param($PathWithoutExtension)
Get-Content "$PathWithoutExtension.md" -Raw |
ConvertTo-MyHtml "$PathWithoutExtension.html"
}
ConvertFrom-MyMarkdown .\blog_idea
</code></pre>
<h1 id="a-wql-query-input-as-sql">2. A WQL query input as SQL</h1>
<pre><code class="language-sql">ASSOCIATORS OF {Win32_LogicalDisk="C:"}
WHERE ResultClass=Win32_DiskPartition
</code></pre>
<!-- #endregion -->
<!-- #region: Highlight.js scripts (required for syntax highlighting) -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/highlight.min.js"></script>
<!-- Language module for PowerShell (required for PowerShell highlighting) -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/languages/powershell.min.js"></script>
<script>hljs.highlightAll();</script>
<!-- #endregion -->
</body>
</html>
Line 15: Highlight.js theme stylesheet.
Line 42 to 45: Highlight.js scripts.
Line 20 to 35: PowerShell code snippet.
Line 37 to 39: WQL code snippet.
This is how it looks in the browser
And this is how it looks without the Highlight.js stylesheet and scripts
That's how I set up Highlight.js in the browser to keep my code snippets clean, unstyled, and readable in the source HTML file, while they appear fully colorized and formatted in the browser.
For example, usually when you save the HTML page via the right-click menu in the browser, the JavaScript has already executed. This means that any syntax highlighting performed by the scripts has already been applied. As a result, the rendered code blocks are transformed into verbose HTML structures. The code is broken into <span>
elements that label and style individual tokens based on their roles — keywords, operators, and so on. This is what the WQL snippet source code ends up looking like:
<pre><code class="language-sql hljs" data-highlighted="yes">ASSOCIATORS <span class="hljs-keyword">OF</span> {Win32_LogicalDisk<span class="hljs-operator">=</span>"C:"}
<span class="hljs-keyword">WHERE</span> ResultClass<span class="hljs-operator">=</span>Win32_DiskPartition
</code></pre>
Ironically, this resembles the kind of markup I used to write manually when I was highlighting code by hand. The difference is that Highlight.js handles this complexity dynamically, in the browser — whereas maintaining such markup directly in the source file made updates tedious and error-prone.
2. Fixing Up Code Snippets
Highlight.js doesn't always return correctly highlighted output. Some code constructs may be miscolored, or not highlighted at all. When that happens, I often need to tweak things manually to ensure the result is visually accurate and semantically meaningful.
In the PowerShell snippet example, the filter
keyword is correctly highlighted, but the custom filter name that follows isn't recognized as a function definition. This contrasts with the function
keyword, which Highlight.js properly interprets — applying the correct styling to the function name. Since a filter is a specific type of function, it should be styled the same way as a function definition.
Similarly, in the WQL snippet, the expression ASSOCIATORS OF
is split into two separate tokens. However, in WQL, ASSOCIATORS OF
is a single compound keyword.
As the demo video below shows, to fix these issues, I usually modify the rendered HTML from the browser's Developer Tools > Elements
panel by adjusting the <span>
elements manually as needed, and copy the resulting HTML fragment - usually <pre>
elements - to replace the improperly highlighted code sample in the source file.
You'll notice that each manually modified code sample source code includes a data-highlighted="yes"
attribute in its parent <code>
element. This tells the highlighter to skip re-processing that particular snippet.
3. Pure HTML+CSS Highlighting
Cherry-picking code samples to exclude them from automatic highlighting suggests that in a larger static web page with more program source codes snippets, there will still be code that does not require manual refining and can be handled by the Highlight.JS scripts.
However, in cases like our example — where all code samples have been manually adjusted to match the exact formatting and details I want to present — there's effectively no need to retain the Highlight.js JavaScript in the final HTML source code. Since the Highlight.js library clearly separates styling from syntax classification, it's safe to remove the JavaScript scripts once the highlighting has been applied — the styles will remain intact.
Earlier, we used the browser's context menu to save a snapshot of the live Document
as shown in the Developer Tools > Elements
tab. By the time we clicked "Save as..."
, Highlight.js had already executed, and the DOM was fully rendered — including all highlighted code.
At that point, the only task left is to manually adjust each <pre>
block as needed, just as we demonstrated in the previous video, and then remove the Highlight.js <script>
tags from the page. This cleanup step can also be done before saving the Document
, directly from the browser's Developer Tools, as shown in the following video.
3.1 @import
the Theme Stylesheet
As shown in the previous video, when saving a webpage via the browser's UI, external stylesheets and scripts linked with <link>
or <script>
tags are downloaded and rewritten as local files. This behavior is undesirable when we plan to embed the saved HTML in platforms like Blogger, where remote stylesheet URLs are preferred over inline styles.
To avoid this — specifically for stylesheets — we can use the @import
rule within a <style>
element instead of relying on a <link>
tag:
<style>
/* Highlight.js theme stylesheet (required for code highlighting) */
@import url('https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/styles/atom-one-dark.min.css');
</style>
I place the @import
declaration in its own <style>
block to ensure that the Highlight.js theme can take precedence in the cascade, particularly when specificity or load order matters. If order isn't a concern, the @import
rule can be included alongside other CSS rules in a shared <style>
block. Remember that @import
declarations must appear at the very top of the block to remain valid.
In our case, the combined stylesheet would look like this:
<style>
@import url('https://fonts.googleapis.com/css2?family=JetBrains+Mono:ital,wght@0,100..800;1,100..800&family=Roboto:ital,wght@0,100..900;1,100..900&display=swap');
/* Highlight.js theme stylesheet (required for code highlighting) */
@import url('https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/styles/atom-one-dark.min.css');
pre > code {
border-radius: 4px;
font-family: 'JetBrains Mono', 'Courier New', monospace;
}
</style>
This approach keeps the code lightweight, readable, and Blogger-friendly — without cluttering the template's <b:skin>
section and inline stylesheets.
3.2 Headless Browsing Mode
An alternative to using the @import
CSS declaration is to execute the page in headless mode, which behaves as if you're viewing the rendered DOM in the Elements panel of Developer Tools — but without a visible UI. Headless mode renders the page exactly as the browser would — with full JavaScript execution and DOM transformation. Since there's no “Save As…” step involved, the stylesheet CDN URL remains untouched, and the final output reflects the fully rendered HTML document.
It also enables full automation. For instance, the following PowerShell Core script takes a Markdown file, converts it to HTML, runs it through a headless browser, and captures the final HTML — with Highlight.js applied and script tags stripped out:
MakePost.ps1
View on GitHub
#Requires -PSEdition Core
$templatePath = "$PSScriptRoot\template.html"
$markdownPath = Get-Item $args[0] -ErrorAction Stop |
Select-Object -ExpandProperty FullName
$htmlPath = [IO.Path]::ChangeExtension($markdownPath, '.html')
$sourceWithScriptPath = "Temp:\$(
New-Guid |
Select-Object -ExpandProperty Guid
).html"
$sourceNoScriptPath = "Temp:\$(
New-Guid |
Select-Object -ExpandProperty Guid
).html"
"{0}{2}{1}" -f @(
((Get-Content $templatePath -Raw) -split '\n<!-- __CONTENT__ -->\n') +
(
Get-Content $markdownPath -Raw |
ConvertFrom-Markdown |
Select-Object -ExpandProperty Html
)
) | Out-File $sourceWithScriptPath -Encoding utf8 -NoNewline
Start-Process 'C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe' -ArgumentList @(
'--headless'
'--disable-gpu'
'--dump-dom'
'--single-argument'
(Get-Item $sourceWithScriptPath).FullName
) -RedirectStandardOutput $sourceNoScriptPath -Wait -NoNewWindow
(Get-Content $sourceNoScriptPath) -notmatch '^<script\b' -replace '\s*(<span class="hljs-keyword">filter</span>\s+<span class=")hljs-built_in','$1hljs-title' -replace '\bASSOCIATORS <span class="hljs-keyword">OF','<span class="hljs-keyword">ASSOCIATORS OF' |
Out-String |
Out-File $htmlPath -Encoding utf8 -NoNewline
The replacement operations at the end are optional and tailored to specific Highlight.js output adjustments discussed earlier. You can remove them if your only concern is to execute the JavaScript, preserve the CDN styles, and eliminate unnecessary scripts.
The HTML template.html
used in the script wraps the Markdown-generated content into a full HTML document structure with optional styles and Highlight.js resources. The <!-- __CONTENT__ -->
placeholder is where the converted HTML is inserted:
template.html
View on GitHub
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title></title>
<style>
@import url('https://fonts.googleapis.com/css2?family=JetBrains+Mono:ital,wght@0,100..800;1,100..800&family=Roboto:ital,wght@0,100..900;1,100..900&display=swap');
pre > code {
border-radius: 4px;
font-family: 'JetBrains Mono', 'Courier New', monospace;
}
</style>
<!-- Highlight.js theme stylesheet (required for code highlighting) -->
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/styles/atom-one-dark.min.css">
</head>
<body>
<!-- #region: section generated from the Markdown to HTML conversion -->
<!-- __CONTENT__ -->
<!-- #endregion -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/highlight.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/languages/powershell.min.js"></script>
<script>hljs.highlightAll();</script>
</body>
</html>
Start the Headless Browsing Mode from the Command Line
In the second video demo, I used Microsoft Edge, but the exact same arguments work with Google Chrome. You can explore Peter Beverloo's list of Chromium command-line switches for more options.
Here's the simplest working command:
msedge --headless --dump-dom %url%
We can replace the command msedge
with chrome
. To avoid typing the full path each time, make sure the browser's executable directory is in your system's PATH
. For Microsoft Edge, the default path is C:\
.
You can add switches like --disable-gpu
since there's no visible rendering. I also use --single-argument
to ensure that file paths with spaces are passed correctly:
msedge --headless --disable-gpu --dump-dom --single-argument http:\\localhost\blog idea.html
To run the command in a blocking way, you can either use the start
command or return the result to the pipe (|
) like the following example in the Command Prompt:
start /wait /b msedge --headless --dump-dom http:\\localhost\blog_idea.html
msedge --headless --dump-dom http:\\localhost\blog_idea.html | findstr /vbrc:" *<script\>"
Redirecting standard output suppresses error messages by default — no 2> nul
needed.
The PowerShell equivalent of the last one is:
msedge --headless --dump-dom 'http:\\localhost\files\blog_idea.html' |
Where-Object { $_ -match '\s*<script\b' }
In the MakePost.ps1
script above, I use Start-Process
with -Wait
and -RedirectStandardOutput
to capture the final DOM. The output is saved to a temporary file, which is then parsed to extract the fully rendered HTML without scripts. It is crucial that the input file passed to the browser has a .html
extension — otherwise, the browser will not treat it as an HTML document, and syntax highlighting (or even DOM parsing) will fail. This subtle requirement ensures that the browser's rendering engine applies the correct content type, enabling JavaScript execution as if the page were loaded in a normal browser window.
4. Conclusion
In this post, I explained how to set up the syntax highlighter Highlight.js in an HTML source file that's ready to be published on my Blogger site. One of the main advantages of this approach is the maintainability of the source file: the code remains clean, readable, and easy to edit.
However, Highlight.js isn't flawless — it occasionally misclassifies or incorrectly colorizes tokens, which can reduce the semantic clarity of the highlighted code. At that point, we may choose to trade off maintainability for precision by manually adjusting the snippets, even though it results in a more cluttered and less readable HTML fragment.
When all code samples have been manually refined, we can further optimize the final output by removing unnecessary Highlight.js scripts from the source. However, the method we choose for this cleanup matters: some approaches — such as manually saving a copy of the page with a <link>
to a CDN stylesheet — may impact the portability of the resulting HTML file.
Comments
Post a Comment