Adding Search to the Blog
How to add search to a static site.
There鈥檚 something new on the site. Up in the header there is a search box. You can now search through the posts I鈥檝e made. Go ahead and try it out, it works pretty well. Come back if you want an explanation as to how I did that.
Search isn鈥檛 a difficult thing to implement, but getting it right is very hard. I鈥檝e worked on many kinds of searches over the last few years, and I never (ever) want to do search by hand. Always use a dedicated search tool or service.
When you have a server powering the site, search tools are plentiful. I鈥檝e used Solr, which is tricky to work with and get configured but extremely powerful. And recently I鈥檝e converted a project at work from Solr to Azure Search. Azure Search is a bit more simplistic, but a lot easier to deal with since it鈥檚 mostly managed by Azure.
But this site doesn鈥檛 have a server powering it. It鈥檚 all static. There are files on my computer, I run hexo generate
, and there are files that can be put on any webserver and served up as-is. No per-request processing required. It makes the site super fast.
Someday I鈥檒l do a post about how I host this site and others on Netlify.
Then last week I saw a blog post about searching static sites and decided to try it for myself. After all, I鈥檝e got quite a few posts spanning 5 years now.
Enter: Lunr
Lunr is a search tool, like Solr, but meant to be run client-side. In a browser. All in JavaScript. That seems like it鈥檒l be complex and slow. That鈥檚 what I initially thought, but it performs very well.
All you need is some data to feed it to build and index, and then you search and get results. Sounds easy, right?
Getting JSON from Hexo
So we need data from the site. But Hexo publishes HTML, not JSON. Well it can publish JSON, too!
The generator has lots of options. After some trial and error, here鈥檚 the config I ended up using:
jsonContent:
file: posts.json
dateFormat: YYYY-MM-DD
meta: false
pages: false
posts:
title: true
date: true
text: true
description: true
tags: true
image: true
path: true
link: false
raw: false
content: false
slug: false
updated: false
comments: false
permalink: false
categories: false
author: false
The resulting JSON file looks something like this:
[
{
"title": "Post Title",
"date": "2020-01-25",
"text": "The full text of the post, without any HTML!",
"description": "Post summary line",
"path": "2020/01/post",
"tags": [
{
"name": "Tag",
"slug": "Tag",
"permalink": "https://moscardino.net/tags/Tag/"
}
]
}
]
Wonderful! It has all the data I鈥檇 like to search and all the data I鈥檇 like to display.
You can see the entire generated file here. It鈥檚 not small, around 220KB, but it compresses very well.
Building the Index
The next step is to load the JSON file and create a Lunr index from it. This part is actually quite easy:
// Load the posts from the json file
let response = await fetch('/posts.json');
let posts = await response.json();
// Create the lunr index
let index = lunr(function () {
this.ref('id');
this.field('title');
this.field('text');
this.field('description');
this.field('keywords');
posts.forEach(function (post, i) {
post.id = i;
post.keywords = post.tags.map(tag => tag.name);
this.add(post);
}, this);
});
There are two key things here:
- Each document in the index needs a
.ref()
. This must be unique per document. In this case, we are using the index of the post in the JSON file. All other searchable fields should be added with.field()
- Our tags array is too complex, so we extract just the name of each tag.
Perform the Search
Now that we have an index, we can make our search:
let urlParams = new URLSearchParams(window.location.search);
let term = urlParams.get('term') || '';
let results = index.search(term);
Cool! What does the results array look like? What do we do with it? The most important thing to know is that the results does not contain the documents to display. You get a ref
value which you can use to retrieve the document from the original array of posts. It also contains some more info about the score of each match and the details of the match, but I鈥檓 not using those.
Displaying the Results
Here鈥檚 my code for showing the results:
if (results.length) {
updateTitleBox(`Showing search results for <strong>${term}</strong>.`);
// Display the results
results
.filter((_, i) => i < 10) // Top 10 results only
.forEach(result => {
let post = posts[result.ref];
if (post) {
let html = createPostItemHtml(post);
document.querySelector('main').insertAdjacentHTML('beforeend', html);
}
});
}
else
updateTitleBox(`No results found for <strong>${term}</strong>.`);
updateTitleBox
is a helper method to update the text of the box at the top of the results page. createPostItemHtml
uses JS template strings to generate the HTML for each result. I don鈥檛 like generating HTML in JS, but loading a templating library for this is overkill.
There is some more stuff in the rest of
search.js
, but it鈥檚 not very relevant to this post.
Wiring It Up
The last piece to this is to create a page that serves as a results page. This part is very specific to Hexo.
- Create a new layout template in the theme folder. I called mine
search.ejs
. This template needs to have some HTML for our results to be inserted into and it needs to load Lunr and oursearch.js
file. I savedlunr.js
from GitHub into my project to I don鈥檛 have to rely on some 3rd-party host. - Create a search page. Not a post, but a page. I put mine at
public/search/index.md
. The file is empty, except for 2 front-matter properties.layout: search
to use the layout we made, andtitle: Search
for the page title. - Add a search form somewhere on the site. I put mine in the header. It鈥檚 really simple, as the label and button are only shown to screen readers:
<form action="/search" method="GET" class="header-search__form">
<label for="term" class="u-sr-only">
Search Term
</label>
<input type="search" name="term" id="term" class="header-search__input" placeholder="Search" />
<button type="submit" class="u-sr-only">
Search
</button>
</form>
A note about debugging: Using
hexo server
will not work for testing search as it strips query strings from URLs. I got around this by usinghttp-server
andhexo generate -f
together.hexo-generate -w
will not work consistently because I don鈥檛 think it watches the JS files.
Conclusion
Lunr is awesome. It鈥檚 simple and fast. It鈥檚 also easy to integrate with anything that can output JSON. Implementing search took very little time and I think I get a lot of benefits from it.
There can be improvements. As I add more posts, the JSON file will get larger and the index will take more time to build. Lunr does offer a way to pre-build the index and load that directly, but I would need to find a way to build that into the Hexo build pipeline. Maybe some day.
Update: I figured out pre-building. Read about it here.
Photo by Jo茫o Silas on Unsplash.