Automating Astro frontmatter with Claude skills

You're looking at a bunch of overlapping, messy tags for some content you don't fully understand because you're not an expert in the relevant domain. Is "Net Zero Technology" really different to "Net Zero Solutions"? You ask yourself: "When did magic strings stop being magic?" You miss Ruby on Rails.

This was me. I was rebuilding a company website to better showcase their product offering and improve the general feel of their web presence. Part of this included porting over a large collection of case studies and blog posts. A lot of time and effort had gone into this content and I needed to treat it with care.

My suggested design included tags and descriptions for these pieces of content, which would allow us to show short blurbs, rather than just previewing the first few lines of the post. I've never been a fan of these as the opening of a post rarely captures the essence or main themes. They invariably end in ellipsis as well, like a last ditch attempt to pique your interest and get you to click "read more". I built the site using Astro - Astro has content collections with Zod schema validation, which meant I could enforce structured frontmatter across all posts.

This metadata didn't exist in the current content, and while I've been learning more about the related domain I am by no means an expert. I tried my hand at writing some descriptions and even agonised over a few tags before realising I would most likely miss the mark. I ran a quick test using Claude to generate a relevant description and a short set of 3-4 tags instead, and it created a solid starting point.

I did this for one or two posts and realised it was a repeatable action - not just for this piece of work, it would be useful in future when adding new blog posts or case studies. This is skill territory.

I created a skill and defined both content types, case studies and blog posts, and outlined what I was looking for when creating descriptions for them. Case study descriptions are more punchy when framing the problem or the impact of the solution, while blog posts descriptions might be more enticing if they frame the general theme or direction of the post. As for tags, we didn't have any yet so this could become a starting point.

After having an initial crack it became clear that while the description aspect was useful, the tags were going to create an unmaintainable mess of magic strings. Having three overlapping tags like "Net Zero", "Net Zero Technologies", and "Net Zero Funding" looks sloppy and makes users think the site is poorly maintained. It's just not a good look.

---
tags: ["Net Zero Finance", "Net Zero Technology", "Renewable Technology"]
---

Overlapping, repetitive tags that provide little value.

I tweaked the skill to get Claude to look at the existing tags it had generated in other blog posts before suggesting ones for the file I'd passed to the skill, and made sure we kept tags granular and composable, meaning we could avoid tags that were too similar or overlapping. But this approach posed issues:

More input tokens - potentially whole blog posts will be processed if the skill doesn't tell Claude to stop reading after it finds the frontmatter tags!
Slower process - with many files to read, and more each time we create new content, this was slow and would only get slower.
Potential reliability issues - with a larger context window, full of blog posts and case studies, we are probably muddying the waters. With more to get through and consider when giving a response, Claude will be taking loads of fluff into consideration when it should only really be considering a small subset of that information.

The fix for this was simple: another content collection for tags, this time a simple tags.json file.

[
  { "id": "decarbonisation", "name": "Decarbonisation" },
  { "id": "renewable-energy", "name": "Renewable Energy" },
  { "id": "net-zero", "name": "Net Zero" }
]

One object per tag, so we can scale to more properties if we decide to get fancy.

This mitigates our problems very simply:

Fewer input tokens - it's just one file with some strings in it. This scales far more manageably as each new piece of content can use existing tags that apply. The skill encourages Claude to create new tags if something appropriate doesn't exist, but in the worst case 3 or 4 new tag strings being added to the context window is better than a few hundred words from a new blog post.
Faster - it's just one file with some strings in it. It's not a growing number of files full of text, so there's less to do.
More focused context - it's just one file with some strings in it. We're not diluting the window with lots of text that's not needed once it's been read in, so Claude has less to worry about when making suggestions.

Now each blog post or case study could reference a tag by id in its frontmatter, breaking the build if it tried to reference a non-existent tag.

---
tags: ["net-zero"]
---

Nice non-magic tags, created magically.

With this in place I was able to create descriptions and tags for existing content and hand them over to domain experts to review and tweak. Now when we add new content we can focus our attention on the piece of writing itself, and create a quick and easy starting point for frontmatter. If (when) content is updated over time we can also use this as a tool to easily tweak the related frontmatter to make sure it stays fresh.

Magic strings were magic all along.