Hello, nanoc

nanocblog

I started this blog on 2007 in Blogger, and later moved to self-hosted WordPress. WordPress is a great platform for self-publishing, but it does not perfectly fit my workflow:

  • I would rather edit the post in markup language with my familiar editor, vim than some half-baked HTML rich editor.
  • A full-fledged version control system is always blessing.
  • Edit anywhere, preview anywhere, even in the 30,000 feet above.

That is exactly where static site generator shines. There are many static site generators to meet hackers’ various flavors. I have played with, including, but not limited to: jekyll, Octopress, hyde, blogfile, pelican; and I settled down on nanoc.

First impression on nanoc

I cherish the design philosophy of nanoc, separation of mechanism and policy, — nanoc does not enforce you to follow the implicit conventions to pull off the show. Instead, it builds the rendering pipeline, the helper utilities, but ultimately, it is you to setup Rules file, — as Makefile for make, to instruct nanoc how to render, layout the content, then route the compiled page:

  • the DSL compiler parses the Rules file, saves the (pattern matching, ruby block) pair for each stage
  • the runner then run through the following stages in order: preprocess, compile, route
  • the preprocess stage is the very first stage after the site is loaded, which makes it a perfect candidate to populate the meta pages, see Rendering Tags for more details.
  • in the compile stage, the runner is governed by the DSL filter and layout for page rendering.
  • in the route stage, an absolute path is returned as the final destination for each item.

In each stage, the runner iterates saved patterns, and tries to match the identifier. If a match is found, the runner pass the context(@site, @item) and execute the corresponding ruby block. That is why the pattern order matters in Rules.

Examples

Nothing is more convincing than some concrete examples:

Routing blog posts imported from jekyll

jekyll uses a rigid directory convention:

  • the posts are stored in _posts
  • all posts uses yyyy-mm-dd-slug format

Assume we want to use yyyy/mm/slug permlink structure, we can add a custom route rule in Rules

route "/blog/_posts/*/" do
  # route to yyyy/mm/slug
  path, title = File.split(item.identifier)
  y, m, d, slug = /(\d+)-(\d+)-(\d+)-(.*)/.match(title).captures
  File.join(path.chomp('_posts'), y, m, slug,'index.html')
end

Rendering Tags

tags provide another perspective to organize the posts, they SHOULD be generated dynamically. In the Rules, once the site is loaded, we can populate all the meta pages in preprocess stage:

preprocess do
  generate_tags(@items)
end

The implementation is in lib/default.rb :

def generate_tags(items)
  # ... ...
  # iterate all items and group them by tag

  tags.each_pair do |title, entries|
    items << ::Nanoc::Item.new(
      '= render "partials/blog/list", {
          :archives => entries,
          :extension => 'haml',
          :title => "##{title}",
        }, "/blog/tags/#{title}"
      )
  end
end

In generate_tags, all posts are grouped by their tags(omitted for the sake of simplicity), then we create a page for each tag by rendering the partials/blog/list layout. The secret sauce is to pass the grouped posts as archives attributes, then in partials/blog/list :

- @item[:archives].each do |item|
    h2= link_to item[:title], item.path
    item.compiled_content

we can reference them via @item[:archives], and render the post list. Using the same magic, we can also support pagination.

Rendering figure

nanoc provides a very rich and flexible architecture to support content distilling, for example, this page is rendered by rdiscount filter, chained with pygments.rb for the code snippet. It is also quite straightforward to homebrew a filter. For example, the following filter will surround any img element with title attribute with figure, figcaption.

module Nanoc
  module Filters
    class Captioner < Nanoc::Filter
      identifier :captioner
      type :text
      def run(content, params= {})
        html = Nokogiri::HTML.fragment(content)
        html.css("img[title]").each do |element|
          figure = Nokogiri::XML::Node.new("figure", element.document)
          element.parent.add_child(figure)
          element.parent = figure

          # Add a <figcaption>.
          figcaption = Nokogiri::XML::Node.new("figcaption", element.document)
          figcaption.content = element["title"]
          element.add_next_sibling(figcaption)
        end
        html.to_s
      end
    end
  end
end

The filter is fed with content, and is expected to output distilled content. We can use the Nokogiri module to load the content into DOM tree, then manipulate the HTML elements, dump the DOM tree to the string at the end.

Beyond nanoc

nanoc is extremely flexible and powerful for canonical static content, it falls short for any dynamic features, such as

  • authentication and authorization
  • comment management
  • search

Any lightweight backend will suffice for authentication and authorization, I am considering OpenResty solution from nginx community for its simplicity.

The comments in this site are managed by disqus.

I am still debating between Google custom search and a dedicated Solr service. They both function very well, while solr have more knobs than Google search.