Parsing arguments with docopt


In 2008, I hacked a group of ID3 tag utilities to work around MP3 files with incorrect string encoding. These scripts are less useful nowadays due to the lacking support of MP4. I’d like more practice of python application release, since I just recently did so, so I took this opportunity to rewrite from the scratch, tagcli.

Thanks to mutagen, manipulating ID3 and MP4 tags is quite trivial, especially I am using the easy interface. The challenge then mainly focuses on the easy use. Since tagcli is pitched to the power users, a nice command line UI with comprehensive online help will make it more appealing, where docopt will shine.

Just like pip, tag uses the subcommand pattern to loosely aggregate all functionalities. First, the subcommand is extracted by docopt, then the corresponding subcommand is invoked with the remaining arguments. Each subcommand is essentially a python function with help copy in its doc string. It will use docopt to parse the arguments again to get the semantics of the user input.

Here is the global help examples:

$ tag --help
A mutagen-based tag editor.

  tag  [...]
General Options:
  -h, --help    Show help.
  --version     Show version and exit.

 rename         Rename file using pattern with tags.
 update         Update the tags.
 dump           Dumps the tags.
 tags           Show generic tag names.

See 'tag help ' for more information on a specific command.

and help on the subcommand:

$ tag help rename

usage: tag rename [options]  ...

Rename  with the naming  formated by the tags.

             The file name pattern using python string format
                      syntax. See 'tag help tags' for supported tags.
  -p, --dry-run       Print the action the command will take without
                      actually changing any files.
  --verbose           Output extra information about the work being done.


  tag rename '{discnumber}-{tracknumber:02}.{album} - {title}' foo.mp3

Unlike argparse or optparse, which build the argument parsing in a bottom-up fashion; docopt takes a top-down approach:

  • you make the exact help copy with predefined placeholders.
  • the docopt library parses the string and returns a key/value dictionary
  • or print the usage and exit for if arguments are invalid

This approach has its own Pros and Cons. The good is the developers have full control of the aesthetic of the help, just WYSIWYG. The bad is the docopt specification does not support reStructuredText, so you may have to repeat yourself for the document, and man page if you want to go extra miles. The ugly is that you cannot debug your help string and fitting the help copy to your doc string could be quite tricky if you are religious about PEP8.