Web Key Directory Implementation

An Eccentric Anomaly: Ed Davies's Blog

In the last few days I've seen a couple of write ups of implementations of Web Key Directory (for finding the public cryptographic key associated with an email address) so I thought it might be useful to describe mine.

Public-key cryptography relies on there being some method to obtain the public key of whoever you want to send a message to or whoever you want to verify the signature of a message you've received from. There are two stages to that: first actually getting hold of the key and second verifying that it's the right one and not one inserted by some person who wants to read your mail or impersonate you.

The traditional solutions are including the key with the message signed by some authority trusted by both parties as used in TLS (including HTTPS) or getting the key from a public key server then verifying it using signatures by people you trust at some transitive level (the web of trust).

Both approaches are pretty clumsy for many forms of communications which is part of the reason cryptography is not so widely used. Two proposed solutions are Autocrypt and Web Key Directory (WKD). They're not really competitors, they're more complementary than otherwise.

There have long been concerns about the centralised nature of the key servers (though they are distributed in their actual operation) and some of the policies applied to their operation. Recently there have been problems with keys being spammed into uselessness on the servers which gives an additional nudge towards alternative solutions.

Autocrypt is a standard for including keys in the headers of email messages so that clients can build up a list of known keys and automatically do encryption when all the required keys are known.

WKD is a standard for storing the keys at well-known locations on a web site to allow email clients to fetch them automatically. It requires the use of HTTPS which has the neat effect of leveraging that protocol's security to give a modicum of hope that the key fetched really belongs to the right person. It's possible that the hosting provider for a website could sneakily change the key so, in theory, the user should check their own key is served correctly once in a while.

Enigmail, which I use in Thunderbird for sending and receiving email, already does Autocrypt but, as it doesn't control your web site, it can't do WKD. Consequently, back in March I added WKD directories and files to my website (as noted in an aside on my contact page). In the last few days I've seen writeups by dkg on the implementation for Debian developers and Katharina Fey for her personal email so here's mine.

My site is built statically (as previously described). The ASCII-armoured versions of my public keys were manually exported from GPG and placed in the version-controlled static site-directory tree. The binary forms of the keys with the funny paths specified by WKD are extracted from those and added to the final published form of the site automatically by the build scripts.

The code fragment in my build-site.py file which invokes this operation:

    # Add the web key directory.
    verbosePrint('Creating web key directory')
    wkd.setupWkd(new, options.verbose, {
            'ed@edavies.me.uk':     'edavies-pub-key.asc'
        })

… where “new” is the path to the temporary directory where the site is being built.

The code to do all the work is in wkd.py:

#!/usr/bin/python3

"""
wkd.py

Code related to supporting Web Key Directory.

This is for the direct method documented in:

    draft-koch-openpgp-webkey-service-07.txt

"""

import string
import hashlib
import os
from os import path
import subprocess

def z_base_32(s):
    """ Return a string which is the z_base_32 encoding of a sequence of
        bytes. The bytes should be an exact multiple of 5 bits in length,
        i.e., 5 bytes/40 bits or multiples thereof.

        https://philzimmermann.com/docs/human-oriented-base-32-encoding.txt
        RFC 6189, section 5.1.6.
    """
    assert (len(s) % 5) == 0
    
    def bits():
        """ Yield the bits in s. """
        for b in s:
            for i in range(8):
                yield (b >> 7) & 0x01
                b <<= 1

    def words():
        """ Yield the 5-bit words from s. """
        w = 0
        i = 0
        for b in bits():
            w = (w << 1) + b
            i += 1
            if i >= 5:
                yield w
                w = 0
                i = 0

    r = ['_'] * (len(s) * 8 // 5)

    i = 0
    for w in words():
        r[i] = 'ybndrfg8ejkmcpqxot1uwisza345h769'[w]
        i += 1
        
    return ''.join(r)
    
ascii_lower_table = str.maketrans(string.ascii_uppercase, string.ascii_lowercase)

def ascii_lower(s):
    """ Convert ASCII upper-case characters in a string to lower case.
    
        Note: str.lower() also translates non-ASCII characters.
    """
    return s.translate(ascii_lower_table)
    
def wkdHash(s):
    """ Compute the WKD-style hash of a string, typically the local part of an
        email address.
    """
    h = hashlib.sha1()
    h.update(ascii_lower(s).encode())
    return z_base_32(h.digest())
    
def setupWkd(root, verbose, keys):
    """ Create the well-known directory for WKD key lookup.
    
        root    The root of the directory tree for the site being built.
        
        verbose True iff progress messages are to be displayed.
        
        keys    Dictionary keyed on email address of files in the tree 
                containing the keys for those addresses.
                Extension should be .asc for ASCII-armoured files and
                .gpg for binary files.
    """
    def verbosePrint(*s):
        if verbose:
            print(*s)

    def vcmd(c, **kwargs):
        """ Execute command, listing it if we're being chatty. """
        verbosePrint(' '.join(c))
        rc = subprocess.call(c, **kwargs)
        if rc != 0:
            raise ValueError(
                    'Command "' + 
                    ' '.join(c) + 
                    '" failed with return code: ' + 
                    str(rc)
                )

    wellknown = path.join(root, '.well-known', 'openpgpkey')
    hu = path.join(wellknown, 'hu')
    
    if not path.exists(hu):
        verbosePrint('Creating ' + hu)
        os.makedirs(hu)
    elif not path.isdir(hu):
        raise ValueError('openpgpkey/hu file-system object ' + hu + ' exists but is not a directory')
    
    for email, keyfile in keys.items():
        localpart, domain = email.split('@')
        h = wkdHash(localpart)
        target = path.join(hu, h)
        source = path.join(root, keyfile)
        
        if keyfile.lower().endswith('.asc'):
            binary = False
        elif keyfile.lower().endswith('.gpg'):
            binary = True
        else:
            raise ValueError("Keyfile %s doesn't have extension .asc or .gpg", source)
        
        if binary:
            vcmd(['cp', source, target])
        else:
            vcmd(['gpg', '--dearmor', '-o', target, source])
            
    htaccess = path.join(hu, '.htaccess')
    if not path.exists(htaccess):
        verbosePrint('Creating', htaccess)
        with open(htaccess, 'wt') as f:
            print('ForceType application/octet-stream', file=f)
            
    policy = path.join(wellknown, 'policy')
    if not path.exists(policy):
        verbosePrint('Creating', policy)
        with open(policy, 'wt') as f:
            print('# Dummy policy file', file=f)

if __name__ == '__main__':
    setupWkd(
            path.abspath(path.expanduser('~/web-work/new/')),
            True,
            {  'ed@edavies.me.uk':     'edavies-pub-key.asc' }
        )

Hardly the fastest ever bit twiddling but for a few bytes each time I want to build my site it's not a problem. If anybody wants to lift any of this code directly then feel free (consider it public domain/MIT licence, I suppose) but, more realistically, I hope it might be a useful reference for other implementations. It seems to work with Enigmail is the best I can say for its validity.