Decentralized Secret Management
Back in exploring how to secure application credentials I said we would dig deeper into an encrypted credentials solution. If you haven't read that post, don't worry, this post is self contained. It does assume you're set on managing secrets without centralization though. If you're interested in the trade-offs between centralized, decentralized, and bad solutions to the problem of credential management that post may be worth reading before or after this.
While PGP is a large part of the implementation discussed, I will discuss emerging alternatives to PGP and the implications of that in the later half of the post.
Problem Recap
You're writing part of a distributed system that uses secret credentials to authenticate itself to an external service. You chose to distribute the management of credentials. How does this work securely? Cryptography, with it we can securely share secrets among trusted parties.
To take advantage of git's version control in the process, we can use a tool called git-crypt. An alternative project called git-secret works in almost the same way. The biggest difference between the two is that git-crypt operates as a transparent crypto codec while git-secret manages a pair of files, one encrypted in git and the other plain text ignored by git. Neither is really superior, just different workflows. Both have aspects I consider poorly designed. I chose git-crypt but you could easily use the other.
How Does git-crypt Work?
Understanding git-crypt is fairly easy because it doesn't do all that much. If you already understand asymmetric cryptography, it's that, applied between STDIN and STDOUT. It applies this crypto codec using git attributes. This results in your git repo only storing credentials encrypted, but you and your trusted contributors can change, diff, search, blame, and use the secret files as you would any other plain text file in the repo.
Git Attributes
Git allows you to assign attributes to files. The primitive form you probably already know of is the
.gitignore file. In those files, you define globing patterns one per line that define
files you want git to ignore. While the ignore system is not the same as the attributes system, it
is related close enough to serve a familiar introduction. See, that file you could say is applying
the "ignore" attribute to matched files. When a file has that attribute, git subsystems like the
index, rebase, and status all modify their behaviour. They check the list of attributes for each
file, note if it's ignored and, for example, refuse to add the file to the index unless you've
forced the operation (git add -f ignored.file).
Ignoring files and directories is a fairly common desire. It existed before attributes and thus got
it's own file. Later the ability to add many more attributes to a file was needed and so the
.gitattributes file was
defined. In that file, just like the
.gitignore, attributes are applied
by defining globing rules, one per line. The difference here though is that no attribute is assumed
for matches. Instead, after the glob you need to list the attributes you want to apply to the
file(s) it matches.
While there are many attributes you can apply, the two that we're concerned with are the filter and diff attributes. These attributes are each set to the name of a driver. Drivers are just what git calls named configs that define external programs to handle operations on the files. For example, a diff driver is a program that you want to handle calculating and presenting a diff between two versions of a file while a filter driver is defined by both a clean and smudge command to translate a file between your working directory and the repository's index.
The clean command of a filter driver is a program that is passed the contents of the file as STDIN before the value of its STDOUT is added to the repo's index in place of the file's contents. Using a clean command a file can be modified transparently before it's added to the index (and subsequently committed). The smudge command works the same way but in reverse where the STDIN is passed the contents of the file in the index and the STDOUT is put into the file in the working directory. To manually add it you'd define this filter in the repo or global git config using an entry like:
[filter "git-crypt"]
	smudge = git crypt smudge
	clean = git crypt clean
	required = true
Instead of having to define this ourselves however, git-crypt will automatically add this to the
repo's config (.git/config) when we unlock the repo and remove it when we lock the
repo. It's not perfect, but it keeps the setup simpler for most people. For example, it opts to hard
code the path of the git-crypt binary, but that only becomes a problem if you routinely change the
install location of the git-crypt binary (like I do for reasons…).
Basic Workflow
Putting it together we can examine an example. Let's say we have a directory of secret files called
secrets/ in our repo. We can add a .gitattributes file to that directory
which contains the following to transparently encrypt any file we put inside.
/** filter=git-crypt diff=git-crypt
.git* !filter !diff
Now git crypt init and add your files to the directory. You can double check they'll be encrypted when you add them to the index by running git crypt status secrets/. You should see something like:
not encrypted: secrets/.gitattributes
    encrypted: secrets/credentials-dev.json
    encrypted: secrets/credentials-prod.json
    encrypted: secrets/credentials-stage.json
Note that you should only type git crypt init if you're setting up git-crypt on that repo for the first time. Otherwise, stick to git crypt unlock.
By the way, how does git-crypt work when calling git crypt? Well, the git
command was originally just a program for calling a large number of git-{subcommand}
utilities. Since then to improve performance most of these tools have been merged into a single
binary. Git however still maintains this calling convention like it maintains the
.gitignore instead of replacing it with the .gitattributes. Backward
compatibility is important. Just because you changed your mind doesn't mean you get to break things
for everyone else. If you call a subcommand git doesn't know about it'll search your PATH
for any executables that match git-{subcommand}.
Cryptography
You've now prevented the history from containing secrets in plain text, but you've still got a key
management problem. How do you make sure trusted parties can decrypt the secrets? Well, when you ran
git crypt init, this told git-crypt to generate a repository private key. This key is
stored locally in the .git/git-crypt/keys/default file.
This file isn't shared between git repos when you push somewhere, create a bundle, or when someone pulls from you though. It's also only a symmetric key allowing you to encrypt and decrypt the files. We still have to use something like GnuPG to encrypt that key and share it between trusted parties. I'll cover setting up and using GnuPG in the next section because it's complicated enough to require a 200 page book. For now I'll just assume you have your own key pair generated and the public keys of everyone you want to entrust the secrets to.
To add others to the repo you use the command git crypt add-gpg-user USERID…. Here USERID is usually just part or all of the person's email address. Just enough to be unique among all the public keys you have on your keyring. The ellipsis notes that you can add multiple people with the same command. This is often preferred as each invocation creates a new commit. This command causes git-crypt to:
- Duplicate the secret key (.git/git-crypt/keys/default) for each person.
- Encrypt a duplicate individually for each public key so every duplicate can be decrypted by only one person.
- Commit those now individually encrypted keys into the repo in the
.git-crypt/keys/default/0/directory using the person's key identity as the filename.
Now when anyone clones a copy of the repo they don't get a copy of the .git/git-crypt/
directory, but they do get a copy of the .git-crypt/ directory containing each
individually encrypted copy of the shared secret key. They can then "unlock" the repo by grabbing
their copy of the secret key from the .git-crypt/ directory, decrypt it using their
private key, and place it at .git/git-crypt/keys/default to encrypt and decrypt
transparently as though they were the ones to initialize the repo.
The one trick here is to add yourself to the repo's recipients of the secret key. Unless you do this, you'll not have a full backup of the code if something goes wrong. For example, if you git crypt lock the repo, you'll be locked out. By encrypting a copy of the secret key using your own public key you can later unlock it again.
You can see an example repo below taken using the tree -a command. Note the difference
between the .git-crypt/ and .git/git-crypt/ directories. You can tell this
repo has been unlocked because the .git/git-crypt/ directory has a default
key file. Without looking in the .gitattributes file you can't actually tell what's
being encrypted, but it's likely the contents of the secrets/ directory.
. ├── .git │ ├── HEAD │ ├── config │ ├── description │ ├── git-crypt │ │ └── keys │ │ └── default │ ├── index │ └── objects │ └── … ├── .git-crypt │ ├── .gitattributes │ └── keys │ └── default │ └── 0 │ ├── 181955EB0840490DA40FB1449E16EE490B769571.gpg │ ├── 36FAF719DFE5AB5D9CDCFC7155A6BA8AAAE5748C.gpg │ ├── 982B95F2E5A7E90BE6EE637B84CDC98C0274138B.gpg │ ├── 9E432661B250A062A765CA20B8CFD816649F1AC2.gpg │ ├── B5F800DA79611F95C451068C50CB33258CB577CD.gpg │ ├── C16CF91C62EA8742EA72CBA37424F75F55549354.gpg │ ├── CC6A21398BB05FE3AF9C582FE60B1F98FFF63E4D.gpg │ ├── CCE6AB786E43117EDE78CC9ACDD69AD8B94C7598.gpg │ ├── E710C381FC6D8167180E4734D3319873E77EB946.gpg │ └── EF9A33175C92BF93D59A2D374F5DB36943E2AC3C.gpg ├── .gitignore ├── Makefile ├── README ├── secrets │ ├── .gitattributes │ └── credentials.txt ├── src │ └── … └── test └── …
Design Complexity
I already mentioned earlier that there are design issues I have with git-crypt. I noted the path hardcode and the issues it can cause (though not for most people). However, there are many things that you may notice about the tool when you use it long enough with enough people. Broadly they fall into 2 major categories, GnuPG and not GnuPG.
GnuPG
GnuPG is by far one of the most overly complicated tools available to do what it does. Most of it's fault is PGP's, but it has many of it's own unique design sins that only make things worse. I won't cover what's the matter with PGP because others have written much more eloquently on the topic. What I can say is we're finally starting to see alternatives emerging for shared file at rest encryption. I'll talk more on this later in the post, but now let's cover how to get GnuPG working to get git-crypt to work.
The short version (replacing EMAIL with the untrusted person's email):
- Untrusted person: gpg --generate-key
- Untrusted person: gpg --export --ascii EMAIL > EMAIL.gpg
- Untrusted person: Get EMAIL.gpgto someone trusted
- Trusted person: gpg --import EMAIL.gpg
- Trusted person: gpg --sign-key EMAIL
- Trusted person: git crypt add-gpg-user EMAIL
- Trusted person: git push (or other repo sync method)
- Untrusted person: git pull (or other repo sync method)
- Untrusted person: git crypt unlock
- Untrusted person is now trusted
If you really want to learn GnuPG, there's the Arch Linux Wiki or official GnuPG manual. In future I may cover how I use GnuPG because you can use it for password management, commit signing, SSH authentication, and secure communication. For this post, I'm trying to give just the passable amount of knowledge of GnuPG to start managing secret credentials with it.
From experience I can say this is one of the big reasons not to go with git-crypt. Others you work with will complain about the complexity of having to learn GnuPG before they can contribute. The nice thing is once you've gone through this and become a trusted member of a repo, you don't need to deal with GnuPG. It's only there to share the repo secret key.
Not GnuPG
With the pain of GnuPG out of the way, the only thing left to
complain about is git-crypt itself. The first one, how complicated that directory it maintains is.
It doesn't need to be .git-crypt/keys/default/0/. So deep and nested it makes a lot of
things feel really complex and messy when reviewing the repo file tree. It seems to exist because at
one point git-crypt was going to have all kinds of rotating multi-key complexity that it turns out
almost nobody needs. Even storing all the keys individually bugs me. To simplify it should be a
single .git-crypt-key file and the decrypted secret be
.git/git-crypt-secret.
If we make that change, the biggest issue comes into focus. git-crypt has no way to remove users once you've added them. To remove someone you have to:
- ls .git-crypt/keys/default/0/ | cut -d. -f1 > key_ids.txt
- gpg --list-keys --keyid-format long and modify key_ids.txtto contain only the keys you want to have access going forward. If you don't have everyone's public key you've got to start asking around.
- git rm -r .git-crypt
- rm -r .git/git-crypt
- git crypt init
- xargs git crypt add-gpg-user < key_ids.txt
- git push (or other repo sync method)
You might think you can just remove their copy of the encrypted secret key from the repo in
.git-crypt but remember their secret key can still be used to decrypt the secrets going
forward because you haven't changed the secret key (everyone has a copy of that same symmetric key).
Even doing the repo key rotation as described above, remember they already have a copy of the
secrets so you still need to rotate the secrets themselves (you'd need to even without git-crypt).
My last nit pick is that you can't run git crypt --help like you can for every other subcommand because git-crypt doesn't ship with a man page. Instead, it bundles it's own git crypt help [COMMAND] based help system. We could also argue about things like the UI but that's more petty still and not a good idea to change because it breaks backward compatibility. To do better, it's probably best to do something different than another GnuPG based tool.
Alternatives to PGP
I said at the start I'd cover alternatives to PGP and thus GnuPG. Since 1991 there's really only been PGP to cover the situation where you have files on persistent storage that you need to let multiple people have access to. Sure, the odd hobby project has existed, but nothing widely supported, simple, robust, and open source. That's not really the case anymore thanks to age.
It's called
age, which might be an acronym for Actually Good Encryption, and it’s pronounced like the Japanese 上げ (with a hard g).
Written by Filippo Valsorda, the tool was designed from an implementation specification. At present there are two different implementations from two different developers (one in Go another in Rust). The design is both cryptographically sound and the tool's UI is simple:
Usage:
	age -r RECIPIENT [-a] [-o OUTPUT] [INPUT]
	age --decrypt [-i KEY] [-o OUTPUT] [INPUT]
What's powerful though is keys for age can be SSH keys. Many developers already have SSH keys. Not only that, you can use Daniel Bernstein's Ed25519 keys which end up being so short and simple you can paste them into a text message.
AAAAC3NzaC1lZDI1NTE5AAAAIApTqOlFanLv7tFAzrh7+FFJpaNx0RSbZCHAZUcXilB5There's no key server infrastructure, no support for negotiating key protocols, no ability to put JPEGs in your key, no web of trust, no subkeys, no revocation certificates, no signing, no key ring management, no configuration. Everyone gets a key pair, you exchange public keys, you encrypt data for recipients.
At one point before I'd discovered git-crypt I'd written my own version of it. The availability of the filter and diff attributes makes writing a driver application fairly simple. While I gave it up to use something else maintained and "off the shelf," I am from time to time curious how Suckless you could make decentralized credential management.