About Me

A bit about me

Hello, I'm Alexis Lowe, a principal engineer for a leading e-commerce business.

Formerly a cyber security expert, I now thrive in platform engineering. Outside of work, I indulge in my passions for travel, amateur photography, and gaming. Above all, I'm a devoted computer geek and enthusiast.

Join me on this blog as we explore the ever-evolving world of technology together.

AI and how I use it to help with my writing

Something about me that I don’t always share publicly is that I have Dyslexia. It specifically affects my ability to write long texts such as emails, documentation, feedback, blogs, cover letters, etc. Bizzarely writing code hasn’t never felt hard so perhaps there is something else. Basically when I have to write a lot I find it incredibly hard to get whats in my mind into the form of text. It’s either full of mistakes (spelling or grammar related) or so involved that I loose interest and procastinate.

Recently though I have discovered that I can leverage tools such as Notion’s AI or Chat GPT. What I end up doing is writting the outline of what I want to convey and then ask the AI to rewrite it. This has been amazing for helping me write large amounts of text. There is some downsides though and that is the output of these tools usually sounds grandiose or pompous so I spend quite a bit of time prompting the AI to either shorten or simplify the words used. I found that the more context you give it such for example “I am writing a email for x from myself a x …” helps it generate text that is more appropiate.

One more thing I’ve found is that the AI needs very little to generate the text to the point that a few bullet points is enough to generate a page of feedback for example. I wrote down all the points I wanted to mention in the feedback and the AI generated a nice feedback email with the correct tone and embellishments.

Now asking the AI to write cover letters has also been eye opening, the amount of information on a company that it is capable of picking up and introducing into the letter is crazy. To the point that even though I had done research on the company prior to writting the cover letter I was learning new things via the generated text.

To conclude, I was quite skeptical about Generative AI but I am now convinced that this is game changing. We just need slightly better tooling. Perhaps Microsoft should bring back Clippy with a GPT-4 backend.

Last updated 2023-07-04

Hashicorp Vault and docker-compose

Intro

Hello everyone,

This time I wanted to cover how I use Hashicorp's Vault to manage secrets used by docker-compose.

I've been using docker-compose to deploy the services I run on my home servers (I have 2 machines that host the services and kubernetes was overkill) for bit over 4 years now. The overall setup has served me well with it being simple and straight forward to deploy new services or update existing ones. All the compose files are stored in a git repo. The structure of the repo allows me to define "services" which are individual docker-compose.yml files that define a set of containers which together gives me a service I want to host at home.

I control variables that are shared between these services but change based on the machine hosting it (Usually just the domain name change) via {{ hostname }}.env files. This has been working for me though one major downside is that the .env file can't be commited to git due to it containing secrets such as api keys.

This is where I've been leveraging vault and specifically vault agent to template the .env file so I can push the .env template but not the secrets themselves.

Vault agent is capable of templating a file using go template syntax and generates the files with data from vault.

Todo this we need a few things, first you need a running vault instance. I would recommend following the great docs from Hasicorp which you can find here.

Vault Setup

I have it setup as a service defined in docker-compose. A really simplistic example of the docker-compose.yml file:

version: '3.8'
services:
  vault:
    build: ./vault
    command:
      - server
    cap_add:
      - IPC_LOCK
    ports:
      - 8200:8200
    volumes:
      - /path/to/where/you/want/to/save/your/vault/data:/vault/data
    restart: always

With ./vault containing the following:

Dockerfile:

FROM vault:latest
ADD config.hcl /vault/config/config.hcl

config.hcl:

storage "file" {  
    path = "/vault/data"
}

listener "tcp" {
  address     = "0.0.0.0:8200"
  tls_disable = "true"
}

api_addr = "http://localhost:8200"
ui = true

Using Vault for storing secrets

So now we have vault running we can create secrets to do this we need the cli tool (You can do it via the WebUI but I would recommend getting comfortable with the cli tool)

Creating a secret:

vault kv put kv/services/example apikey="super_secret_api_key"
# I would recommend prefixing the command with a space 
# this will prevent it from saving it to your bash history

Once we have a secret created we can use it with the vault agent. We first need to create a agent-config.hcl in which you will define the files you want to template:

auto_auth {
   method {
      type = "token_file"

      config { 
        #Make sure to update this to the path of your home directory
        token_filce_path = "/home/username/.vault-token" 
      }
   }
}

vault {
  #Update this with the address of your vault instance
  address = "http://localhost:8200"
  retry {
    num_retries = 5
  }
}

# Forces agent to close after generating the files
exit_after_auth = true

template {
  source = "example.env.ctmpl"
  destination = "example.env"
}

Next you need to define the template file example.env.ctmpl:

MY_NON_TEMPLATED_VAR=BLAH

{{ with secret "kv/services/example" }}
MY_SECRET_API_KEY={{ .Data.data.apikey }}
{{ end }}

This will fetch the services/example secret from the kv engine and write the vaule of the key apikey.

Generating us a file that looks like this:

MY_NON_TEMPLATED_VAR=BLAH
MY_SECRET_API_KEY=super_secret_api_key

docker-compose can now reffer to that file making the secret available to the containers.

Conclusion

As you can see with Hashicorp vault its possible to generate .env files which can be used by your apps or in this case by docker-compose.

Last updated 2023-07-05

Hashicorp Vault and direnv automating env secrets

In my last post I cover how I generated .env files using vault agent, and after a few weeks I discovered that you can leverage Hashicorp vault and direnv to automatically fetch secrets and make them available in your shell's env when you move to a directory containing a .envrc. With this I can setup git repos for colleagues where they can then run things like tests locally without them having to manually fetch secrets from vault or our organisation's password manager.

So to set it up you need to install direnv and have your Vault access setup. You can find the direnv instructions here and the vault instructions here

You can now create a secret on vault for example:

vault kv put -mount=secret test test-key=yourkey
# Don't forget you can put a space infront of this command and it won't save it to your bash history
# You can also read from stdin or use the web console
cat secret | vault kv put -mount=secret test test-key=-

Now you can create a .envrc file in your project directory and create a named variable that executes a vault get:

export MYSECRET=$(vault kv get -mount=secret -field=test-key test)

Now if you setup direnv and hooked it into your shell for bash : eval "$(direnv hook bash)" When you move to that directory direnv will load that .envrc file into your shell's env. N.B.: Make sure you have run direnv allow . on the directory otherwise direnv will not load the env files.

Last updated 2023-07-13

BTRFS Metadata and No space left errors

Before we start I wanted to give a bit of context about my data storage strategy. My data is in two categories: important and reproducible data (can be easily recreated or retrieved).

I follow these rules as best practice:

  1. Important data must have 3 copies:
    • Local Network accessible copy
    • Local copy on cold storage
    • Offsite copy on cold storage
  2. Data integrity for important data is crucial
    • Using squashfs and then creating parity data of the archives using par2 to mitigate bit-rot
  3. Full data integrity of reproducible data isn't important
    • I can accept bit-rot but not losing access to the files
    • Knowing a file is corrupt is important so that I can recreate or retrieve it.
  4. Local network data access must be fast and low latency
  5. Must not break the bank

Local data server setup

BTRFS was the best data storage format that fit the bill.

I created a storage pool consisting of two 4Tb Hard Drives and two 8Tb Hard Drives. The data is configured with the RAID0 profile and the metadata is configured with RAID1C4. This allows data to benefit from the bandwidth of all 4 drives and is able to fill all the space on the drives (no storage loss). The configuration also guarantees that the metadata will not get corrupted, making it reliable for detecting bit-rot in my data. In addition to this configuration I made sure that each disk has 100Gb of slack saved (this is a section of the disk that BTRFS will not use).

This setup has worked for me for over 6 years, and technically I started with just the 4Tb drives so BTRFS allowed me to grow my storage pool without any hiccups.

Recently I've run into a problem, all of a sudden my system put the storage pool into read-only mode and claimed it ran out of space.

Diagnosis

When checking dmesg, BTRFS kindly printed out exactly what had happened. The data section still had free space and so did system however metadata had run out of space!

To confirm this I ran sudo btrfs device usage /pool which allows you to see the current disk usage per device in the pool:

$ btrfs device usage /pool
/dev/sdc, ID: 1
   Device size:             3.64TiB
   Device slack:          100.00GiB
   Data,RAID0/4:            3.52TiB
   Metadata,RAID1C4:       17.03GiB
   System,RAID1C4:         32.00MiB
   Unallocated:             1.02MiB

/dev/sda, ID: 2
   Device size:             3.64TiB
   Device slack:          100.00GiB
   Data,RAID0/4:            3.52TiB
   Metadata,RAID1C4:       17.03GiB
   System,RAID1C4:         32.00MiB
   Unallocated:             1.02MiB

/dev/sdb, ID: 3
   Device size:             7.28TiB
   Device slack:          100.00GiB
   Data,RAID0/4:            3.52TiB
   Data,RAID0/2:          195.00GiB
   Metadata,RAID1C4:       17.03GiB
   System,RAID1C4:         32.00MiB
   Unallocated:             3.45TiB

/dev/sdf, ID: 4
   Device size:             7.28TiB
   Device slack:          100.00GiB
   Data,RAID0/4:            3.52TiB
   Data,RAID0/2:          195.00GiB
   Metadata,RAID1C4:       17.03GiB
   System,RAID1C4:         32.00MiB
   Unallocated:             3.45TiB

As you can see, the disk sdc and sda are full with only 1.02MiB left unallocated. However, sdf and sdb still have plenty of space with 3.45TiB unallocated.

To help me better understand the state of the disks I drew a diagram. raid1-c4

The first thing that stood out was the behavior of the RAID1C4 profile for the metadata. It forces BTRFS to create 4 copies of the metadata, one on each disk. So when I tried to write new data to the storage pool BTRFS failed and in order to protect the data and the storage pool, it had set the pool to read-only.

The fix

Fixing the issue was quite straight forward, but it required at least 1Gb of free space on each disk in the pool.

I used 10Gb from the slack section (the extra 100Gb of unused disk space) to resize the disks sdc and sda (device IDs 1 and 2) using the following commands:

sudo btrfs filesystem resize 1:+10G /pool and sudo btrfs filesystem resize 2:+10G /pool

Caution this requires the pool to be mounted in read-write so you might have to umount the pool and remount it. See gotcha in the Conclusion.

This provides some extra space that BTRFS can use to move chunks around when it converts the metadata profile from RAID1C4 to RAID1. RAID1 guarantees that the metadata is stored on 2 disks instead of 4, removing our deadlock.

The command to convert the metadata profile is the following:

sudo btrfs balance start -mconvert=raid1 /pool

This will kick off a balance operation.

Once it is finished running sudo btrfs device usage /pool shows what changed:

$ btrfs device usage /pool
/dev/sdc, ID: 1
   Device size:             3.64TiB
   Device slack:           90.00GiB
   Data,RAID0/4:            3.52TiB
   Unallocated:            27.06GiB

/dev/sda, ID: 2
   Device size:             3.64TiB
   Device slack:           90.00GiB
   Data,RAID0/4:            3.52TiB
   Unallocated:            27.06GiB

/dev/sdb, ID: 3
   Device size:             7.28TiB
   Device slack:          100.00GiB
   Data,RAID0/4:            3.52TiB
   Data,RAID0/2:          195.00GiB
   Metadata,RAID1:         17.00GiB
   System,RAID1:           32.00MiB
   Unallocated:             3.45TiB

/dev/sdf, ID: 4
   Device size:             7.28TiB
   Device slack:          100.00GiB
   Data,RAID0/4:            3.52TiB
   Data,RAID0/2:          195.00GiB
   Metadata,RAID1:         17.00GiB
   System,RAID1:           32.00MiB
   Unallocated:             3.45TiB

Again to better visualize, this diagram represents the state after the balance operation: raid1

As you can see BTRFS removed the redundant copies of the metadata from the smaller disks and preserved it on the larger ones.

To clean up and reclaim the slack I ran the following commands:

sudo btrfs filesystem resize 1:-10G /pool and sudo btrfs filesystem resize 2:-10G /pool

Conclusion

BTRFS is incredibly powerful and super configurable, so configurable that like me, you can easily set up a foot gun. But it also has all the tools needed to diagnose and fix it.

The main thing that saved me was the slack space as without it, I would have not been able to run a balance. I recommend this as a best practice for anyone running BTRFS.

Another gotcha to be aware of if the pool has less than 1Gb of space, and you try to run a balance, the balance will fill the remaining space in the pool and cause it to switch to read-only mode.

It is impossible to cancel a balance once the pool gets into read-only mode. The only way to stop it is to reboot and make sure not to mount the pool on boot. Once booted, you can mount it with the skip_balance option (sudo mount -o skip_balance /dev/sdc /pool) which will set the balance operation to paused. Use sudo btrfs balance cancel /pool to cancel it and proceed with resizing the pool.

Last updated 2023-11-2