Downloading Albums, from Youtube

30 march 2023

So, for the geekier people out there, there’s a really neat tool to download youtube videos and do a whole bunch of fancy formatting from the command line: youtube-dl. This, however, tends to break every now and then and is slow to update. In comes yt-dlp, a fork that gets more frequent updates, does some fancy User-Agent stuff to not get throttled, and probably has some other features over it’s predecesor.

One neat function these tools have is to wrangle downloaded videos into mp3s, even adding metadata it can find.

This is nice and all, but it only really comes together when you realize one thing: Youtube Music is just a frontend for Youtube. You can notice that easily by just looking at the URL when you’re viewing an album on Youtube Music: It’s just a playlist, nothing more, and yt-dlp can download all videos in a playlist.

However, getting all the options right for yt-dlp to download an album’s worth of music and add in the right metadata including the cover art is a bit of an ordeal, considering that tool is meant to be much more general purpose. So, I’ve spent a day or two getting it mostly right:

yt-dlp \
--replace-in-metadata uploader ' - Topic' '' \
--parse-metadata '%(playlist_index)s:%(meta_track)s' \
--parse-metadata '%(uploader)s:%(meta_album_artist)s' \
--embed-metadata \
--embed-thumbnail --ppa "EmbedThumbnail+ffmpeg_o:-c:v mjpeg -vf \
crop=\"'if(gt(ih,iw),iw,ih)':'if(gt(iw,ih),ih,iw)'\"" \
--yes-playlist --format 'bestaudio/best' --extract-audio \
--audio-format mp3 --audio-quality 0 \
--windows-filenames --force-overwrites -o \
'%(uploader)s/%(album)s/%(playlist_index)s - %(title)s.%(ext)s' \
--print '%(uploader)s - %(album)s - %(playlist_index)s %(title)s' \
--no-simulate "$@"

You can just stuff this overly long command into a shell script and give it the link to a playlist (read: youtube music album) and it’ll download, convert and tag everything for you. It’ll even store everything in a neat folder structure that coincidentally is perfect for media servers like Jellyfin: <album artist>/<album>/<track nr> - <title>.mp3

Cool, but what does that wall of options actually do?

--replace-in-metadata uploader ' - Topic' '' \

This first line of options uses a pretty neat function: Using yt-dlp, you can basically run sed over any metadata yt-dlp managed to extract. Since for some reason youtube likes to add " - Topic" to the end of artist’s channels, this line simply removes that.

--parse-metadata '%(playlist_index)s:%(meta_track)s' \
--parse-metadata '%(uploader)s:%(meta_album_artist)s' \
--embed-metadata \

This one does some further metadata-magic: It tells yt-dlp to use the playlist index as the track number and the uploader (which we previously “fixed”) as the album artist, since Youtube basically never sets that one like ever. This does break sometimes, I’ve noticed it choke on DMC5’s and Metal Gear Rising’s soundtrack. With that last option, yt-dlp actually embeds the metadata into the finished files.

--embed-thumbnail --ppa "EmbedThumbnail+ffmpeg_o:-c:v mjpeg -vf \
crop=\"'if(gt(ih,iw),iw,ih)':'if(gt(iw,ih),ih,iw)'\"" \

This line is pure magic: the first option simply embeds the thumbnail, which unfortunately isn’t square, as cover art should be, but the second option fixes just that. I’m not nearly well-versed enough in ffmpeg’s cryptic options to understand what’s going on, I’m just glad it works.

--yes-playlist --format 'bestaudio/best' --extract-audio \
--audio-format mp3 --audio-quality 0 \

Here’s just some basic format selection. Basically, if you give it a link to a video in a playlist, it’ll download the whole playlist instead of just the one video. Also it’s told to get the best audio it can and that you want to end up with an mp3 file.

--windows-filenames --force-overwrites -o \
'%(uploader)s/%(album)s/%(playlist_index)s - %(title)s.%(ext)s' \

This is where the fancy folder structure happens. I’ve chosen to force yt-dlp to use windows compatible filenames, just for a little extra compatibility.

--print '%(uploader)s - %(album)s - %(playlist_index)s %(title)s' \
--no-simulate "$@"

Now this line exists to fix the horribly verbose output of yt-dlp. This way, you’ll only see errors, warnings, and one line per downloaded video, allowing you to preview the folder structure in case some metadata is set wrong.