So, for the geekier people out there, there’s a really neat tool to
download youtube videos and do a whole bunch of fancy formatting
from the command line: youtube-dl
. This, however, tends to break
every now and then and is slow to update. In comes yt-dlp
, a fork
that gets more frequent updates, does some fancy User-Agent stuff
to not get throttled, and probably has some other features over
it’s predecesor.
One neat function these tools have is to wrangle downloaded videos into mp3s, even adding metadata it can find.
This is nice and all, but it only really comes together when you
realize one thing: Youtube Music is just a frontend for Youtube.
You can notice that easily by just looking at the URL when you’re
viewing an album on Youtube Music: It’s just a playlist, nothing
more, and yt-dlp
can download all videos in a playlist.
However, getting all the options right for yt-dlp
to download an
album’s worth of music and add in the right metadata including the
cover art is a bit of an ordeal, considering that tool is meant to
be much more general purpose. So, I’ve spent a day or two getting
it mostly right:
yt-dlp \
--replace-in-metadata uploader ' - Topic' '' \
--parse-metadata '%(playlist_index)s:%(meta_track)s' \
--parse-metadata '%(uploader)s:%(meta_album_artist)s' \
--embed-metadata \
--embed-thumbnail --ppa "EmbedThumbnail+ffmpeg_o:-c:v mjpeg -vf \
crop=\"'if(gt(ih,iw),iw,ih)':'if(gt(iw,ih),ih,iw)'\"" \
--yes-playlist --format 'bestaudio/best' --extract-audio \
--audio-format mp3 --audio-quality 0 \
--windows-filenames --force-overwrites -o \
'%(uploader)s/%(album)s/%(playlist_index)s - %(title)s.%(ext)s' \
--print '%(uploader)s - %(album)s - %(playlist_index)s %(title)s' \
--no-simulate "$@"
You can just stuff this overly long command into a shell script and
give it the link to a playlist (read: youtube music album) and
it’ll download, convert and tag everything for you. It’ll even
store everything in a neat folder structure that coincidentally is
perfect for media servers like Jellyfin: <album artist>/<album>/<track nr> - <title>.mp3
Cool, but what does that wall of options actually do?
--replace-in-metadata uploader ' - Topic' '' \
This first line of options uses a pretty neat function: Using
yt-dlp, you can basically run sed
over any metadata yt-dlp
managed to extract. Since for some reason youtube likes to add " -
Topic" to the end of artist’s channels, this line simply removes
that.
--parse-metadata '%(playlist_index)s:%(meta_track)s' \
--parse-metadata '%(uploader)s:%(meta_album_artist)s' \
--embed-metadata \
This one does some further metadata-magic: It tells yt-dlp
to use
the playlist index as the track number and the uploader (which we
previously “fixed”) as the album artist, since Youtube basically
never sets that one like ever. This does break sometimes, I’ve
noticed it choke on DMC5’s and Metal Gear Rising’s soundtrack. With
that last option, yt-dlp
actually embeds the metadata into the
finished files.
--embed-thumbnail --ppa "EmbedThumbnail+ffmpeg_o:-c:v mjpeg -vf \
crop=\"'if(gt(ih,iw),iw,ih)':'if(gt(iw,ih),ih,iw)'\"" \
This line is pure magic: the first option simply embeds the
thumbnail, which unfortunately isn’t square, as cover art should
be, but the second option fixes just that. I’m not nearly
well-versed enough in ffmpeg
’s cryptic options to understand
what’s going on, I’m just glad it works.
--yes-playlist --format 'bestaudio/best' --extract-audio \
--audio-format mp3 --audio-quality 0 \
Here’s just some basic format selection. Basically, if you give it a link to a video in a playlist, it’ll download the whole playlist instead of just the one video. Also it’s told to get the best audio it can and that you want to end up with an mp3 file.
--windows-filenames --force-overwrites -o \
'%(uploader)s/%(album)s/%(playlist_index)s - %(title)s.%(ext)s' \
This is where the fancy folder structure happens. I’ve chosen to force yt-dlp to use windows compatible filenames, just for a little extra compatibility.
--print '%(uploader)s - %(album)s - %(playlist_index)s %(title)s' \
--no-simulate "$@"
Now this line exists to fix the horribly verbose output of yt-dlp. This way, you’ll only see errors, warnings, and one line per downloaded video, allowing you to preview the folder structure in case some metadata is set wrong.