steveo's tildeblawg

a blawg with no heroes

Multiple-delimiter madness

May 02, 2024 — ~steveo

For my $JOB I do data engineering at a relatively well-known $DEPARTMENT_STORE. The tech stack is an dizzying mashup of ancient (people regularly mention “The Mainframe”), merely old (big iron DB servers; JSPs) and newish (K8s; The Cloud). Most everything except the $CLOUD parts are opaque to me as I haven’t had to deal with them. But data from the deep, ancient layers often makes its way to me in my cloudy realm.

Recently I’ve been working with product pricing data that must be old, as it defintely gives off tape drive and FORTRAN vibes. There are not one, not two, but three different delimiters! For example, a single product might have a price like this

10.00,1714694400,1715904000,BIGSAVE

Meaning the price is $10.00, is valid between May 3 and May 17, 2024, and has a coupon code BIGSAVE. But there might be a different price after May 17, so now it looks like

10.00,1714694400,1715904000,BIGSAVE:15.00,1715904001,1716249600,LESSBIGSAVE

Notice the :! But maybe after BIGSAVE there are actually two different coupon codes

10.00,1714694400,1715904000,BIGSAVE:15.00_#_9.00,1715904001,1716249600,LESSBIGSAVE_#_MOREBIGSAVE

Now _#_ has joined the party. Three delimiters in one row of data!? Who can live at that speed?!?!?!

tags: month-of-blogs, data-engineering, jobby-job