Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One less-secretive way I've seen pregaps used is for live recordings.

The crowd noise betwixt songs can be contained in a pregap, so that it is only ever heard when listening to the album straight-through (instead of in shuffle or track-program mode).

---

Another fun feature of audio CDs is indexes.

A disc can have 99 tracks, and each track can have some pregap (including track 1, as the article discusses). And each of these 99 tracks can be further subdivided with 99 index markers.

This gives a CD the theoretical ability to have 9,801 selectable audio segments.

Although realistically, I've only owned a couple of CD players that even displayed index numbers and exactly one CD player (a Carver TL-3300) that allowed a person to seek to a given index number within a track.

(And I've only known one CD to actually make use of indexes in any useful manner, which was a sound effects CD from the early 1980s that had a lot more than 99 sounds on it -- all organized by tracks, and sub-organized by index marks. I just can't think of the name right now.)



My personal CD ripping script is configured to leave all pregaps after track one at the end of the preceding track when splitting them out as individual files. It gets ripped in one DAO pass for guaranteed preservation of all samples when using gapless playback on live recordings. Track navigation then works just like a real CD without having to listen to an incongruous section of audio meant to link the previous track on sequential play or, even worse, missing it altogether.

I have a classical CD from the 80's with index marks for different movements within within the individual compositions represented by a handful of tracks. My understanding is that DG was the only publisher routinely using them. That required some manual intervention to convert the indices to separate tracks. Sony was pretty good about providing index nav. on their full size stereo players. At least until their perpetually cruddy remotes eventually failed.


That's probably the best way to do it, given common toolsets and players. I also rip pregaps as lead-outs (rather than the lead-ins that the structure may appear to suggest).

It's things like this that make me wish that we'd landed on on a good, popular way to store albums (with metadata!) instead of individual tracks -- or to at least reassemble individual tracks' files properly into whole albums without glitches and weirdness. (FLAC/cue can do some of this, but hardware player support is nearly nonexistant.)

I've been told that this is a stupid thing to want, and I want it anyway.

I'm old enough to remember listening to albums the whole way through by default since anything else would take extra steps, and perhaps fortunate-enough to have generally preferred listening to albums where that is a thing that is also worth doing intentionally.

(And yet, I am young enough to still be bitter about Lars killing Napster. My dissatisfaction is multifaceted.)


In addition to lossy compressed track files I also generate a FLAC with embedded cue as a master copy of the original. It's useful for recreating the whole recording for mass editing. I have a few discs mastered with preemphasis that needed correction. I too hope there will be a day when all FLAC players support track navigation. The reality is the music album has had its day in the sun and will largely be a forgotten curiosity like the typewriter or rotary phone.


You're not wrong. New music isn't frequently recorded with the intent for it to be heard in an album-oriented way.

But the albums I like to listen to as albums will remain cohesive albums for an eternity.

Lots of stuff from Roger Waters is cohesive in that way, which is perhaps something a person might expect me to say.

But also lots of stuff from Maynard James Keenan, Trent Reznor, and even Marilyn Manson is also this way, which is perhaps less expected.

(And sure, I can rip an album as an album and convert that to a singular MP3 that I can play as an album almost anywhere, and it needs to be a single file since MP3s can't be perfectly concatenated. But then, I can't easily skip around on that singular album when it behooves me to do so.

I could do both things when it was still in CD format.)


Billy Eilish's latest is intended to be listened as a complete album (but of course the fact that this is known as an exception proves the general rule...)


That.... that makes sense.

Her recordings are excellent. They generally sound simply fantastic. When turned up on the big stereo, they tickle every auditory input I have -- including the usually-strictly-tactile ones.

I've heard that her brother, who is probably (and perhaps obviously) her biggest fan, generally has a huge part in producing and mixing her music. It is apparent that they work well together.

Anyhow, thanks. That album is on the list for the next time the neighbors have left for the weekend.


Billy Eilish is not exceptional in this regard. 100s of 1000s of albums are. There's absolutely no need to single out an artist.


?


I'd love a new solution that wasn't "break the CD data into pieces."

I've never looked inside a CUE file, but it's just text and I don't think it supports meta data, right?

We need like a new CUE file to go with the FLAC, right?

p.s. https://news.ycombinator.com/item?id=40923646


Ideally, I think I'd want a singular container (of whatever sort) that has the album's audio, the music-related timing metadata (as applicable), and whatever other metadata may be appropriate (lyrics? liner note graphics? music videos? sure!).

The audio should be able to be FLAC. But it should also be able to be anything else, like Vorbis or MP3 or AAC or IDK. It needs to be able to be played continuously without aberration (which can't actually be done with a group of MP3 streams).

The audio needs to be able to be seekable, like a CD is also seekable. By track. By index. (With pregaps, where appropriate -- because CDs also have pregaps.)

Other potential metadata must be able to include whatever subcodes are involved in things like CD+G[0] and HDCD and CD Text, since all of those are supersets of the regular datastream and playback is compatible with any CD player.

And it needs to be a singular container file because...well, that's just easier to keep track of as the years go by and data migrates.

Only then, will we have the beginning of a valid archive format for audio CDs as they actually still exist on [some] store shelves today.

(Some stuff can be optional, just as lots of things are optional inside of an MKV container for a film.)

[0]: Almost nobody ever used this outside of the 1990s karaoke world, but Information Society's self-titled album includes an illustrated sequence, with lyrics, that is completely implemented in CD+G and that runs for the entire length of the album. And I should be able to render that locally here in 2024 from a container on my pocket supercomputer instead of watching a bad rip from a Sega Genesis: https://www.youtube.com/watch?v=b89sSa8QlLg


You mentioned MKV - Matroska (MKA for audio, MKV for video) could honestly work quite well for this situation with just a little extra standardization.

Audio codecs: use a single stream of whatever codec you'd like. FLAC/Vorbis/MP3/AAC/Opus/etc. can all go into Matroska.

Seekable: Use chapters for tracks, and nested chapters for indices. Matroska documentation even gives an example of using ChapterPhysicalEquiv 20 for CD tracks and ChapterPhysicalEquiv 10 for CD indices.

Other metadata can be muxed into the stream as well.

Lyrics can be included as text in metadata (lyric tag) or as a subtitle stream.

Liner note graphics (and basically anything else) can be included as embedded files.

Music videos can be video streams in the Matroska file.


I'm glad to see this mentioned. This was first thought I had as I progressed through this thread. I'm surprised this isn't a popular, supported standard already.


Nested chapters can work for index markers, especially if a player supports them right.

I mean: As mentioned, these have almost never been usable with real CD players in the wild. Maybe not much is lost there. (But the format must still accept these things, and allow them to be usable! An archival format must respect all aspects of the item being archived, including those that are unpopular or disused. I am willing to die on this hill.)

What of things like CD+G? Here in 2024, they're very simple graphics using 35+-year-old tech, and they should be archived neatly, precisely, and without interpretation, to be rendered client-side at a later point. I think I've mentioned it, but we literally have pocket supercomputers in common use today. If we can make the complexities of MAME work for the past couple of decades, and do it with direct ROM dumps, we can do this for CD+G.

But the CD+G must be rendered synchronously with CD audio data on playback. This applies whether it is my Goldilocks example of an Information Society album, or whether it is a CD+G karaoke disk with Garbage's I'm only happy when it rains (and twelve other crowd pleasers from that month of 1995).

How will that work with MKA?

And how will pregaps work?

(Maybe MKA isn't an ideal container if it does not already include avenues that lead to this kind of functionality in ways that are compatible with the original article.)


Interesting point about CD+G. I think whatever format was used needs to take this into account.

There were also a ton of Audio CDs that were not CD+G but had a data track with the music video etc on them.

I worked on a horrible one for Sony, one of those ones with all the anti-rip protection on it, where I was tasked to build a binary blob for a web site that detected if the specific audio CD was in your drive and let you into the web site. What were those things called, ActiveX?


Sony had plenty of awful stuff at different times for audio CDs, despite being a co-developer of this wildly-successful and long-lasting format.

I think you're referring to ActiveX, yes. It's the only thing I can think of where "web" stuff and "hardware" stuff commingled back then in a semi-transparent way.

And famously, as you surely know yourself, Sony once published some rootkitted audio CDs: https://en.wikipedia.org/wiki/List_of_compact_discs_sold_wit...

Anyway, I'll just assume that you aren't the rootkit guy -- or even if you are, that your heart is in the right place.

---

And yes, CD+G is is important. As are the mixed-mode releases with video. All things CD audio are important if we are to talk about an ideal archival (and playback!) format for audio CDs, and archiving an audio CD is not always quite as simple as ripping a folder of FLACs -- there's a ton of diversity here that FLAC (and cue) can't accurately embody.

We're fortunate that we still have so many CDs right now, and that they're still being sold today. This will change. (It must change. It can't not change.)

The good folks working on the Domesday Duplicator have a relatively uphill battle for the often-older (and often rotting) LaserDisc media that they're working on tools to properly preserve.

It would be good to get ahead of the curve and get something with a practical workflow working sooner instead of later.


You are almost describing the MAME CHD format. As they have the problem that the object (hard drive, cd, dvd, etc) must be in one file. Have the ability to do differences (writable in some cases). But also compressed (compressed hunks of data). They also need that sub track data too as some systems do interesting things with that sub data. As some even hide their encryption format in the SBI fields. The CHD format is more like a container that acts like whatever media it was. Depending on what system they hook it up to. The downside is there is no concept of 'metadata' to find different things in CHD. It is up to the system it is hooked up to to interpret what that data stream is.


This could be a good avenue. It might be possible the CHD format could be extended and be backward-compatible, or even just as simple as bodging all the extra data onto the end of the file and hope it is ignored by other readers. This is an avenue worth exploring, thank you.


There is a way to extend the format. As it has version number. It does have some metadata fields (drive geometry, compression formats, version of mame created with, etc). The trick would be getting the MAME team to accept the changes. Just dumping it on the end of the file I would guess they would not be too happy with.

There are a number of requests out there to extend the chd format and fix a few things. They are currently tracking some of this info in XML files (called hash files). They would be down with more accurate information though. So you would need a proposal that adds more accurate information and gives them something to work with. More like getting all the info that something like the redump project tracks in there would make them very happy.

There is a separate project that some other emus use libchdr which is a soft fork of MAME. I think they are trying to track closely to what the MAME group is doing but let other emus use it.


Can’t an MP4 container do most or all of that already? (Pregaps would probably need to become a full-fledged chapter in their own right, with the current spec.)


Cue is a bodge that should never have become a defacto standard. Joerg Schilling's cdrdao tool has its own TOC format that faithfully captures everything including index marks, various flags, and multilingual CD text but it was ignored by everything else in the heyday of the ripping era. Nowadays we'd be better off with a standard yaml/json format that duplicates what cdrdao provides.


This is an issue for movie discs too. Some mkv rips will preserve chapter data (though player support is spotty), but in the end it's still a big linear file— menus, intros, trailers, optional features, etc are all gone once it's ripped unless you rip that stuff to separate video files.

Which I get on the one hand, but it's a bummer that in all these cases (CD, DVD, Blu-Ray) the metadata for the larger structure of the production got inextricably tied to the specific physical media implementing it, such that the only real way to preserve that data was to rip a full disc copy.


> At least until their perpetually cruddy remotes eventually failed.

For me, Sony remotes were made of the same stuff as early Nokia phones - indestructible! Surprised to hear someone thought they were cruddy.


They were physically robust but the carbon button contacts always became dodgy for me. I tried to avoid Sony products for this reason because I encountered it so often in other people's gear. I have a remote from the late 00's that saw virtually zero use and it conked out with age alone.


Japanese called it Sony Timer...some call it urban legend but this seems like yet another independently verified data point


There is/was carbon spray that you could use to refresh the contacts.


I had a set of cds that went with an intro to music theory textbook around ~2009. It did also made heavy use of indices in tracks to do exactly the same thing. My car stereo listed each index as a track.

I wish I could remember the name of the textbook because I really liked a lot of the baroque music on the CDs and can't remember who they were by or the titles of the songs...


Do you remember if the textbook was orange (possibly with a two-tone cover design)? I had a really good textbook in college that had a... 4? CD set (with the big jewel case) that had a bunch of tracks and like you I really enjoyed it.


It was a reddish (could be orange, could have been maroon) color lightly mottled in black with a picture of a violin (or cello, idk) set in the lower 2/3 of the cover. I'm somewhat certain it had 6 cds because it filled my disc changer in my stereo, although that detail is fuzzy too.


I mastered a CD in 2000 for a band that wanted a secret track at the end. I came up with a novel way to do it.

There were a dozen regular tracks. A bunch of empty ones. And the final track over about a dozen tracks of varying length with no gap. Used all 99 tracks.

I could only pull it off with this CD burning software that didn’t have a UI. It took a text file as input at the command line. But it could do everything from almost every color of spec (Red Book, Blue Book, etc) for CDs.


The Nine Inch Nails “Broken” EP had a couple of tracks at the end of 99, though the middle tracks were all 1-second blanks.


I've had visions of putting a CD together that was that way, but with pregaps and indexes utilized as well.

"WTF? The time counter keeps going forward, and then sometimes it goes backwards! And using the track seek buttons completely eliminates some parts that I can hear if I don't touch anything!

It's a whole different song entirely when you program tracks 39, 40, and 52 in a loop, and IDFK what it is with this Index number that only always showed "1" before.

Oh wait. Srsly? From tracks 71-93, it's using the index to count beats...and the track number to count measures? No, that can't be it. Except...."


I thought I'd really (ab)used the CD specs, but don't recall ever trying indexes. Curious how most CD players, which only had a two-digit track indicator, handled indexes. I would have used that if I had known about it.


I wasn't aware these existed either. I suspect the answer is incredibly boring: most CD players simply wont seek to an index, pressing skip track will just skip to the next track ignoring any indexes present.


> Broken was re-released as one CD in October 1992, having the bonus songs heard on tracks 98 and 99 respectively, without any visual notice except for the credits, and tracks 7–97 each containing one second of silence.

https://en.wikipedia.org/wiki/Broken_(Nine_Inch_Nails_EP)#Pa...

Amarok (1990) by Mike Oldfield is a single hourlong track with 53 index marks.

https://en.wikipedia.org/wiki/Amarok_(Mike_Oldfield_album)#T...


Broken was absolutely perfect to put into a multi-disc player along with TMBG's Apollo 18, which contains "Fingertips", a suite of 21 very short songs. Set it to shuffle songs from everything in the player, and enjoy your sonic whiplash


Was this better with the crazily fast-loading chonka-chonk slam-slam nature of a Pioneer 6-disc cartridge changer, or with something slower and perhaps more-civilized like a period-correct Technics 5-disc changer, with its nearly-silent and relatively exquisite, seemingly-careful demeanor?

(Both have their merits, but I unfortunately have neither at hand. And I only have one of these 2 albums. And one of those albums is the original Broken, which only has 6+2 tracks across two discs instead of 99 tracks on one disc.

And how do the 91 silent tracks on a more-common release of Broken affect things compared to the 26 musical tracks that the original 6+2+18 track-count ensemble may entail, in terms of inter-song delay or any other such thing on a real multi-disc changer?

I know TMBG fairly well, and NIN very well, and I enjoy the fuck out of gear, but I have so many questions.)

(I vaguely jest above, but Spotify only shows me 18 tracks on Apollo 18. And only one of them is Fingertips. Am I looking at this wrong?)


Presumably Spotify has glommed all the Fingertips into one file. On the original CD release it was twenty-one separate tracks; there was a bit in the liner notes that explicitly encouraged you to put it on shuffle. https://tmbw.net/wiki/Fingertips

I could not tell you what brand the all-in-one turntable/radio/tape deck/cd player I had at the time was. There was a big tray with room for five CDs and I have absolutely no memory of how much noise it made when switching from one disc to another, and every physical object involved in this affair is long gone in a hurricane.

I suspect both CDs should be easy to find used copies of, if you have the appropriate hardware and want to experience the tension of not knowing if the next song you hear is going to be Trent bitching, a brief moment of silence, a Fingertip, or whatever else you put in the player. Given my tastes at the time this would have probably been Skinny Puppy, Ozric Tentacles, and Björk, but do whatever feels like the most interesting possible choice; I have a disc lying around now that’s nothing but forty iterations of Satie’s Vexations and that would have certainly been a prime choice for this little game.


Here it is on Spotify: https://open.spotify.com/track/1VdLQlGXMvaEUZIL9KuG35?si=29f...

In my location, it shows as a single track that is 4:33.


If you want sonic whiplash without so much effort, you can also listen to a Fantomas album in its original order.



There wasn't really much effort involved. Pick a few discs off the CD rack that I thought will clash interestingly, load up the cd tray of the cheap all-in-one turntable/tape deck/radio/cd unit I had in my room, hit the 'shuffle' button until it tells me it's gonna shuffle everything together, hit play.

Looking Fantomas up on Wikipedia makes it sound like they'd go pretty well with "twenty tracks that sound like the choruses of twenty different songs" and "ninety-something 1s blank tracks plus a few industrial songs", though.


I'm curious if you have a specific example of an album with the crowd noise between tracks like that? I collect and rip hundreds of CDs and am always on the look out for edge case discs to further hone my tools.

On your pregap + 99 indexes remark, the "pregap" is the space between index 00 and 01 which continues on up to index 99. Players seek to index 01 as the start point of the track. There is no separate pregap designation. I've paid special attention to this because it is a difficult problem to solve as many discs have space between tracks stored in index 00-01 but rarely is there anything audible in there after the first track. The only example I have of this is a specialty music sample disc, Rarefaction's A Poke In The Ear With A Sharp Stick, that has over 500 samples on the disc accessed by track + index positions.

As a sidebar based on the later comments in the thread, I've made it a habit to rip and store every audio CD as BIN/CUE+TOC using cdrdao. This allows me to go back and re-process discs I may have missed something on. But that is imprecise even because it usually breaks bluebook discs with multiple sessions to store data due to absolute LBA addressing. Also the ways different CD/DVD drives handle reading data between index 00-01 on track 1 is maddening. Some will read it, some will error, and the worst is those that output fake blank data.


>I'm curious if you have a specific example of an album with the crowd noise between tracks like that? I collect and rip hundreds of CDs and am always on the look out for edge case discs to further hone my tools.

E.g. the Japanese version of Flying Lotus' album "Until The Quiet Comes" has a pregap of 5 seconds before the 19th track, to separate it from the rest of the album, as it's a Japanese-exclusive bonus track.


Is it crowd noise or just a gap? I'm also interested to see one of these skipable crowd noise CDs.


Not sure why I didn't think to mention this before: One tool you might consider is redumper, it's designed in particular to handle multisession CDs, and it attempts to over-read into the disc lead-in and lead-out to catch data outside of the range covered in the TOC (particularly common in older CDs). It only outputs a final split bin+cue, but everything read, including scrambled data, toc, and subchannel/subcode, is saved for future processing. The bin+cue can be used with ISO Buster (and probably other tools) to access Enhanced CD filesystems. Feel free to reach out if you need some tips, this is what I use for my collection.

https://github.com/superg/redumper/

Caveat: It is mostly intended for use with the low-level features of Plextor drives, so CD support on other drives is relatively limited; in particular it doesn't have any overlapping read paranoia-style features. The recommendation is to dump twice to confirm; it's running straight through without seeking so that's usually still quite fast.


Seven minute pregap on disc 1 track 4 of https://vgmdb.net/album/5549 , it's a whole long discussion between songs, with some audience cheering. VGMdb follows the "append pregap to previous track" convention, that's why track 3 looks so long. There's similar but shorter gaps with banter on tracks 2 and 7.

Cuesheet looks like:

    TRACK 04 AUDIO
      INDEX 00 00:00:00
      INDEX 01 07:34:43
Edit: Probably covered by your sfx disc, but this one has 17 indexes on track 1: https://vgmdb.net/album/3091

That messed with a tool that only anticipated one index in track 1 for detecting hidden pregap audio, cuesheet is like:

    TRACK 01 AUDIO
      INDEX 00 00:00:00
      INDEX 01 00:00:37
      INDEX 02 00:11:40
      INDEX 03 00:37:33
      ...


I have an early CD (Bach's Goldberg variations, played by Glenn Gould) which is one track with 31 index markers.

My (early) Philips CD player dealt with it fine, but since then it's been a bit of a problem...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: