Hacker News new | past | comments | ask | show | jobs | submit login
Audio CD ripping – optical drive accuracy listing (pilabor.com)
208 points by sandreas on Nov 7, 2022 | hide | past | favorite | 161 comments



This reminds me of “Why I Ripped The Same CD 300 Times” – https://john-millikin.com/%F0%9F%A4%94/why-i-ripped-the-same...

(HN discussion: https://news.ycombinator.com/item?id=17649374)


Thank you, this is a great ressource for understanding how cd ripping works!

I added this to the article along with some other things, I took from this discussion. Thank you HN for providing more details!


This is a great great read thanks, and the follow up HN comment regarding "weak sectors" is amazing.


If you care about accurate rips on Linux, the best tool to use is whipper: https://github.com/whipper-team/whipper. It makes use of the AccurateRip database, which is used to calculate the statistics. I don't know about any other native Linux application that makes use of it. Other tools like cdparanoia, and all the other wrappers around it, just attempt to read it multiple times and still get the wrong result, as the post shows.


The frustrating thing about AccurateRip is that several open source apps can pull down from it to compare your local rip, but IIRC the only app that is allowed to push rips back up to the AccurateRip DB (and hence make it more "accurate") is the proprietary, Windows-only EAC.


That’s a feature. The value of AccurateRip comes from the accuracy of data submitted to it.

Tightly controlling how data can be submitted allows that accuracy to be maintained.


As one of the other links explains, ripping the same CD on the same drive a 100 time might still not produce the correct rip. Something like AccurateRip works by having multiple copies of the CD scanned, and then voting which one is the correct version.

I forgot that CTDB (http://db.cuetools.net/) exists, which is is an alternative to AccurateRip. CUETools is open source Windows software to rip CDs. Instead of just providing a checksum of the track, it provides error correction information. So instead of just getting that you probably have a bad rip, and keep getting a bad rip, it's possible to correct the rip. EAC has a CTDB plugin that's installed by default, whipper currently doesn't support it.

AccurateRip is not something from EAC, it's from dbpoweramp.


Or just run EAC in wine.


Isn't "just" attempting to read multiple times basically how AccurateRip is itself populated, as that is how EAC works?


Looking at the source code Whipper seems to be a Python wrapper around common command line tools in Linux.


Great addition, thank you... I'll add this to the article soon


Not sure what the confidence intervals of these numbers are. For example, the BH14NS40 drive is listed with an accuracy of 99.4937, whereas the WH14NS40 is listed with 98.0869. These are the same model of drive with the black and the white front respectively.


Sometimes it's different firmware revisions, sometimes it's just statistical noise...

I wrote this article mainly for myself... I just converted the dbPowerAmp stats into a somewhat searchable JavaScript table and added some useful notes I found out over the last days. I did not expect HN Top 10 for a whole day with this... :-)


I wish I could use my physical CD collection as a "pass" to download high quality rips so I can skip the part where I have to do the labor of ripping them myself.



Another one was Murfie (now defunct): https://en.wikipedia.org/wiki/Murfie


I think iTunes Match still offers this, too.


They problem is that you might get remastered editions of tracks. So don’t actually get what you have on CD.


For musicians who are on Bandcamp you can buy a CD and get a FLAC download. (Ofc there's no telling what happens to your access when Bandcamp folds, so that sort of defeats the point.)


You still have the FLAC. That’s kinda the point of downloading it. Plus the CD, which is uncompressed FLAC.


I see there are a lot of misconceptions here.

FLAC is 100% identical to the original audio CD sonically, it's a lossless format. See it as a ZIP file for WAV audio files.


I don't have an misconceptions about flac, lol.


You could expand the FLAC back to CD and slap a cover on it if required.


I use this[1] for making sure I have everything I've purchased from Bandcamp downloaded to my NAS. Even made a bash alias to wrap it so I just type `bandcamp` after I've purchased something and it downloads and sorts it immediately.

1. https://github.com/Ezwen/bandcamp-collection-downloader


Well, if you're afraid of Bandcamp folding, you'd better download the FLACs right now. That's the advantage when you compare it to MoviesAnywhere - there's no way to get a DRM-free downloadable version of the video material you bought...


Don't (just) be afraid of Bandcamp folding - be afraid of artists folding. Some years ago I purchased a few tracks by an artist on Bandcamp, downloaded them as Ogg Vorbis - and now, almost ten years later, I wanted to re-download them as FLAC for archival, but the artist has completely shut down their Bandcamp page. It's a bit of an extraordinary case but serves as a good reminder to download in the highest-possible quality at the earliest possible point in time. It doesn't help that it was a really niche artist and there are no warez copies of his tracks, at least not in the open web.


> Don't (just) be afraid of Bandcamp folding - be afraid of artists folding.

I've recently come across that problem at least twice – I learned about a potentially interesting album done by somebody, but it had already disappeared from Bandcamp without so much of trace. In one case it turned out I had missed the album by only two months, because the album page still appeared in Google's search index cache with a corresponding last indexed-date, but of course that's not much use as far as getting the actual music is concerned.


did you try archive.org's wayback machine?


Is the Wayback Machine even able to save the preview streams? Trying it for some other artists that are still live on Bandcamp this doesn't seem to be the case, and in any case it certainly wasn't the case for the particular album I was looking for.

Ultimately, unlike the OP I luckily found at least the one album I was most interested in "elsewhere", though it's still a bit of a shame that it's no longer regularly available and it'd absolutely would have merited paying for it.


I have a similar story, about a band that apparently only ever appeared live as an opening act for another pretty niche band called Texas (they had a few hits in the nineties) five years ago and has since been inactive - https://www.facebook.com/hightre. I liked them and fortunately bought a CD at the souvenir stand at the end of the concert - to my knowledge the music is not available anywhere else, in any form...


What was the album?


It was by a guy called Dombrovski. I started digging, trying to see if I can track him down in other ways and ask him for a copy, but then I wish I hadn't done that because I dug up quite some dirt... (seems like he was sort of a con man, his website dombrovski.com now just redirects to a court case, which makes me think that his domain was seized and maybe also his other sources of income such as his bandcamp account)


Looks that way yes, all his links here appear to be dead now.

https://web.archive.org/web/20130825015207/http://dombrovski...


In reality the IP owners wish you'd pay for each version separately and they've implemented that version of reality pretty much across the board. Once the last CD dies out, that's it.


That would be awesome. I know some indie bands do that (buy our CD, get free downloads) but it would be better to see an industry effort, the same way we have MoviesAnywhere for films on disc


I have my Dad's record collection. A digital jukebox that has all the record covers - and plays the right version and track listing when you pick an album, would be just fantastic.


  > plays the right version and track listing
Just try to get ahold of Rust in Peace today. The rerecorded album by Dave's Megadeth cover band sounds terrible, and the original is out of print and not available from any of the legal streaming platforms. I'm considering stocking up on original CDs and pressings, and storing them in different locations. I'm not even joking.

If you really want the right version and track listing, you'll have to preserve it yourself.


Columbia Records managed to mess up a number of live albums while uploading them for digital distribution by leaving out the transitions between the individual tracks.

In each instance I've come across this problem (mainly various Bob Dylan albums so far), it turned out that on the actual CDs, the affected portion had been mastered as part of the pre-gap (i.e. the bit where a hardware CD player will show a negative timecode counting back up to 00:00), so as silly as it sounds, it almost feels like they themselves simply ripped the CDs in order to upload them for digital downloads and streaming, and somehow managed to mess up the process and discard the pre-gap bits.

Though unlike your example at least in that case the CDs are still in print, so for now you're not completely stuck there…


>so as silly as it sounds,

why do you feel that sounds silly, as it sounds exactly like what I would expect them to have done. it was probably some intern tasked with the job as well. it's not like they were going to go back to some mastering format to have the streaming files created for the entirety of the back catalog.


When I was in middle school, Tanqueray had a spokesman named “Tony Sinclair.” I loved every commercial he was in, they were so fun and the background music was amazing.

Anyhoo, the official site had all the songs (5) for download so I forged my age to access the mp3s and I still listen to them, to this day.

I just wish I knew more about music formats back then, I have no idea if there were FLAC versions available. My annual pleas to the Tanqueray and Music Beast Twitter accounts remain ignored.

Alas.


https://www.discogs.com/release/17502538-James-Dukas-Narrato...

I bought this album on Google Music, which got shut down and I didn't copy it off in time. Now I can't find it online.


I vaguely remember a project somebody made where they made mini album covers on card stock and attached an NFC sticker. They would pick an album, set it on a reader, and it would start playing.



I’ve digitized every record I own and have “cut” like 2/3 into WAV, FLAC, and mp3 “sides” to listen to. A digital jukebox as you described sounds like a fun long term project…


In a sense, laws around “format shifting” can make room for that to happen.

As long as someone is not “distributing” (AKA uploading to any user. AKA torrents are off the table.), then an argument can be made that the user is format shifting using the internet as the format-shifting tool.

Even though the US (and now other countries) are a bit of a nightmare for fans of copyright works, I always thought we’d end up with more tools that “read” a physical item, then download the bit-perfect copy.


New vinyl records quite often come with single-use download codes.


iTunes Match?


>high quality rips

I cannot unhear the Flintstones now...


X Lossless Decoder is a good alternative for the Mac – and perhaps the only app that runs natively on PowerPC, Intel and ARM!

https://tmkk.undo.jp/xld/index_e.html


So many reasons to love this app :)


Sadly this omits the metric I'm most interested in, which is speed. I've noticed that, with the same rip settings, different drives -- some are a lot faster, but both produce bit-perfect rips. Biggest difference was between a slim, new consumer drive, and an old, late 90s IDE drive. The latter was significantly faster, and had far better error correction too!

In general, error correction adds a lot of time. My sense is it will re-read the damaged sectors several times and take the "most common" reading. Then, it compares that with the external full-track hashes to establish correctness. So, it has to do a lot of seeking to read and re-read a damaged portion, which can take minutes for "secure" rips in EAC.


Some drives (for example those using Pioneer PureRead) will actually vary parameters like laser power and angle between re-reads, so it‘s possible that a damaged sector can get read probably after some retries.


I usually rely on cdparanoia, which claims accuracy, but not speed.

https://xiph.org/paranoia/


Can anyone comment on why the author suggests to “DISABLE C2 error correction” beyond his assertion that most drives just don’t do it well? Does it impede the process and/or actually produce inferior rips? Or is it just that in the vast majority of ripping outcomes, C2 error correction makes little to no difference ? (and might perhaps even cause issues?)


When ripping with EAC, you want EAC to be doing the error detection/correction, not C2. EAC tries to read the data multiple times and takes the best results. C2 will hide issues in the raw data.


C2 is lossless if the error correction (Reed-Solomon) can reconstruct the data. It interpolates when it can't (therefore lossy).

The gist is that you need to clean your CD's meticulously before ripping them!! Also, clean the CD-ROM player's lens using a cotton swap and lens fluid if you can (wrap the cotton swap with non-abrasive lens cleaning tissue).

If you do that, there will be no audible difference between the ripped version and the data on the original CD (providing you don't have some setting turned on in your ripping software that does some kind of processing of the data). You don't need special software to get a clean rip. I've been doing it this way for decades.


How do you know if it interpolated or not?


You don't. There's no way to tell from the data stream.

The signal processing IC's do set flags on their pins to indicate whether interpolation was done, but this isn't handed to device driver AFAIK and certainly not passed on to the client software reading the data.


Exactly, that's why we use special software.


ah this makes sense, thank you.


According to wikipedia, C2 error correction is lossy on VCD and CDDA, but not on data CDs.


Why bother? If you make a CD drive to go into a computer (and not a stereo cabinet), why bother to have a lossy error correction method, when you must have a lossless one sitting right beside it?


C2 is error detection, not correction. C1 is the error correction. I think what wikipedia is trying to say is that the C2 error detection just points out something is wrong, even after the C1 error correction, and so you can't fix it. But a data CD has additional error correction, so it can correct more errors.


> but not on data CDs

Except when intentional errors are used for copy protection.


It is possible to make a CD that has a DTS data stream on it instead of a PCM stream and most home theatre receivers will process it as a 5.1 multichannel signal if you connect the SPIDF output of a CD player to the receiver.

https://en.wikipedia.org/wiki/5.1_Music_Disc

I have a 300 disc Sony changer that I am filling up with DTS discs.

The normal interpolation algorithm in CD players is not going to work right for a DTA stream so the quality of the stream is excellent.

This guy dumped the digital output of two identical CDs into a digital recorder and found the output was bit-by-bit identical

https://www.youtube.com/watch?v=f-QxLAxwxkM

For the life of me I can’t figure out why Techmoan hasn’t demoed a DTS cd on his show.


Cool.


Could someone give a brief explanation of why this (bitwise accuracy) is such a problem when reading CDDA discs while it's not a problem at all for regular data discs?


The problem is that there's several layers of encoding and encapsulation, each with their own type of error correction and bits of extra information, and the system was designed to hide all the layers from the end user; in its normal mode of operation, the final output of the CD drive is audio (digital or analogue), not the contents of an “audio file” or anything like that. There's also the fact that audio CD drives aren't designed for truly random access (jumping to different tracks is approximate IIRC). So it's a problem to dump because there's many options for which layer to dump, and it depends on the drive which layer you can get to, and whether you can get it without error correction already applied.

For data discs (or more accurately, the data portion of a disc, because some discs have other types of content on them too), the drive essentially functions as a standard random-access block device, just like a hard drive or floppy disk. The error correction etc is still hidden from you, but as long as your entire disc is just data, there's an easy, obvious thing to dump: the bytes of that block device.


In addition to what the sibling says, CDDA discs have less error correction than CDROM discs. Each audio sector (frame) carries 2352 bytes of audio data, while a data sector carries just 2048 bytes of data; a different coding is used to reduce bit errors, at the expense of storage capacity.

On top of that, the audio extraction mode (in the drive) is usually optimized for constant-time delivery rather than accuracy; if an audio sector is damaged, most players will opt to deliver inaccurate data rather than stop the audio stream altogether.


An audio CD has 2352 audio bytes per sector. The sector also contains C1 error correction and C2 error detection.

On a data CD, those 2352 bytes are split in 2048 data bytes, plus an additional 4 error detection, 276 error correction, plus some other bytes including an address. So there is an extra layer of error correction.


If you get even an error-corrected rip, why would that not be perfect? I guess I’m failing to understand how any of this could improve accuracy in any way.


Error-correction in audio CDs can correct errors up to a specific threshold of corruption.

Additionally, there is no way to know that you've passed that threshold -- the drive will just return bits, with no way to guarantee that they're correct.

A better description for this is "passive" error-correction. To try to get a 100% bit-perfect rip, it's preferable to start off with a drive that you know minimizes errors in the initial read process.


Audio CDs also have significantly weaker error correction than CD-ROM.


Ah, so audio CDs were engineered to provide best effort data correction instead of dead-sure data correction to serve the audio market instead of the data market.

That just goes to show the lack of design foresight. They were still thinking in terms of grooves carved in plastic or a waveform on tape.


There's no such thing as dead sure error correction, and there cannot be.


You could include a hash of the correct data, which would tell you in the (cosmically) vast majority of cases when your data has been corrupted.


Indeed.

That is, however, error detection and not error correction.

Once an error is detected you can apply 'correction' to the nearest probably correct value .. which only 'works' for errors under a threshold.

Hence the assertion in comment above yours.


How many hashes will you include? A single one for the whole CD? That tells you if you have an error, but not where?

A 32 bytes hash for every 32 bytes? Then you double your storage requirements.

There are actually WAY better algorithms for this than a hash (a hash is meant to handle secrets and cryptography - it's the wrong thing for this purpose).

ECC codes can tell you if the data is bad - and you can choose how many bytes this covers, and that can also fix bad data, again, you can choose how many errors per byte you want.

If you are curious: https://en.wikipedia.org/wiki/Error_correction_code


For ripping, would it matter? It's probably not a feature that was needed back in time when audio CDs were designed, but in retrospective, getting a notification "bit error was detected when ripping, not sure where, but now you know" would help preservers - they could then find another copy of the CD and get a 100% accurate digital copy of the song (within limits of hashing algos of course).


>How many hashes will you include? A single one for the whole CD? That tells you if you have an error, but not where?

Interesting ! At what point with today's high-capacity-disks (DVD etc), can we just say store the data x3 and take the value of the closest two ?

If the only thing we really want is "high-audio-accuracy" ?


> a hash is meant to handle secrets and cryptography - it's the wrong thing for this purpose

The rest of your comment is good, but I wanted to quibble with this: hashing is useful for a very wide range of things, and are one of the main building blocks of modern algorithms (hash tables being the best know).



Oh, and another limitation in that paper: to get positive zero-error rate, there has to be input symbols in the channel that cannot ever map to the same output channel, which is not the case for any binary channel in use I have ever seen (my PhD is in error correcting codes).

So if noise in the channel could turn a 0 to a 1, or a 1 to a 0, then there is never a positive zero-error rate.

This is explained in the paper, page 9, second column.

So this is an interesting math question for channels that don't occur in practice, and has resulted (as far as I can tell) in no working codes usable even in labs. It certainly does not apply to transmitting binary data over any noisy channels or media, which is where most if not all error correction codes are actually used.


The zero error capacity is defined for infinite length codes, none of which are usable in practice.

Another way to see it - care to list any error code used anywhere with zero error rate?

Read the paper. Shannon write about this in his 1956 paper and subsequent work analyses further, but it's theoretical, and probably not practical for any real world error codes.


I think GP means prefect or nothing - it errors out if not able to get a valid read.

Regular error correction + checksum is a pretty solid implementation of that.


You can however make it astronomically improbable.


Which is not what the OP asked for. You simply reintroduced error correction, and you cannot surpass the noisy channel coding theorem bounds.



Checksums don't provide error correction, nor are they without errors. And they cannot be - it's mathematical fact, hence my post.


No, but it shows you if your copy is 100% accurate or not. Which proves that 100% identical copy is possible.


No, it most certainly does not.

A N bit checksum can at most distinguish 2^n different bitstreams as "100% accurate", and that is assuming there are no transmission errors in transmitting the checksum itself. This is proven trivially by the pigenhole priinciple.

And for most media, the checksum is itself transmitted over a noisy channel like the rest of the data.

Read your link carefully.


If they had foresight, they wouldn’t have made CDs at all. They do not want people ripping them and getting “perfect” copies.


What do you suppose dead-sure data correction would look like?

My guess is that you couldn't fit a lot of information on it, what with the redundancy to make it perfect.


With coding dead-sure correct isn't usually much different in capacity -- you usually don't need to sacrifice bandwidth completely to get reliability. In fact, a simple checksum with a good number of bits (e.g. CRC32) will pretty much give dead-sure accuracy (non-adversarial, you can hash e.g. SHA256 to get cryptographic certainty). If errors were evenly distributed (independent and identically distributed bit flips), it's generally very easy and well known how to obtain dead sure error correction, and the bandwidth you get is known as the channel capacity (C) of the medium (usually in bits/bit or bits/symbol) -- a "good code" operating at a rate R < C (under or equal to capacity) will give you this property; good codes, depending on the error rates, require moderately large block sizes (usually from 8b to 1-2kb in practice).

Real media like CDs usually have 'burst errors' like scratches that give a whole bunch of successive bit errors; however, there's an interleaving process to "spread" the errors across blocks, i.e. spread the information so it will resist a scratch. In an absolute way, indeed it's impossible to guarantee almost no errors (although you can guarantee almost no undetected errors) -- simply because your CD might be ruined in a way (which I guess isn't all too unlikely). By setting your rate R to a reasonable level above your drive error rates, you can get almost no errors (as few as you like) within those bounds (and fail above).

Here's a typical error rate curve for a moderately large (255b) code: https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_cor...


3 copies of each CD, stored in different locations, inserted into different players, interconnected via ...


> That just goes to show the lack of design foresight.

You have a skewed perspective of growing up with ubiquitous computing and cheap ICs. This was very much not the case in the early 70s when CDs were designed. Trying to simultaneously design for high fidelity audio and a then completely theoretical demand for data storage would have made no sense and would be unjustifiable scope creep. The spiral groove of a CD is objectively better for a low cost linear playback device - even though it is obviously worse for random data access (though modern DSP mitigated this 15 years later). Engineering is about trade offs.

There’s no binary best effort vs dead sure - it’s a matter of degree, unlike the mistaken GP post, CDDA and CDROM both use RS error detection and correction, just the former has less of it - a CDDA (Redbook) can store about 15% more audio than would be possible with the equivalent WAVE files on a CDROM - at the expense of some redundancy, but it still has very robust error correction - otherwise every little tiny piece of dust would cause a skip. And for audio playback - trying to fill in a best guess is the right thing to do - rather than just making the thing spit the disk out with a failure.


> Additionally, there is no way to know that you've passed that threshold

This is not quite right. CDs use a forward error correcting code, and of course that means you can detect errors (how can one correct errors if they can’t detect them?). The CD drive most definitely can distinguish between an error free signal and a marginal one - both at the analog level and the digital level.

Most CD drives firmware are a bit inconsistent in their ability to correctly provide error correction information but this is not a fundamental issue with the technology and far from “no way to know” - relying on the drives error detection facilities is what the C2 error detection option in EAC does for instance. There are some drives that do well with this.

https://wiki.hydrogenaud.io/index.php?title=EAC_Drive_Option...


Thank you for posting this! I still have hundreds of CDs that I'd like to rip, but I don't have any device to connect my ancient Plextor SCSI CD drive to. I didn't want to buy just any CD drive because I want to rip these CDs as accurately as possible. One of the top contestants (HP GUE1N) is available for cheap on eBay, so I went and bought one. So I guess I'll do a lot of disc changing during the long winter evenings to come, but then I'll finally be able to throw out all these CDs.


> I didn't want to buy just any CD drive because I want to rip these CDs as accurately as possible.

Ripping software like EAC or whipper do verify, using an online DB of rips, that your rip is 100% bitperfect. So ripping "as accurately as possible" ain't a concern: your rip is either 100% correct, bit for bit, or it's not and you should start over.

If you're on Windows use EAC while if you're on Linux, use whipper. I use whipper and it's nowadays stock on many Linux distros (Fedora, Debian, etc.).

> but then I'll finally be able to throw out all these CDs.

Well I think that legally you need to keep the CDs if you want to keep your rips (it may depend on the country but in Europe you're allowed to rip CDs you physically own).


I set up a system with two SCSI drives and one USB to mass rip a few hundred discs. Modern Linux can't handle media changes over SCSI without interfering with an existing transfer so I had to synchronize disc swaps on those two drives. I wouldn't attempt it with only one drive.


I just want to add that you don't need special ripping software.

Just clean your CD's carefully and meticulously before ripping. If possible also clean the CD-ROM player lens using a cotton swab wrapped in non-abrasive lens tissue (available at photographic stores) and dipped in lens cleaning fluid.

Check that your ripping software doesn't do any other processing. For example the 'freac' ripping software has an "Enable Signal Processing" menu item which should be turned off.


The accuracy figures in TA are kinda low - 99.7 for the top one, averaging out at 98.5%.

Anyone has some idea what is this measuring?

Bit accuracy (way too low)? A full disk read 100% error free? The actual accuracy really depends on the quality of the disc, is this measured with a reference disc?


Its a shit way of expressing data they collect. It means every single drive model had at least one owner try to rip badly scratched disk.


How many of the discs ripped were 100% error free, from what I understand it compares against a database of checksums, the one that has the most 'votes' is considered to be correct.


This is based on submission to AccurateRip. As I understand it, it's how many tracks submitted by users owning that drive match the what AccurateRip considers the correct rip.


> As I understand it, it's how many tracks submitted by users owning that drive match the what AccurateRip considers the correct rip.

Wait: I'm pretty sure that's not how it works at all. No matter the drive, there's only one correct rip. And two people won't, by coincidence, happen to read the same disc with the exact same error(s).

So when you rip and compare to the AccurateRip DB, you know you have the correct rip if it matches the one already in the DB.

I don't think AccurateRip "considers" rips correct not: to me it knows with 100% guarantee. Which is the entire selling point of the AccurateRip DB. And which is why it's called that way.


I don't get it. It's not analog data. So how can quality of digital data differ unless you have deteriorated disk and simply data loss? Is this about what hardware recovers data from defective disks better?


There is no "digital data" in physical reality.

There is "analog data that is within a threshold of the bit that we decided".

Hard drive or CD doesn't read "1" or "0", it reads say (after normalization) 0.823V. Or 0.654V. Or 0.983V. The way circuit is designed decides that above some value its "1", below that is "0", but physical deterioration and material variance will change that value and it will differ bit by bit too.

It's all good when signal is strong and say all the 1's are > 0.8, all the 0's are < 0.2 but when it starts to blur (again, bad media, degradation etc.) you get bits that can be 1 one read, 0 another. And having "tri state" input that can say "ok, it's between 0.4-0.6, we can't tell whetehr that's 0 or 1" is generally rare.


The data is digital but the reading is actually somewhat error-prone; even when the disc and drive are perfect, occasional correctable errors will appear. Also, in audio mode, the error correction is weaker than in data mode.


So it means some percentage of data can end up being incorrect depending on the quality of the drive?


Sort of. If you take the same disc between different drives it's possible to get different readings on each of the drives. Red Book audio data is at a fairly high frequency with multiple channels. A few corrupted samples are likely not noticeable. So its error correction is best effort and interpolates between previous and later samples. Two different drives with different read mechanisms and implementation of the Red Book standard might produce different samples when reading an error and depending on the nature of the error.


If your disc is marginal, some data can end up being incorrect.

Depending upon the quality of the drive mechanism itself, and how it reports errors to the ripping software, you can end up with incorrect data more or less often.


Wow I had no idea EAC still existed. Back when I used it, hard drive space was scarce and I thought I was being extravagent ripping my collection at 160kb/s mp3 files.

Does EAC today offer any advantage over cdparanoia?


My audio quality journey went from:

- 192 kbps CBR mp3 - this was supposed to be good enough (it is) and I had an iPod with only 20 GB so I went with it

- V0 VBR mp3 - I told myself I could hear the difference on my now slightly fancier headphones, and this would not kill my OiNK and later What.cd ratio as fast as lossless

- FLAC, if possible 192 KHz and 24 bit - now I got into really expensive headphones and I only downloaded the most verified of torrents (if possible vinyl rips) or I ripped using all the EAC/Paranoia guides out there. Played music via Foobar2000 with some plugins to minimize “interference” in the audio path. Had a full DAC + Amp setup

- Got a job and kids

- 256 kbps VBR AAC, via Bluetooth to my AirPods Pro - I don’t have time anymore to fret about audio quality, streaming is so super convenient and let’s be honest, above a certain baseline no one hears the difference anyway…


Hearing begins to deteriorate in the late 20s / early 30s, so for better of worse audiophilia is a young people's game.


That's the funny thing - young people can hear better, but most don't have enough money to buy the really fancy stuff until they get older and their hearing deteriorates.


Ha, I like the irony of being picky about the bits that came off the CD then immediately mashing it into a medium-bitrate MP3. I'm not judging; I spent more time curating a WMA collection than I care to admit.


I keep 3 copies of every CD I own (4 if you count the CD itself). A FLAC copy ripped using EAC that lives on my NAS and gets streamed in my home. An MP3 copy at a 192 bitrate for use in our cars. And a 128 bit OPUS file that lives on my phone.

It's a pretty easy workflow. EAC to rip. FlacSquisher to convert to MP3 and OPUS. Then copy the files where they need to go and delete them from my laptop. CD then goes into a bin in the garage until I need it again or want to admire the artwork.


Exact Audio Copy is a marvel of ripping technology. I would 100% use it if I cared about ripping CDs accurately.

Ages ago I made custom cdrom drivers that would purposefully randomize data returned from reads (cd copy protection). EAC would rip right through it without skipping a beat and reconstruct the correct audio. The only way to defeat it was to never return all of data, randomized or not.


Back when I was ripping my collection to FLAC, I used EAC because of AccurateRip, which is something that cdparanoia did not offer - at least at that time.


No mentioning of abcde ?

Would like to know how it's seen in the ripping community https://wiki.ubuntuusers.de/abcde/


thanks, I'll add that later.


The much vaunted Plextor aren't as good as I thought they were...


They're only listing new drives. Plextor hasn't made their own transports for a long time.


The source doesn't mention anything about that - it merely reflects popular submissions by a self-selected pool.


Its amazing how bad interpretation of statistical data can make you think.


That's something that surprised me, too.


I don't get it. CD is digital isn't zero always zero and one is always one? Are there no checksum or some other methods that ensure that the data is read in correct manner?


I suggest you read adql's response below. Also copied here just in case:

"There is no "digital data" in physical reality.

There is "analog data that is within a threshold of the bit that we decided".

Hard drive or CD doesn't read "1" or "0", it reads say (after normalization) 0.823V. Or 0.654V. Or 0.983V. The way circuit is designed decides that above some value its "1", below that is "0", but physical deterioration and material variance will change that value and it will differ bit by bit too.

It's all good when signal is strong and say all the 1's are > 0.8, all the 0's are < 0.2 but when it starts to blur (again, bad media, degradation etc.) you get bits that can be 1 one read, 0 another. And having "tri state" input that can say "ok, it's between 0.4-0.6, we can't tell whetehr that's 0 or 1" is generally rare."


I’ve been using CyanRip and it’s been working nicely. Has excellent defaults Eg just run cyanrip from cli with a CD in and it rips to flac in the current dir with the most sensible/reasonable defaults (at least it’s opinions have worked for me).

https://github.com/cyanreg/cyanrip


thanks, will add this later


Is bit perfect reading necessary? It’s so many samples per second, 44100 to be precise, than one bit error once in a while cannot be noticed.

And it’s not like audio is actually high fidelity. Any DAC will introduce a lot of errors. And the original recordings are not perfect either. Microphones, loud speakers, amplifiers, that’s a lot more errors than one wrong bit too.


I imagine that a bit flip in the most significant bit could cause an unpleasant pop.


I'm not sure a loud speaker can produce a pop so quickly.



So I was wrong, it’s very loud.


A bit flip at the most significant bit is a one cycle 22.05kHz pulse - so yes you almost certainly wouldn't hear it even if the speaker could reproduce it.


You would totally hear it, as a small click. I had a bug in some audio software I was working with that did almost exactly this, and it sounded terrible.


I mean this is mostly nonsense right?

The data is digitally encoded 16-bit PCM, and already has builtin CIRC error correction, there is no "quality" or "accuracy". Your software/hardware either correctly ripped the data or it didn't and this can be trivially verified with crc32 against CUETools or AccurateRip or whatever.


That's generally how the flow chart works in the rip process, if you feel reasonably confident that you've correctly identified your release (specifically the correct pressing) you can burst rip the disc and then compare the ripped tracks to other rips from the same pressing (see also AccurateRip). If you got the same hash as other people ripping the same pressing then you're done.

If your pressing doesn't have a good sample size of hashes to compare to then you use a more secure ripping method that generally makes use of the accurate stream feature (most drives manufactured in the last decade or so have it) it's a slower but more repeatable process of reading a red book CD intended to suppress jitter and alignment errors. If you get through an accurate stream rip and you encounter no read/C2 errors you're probably ok, but if you really have no/few samples to compare against you may be paranoid and rerip the disk a few times to make sure you get repeatable results or even rip the disk with another drive all together.


I can respect the paranoia around ensuring bit-accurate rips of CDs with little-if-any existing metadata on the web.

I remain suspicious of the cargo cult of EAC as doing something substantially different than cdrtools or anything else that knows the ins-and-outs of the redbook format


A lot of the emphasis on EAC is that robust automatic logfile validation tools exist for it (from what I can tell?), so it requires less manual inspection for archival. But I agree that other tools are likely equally capable.


Completely understandable, EAC is certainly not the only software capable of using accurate stream features on CD drives or doing AccurateRip DB comparisons. Making bit-perfect archival rips is more about soundness of "the process" than the software you do it with, EAC is just one such tool.


I was actually wondering if there is a database on the web with the SHA-256 hashes of all the ripped CD tracks ever released.

That would be useful.


Mostly, but not completely. CIRC error correction in the Red Book standard is deliberately designed with a soft failure mode for unrecoverable data - specifically, it will "interpolate" audio data to fill in the gap.

If you have a damaged disc, a drive can be better or worse at resolving "marginal" parts of the disc, which will change the data. Whether or not this makes an audible difference is an exercise for the reader, of course.

CD-ROM doesn't have this problem because it introduced a different error correction mechanism with more overhead. So in that case it is "either it ripped correctly or it didn't"... at least until you start getting into mixed-mode discs, or games that stored their FMVs with the weaker error correction mode to save space, or literally anything with copy protection.


CD-ROM doesn't have this problem because it introduced a different error correction mechanism with more overhead.

It's been a while since I've looked at the standards in detail but I believe that involves another layer on top of the layer of error correction that CD-Audio has.


Evidently not, as seen by the linked post: https://forum.dbpoweramp.com/showthread.php?48320-CD-Drive-A...

The data seems to be based on telemetry collected for a CD ripping software, which I'm slightly more inclined to believe is real than some mumbo jumbo about vinyl warmth or whatever.


So look at how the data is broken down:

> Drive: HL-DT-ST - BD-RE BH14NS48 (116 users): Submissions: 10260 accurate, 23 inaccurate, 99.7763 % accuracy

23 of these submissions come from a broken drive, or the disc is damaged to the point a correct read cannot be made. It's not that the drive is producing 99.7% accurate data, there are 10260 100% accurate submissions and there are 23 wrong submissions.

It's a binary state, either the drive is functional and the CD in decent condition, or the drive is broken/CD is damaged. Some drives may have a higher failure rate than other drives, but every drive is either 100% accurate or busted, there is no 95% accurate drive. That's just a broken drive.


> It's a binary state, either the drive is functional and the CD in decent condition

No. How good the drive is at reading marginal discs varies a lot between drives.

The problem is, these populations are not identical disks, and the error rates are low enough that we're not really sure we're seeing a good metric of this.


It's definitely unclear what "inaccurate" means in this sense?

Is an entire track inaccurate if a single sample is inaccurate? In that case then yeah, rather than measuring some level of audio purity, this is measuring your likelihood of getting a perfect rip before taking CD condition into account.


I inclined to believe this is all non-sense. Did ever someone hear a difference or compare hashes and got sometimes an error and sometimes not?


I did it with a simple program that just issued READ CD commands to the drive using SCSI passthrough and saved the first ten minutes of the disc to a raw PCM file. Running that program twice produced two files that sounded the same to me, but had different hashes (and it wasn't a "sample offset" issue either.) The disc had no visible scratches, and it was the same drive that I'd ripped half my music collection from with EAC, so that experiment was solid evidence for me of what the proponents of secure ripping software like EAC keep saying, which is that read errors when burst-ripping on consumer drives are very common.


Author here, I did (personally).

  CD: The Weeknd - The Highlights
  Drive: Pioneer BDR-XD05TB (not listed in my article)
Accurate Rip reported errors on 3 Tracks. Ripping with

  hp GUE1N 
flawlessly worked on the first try. Although I also doubt that the error would be audible, but I have a better feeling if the hashes are correct.


Thanks. Exactly my thought, it’s data with a checksum at this point. You can confirm the copy.


Unfortunately not all CD drives report C1/C2 errors correctly for audio data from a red book CD which is why you have to do external comparisons (see also AccurateRip) and some discs have fake C2 data as a means of copy protection or sometime the C2 data is just un-reliable.


The problem: some music isn't even released on CD anymore.


shout out to OiNK, where i first learned about proper rips


> NEVER normalize (!!)

Why not? Shouldn't it be non-destructive?


Suppose you have an unsigned 16-bit integer signal of [0, 1, 3, 2000]. You want to multiply every value by some scaling constant so that the maximum value is 65535.

So the multiplier is 65535/2000, and the intermediate result is [0, 32.7675, 98.3025, 65535]. But because the output must be integers, so decimals will get rounded off.

It is possible to restore the original numbers exactly if you know the scaling factor, but it's an extra complication that is not worth the hassle. It's better not to burn the normalization into the CD rip; it's better to do normalization on the fly at playback time.


No, it recomputes the samples and there's always some loss during this process.


I realized that I had normalization set to 98%. I think that's the only setting I changed after going through all of the menus. I'm not re-ripping hundreds of CDs at this point, so I guess I'll have to live with almost accurate. :)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: