Hacker News new | past | comments | ask | show | jobs | submit login
Major book publishers sue Amazon’s Audible over new speech-to-text feature (theverge.com)
256 points by bookofjoe on Aug 26, 2019 | hide | past | favorite | 357 comments



"...devalue the market for cross-format products, and harm Publishers, authors, and the consumers who enjoy and rely on books."

Could someone explain the licensing like I'm 5 that doesn't end up with "because publishers want more money"

When I buy a book I am buying the story. The authors words on paper (well on digital ink). When I buy an audio book I am buying the authors words performed by a voice actor.

I get why the audiobook is more. I'm paying for the author's story + actor. But I never understood why once I have the audiobook I could not be allowed the text version for free.

How are they doing anything but trying to charge me twice for the same story?


that doesn't end up with "because publishers want more money"

Well no, because publishers (and authors) do want more money.

When I buy a book I am buying the story.

This is where your analysis goes wrong.

But I never understood why once I have the audiobook I could not be allowed the text version for free.

For the same reason you're not entitled to a free (or even discount) paperback if you buy the hardback, or why I can't return the paperback I dropped in the bathtub for new copy. You're not licensing the story, you're buying a specific implementation of that story. Furthermore different companies have bought the rights to different versions of that story in different mediums, and they all want to earn their money back.


> For the same reason you're not entitled to a free (or even discount) paperback if you buy the hardback, or why I can't return the paperback I dropped in the bathtub for new copy.

I think I need something better than that. You are talking about 2 physical items, I don't think anyone would argue just because you buy 1 product off the shelf doesn't mean you are entitled to the rest of the shelf

An audio book requires author, editor, publisher, actor, distributor

A digital book requires the same minus actor

Where as a physical book requires the same + printer, binder, possibly artists, typesetter, shipper

In this case the audiobook and digital book have same distributor, amazon

Say I want to whispersync The Way of Kings. I already own the audiobook but now I have to buy the Kindle version for full price, which is $9.99 (I double checked in a private tab to make sure amazon wasnt giving any discounts)

> Furthermore different companies have bought the rights to different versions of that story in different mediums, and they all want to earn their money back.

How am I not paying for the same thing twice? If there is some other entity involved I wouldn't mind paying a small difference for their service to bring it to that medium.

Why is it people get paid twice for contributing nothing to bring it to another medium?


Your question is a reasonable one, but you're misunderstanding the bigger picture. The bigger picture is that you are offered something specific for a specified price. You can accept the deal, reject it, or make a counter offer.

What you can't do is declare that your payment means that you actually bought the rights to the text of the book or anything else you unilaterally and post facto include in the purchase for philosophical reasons, when those things weren't included in the offer that you accepted. What was offered, and you accepted, was a license to unlimited personal listening to an audio performance. If you want something else, make an offer or wait for them to make one.

It's not quite as simple as this because governments often want to get into the deal and declare that certain offers aren't what the offerer offers but what the lawmakers think he should have offered, but absent this intercession, you were offered an audio performance for a certain price, and if you accepted, that's what you got.


Except that the courts have said many times that a seller doesn't have carte blanche to add whatever restrictions they want to a purchase. The First Sale doctrine, for example, means you can't stop someone from reselling their purchased book. It is tricky for digital items, but depending on the exact details, the First Sale doctrine can sometimes apply.

Being able to take the legally purchased text and running it through a computerized reader might end up being legal... we need the courts to decide. It isn't enough that the person selling it doesn't want to allow it.


First Sale Doctrine applies to specific copies of creative works (i.e., a book or DVD).

It doesn't apply to licenses for creative works. When you buy an ebook, you are buying a license to create copies of the creative work as necessary to make use of the license purchased (i.e., a copy in your phone or on your kindle).

FSD doesn't let you sell either of those transient copies, and it doesn't apply to the license itself because the license isn't a copy within the meaning of the statute underlying the FSD. In the rare occasion where the license and the copy are one and the same for a digital good, the FSD would apply.


This is not a great distinction because there are government imposed limits to the vendor imposed limitations, even if it is a "license". FSD will hold up with physical DVDs in American court, even if the vendor says they only sold you a non-resellable license to view the DVD. Court isn't going to care the vendor called it a license. The distinction is the jurisprudence regarding the medium, with a dash of vendor license agreement. Also how you purchased it and what a consumer would "reasonable expect" will probably come into play: if you buy 1M old DVD with license to not resell that you and vendor have physically signed, your FSD is less likely to hold up than if you bought one DVD at Walmart with an EULA.


FSD will hold up with physical DVDs in American court, even if the vendor says they only sold you a non-resellable license to view the DVD.

Right, because that's literally how the FSD was written. The sale of a DVD of "X" necessarily includes a license to view "X". The license is subordinate to the DVD (beause it is necessary for use of the purchased good), and because the FSD applies to the DVD, it also necessarily applies to the subordinate license.

if you buy 1M old DVD with license to not resell that you and vendor have physically signed, your FSD is less likely to hold up than if you bought one DVD at Walmart with an EULA.

No, the FSD still applies. You've just described Redbox, which actually fought and won this battle (about reselling used DVDs). It would be different if you bought a DVD that included a license for an online streaming version of the movie that couldn't be resold. Redbox fought and lost this battle against Disney. In a nutshell: Redbox can do whatever it wants with the physical copy of the movie, but the FSD doesn't apply to the streaming license that was bundled with the DVD and so they couldn't sell those separately from the DVDs themselves.


Yes, and I appreciate you refererencing the specific cases, because my main point, is that that is what matters. It is the jurisprudence that defines what limitations are allowed. Vendors will constantly claim "licenses" and "limitations" that are not supported by jurisprudence and would not likely hold up in court.


My point was just that it isn't as simple to say "the seller put this restriction on the transaction, take it or leave it"

There are limits to what they can restrict.


There are limits to what they can restrict.

Yes, the limitations can't violate criminal or contract law (or other laws in general), and can't completely prevent the licensee from making use of the licensing rights they paid for. Otherwise, any restrictions are fair game.

If the putative licensee doesn't want to accept the restrictions in the license, their remedy is to not enter into a licensing agreement in the first place; they have no right to simply ignore the license restrictions.


Great. I accepted the audio performance and that's what i got. Can i run the audio performance through a speech to text engine?


Not a lawyer, but I imagine that you, as an end user, can do that, so long as you are not selling that text or distributing that text. But Amazon, who is licensed to provide you with an audio book but not a text book, cannot offer you a speech-to-text rendition of that audio book as part of your audiobook purchase. Nor do I imagine them being able to offer you the technology to do it on your own as a "feature", as that would be a rather hideous loophole.


I imagine that Amazon are going to argue in court that if the publishers are held to be correct, then computer monitor manufacturers are not licensed to display copies of the work and that they also should pay the publisher (again) as should people who make software that alters the performance, e.g. volume or equalizer functions.

Amazon is not fundamentally offering a textual copy of the work. If they were doing so they would be violating the license. Instead they are offering a technology to transform the (possibly flawed performance of) the work using a tool (that is itself possibly flawed). That is not the same thing as distributing a textual copy of the original work.

A computer is a device on which the work is licensed to be distributed, and it happens to have the capability to transform the work, just as a volume knob or a graphic equalizer would.

If Amazon can't offer this technology because they are licensed to distribute the work, what is to stop someone selling such technology who is not licensed to distribute the work? The effect is the same, precisely because it is a general purpose computing device with sophisticated functionality. And yet I don't see publishers getting far with a claim that all manufacturers of speech-to-text devices (think of deaf people) should pay them because the device could be used to transform their works.

Of course I know this is not how any publisher sees their work. E.g. you need a different license to perform a work than you need to modify or make a video production of a work (mechanical vs sync license, etc.)


> I imagine that Amazon are going to argue in court that if the publishers are held to be correct, then computer monitor manufacturers are not licensed to display copies of the work and that they also should pay the publisher (again) as should people who make software that alters the performance, e.g. volume or equalizer functions.

The well-established doctrine of patent/trademark/copyright exhaustion [1] prevents this undesirable result.

1: https://en.wikipedia.org/wiki/Exhaustion_doctrine_under_U.S....


To clarify, the exhaustion doctrine allows the owner of the licensed work (ebook, etc) to access/display/use the work except as disallowed by the license.

This includes viewing it on a computer monitor. It's irrelevant that the computer monitor maker does not have a license to the work, since the viewer does.

In the case of the audiobook speech-to-text argument: the licensee has a license to the audiobook but not the prose version of the work, which is a legally distinct creation under many decades of international and domestic IP law.

So Amazon or the customer needs to have the license to the prose work for speech-to-text to be defensible, but neither do.


I am not convinced by this. Amazon does not at any point convey a copy of the written work to the customer. At no point does such a copy exist, nor can it conceivably be constructed from the technology Amazon are supplying to their customers.

This is not (if I understand correctly) a PDF document they can download and read, it's a technology that allows the words that are currently being spoken to be seen in written text.

At no time does a legally significant quantity of the written work get conveyed to the customer. Amazon will just ask the publishers to show where this alleged copy of the textual work resides in their system. They won't be able to do so, as it doesn't exist there.

This does seem to be what they are in fact arguing. And I think they will prevail. I would not expect them to prevail if they offered a pdf of the book or if their software actually contained a copy of the original text that was being conveyed to the customer.


> A computer is a device on which the work is licensed to be distributed, and it happens to have the capability to transform the work, just as a volume knob or a graphic equalizer would.

That's not a fair comparison, and I'm sure you know that. But I appreciate someone playing the devil's advocate here; such exercises are always useful.

> If Amazon can't offer this technology because they are licensed to distribute the work, what is to stop someone selling such technology who is not licensed to distribute the work?

What's stopping them from selling such technology? Nothing, as far as I'm aware. It's similar to how BitTorrent clients are not themselves illegal, but users of the client can use them towards illegal ends.


If I utilize a copyrighted work in the creation of something new, depending on the level and quantity, my new work could likely be considered derivative of the original work.

I think Amazon will argue that that speech-to-text isn’t “creative enough” nor does it involve humans, and it is a purely mathematical function and this not covered.

Most likely this will not end in Amazon’s favor.


Replying to a reply of this. What copyright exhaustion? They just keep changing the rules to make it longer without the consent of anyone act we have no say in it. The average person in the US thinks the copyright length is insanely long. Legally if someone keeps changing the contract without the consent of all parties shouldn't that contract be null and void?


I agree that the duration is out of control, but that's not what the exhaustion doctrine is about. It's about letting a person who has paid for a license to IP to be able to use it without having to secure licenses for all the devices/platforms they want to use it on. It doesn't solve the interminable extension problem you mentioned, but it does moot the point raised by parent's comment.


https://en.wikipedia.org/wiki/MAI_Systems_Corp._v._Peak_Comp.... sorted out most of that

and yes hilariously for a couple weeks it was illegal to run software.


and yes hilariously for a couple weeks it was illegal to run software.

This simplification of the legal case is just as bad or worse than a movie saying they found America's launch code by hacking the dark web (in the movie's defense, they were focused on the fighting and explosions, IT accuracy was not on the top 100 items of their priority list.) It was never illegal for a licensee to run software they had licensed on their own computers.

The issue was whether an unlicensed repair/maintenance company, Peak, could run MAI's OS while repairing customer computers. The 9th Circuit said no. Congress subsequently changed the underlying statute so that repair companies could run computer software as part of their repair/maintenance duties.


Amazon probably cannot provide you the pre-transcribed text (like subtitles i guess) but I bet they could provide an Alexa-transcribed just like Apple could provide a Siri-transcribed version. Closed captioning is permitted, so why not? :)


For your personal use? Yes as a practical matter. Probably as a legal one. The publisher/rights holder/distributor of the audio performance has no obligation to provide you with the means to do the conversion though and, to this case, if they don't own the rights probably can't do so as a bonus for the initial purchase.


I wonder how many months will elapse until Barnes & Noble has a page of fine print on everything stating that you're not buying the book, you're buying a revokable, non-exclusive license to possess the book, and that this license is voided if you attempt to do anything with the book except scan the words using biological optical sensors your possessed at birth into short-term memory only, and violation of this license is subject to...


The distributor can’t distribute it as part of the offering, sure; but I don’t see why they can’t offer it as a freestanding software utility that happens to be easily accessible from their distribution platform. It’s not a “bonus for the initial purchase” in that case; it’s just something they coincidentally happen to offer, that you might happen to already have installed, or install later.


As a legal matter, maybe. You can start to get onto shaky ground when you provide something that's explicitly marketed for some infringing purpose.

I'm not sure how useful any of this is anyway though. I took another look at Amazon Transcribe a couple weeks ago and reached the conclusion that its transcriptions are still pretty bad compared to a human doing it. You get the general gist so long as it's a clean recording but there are lots of errors.

Standalone software is going to be even worse.

Imperfect captioning is better than nothing in some cases. But who wants to read a book--much less use it as a learning tool--when there are errors every other sentence?

ADDED: It really doesn't seem like much of a threat to publishers in its current form. Probably lots of errors and the text is only delivered a few lines at a time as the book is read aloud. However, I can see publishers looking down the road and imagining much improved ML transcription and larger blocks of text as something that's legally indistinguishable.


I don’t see why they can’t offer it as a freestanding software utility that happens to be easily accessible from their distribution platform.

Really the only actual answer to that question is because a lawyer convinced a judge to say that they can't. If they can get a better lawyer and/or a different judge then maybe all of sudden they can.


Not legally.

It's kind of like how just because you have a pair of scissors it doesn't mean you are entitled to cut people's hair (without a stylist license). Or how just because you own a scalpel you aren't allowed to practice medicine (without a medical license). The government has determined that these behaviors need to be regulated. Perhaps you agree about regulating the surgery but don't agree about regulating haircuts... that's a matter for the political process.

In a similar fashion, Congress has passed laws on copyright that specifically say that if you own a copy of a work you can do anything you want with that copy, but you cannot make a new copy in a different form. This made sense when "a copy" meant a physical book, but that law isn't necessarily worded in a way that makes sense for electronic documents. I am old enough to remember when people tried suing over web servers, because when the browser received the file it made a local "copy" of the document in memory (without a license). The courts eventually settled on an interpretation that roughly said "ephemeral copies made during the process of delivering the work don't count as 'making a copy'". But they also settled on an interpretation that said "changing the medium -- like going between text and audio -- does count as making a copy".

So text-to-speech (or speech-to-text) converters are illegal; or, more precisely, using them on any copyrighted work (which is just about every work) is illegal without the permission of the copyright owner. Maybe not the best law for today's modern world, but it's the law we have and entrenched power groups would resist any attempt to change it.


> Can i run the audio performance through a speech to text engine?

I strongly suspect that would hold up in court in the US. For the same reason that I can alter the book and story in my own personal context of ownership. I can take the story, the book, and rewrite it for personal consumption, while keeping all of the original content in my new version. What I can't do typically of course is sell that version, distribute it commercially.

I could use a note taking system that pulls pieces from a physical book as I read it, or snap photos of pages I find interesting. Or I could do that manually with the audio version and take notes of the audio book as it progresses. Both approaches - and numerous others like them - would be protected so long as you're not trying to make money from what you're doing.


Commercial nature of use is only part of the fair use analysis.

You could run your own text-so-speech conversion (or vice versa)for your own private use on fair use grounds because it's a textbook case of fair use. It's very different when you're doing it for someone else's use. There are plenty of situations in which even commercial conversion activity would qualify as fair use.


>not trying to make money

Generally agree with what you wrote. Money isn't the central question though. Something like fanfic distributed for free can still be infringing.


Let's say yes. Can Amazon invent a better compression scheme that uses phonemes & UTF-8 text as a cue? Say, something that compactly expresses Neil Gaiman's voice, then just shows the text of Anansi Boys?


> You can accept the deal, reject it, or make a counter offer.

Or just get it from TPB or whatever.


> Why is it people get paid twice for contributing nothing to bring it to another medium?

I'm a little confused, but who are the people who are getting "paid twice for contributing nothing to another medium"?


Audio books are performance and story. It’s not just the words. It is a substantially different work of art, even if it is the same story. I think that is where the GP is too reductionist.


Audiobooks add on top of the original book; but I would argue that you could train a human being with a set of deterministic rules that they can use to transform an audio-book into the exact text of the text-book that was published, with 100% fidelity, 100% of the time. Which is to say, an audiobook is a lossless embedding of the original text-book. If you’re buying the audiobook, you’re already buying the text-book; it’s in there!

Another way to look at it: if you own a music video, do you have a license for the song that the music video is a music video of? I would argue you do—you have to have that license, to be able to have a license for the MV, just like you have to own licenses to all the tracks on an album, to be able to claim to have a license for listening to the album.

And, as such, it really wouldn’t make sense to attempt to sell licenses to consumption of MVs, that don’t implicitly include licenses to consumption of the audio tracks within said MVs. (Has anyone ever argued that people don’t have a right to use youtube-dl to extract the audio track from MVs publicly-posted by their rights-owners? I assume, if they did, this argument went about as well as arguing you can’t record music off the radio...)

This is the equivalence being offered here: a text book is to an audiobook as the audio track of a music video is to the MV itself: a component. In the case of a text book, it’s not one that’s literally there in bits; but since it can be extracted with 100% fidelity, the information that component is made up of is clearly there, in the thing you bought. It’s no different than if an audio track was embedded in your video but was reversed or something.


> you could train a human being with a set of deterministic rules that they can use to transform an audio-book into the exact text of the text-book that was published, with 100% fidelity, 100% of the time

Totally disagree. Consider poetry, or poetic elements in literature, where the form of the text conveys meaning. That is lossily conveyed in an audiobook.


But if people create the text version of it by listening to the audiobook, it won’t have all that fancy formatting and extra work that you said goes into the print version. So, in this scenario, the user isn’t getting all that extra value produced by the work of a person doing the fancy formatting. Which makes it sound like they are not infringing on the copyright of that work, as they are not getting it by either using software or transcribing the audiobook by hand.


But they are infringing on the copyright of the work. They've created a copy of the work in a different medium, which is one of the textbook examples of copyright infringement (not getting into whether fair use defense applies).


>But they are infringing on the copyright of the work.

They are infringing in the same way as pen and paper companies are infringing when you hand-write the contents of an audiobook on paper for personal consumption. I believe that most people would consider this being an allowable thing to do.


No, they're not even remotely the same thing. It's like saying that English and Javascript are both programming languages because they use the same alphabet.

Pen and paper companies sell you tools/materials to create works. They don't know or care what you create with those works. For all they care, you could just eat the pen and paper. Importantly to this example: they don't transcribe the audio--you do. Your transcription is a derivative work and if a fair use exception does not apply, would be copyright infringement if the copyright owner cared to enforce it in these circumstances.

Amazon's Kindle transcribes the audio of an audiobook using software created by its programmers for the sole purpose of transcribing audiobooks. You don't do anything. Amazon does all the work, at your request. And that makes all the difference, because what Amazon is doing has pecuniary value and thus fair use would not apply to their actions.

I believe that most people would consider this being an allowable thing to do.

Most people also believe it's okay to speed. It's still against the law and it's not a defense to a speeding ticket when you get caught.


Or photographs/maps/etc. To say nothing of just the overall design and physical instantiation of the book.

Back when I listened to audio books while commuting, I learned to be careful when selecting non-fiction history because a lot of things could be hard to follow in the absence of, say, maps in the original printed text.


Another way to look at it: if you own a music video, do you have a license for the song that the music video is a music video of?

You do not. You have a license for the music video, which is the consolidated song+video adaptation of the underlying song.

I would argue you do—you have to have that license, to be able to have a license for the MV, just like you have to own licenses to all the tracks on an album, to be able to claim to have a license for listening to the album.

This would be wrong under US, EU, and even just international IP law generally. The lyrics, music arrangement, audio performance, and music video are all separate creative works. If you want to get technical, the choreography in the music video may itself be a separate copyrightable work.

And, as such, it really wouldn’t make sense to attempt to sell licenses to consumption of MVs, that don’t implicitly include licenses to consumption of the audio tracks within said MVs.

Licensing music videos for money is actually a thing that has happened since the 1980s. These licenses are separate from the right to consume a standalone non-video audio performance.

Has anyone ever argued that people don’t have a right to use youtube-dl to extract the audio track from MVs publicly-posted by their rights-owners? I assume, if they did, this argument went about as well as arguing you can’t record music off the radio...

Yes, and it was found to be fair use. Decades ago, the music labels tried to ban tape recorders on the grounds that they could be used to record radio broadcasts. The courts agreed, but said that such use would be fair use if the recordings were just for private use.


> Another way to look at it: if you own a music video, do you have a license for the song that the music video is a music video of? I would argue you do—you have to have that license, to be able to have a license for the MV, just like you have to own licenses to all the tracks on an album, to be able to claim to have a license for listening to the album.

This analogy isn't going to clarify things. First of all, I don't think it's necessary to bring opinion and speculation into this, e.g. "I would argue you do", if there is actual law here.

So with music videos and songs, I would be very surprised if owning a music video automatically and simply granted you a license to the song. Ignoring the fact that videos frequently feature tracks specifically cut and rearranged for video (Michael Jackson is an obvious example), I think the main issue is that songs and videos have multiple copyright holders, and it's too complicated to be a relevant analogy.

IANAL, but if I were to record a high-quality video of a live band performance, the video recording is my property. I can take it home, rewatch it, etc. But I think it's obvious that I don't in any way "own" the copyright to the song I filmed. I can't extract the audio, burn it a CD, and sell it. And I'm pretty sure I would have very few commercial rights to distribute the video itself, if the band members never signed modeling contracts.

> I assume, if they did, this argument went about as well as arguing you can’t record music off the radio...

OK, I don't think this is terribly important to debate, but are you implying that settled law says people have the right to record off the radio? That's definitely not my (amateur) understanding.

> In the case of a text book, it’s not one that’s literally there in bits; but since it can be extracted with 100% fidelity, the information that component is made up of is clearly there, in the thing you bought.

This will sound like a quibble but I think it's fundamentally important to point out: text from an audiobook cannot be "extracted with 100% fidelity". Whether it's 100% fidelity or much worse isn't even the point. What you assert simply isn't so. But I'll be happy to admit I'm wrong if you find an audiobook from which you've extracted all the text content, perfectly, as well as supplementary text like footnotes, indices, headers, illustrations, and captions, and of course, page numbers. It goes without saying that visual data like layout and typography are also components of book with value, but I think you get my point.


> This will sound like a quibble but

I think that is a minor quibble. You're arguing against a point which is not really the important point. That is, that owning a license for the audio book means you also own a license for the printed book including all the extra creative work that went into it, like formatting, pictures, etc.

A lot of other people in this thread are arguing against that point too but I don't think they need to because it's not relevant to the Amazon case where the generated text version isn't taken from the official text version, so it doesn't copy any of the formatting/etc.

If you buy a music video, you can obviously legally listen to it while facing backwards and not seeing the images. So you do have a license to the song in the video like the GP said (if I understand him correctly). Of course you don't necessarily have a license to some other version of that song that somebody else has put additional creative work put into, however minor. That's surely obvious to everyone here and people seem to be talking past each other trying to point that out.


> I can't extract the audio, burn it a CD, and sell it.

No, but what this is about is saying "am I allowed to listen to just the music without also watching the video? Can I play it on my phone with the screen off? Or do I have to pay extra to do that?"


Yes, you can play it on your phone with the screen off. You're still playing the video and processing the graphical data.

What would not be permissible is stripping the audio track from the video to play separately. That would generally be copyright infringement and there are limited circumstances under which a fair use defense would apply.


To be clear, if you turn off a device's display when it's playing a video, it usually stops decoding the video stream from the container, because it knows the destination video framebuffer isn't visible. It's doing the same thing you're talking about—"stripping the audio track from the video to play separately"—just without writing down the result.

You can get the same effect without turning off your screen if you tell e.g. VLC to stream only the audio from a network video stream. All VLC will do, in that case, is to request only the audio segments of the stream, and not request any of the video segments.

Consider this another way: is a web browser infringing a website's copyright when it makes a "derivative work" of the page by blocking the ads on it?

Or, hopefully even clearer: is a CD player infringing the CD's copyright when it applies an EQ preset to the audio as it plays it?

There's a very fine point I'm trying to gesture at here—the point where something goes from being "the creation of an infringing derivative work" to "the way that a device which reproduces the work for the purpose of consumption, chooses to do that reproduction."


To be clear, you're still processing the same file. You're just not processing all of it for the sake of efficiency. If you want to get into the very technical details, then it depends very heavily on what type of codec you're using since some codecs interleave video and audio data, while modern codecs generally don't (and in fact require separate audio and video codecs).

Consider this another way: is a web browser infringing a website's copyright when it makes a "derivative work" of the page by blocking the ads on it?

Yes, under copyright law, since the article and the page are separate copyrightable works. The question is does a fair use exception apply? (For something like reader mode, the answer is likely yes, but it gets more complicated for an ad-blocker that just removes ads while retaining other elements of the page.)

Or, hopefully even clearer: is a CD player infringing the CD's copyright when it applies an EQ preset to the audio as it plays it?

No, the CD player isn't a person and isn't infringing anything. The user of the CD player is creating a derivative work by applying an EQ. An analogous case (several actually) discusses this point with respect to running computer software. Courts ruled that the copy was necessary for a user to "enjoy" (make use of) their license. And with respect to the EQ, a fair use exception would apply. And even if it didn't, then a compulsory license would apply (in the US at least), and you would owe the song creator a fraction of a fraction of a penny for every performance.

There's a very fine point I'm trying to gesture at here—the point where something goes from being "the creation of an infringing derivative work" to "the way that a device which reproduces the work for the purpose of consumption, chooses to do that reproduction."

No, it's actually a very bright line. Either you're working with the original, or you're working with a derivative work. The question is whether your use of the derivative work you've created is covered by the license, or by fair use.


> What would not be permissible is stripping the audio track from the video to play separately

Even for personal use, where you aren't sharing the extracted audio? I am pretty sure that would be allowed.


Arguably it could be fair use but strictly speaking it's copyright infringement.

It's generally not worth the copyright owner's resources to pursue something like that unless it's happening on a much larger scale.


Fair use is not infringement.


I agree that audiobooks require substantial work. But I'm confused about who is "getting paid twice for contributing nothing"? Is it the writer that's getting paid twice "for contributing nothing" to an audio book? That doesn't make sense to me.


All the people who would get paid, since the only people doing extra work to create the AI written-version of the audio are Audible software engineers, and they aren't getting paid any extra when you enable the feature.


People selling something in general are not getting paid for their "contribution", they are getting paid because (a) they have something you want, (b) they ask for money to give it. They are getting paid for the ownership.

If I sell a plot of land I inherited from N generations ago, there was zero of my "contribution" factored in into the price (and perhaps zero of my ancestors who first got it contribution. Such plots might have just been up for grabs at the time). The price can still be millions for said plot of land -- it's all to demand, nothing to do with some compensation for "contribution".

In this case, the publishers (and owners of the story copyright) want to sell the 2 different versions separately.

It's as simple as that.

And you can complain about "useless middlemen, etc" but it could just as well have been the author that wanted to sell his stories as different deals for different mediums, and not e.g. allow automatic conversion from one format to another.

Usually, he who sells something sets the price for what you get (and determines what you get).

You might agree and buy it or not, but you don't get to force them otherwise, because you "really wanted X", or "you thought you got Y too", etc.


You're taking an example from the world of rivalrous goods and trying to apply it to the world of non-rivalrous goods (where supply is no longer a factor).

You then try to reframe copyright, from "Copyright is forcing consumers to overpay for some things" into (paraphrasing) "Anyone complaining about the current state of copyright law[1] is trying to force artists and publishers to live in poverty."

[1] Which is mostly the result of a century's worth of lobbying by IP holders with little to no effective counter-lobbying by consumers.


>You're taking an example from the world of rivalrous goods and trying to apply it to the world of non-rivalrous goods (where supply is no longer a factor).

Read demand as "people wanting the stuff" as opposed to "people wanting the stuff versus available stuff". I'm not talking about the law of supply and demand, but of, let's say, "willingness to pay to get the thing if no other way is available" (even if it's just for legal reasons that no other way is available).

>You then try to reframe copyright, from "Copyright is forcing consumers to overpay for some things" into (paraphrasing) "Anyone complaining about the current state of copyright law[1] is trying to force artists and publishers to live in poverty."

Nope. In fact I wouldn't care if some artists and publishers lived in poverty.

What I argue (and this is not about personal preference) is that the transaction is:

(a) publisher has something another person wants, sets price

(b) said person can buy it or not

We can imagine a setup that's just based on somebody owning the sole ability to sell a thing and its digital copies, and demanding whatever for it.

Scarcity, marginal cost of copy, etc, don't need to come into play.

People who is only authorized to sell gets to set whatever terms, even if those terms are "I want people to pay twice if they want the text version and the audio version" or even crazier.

That's the reality (and, more or less, the law).

We could make it like "Everybody is authorized to make as many digital copies as they want, for free, redistribute them, even put their names on the work" if we wanted.

It's not something set in stone, it's only set in the interests of publishers and in law. And while we can argue that the interests of creators are not the same, most creators tend to just sell their rights to/go along with the publishers, so the point is moot from a legal perspective.

>Which is mostly the result of a century's worth of lobbying by IP holders with little to no effective counter-lobbying by consumers.

Well, in life people don't get what the want/need/deserve. They get what they've managed to negotiate.

If consumers don't counter-lobby/protest/break some heads they'll get the laws ip-holders want.


> If consumers don't counter-lobby/protest/break some heads they'll get the laws ip-holders want.

If IP holders get too greedy, consumers will just pirate stuff. I mean, consider the situation for music vs films. Music is available at low cost through streaming services, and piracy is no longer a major issue. But piracy remains a serious issue for films, which are still priced way too high.

But I suppose that piracy is a form of protest, so hey.


Try starting from the author. The author invested his own time and wrote a thing, but for whatever reason didn't want to go out and sell it himself. So he sold a company the rights to print and sell text copies of the thing he'd written. He then sold the rights to sell audio books of his work to a different company (and perhaps the right to make TV show of his book to a third company).

The only reason a company was willing to pay for the rights to the text copy of the work was because they where being guaranteed some sort of monopoly. If that monopoly is weakened then they have less incentives to pay the author.

If that is a good or bad outcome is left as an exercise to the reader.


Do you think that's how it works in practice, any evidence?

I suspect the authors now are getting the same, or less, and the publishers are both charging more for digital (despite their costs being lower) and taking a higher proportion.


>the publishers are both charging more for digital (despite their costs being lower)

The costs associated with printing and distributing a physical book are a much smaller slice of the pie than a lot of people assume. (Back when I looked into this, it was about $2.50 per copy for a hardback as I recall. The author cost for a PoD trade paperback from Amazon is about $5.)

Also, in general, the end-user price for books is generally significantly lower than it used to be. Go back a couple decades and you mostly paid list price for books and paperbacks usually weren't available for a year or so after a book came out in hardcover.


>I think I need something better than that. You are talking about 2 physical items, I don't think anyone would argue just because you buy 1 product off the shelf doesn't mean you are entitled to the rest of the shelf

The whole publisher intention behind copyright and license rights is that you shouldn't think of digital products any differently than you think of physical products on a shelf.

In other words, the idea is that just because a digital product costs $0 to copy it doesn't mean it doesn't harm them business wise to make one and not pay them. A physical book/cd doesn't cost as much as its sold either -- it usually costs way way less. So "cost of physical copy" shouldn't be thought as the differentiator...


The whole publisher intention behind copyright and license rights is that you shouldn't think of digital products any differently than you think of physical products on a shelf.

They don't claim that at all. Digital items are typically licensed, not owned outright (as is the case with a physical book).

That said, the way the market is currently configured, audio books are a separate product from digital text. That's fine and legal. But, we don't have to like it. I have a legally blind* family member and it always feels a bit like a scam when she has to repurchase books that somebody else in the family already purchased.

* She can see, barely. For complex texts (particularly textbooks), she prefers to have an audio copy as well as the printed/digital text.


>Digital items are typically licensed, not owned outright (as is the case with a physical book).

You own an MP3 in the same way you own a book. If you email an MP3 to others in exchange for payment, the FBI aren't going to bust down your friends' doors and securely erase the files from their drives, it's up to the licensor to sue you for damages. Likewise, if you copy books and start selling them at the flea market, nobody's going to round up and destroy the unlicensed copies, it's up to the publisher to destroy you in court.


> How am I not paying for the same thing twice?

In some ways you are. This is the same pain the music industry went through when they went digital. People had purchased records, 8-tracks, tapes, and finally CDs often of the same music and often multiple times because those physical items wore out. Now I can buy a song once and it is effectively mine forever. Book publishers see what happened to the music industry and are fighting every way they can. A lot of times those fights do not make sense, but the industry has decided to fight everything.


Say I want to whispersync The Way of Kings. I already own the audiobook but now I have to buy the Kindle version for full price, which is $9.99 (I double checked in a private tab to make sure amazon wasnt giving any discounts)

Two different things. One is an audio performance, which you bought. It contains elements not in the print/ebook version.

The other is a print version of the same work. The purchase includes the formatting, editing, etc., that went into producing the print version which are not found in the audiobook version.

Why is it people get paid twice for contributing nothing to bring it to another medium?

Why is it that programmers get paid six figures to create just another database application? Aren't they all the same thing? No!? Well, it turns out that creating an audiobook and creating an ebook are two very separate things that each take a fair amount of work and that each deserve to be paid for.


>I think I need something better than that. You are talking about 2 physical items, I don't think anyone would argue just because you buy 1 product off the shelf doesn't mean you are entitled to the rest of the shelf...

Let's say you are an iPhone user and you have bought a bunch of media from Apple. Maybe you really hate the next iPhone announcement and want to move to Android. Would you expect Google to give you a free copy of all the media you originally bought from Apple?

Also you have to get away from the idea that the price of the content is dictated by the costs to produce that content. That production costs is only the minimum price that would allow production to continue. Prices for digital goods are often just dictated by what the seller thinks demand is (since supply is practically infinite assuming there are no contractual restrictions).


> author, editor, publisher, actor, distributor

...and someone to buy the microphones and recording equipment, and someone to push the buttons, and someone to edit the audio.


Which I paid for by buying the audiobook.

I'm asking the reverse, who is getting left out of collecting money when I purchase the audiobook and then want a digital copy for whispersync?


Do you think the audio-book publisher is the same organisation as the printed-book publisher?


You can usually get a discount on the audible version when you own the ebook, but not the reverse. It’s strange.


This doesn't make any sense. It's not zero-cost for a company that provides a book to also provide an audiobook. Both require their own supply and value chains. An audiobook will require recording studios, artists, a streaming audio product, an app, massive storage requirements, separate catalogues and search. The only aspect it shares with the book supply chain (which is also costly for the company) is the IP. It might make sense for Amazon to bundle the kindle/audible versions from a sales POV, but you have no moral right to owning a story in all formats- there are real costs and people involved.


> A digital book requires the same minus actor

The digital book also requires formatting work. Depending on the type of book and the publisher's workflow, this can be simple (an export from InDesign) or very complex (a custom process for converting a non-technical author's Word documents for a cookbook with an inconsistent layout into an EPUB that renders nicely across devices).


For the book you mention, The Way of kings. There are two audiobook publishers; McMillan Audio for the single voice work and Graphic Audio (a different company) for the multi voice.


The physical book example is not a reasonable comparison because you could probably buy a readable, non-waterlogged copy (or alternative physical format) on amazon for less than the retail price. That option is not available for ebooks or audio books, since they're licensed not owned. And licenses for ebooks are unilaterally dictated because of the copyright monopoly.

You want a normal copy of the ebook? Pay $10.

Typeset in Arial? Another $10.

Typeset in Comic Sans? Another $10.

Voiced by a free TTS engine? $20 because our market research indicates people pay more for audio books.

The logical way to approach this is that if a product can be automatically converted from one format to another, getting both formats should cost the same as getting the source format, or at most the set should include a marginal fee for the use of the conversion technology development costs + marginal compute costs).

But of course publishers aren't going to do that. That reality, however, is not a defense of the status quo; rather, it's an argument for copyright reform.

Copyright monopoly forces people to pay again and again for "multiple bites at the apple" for purely-digital products that have no secondary market. They're even more arbitrarily priced than normal IP goods like paper books and CDs and DVDs which at least had secondary markets.


>For the same reason you're not entitled to a free (or even discount) paperback if you buy the hardback

But I am. It would take some work, but I can undo the binding of the hardback and turn it into a softback. I can also scan the pages and print them into a new softback. I can even just retype the story and print it into a new softback.

Now, selling, trading, or lending those copies will be an issue, but I don't think anyone here is arguing that someone should be able to use speech to text and then sell the text generated, so the comparison stills stands.

And I can't expect to be given an official softback book, just like I don't have to expect to be given an official text version. But being able to turn an audio book into text seems well within my rights.


> But being able to turn an audio book into text seems well within my rights.

Maybe it is. The law isn't the best place to glean any useful moral principles from. I think if you're willing to put in the effort, nobody's going to stop you from making your own text version from the audiobook. But once you're in a position where you're distributing that unauthorized version, or you're already a distributor of the audiobook version and add a feature that automatically creates a text version (without appropriate licensing from the copyright holder), it's something else entirely.


>The law isn't the best place to glean any useful moral principles from.

I should clarify I was talking about the existing legal rights owning property entitles me to. Moral rights as well, but that discussion isn't nearly as relevant or useful in this particular case. Sorry for not clarifying.

>and add a feature that automatically creates a text version

This is where I'm not so certain about. If they add a keyboard and notepad that lets me take audio myself, then where is the significant difference in adding software that makes it easier for me to do the transcription? Is there a certain level of ease that anything easier than that becomes illegal?


> This is where I'm not so certain about. If they add a keyboard and notepad that lets me take audio myself, then where is the significant difference in adding software that makes it easier for me to do the transcription? Is there a certain level of ease that anything easier than that becomes illegal?

I don't think there's any uncertainty here. They're creating a text version and distributing it without appropriate licensing. If they give you a notepad, and you happen to write a text version, it's not them creating or distributing it, but it's still unlicensed. If it makes its way on their platform in some way, they're going to need permission, again.

Further, I don't understand why anybody here is giving Amazon the benefit of the doubt. If they want to do this, they can negotiate the required licensing from the publishers. The fact that they've done it without doing this or even discussing it with the publishers shows their clear intent to do what they want, regardless of how much it runs afoul of any laws, simply because they're big enough to fight any lawsuits for ages.


>They're creating a text version and distributing it without appropriate licensing.

If they create it on their machines and distribute it, then I agree this is an issue. It doesn't matter if it is an AI or if they paid someone to transcribe it, they are distributing a work they likely don't have permission to do so. But if they provide you an application to create it on your machine, then it is you who does the work to make a copy and distributing the application is not an issue (there is an underlying issue of distributing a neural network that was trained by private work, but I'm not certain that is the case here and I think that is a separate issue to discuss that is irrelevant to the use of that neural network).

It is my understanding that the user's device does the translation, but if that isn't the case then my entire argument doesn't apply to this particular case.

>Further, I don't understand why anybody here is giving Amazon the benefit of the doubt.

Personally I don't care about Amazon. But I do care about the precedent set when it comes to distributing applications that make doing something legal easier (at near zero cost). I personally don't even prefer the argument they use and don't even think it works that great as an argument (because it doesn't work if you replace the AI with a person hired to do the same job).


> Personally I don't care about Amazon. But I do care about the precedent set when it comes to distributing applications that make doing something legal easier (at near zero cost). I personally don't even prefer the argument they use and don't even think it works that great as an argument (because it doesn't work if you replace the AI with a person hired to do the same job).

I'd say if this were an open source project created by some volunteers because they were genuinely of the opinion that this provided value to people, and it would in the same way that DeDRM is, there'd be no problem.

> It is my understanding that the user's device does the translation

It's Amazon doing this for financial gain, integrating it into their app with which people consume audiobooks bought through their platform, based on licenses they legally acquired that presumably don't allow this use. I don't see any context in which this is okay. We can't excuse anti-competitive behavior just because if you do the right amount of twisting it seems okay and they found a neat way to skirt contractual and financial obligations. I'd say most significant issue is the scale. They're essentially providing automation of creating unlicensed copies for everybody on their platform and presumably will have a significant impact on the original copyright holders. Which simply wouldn't be the case for an open source project (the amount of people who use DeDRM is miniscule).


>If they add a keyboard and notepad that lets me take audio myself, then where is the significant difference in adding software that makes it easier for me to do the transcription? Is there a certain level of ease that anything easier than that becomes illegal?

Yes. The level of ease where a technology impacts other businesses/versions of material (e.g. the audio book industry here) it is made illegal.

If it's unrealistic/inconvenient as to not have impact, it's ignored.


You're also free to listen to the audio book and write down the story yourself. Perfectly legal.


Nope. At least not according to [0][1].

I don't see why you would be. You aren't permitted to manually copy a book to notepaper, either. The whole point of copyrights are, after all, to manage the rights to copy.

[0] https://answers.justia.com/question/2017/09/03/is-it-legal-t...

[1] https://www.quora.com/Are-audio-transcriptions-legally-allow...


Those don't exactly come across as authoritative sources. Show me some specific case law and I might start to believe it.

IANAL but, at least as a practical matter, anything along these lines that's done for personal/internal use seems likely fair game. Someone is seriously going to make a case that I can't transcribe a copy of a speech or other recording for my own use?

It's external distribution that gets stickier--especially as you get beyond short excerpts to a full and complete work.

ADDED: And, yes, anyone can sue anyone for anything. But arguments that something could technically be construed as a legal violation even though no one has ever been called on it in the history of the legal system aren't terribly convincing.


> anything along these lines that's done for personal/internal use seems likely fair game

How about students photocopying books?

Universities seem pretty convinced that's not permitted.


It has been some years but I don't remember mine ever having an issue with me copying books I own. They didn't even have an issue with copying books I borrowed, which seems far more suspect.


This article [0] says that, at least here in the UK, it's permitted to photocopy a chapter, or an article, but it's not OK to photocopy a whole library book, even as a student.

They note however that The concept is not defined in the legislation, so they seem to be going by feel really.

https://www.soas.ac.uk/infocomp/copyright/library/photocopyi...


You're allowed to format shift and time shift content that you otherwise legally have access to.[1]

Audible on the other hand is not allowed to sell you a text version of an audio book that they licensed even if they are using a clever way of producing the text version of the book.

1) https://en.wikipedia.org/wiki/Sony_Corp._of_America_v._Unive....


>You aren't permitted to manually copy a book to notepaper, either.

Do you have any case law dealing with personal use copies only? In my mind it is like copying an mp3 file. I can make as many copies all over my harddrives as I care. It is only when I share those copies with others it becomes an issue.

>The whole point of copyrights are, after all, to manage the rights to copy.

While it is named copyright, generally they are applied to distributing copies, not the actual creation of copies. For example, anytime I make a backup of my computer I make copies of all data on it, including many I have no copyrights to. It only would be an issue when I started sharing/selling those copies.


> Do you have any case law dealing with personal use copies only?

I'm afraid not (IANAL), but university text-copying policies seem to assume that personal copies are very much forbidden, for what that's worth.

> While it is named copyright, generally they are applied to distributing copies, not the actual creation of copies

It also covers creation of copies. Unauthorised recording is copyright infringement, despite that it's an act of creation, not of duplication.

> anytime I make a backup of my computer I make copies of all data on it, including many I have no copyrights to. It only would be an issue when I started sharing/selling those copies.

Indeed. That seems to count as fair-use provided you don't distribute to others.


To extend your analogy, it would be like a bookstore selling (or renting out) a “debinder” machine that can strip a hard cover off a book and turn it into a paperback (and vice versa). Or, perhaps, a book compressor that can compress a book in exactly the right way so that it becomes a smaller, more portable version of it.

(More generally, there are already lots of products that “covert” a book to a similar but more useful item: bookmarks, stands that hold it open, etc.)

Publishers may not like that product, and may even embargo dealers that sell them, but it’s not clear that they would have any kind of legal right to stop them, even if they were only used for this one purpose.


This analogy doesn't work, because different laws apply to physical goods.

You're allowed to physically do whatever you want to the copy of a book you've purchased. You can fold it, you can tear it, you can burn it. Hell, you can even eat it if you really want.

I can also scan the pages and print them into a new softback. I can even just retype the story and print it into a new softback.

(The quote is from the parent's comment, not the one I directly replied to but my reply fits here better.) This is not permissible under copyright law, and never has been. That would be creating a new copy, which is exactly what copyright law was created to prevent.


>(The quote is from the parent's comment, not the one I directly replied to but my reply fits here better.) This is not permissible under copyright law, and never has been. That would be creating a new copy, which is exactly what copyright law was created to prevent.

My understanding is that this is entirely legal as long as you don't sell the copy. Borrowing/giving it away would likely be illegal but not worth prosecuting. But a copy solely as a personal backup is allowed. We do that with data all the time.


I am inclined to agree, however Amazon basically offers you the service of scanning and rebinding as a service, and as you said that is basically selling the rebound book.


I think it is the difference between offering the service of scanning and rebinding and offering the service of selling a scanning and rebinding machine. If I buy the machine, then I'm free to use it to make my own copies.

I mentioned in another post, my understanding is the AI is running on the users machine (given neural networks are cheap to run and expensive to train, it seems plausible the phone is capable of running it). If I'm wrong then my comments don't apply and the issue is more convoluted (I would almost agree it isn't legal in such a case, but I would first have to see about some case law to what I think of as equivalent services, such as online backups/storage/vcr to dvd services).


Sounds like you should play that audio book and start typing.



They're well within their rights in doing that, but the movie industry at least kinda understood that if you want to attract customers and minimize piracy, your best option is to minimize friction and let them watch it on the platform they want.

I bought many blu-rays that came with the DVD version and an UltraViolet code.

It made the choice of buying the blu-ray copy a much simpler choice at a time I didn't have a blu-ray player yet, and it still meant I could watch the DVD version noe, and watch the improved HD copy later.

At least some book publishers offers you the ebook versiin when you purchase a hardcopy, but it's still not universal.

If they want to minimize piracy, their best option is to make their content available in many ways as possible.

I'm not saying the movie industry got it nailed (they still have a long way to go), but they definitely are better than the book industry now.


>For the same reason you're not entitled to a free (or even discount) paperback if you buy the hardback, or why I can't return the paperback I dropped in the bathtub for new copy

And people wonder why pirates exist...


Analogy doesn't work, clinging to old ways doesn't help.

Paper is a way to make information into something you can sell as potatoes. But information is different, it's more like that 'single fish he feed dozen people with'.

For decades already we're thinking about what to do with that low price to copy.


Wouldn't it be more accurate say that you're licensing access to a specific rendition of the content?


My quandary, I have three hundred plus music CDs I ripped years and years ago that I really do not need the physical media. Yet if I give up that media I am likely not legally allowed to keep the ripped versions.

Instead of ripping it would be ever so nice to scan the barcode to gain access to digital versions but I understand the issue, I paid for physical copies and for the production and distribution of those, I did not pay for someone to host and provide for me the means to download the copies.

Now some movies do come with a digital version which you can download separately and never use the physical media.

So I have a dresser drawer filled with CDs I had not used for five if not more years. However I do have to say, I never will lose access to that music nor the physical books I own which is more than I can say for stuff I purchase digitally through Amazon or Apple


The funny thing is that Amazon actually did this several years ago. Many of the CDs I had purchased via Amazon were added to my Amazon Music collection as mp3s, including CDs I purchased as gifts and gave away.


IANAL but I can’t see why you really need to keep the physical CDs


When you buy a book you are buying an implementation of a story. And a company paid for the right to produce the implementations in that form, which is what you're covering in the purchase.

When you purchased a book it didn't give you ownership to all other implementations of the story.

You still must pay for a recording of a reading, a translated version, the story in a film, and so on.

The right to make implementations is agreed by contract... Licensed... And restricted to the form of an implementation and usually a legal/geographic domain.

If you purchased the right to make implementations in clay tablets, you would not have the inherent right to produce a shadow puppet version too. For that you would need another license from the author (typically via their publisher).


information is information. If i purchased a video on casset tape, i ought to be allowed to produce a clay-tablet version (at my own expense, and only for my own use).


> i ought to be allowed to produce a clay-tablet version (at my own expense, and only for my own use).

What in the world does this have to do with Amazon providing a feature on their audiobook distribution platform, for which certain licenses were acquired to be able to distribute these audiobooks, that automatically creates unlicensed copies? If you want to create a clay-tablet version from your own copy for your own use, go for it. But the moment Amazon decides they're bored with audiobooks and start selling clay-tablet versions on amazon.com without appropriate licensing, that's entirely different.


Your analogy would mean that you would have to transcribe the audiobook for your own personal use. That would probably be fair use but you probably wouldn't want to do that.


using a tool to transcribe (for example, a tool provided by amazon) is the same as doing it myself.


But is Amazon actually providing the tool, or just its output? Could I use it to transcribe any audiobook I want, even if it isn't available via Audible? Because otherwise it's not you doing it for your personal use, but Amazon redistributing copyrighted content.


if amazon is redistributing the text without providing a way to transcribe your own audio as well, then it is definitely a copyright infringement (assuming they didn't also purchase this right from the copyright holders).


But that doesn't make any sense. If I want to make a legal backup of a DVD I own but don't own the equipment to do so am I not allowed to take my DVD to a 3rd party and have them create a backup on my behalf?


In the USA this isn’t allowed. Neither is backing up DVDs in general though. https://en.m.wikipedia.org/wiki/RealNetworks,_Inc._v._DVD_Co....


Putting aside the publishers' financial motives, it doesn't sit quite right with me to think of and reduce the value and purpose of the book in text form as being completely subordinate and subsumed by its audiobook version, to the point that the text can be seen as nothing more than an inferior fallback mode not even worth charging for.

Of course I agree that distributing a book as text is basically nothing compared to producing and distributing audio. Making an audiobook (for the most part, I assume) requires the the book text exists in the first place, so the book text is a sunk cost in terms of production.

But it doesn't make economic sense, because a book in e-text format has actual value, and continues to have value and utility even if you end up buying the audio version too. As long as an e-text format has value, how could the market justify reducing its value to 0 when it's part of an audiobook purchase? Wouldn't the price for all audiobooks (that now come bundled with text for free) end up being increased across the board?

People still value the experience of text, enough to be satisfied with paying for text and text alone. Would the market be better if e-text prices stayed the same, and all audiobooks were bundled with text for "free", at a higher price? Sure, if you're the type to buy both audio and text. But not if you frequently buy just audiobooks and are now forced to pay a premium for the bundled text version.

Thinking of real-world factors, I'm assuming that the pay structure and system for audiobooks is different than it is for text. If a writer trades audio rights/royalties for better royalties from print/ebooks, giving away text with audio will hurt their bottom line.


> text form as being completely subordinate and subsumed by its audiobook version

Close to this, an other angle is to see the text version as source, and the audiobook as a representation.

The source having more information, or allowing for more representation has a real value. The same way raw files/original film in photography have a different value from a print.

To go on with the analogy, historically we pay for photographic prints and rarely for the raw file (or the raw file would cost more than individual prints). And there is the same debate, when a photographer shoots a wedding and the couple didn't negotiate the raw files, they can be surprised they don't get them by default, even when paying for the print.

I think in general the source/original material should be included in the sale of the representation, and the whole priced accordingly, but I get that it's not how the publishers see it, or how some customer would bargain to get only the cheapest part, and that behaviour becoming the least common denominator.


Of course I agree that distributing a book as text is basically nothing compared to producing and distributing audio. Making an audiobook (for the most part, I assume) requires the the book text exists in the first place, so the book text is a sunk cost in terms of production.

Once you write a piece of software, it is also a sunk cost. Therefore I guess you shouldn’t be able to charge for it based on the number of people who are using it?


But this is more like being charged a full license cost for a CD copy and a Digital Download.


Unfortunately, even five year olds realize that the reasoning they use doesn't make sense.

I gave a short talk on the economics of information and started with a very similar example which was this; Given that 3 ounces of ink and 200 pages of nice 20lb paper, bound with sewing and combines with a cardboard cover into a book costs roughly $3 in reasonable quantities, costs $20 when the ink is used to print J.K. Rowlings story of Harry Potter and the Deathly Hallows?

Clearly the same package with my scribbling about wizards and magic would likely not even be valued at enough to cover the $3 printing cost :-). So it is clear that the information has an intrinsic value, but what is less clear is how it develops that value and how much of the value can be captured.

The book publishers have gone down this road before[1]. 10 years ago the first 'text to speech' features came out in Kindle and Amazon backed down. Ten years later the Kindle is in a stronger place, so Amazon is perhaps inclined to not back down this time.

What I find particularly amusing, is that publishers, when asked back in '09 if it was "Okay" for parents to read their books aloud to their children, they agreed but felt reading aloud to a group of children (as a teacher might do), or to a group of people unrelated to you, was a violation of their commercial rights which unfortunately they couldn't do anything about. Not so with the Kindle, which is standing in here as the reader, that gives them someone to sue.

As technology gives people the ability to put information 'goods' in a real goods market framework (hard to reproduce, Etc.) the owner of those goods exploit those capabilities and litigate anything that would damage their capability to do so.

Personally, I think it is a matter of time before the economics of information become more mainstream in their understanding much like the changes that allow people to reason about "service work" in discussions of GDP. But it is going to take a while. I look forward to reading papers that talk about economies working with goods, services, and information as their drivers of economic activity.

[1] https://www.techdirt.com/articles/20090227/1759173928.shtml


Different industries which have established themselves over decades.

It's like software licencing. You buy a licence for the text, or the audio.

It's why buying an MP3 doesn't entitle you to the Sheet Music.


For music, it can get really weird. One company can own a specific performance, a second the lyrics, and a third, the sheet music.


I would love a version of the audible subscription to include the kindle version. Paying multiple times for the same information doesn't feel right to me. Sure, everyone in the chain should be fairly compensated.


One thing to keep in mind is that the rights that the publishers are exercising here are the same rights initially possessed by the writer, before they transfer them to the publisher. So it's about the publishers, yes, but before that it's about protecting the creators.

> I get why the audiobook is more. I'm paying for the author's story + actor. But I never understood why once I have the audiobook I could not be allowed the text version for free.

The audiobook isn't just "more", it's technically a separate work altogether. The audiobook producer will acquire a license to use the book in its own work. Presumably the consideration for the license is either straight up cash or some cut of the profits from the sales of the audiobook, or some combination of the two.

But copyright is not just one right, it's a bundle of rights. And most likely, when the publisher licenses use of the book for creation of the audiobook, they are (probably explicitly) not also licensing rights of distribution of the text of the book. So disseminating the text of the original work would be a violation of the publisher's rights.


I mean, let's put it in these terms and see how you feel.

Write a book. Your average book that reaches shelves takes 4 to 8 months to write. About 2 months for the first draft and roughly 4 to 10 rewrites depending on the author's quality.

Then it goes to a publisher that hires an editor to comb the book a few times over. Discussing aspects of the book and a decent amount of rewriting from here as well.

Then theres cover art.

Oh and typesetting/formatting for actual publication.

About a good year's worth of work along with many people for one book.

So, go ahead, do that all for free and dont feel salty when some other company takes your story, exactly word for word, makes easy money off of it and gives you nothing.

Just saying. Rent and food isn't paid with like buttons, reddit karma, hn upvotes and hugs/kisses.


But the author, the editor and the cover artist are still getting paid: people are buying audio books for money. The only people who don't get paid are those who did print formatting - but their work isn't being offered here, as Audible is only showing captions and not full pages.


Other responses have tried to summarize WHAT is happening, but not as much as the why (or at least, the direct why as opposed to the handwavey "encourage innovation" why).

For that, look to the past before copyright was a thing. Sure, there was the immediate result of when you put out a book others would copy the text (sometimes poorly). But another huge issues was _translations_. If you put out your book in, say, English, that meant you had a fairly small market. Those with the money for books and the background knowledge and interest in your topic were probably among the smaller well-off classes in several countries. Translations of a hot title were big money - and they'd be done, probably by others, and probably not as you (the author) would like.

Thus, while the first copyrights were "don't copy the text", derivative works like translations were soon added. And because other languages mean other countries, there was an incentive to create international agreements, and this led to modern copyright.

Arguably, all of the above are pretty decent reasons for the copyright protections we have. Consumers benefit because creators can expect to be paid for their work, and because creators are more likely to focus on quality than just getting it out first (particularly on translations).

In that light, the argument that a speech-to-text translation requires permission makes some sense, at least legally. One could even argue that the copyright-holders (publishers, in this case) have a reason to not want a poor speech-to-text translation to ruin the reception of their works.

Of course, this falls down when the publishers are preventing purchasers from making use of such services even when those purchasers are aware of the source of the translation, and any motive other than greed falls away when the publisher wants to be paid a second time. But the publishers aren't coming at this from nowhere, it's an existing, long-present concept.

Arguably laws should change. Alternatively, consumers could demand changes in behavior (I have low expectations of that, but that's the theory).


>But I never understood why once I have the audiobook I could not be allowed the text version for free.

The opposite is also true, that Amazon removed and no longer allows text-to-speech for Kindle books, except for certain devices that advertised the feature like one of the Kindles and the Fire Phone. I use my wife's old Fire Phone to do this, and it works rather well.


I believe the publishers sued Amazon to have them remove that feature


In software terms, by buying a book you are purchasing a (transferrable) license to read a story. An audiobook is a (generally nontransferrable) license to hear a story. As far as copyright is concerned those are different things, so since publishers do want more money, why not charge twice.


If you bought a movie (let's say a DVD), does that mean you should be able to get the book for free (even in ebook form)?

IMO both are completely separate products. It should be up to the publisher on whether or not they want to bundle the text version for free with their audio book (some do).


That I think is exactly the point that we should be 'fighting' for.

I bought $medium, and I want to consume it however I want. I didn't "request a license" to use it how they want me to.

For Example, take a Stephen King book. If I bought a Stephen King hardcover book; I can use it as a paper-weight, bug-killer, timeless tome to read under the covers, or Other. If I bought the DVD movie, I can play it for personal enjoyment where-ever I want, however I want. I can also just listen to the audio output, or mute it and just have the video playing. I can consume it in anyway that I want. If I am buying an e-book, the options and methods of consumption change are different but should not be limited. If I ask my partner to read that hardcover book to me at night the Publisher doesn't get to charge me addition money for it.

There have been text-to-speech computer programs going back DECADES. If I had Sound-Blaster 'SAY.exe' [1] read to me the ebook that I bought NOBODY would be crying over "you are stealing our monies!"

The modern text-to-speech is nothing more then SAYv2019.

If I don't want text-to-speech, I'll but the Audio-book and hear it performed by a real-person.

[1]https://ibm-pc.org/manuals/other/creative/SoundBlaster15.PDF - page 25


If you bought a book, does that mean you should be able to read it out loud to your children for FREE?!?!?


That's a tricky one. I'm not a copyright lawyer.

My spidey sense tells me it's probably in violation but it's one of those things that are so close to impossible to enforce that no one would ever get caught, plus if they somehow did they caught (their evil genius 5 year old planted a camera in the room, recorded it and sent it to the publisher) nothing bad would come of it because any publisher who sued a parent into ruins over reading a book to their kid would probably be outcast from society if the media got a hold of that story.

Now, if you recorded that reading alone and put it on Youtube, I imagine that would for sure be a clear cut violation even if you didn't monetize the video. The same goes for reading it out loud in a public place. The only real difference there is you're no longer in the privacy of your own home and there's concrete proof of the violation in a much different context than reading to your kid. Now you're distributing copyright material.

I had something similar come up with one of my video courses. Someone wanted to buy a single license of my course and then teach the course live to a bunch of high school students in Africa. They asked me if they could do this with only 1 license even though a bunch of students would be consuming the content. I said sure no problem and wished them the best and to ask me any questions if there were any issues. I guess that's one benefit of publishing things on your own. You can treat these things on a case by case basis.


Except a novelization of a movie is entirely different than the movie itself. Why is this any different than doing text-to-speech of an ebook? As long as this is done on-the-fly and the text isn't stored afterwards, what difference does it make in how the original piece of media is consumed? Amazon isn't selling either form of transcription. What if someone made a device for the deaf that would transcribe spoken word to text? Would it be fair for the manufacturer to be sued for possibly doing speech-to-text on some copyrighted recording?


That's a different argument. The right argument is whether you should get subtitles and/or the script for free. Similar would be whether YouTube can automatically add subtitles to videos without explicit permission from the video uploader.


> Similar would be whether YouTube can automatically add subtitles to videos without explicit permission from the video uploader.

That's a very good one.

My personal opinion is Youtube should not be able to add captions without your consent because suddenly this changes everything.

What if one day Youtube decides to add a transcript feature where they can turn a video into a decently formatted blog post. Now imagine if you had your own website where you used to make a blog post about your video manually and then uploaded your video to Youtube with a link back to your site.

A move like that could remove a lot of traffic to your site because chances are Youtube would outrank your personal site by a lot in search results and would also give people less of a reason to ever click back to your site.

But this is really similar to audio books vs text. They are both independent mediums and having access to 1 shouldn't guarantee free access to the other. It should be up to the discretion of the content creator (or publisher if they are not self published).


If you are deaf, you are really hoping that YouTube will add captions to everything, because that's the only way you know what is being said. It would "change everything" by giving deaf users access to youtube...


It's more complicated than that.

Auto-generated captions (something I deal with on a regular basis as I do create video courses) tend to be horrendously bad in quality even on major platforms. Especially with tech content (even if you talk crystal clear).

Auto-generated captions that came out bad leaves both the reader wondering what "jean kids" means when you really said "jenkins" and it also de-values the content creator's brand because now you think "how did such crappy captions make their way into this video?".

Now with that said, I'm all for providing human created / vetted captions that are accurate, but that should be up to the content creator to decide.


It should be up to the content creator to decide if they want their content to be accessible to deaf people?


It should be up to the content creator to decide which medium their content is released as.

The platform should never step in and create new content based off your copyright content in a format that you didn't explicitly release.


Granting permission is probably in the YouTube ToS, at least if the lawyers thought about it and considered it to be an issue.


"cross-format products" is basically their way of saying "talking book as an app".

You can find lots of these on the App Store; in the days of yore they were also distributed on CD-ROM (as a separate purchase from the corresponding book and cassette book).

One of my favorites is the app for "Moo Baa La La La" which lets you tap on individual animals in addition to the "read-along" mode.

https://apps.apple.com/us/app/moo-baa-la-la-la-sandra-boynto...

So, you could say that this ML product is putting certain software developers out of business.


Forget about the philosophical "because you didn't buy that" or the various decent counter arguments. What it comes down to is that publishers purchase very narrow licenses of an author's output. One may specifically buy print rights, and only print rights for a specific geographical region like English USA. Or just the audio rights for a region. The same publisher often does not own both print and audio. So the big publishers that deal mostly in print get very defensive when sales of a copy they had no part in might eat some of the sales they might otherwise get. That's what you see in this case.


I'd have to check but I'm not so sure you're buying, in the traditional sense, either of them. In modern legal terms, you are paying a fee for the rights to access that work in that format.

It used to be you'd purchase a print book, you'd own it, and you could do whatever you wanted with it (e.g., resell it, burn it). That's not the case anymore. You can't resell it, not because of the format, but because you don't own it in the first place.

I'm not a legal expert, and I know it might sound like I'm parsing words, but I feel there's some validity to my 50k feet assessment.


> How are they doing anything but trying to charge me twice for the same story?

That's essentially what is happening, but that's not the onus for the copyright law.

Standard IANAL caveat.

The law is set to protect and encourage content creators by clearly defining what is copyrighted work, and how that copyrighted work can be used.

> why once I have the audiobook I could not be allowed the text version for free.

Speculation, but it could be that publishers haven't figured out the simple economics of rolling the cost into the price of an audio book, or that publishing deals with authors prevent that type of bundling.


Related scenario: Imagine I sold a digital image (artwork). Imagine I sell two versions, each licensed separately (if you want both, you have to buy both): One is a full color image, and the other is a grayscale image.

If you display the color image on a black-and-white monitor, you will be seeing something virtually identical to the grayscale version.

By the logic I'm seeing here against Amazon, it would therefore be considered copyright infringement to sell black-and-white monitors?


I largely agree with this logic per my other comment [1], but I think your analogy leaves out an important element (as does mine!), which is that the converted version only lives within Amazon's platform; it's not like Amazon is licensing you a tool that you can apply to anything and happen to use on an Amazon-published work.

So that makes it more like a derivative work Amazon is distributing than a client-side format shift.

So then, let me tweak your analogy a bit. Let's say a studio sells copies of a movie to theaters. They sell the color version and the B/W version, and different rooms/times might have one or the other. The theater sells out all the tickets for the B/W version, but can't any sell tickets for the color version.

Thinking quick, the theater management decides to sell tickets to a color showing as a B/W showing, and when patrons come in, they put some layer over the screen so it looks B/W.

In that case, I could kind of see how that would violate the copyright/license agreement of the studio -- you're selling showings of the B/W version you didn't purchase. [2]

Your scenario would be more like a customer coming in with filtering glasses that make it look like the B/W version to that customer only, in which case, yes, it would be harder to make the case that it was copyright infringement.

[1] https://news.ycombinator.com/item?id=20800777

[2] And yes, in practice, the studio wouldn't turn down the money, but let's say they're stubborn about it for some reason.


> black-and-white monitors

Don't know. But is it similar to DeCSS technology and the MPAA.

B&W Monitors could be argued to have multiple purposes, a side-effect is showing copyrighted images.

I'm not fully informed as to what Amazon is doing, but if their technology is being applied specifically to create digital text versions of Audio Books, then it could be argued in a similar vein as DeCSS, which was argued to be necessary to provide "backups" of physical media.

It could also be argued that Amazon is used legal maneuvering for a future showdown in which they can argue the effort to create digital text books is minuscule and thus drive the market price down

Again, IANAL.


I love this example. I can understand why the audio version isn't included for free with the text version: the voice artist had to work 40~50 hours to create it, akin to the coloring artist in your example. But the reverse doesn't make sense. When I buy an audio book, I'm already paying both for the contents and for the execution: surely the contents were already paid for.


Here's something even more crazy. You can buy an ebook but you can't use a text to speech engine on the actual text.


The publishers have been strong armed by Amazon for well over a decade. Apple was sure and forced to settle by the DOJ for giving publishers more control over what they could charge and it strengthened the hand of Amazon.


And if the story was made into a movie at some point, they're screwing you by not throwing that in as well? What about other languages?

The product isn't the story by itself.


I'd imagine the real reason for the complaint is that the royalty structure is more lucrative for the publishers and/or authors for books vs audiobooks.


Hard to move too far forwards without acknowledging that copyright law's constructions are deeply divorced from the public's values.


You're not entitled to a sheet music when going to a concert either.


Ok but explain why in a way that doesn't end up with "because publishers want more money"


Do they want more money, or just not to lose existing income?


those are the same things in essense. If their market is shrinking, they are losing "existing" income. If they use force to prevent this from happening, then it is the same as wanting more money.


That’s a reasonable perspective.


> When I buy a book I am buying the story.

This is 100% not what you’re buying. You’re buying a book or an audiobook. Could be 200 blank pages or 10 hours of white noise.


The whole idea of copyright is unsound, so arguing about it's 'intricacies' is pretty pointless. Garbage in, garbage out.


In true ELIF:

Copyright = right to make a copy


When you buy a print book, you are buying a physical license token for a copyrighted work, bundled with a single copy of that work. The license is attached to the copy.

These are actually separable. If you remove the cover from a book, you are destroying the license token, without necessarily destroying the [now unlicensed] copy. This is something various entities in the book industry may do, to refund unsold copies from the distributor without incurring return shipping, warehousing, or handling charges. Instead of sending it back, the book is rendered unsellable by removing the cover with the ISBN on it, the paper is tossed in the trash, and nobody gets paid for the unsold copy. You can theoretically fish the pages out of the trash and read the book, but that's now technically piracy, because the copy is unlicensed.

If you rip the cover off of a book you bought yourself, no big deal. You still have the license, because you bought it, but then you can't resell your copy, because the license is no longer embodied in a physical token attached to the copy. You can't prove you have the license, and any potential buyer couldn't be certain that you transferred it to them.

For an audio-book on physical media, when you buy it, you buy the license implicitly attached to the copy. But since copyright is a mess, a recording of audio is a different copyright from the material used to generate the audio. An audio-book of Jeremy Irons reading a story has a different copyright as an audio-book of Bobcat Goldthwait reading the same story. You're not licensing the story itself, but a specific recording of the story. It's the same reason why a piece of music can be in the public domain, while a recording of it made by the FooBar Ensemble in 1991 can still be copyrighted. That theory of copyright is also how museums and collections try to claim copyright on digitizations of visual art that has long been public domain. There are a lot of tests about "sufficiently transformative" and "slavish copy" and whatnot. There's also a lot of money in ensuring that cross-media copyrights remain separated.

From the publisher perspective, someone has a license for an audio recording. They probably also sell licenses for print copies, and licenses for e-books. And these are all sold individually, not as a bundle, because they are allowed by law to treat all these as separate works.

From the consumer perspective, buying the audio recording is implicitly licensing both the specific recording and the work on which it was based, because otherwise, how could one possibly legally enjoy the content of that recording? So the consumer considers it perfectly acceptable to transform the format of what they believe they have licensed. I paid; this copy is mine now; I can do with it as I please. Parent post reinforces this. "I'm paying for the author's story + actor." But you're not. You licensed a copy of a recording, period. It's a derivative work, and you're piggybacking on the recording-owner's licensing of the original story, that gives them the right to make an audio-book.

This confusion is not the consumer's fault. It arises from the non-intuitive cross-media parts of copyright law. It arises from the publishers and distributors of copyrighted works exploiting the ambiguity between buying a license and buying a copy. They would dearly like to have their cake, and eat it too. When people buy cakes, and then eat them (as one does with cakes) they get upset, because they sincerely believe that selling a cake and selling the right to eat a cake are completely different things.

If they choose the non-transferable individual licensing model, they can no longer sell the same work to the same person multiple times. If I lose my file, I still have the license, and I can either demand a fresh download from the distributor or acquire one from another source without being considered a pirate. If they choose the license-attached-to-copy model, they can no longer capture the resale market. I can read or listen to my book and then resell it to someone else. What they want, but should not have, is a revocable non-transferable license attached to a single copy, so that if I buy a copy, and lose it, I still have to buy another copy, but if I then locate my original copy again, I can't resell it or get a refund for it. That model is worth more money to them, in their estimation.

In my opinion, the publisher argument is bullshit. But it is founded in law. The law happens to be an ass. So they will likely win, until a federal appellate judge decides that they want their public service career to be destroyed by Disney et al.


I don't know about the legality, but this is seems like a very useful feature.

When I was a teen (well, younger teen), I learned to understand and comprehend western accents by watching a lot of videos by North American or European creators and using close captions on YouTube. At this point, I don't need them anymore, but I certainly did when I first got started.

Clearly this feature on audible can serve a similar purpose, and can definitely help students like myself in developing countries who are taught how to read and write English, but not exactly speak it - at least not in the same way as the western world.


Of course it is a useful feature, that's why Amazon wants to add it. They could pay additional licensing and include the original text as captions just as easily ...


Does it really help to learn English to have a poorly transcribed audio version because Amazon doesn’t want to pay the license fees?


I think transcription (even if not perfect) helps a lot when English is not your first language and you are still training your ear. Remember not everybody processes audio the same way, think ASD or ADD people.


The implication hidden in OP's comment is that Amazon, a multi-billion dollar company, is seeking to pay less money to people generating content for its platforms and instead provide users with an inferior product. The question was semi-rhetorical, with the understood context being that the best choice would be for Amazon to pay for all the licenses required to make their products better.


Maybe I'm misunderstanding, but it seems to me that it's the consumer, not Amazon, that's being asked to pay extra fees. The consumer bought an Audiobook, Amazon wants to provide read-along transcription for that audiobook, but the publisher wants the consumer to have to pay extra for that.


Do you think that Amazon couldn’t offer a sum of money to publishers that was lower than the retail cost to allow a real, accurate transcript?

The music publishers accepted an offer from Apple to allow any song from anywhere to be counted as a “purchase” for $25 a year through “iTunes Match”. There is some number between $0 and $full_wholesale_price that the publishers would accept.


Tangent: I enjoyed www.fluentu.com when I was studying french. They provide tooling around YouTube videos and sub-titles in foreign languages.


I wonder if the publishers will also be mad about Android Q's upcoming feature which can transcribe any audio on your phone: https://www.theverge.com/2019/5/7/18528447/google-android-q-...


Probably - publishers are always mad.

Kindles actually used to have headphones jacks and a feature to automatically transcribe books that you owned into auto-generated speech.

It was a fantastic accessibility feature, but the publishers complained, and the feature and the hardware supporting it were both removed. Although like people are saying, I guess the headphone jack also probably wasn't a very popular feature.


Kindle Speech to Text is still a feature. It’s just optional for books to support it or not.

Headphone jack was probably removed to save money... I think the high end kindles support Bluetooth.


> It’s just optional for books to support it or not.

Translation: It's optional for the publishers to enable DRM restricting it or not.

If I buy a paperback book, there is nothing stopping me from building a machine I can insert the book into that scans the page, OCRs it to text, and then converts the text to audio.

DRM'd ebooks are a travesty, which is why I immediately remove all DRM as soon as I buy one so that I have full freedom to do whatever I want with it.


> If I buy a paperback book, there is nothing stopping me from building a machine I can insert the book into that scans the page, OCRs it to text, and then converts the text to audio.

Nothing except the extreme hassle of doing so.

Digital is different, because it’s so much easier and cheaper to perform such transformative operations.


The point is, there are no real restrictions to use of paperback books. I can whiteout words, tear out pages, cut out quotes and tape them on my wall, photocopy, etc. Basically you have the freedom to do anything you want with it because you own it. You would probably get into hot water if you tried to start selling/distributing unauthorized copies, but as far as private use goes, the sky is the limit.

eBooks are extremely restrictive by comparison. They are so locked down that you can't do any of the above things I mentioned. You only have one basic freedom: to read it with approved software on an approved device with the publisher in complete control.


Kindles actually used to have speakers on the side, not just a headphone jack.


This will be a wonderful accessibility feature, and if I were an Android developer and a publisher approached me to request that their app could opt-out, my reply would be "pound sand." I actually think it's pretty ridiculous how many folks are seemingly on the publishers' side here in the comments. Let's frame it a different way:

* Manufacturers of reading glasses don't require a license to transform works a user consumes into a format more accessible to them

* A built-in magnifier app does not require a license to transform works a user consumes into a format more accessible to them

* A built-in audio transcriber - and this seems to be the sticking point somehow? - should not require a license to transform works a user consumes into a format more accessible to them

A lot of people are getting caught up on this whole "whether it creates a derivative work" thing. And my point is: yes it does, of course it does, any method of consumption effectively does, and these should be explicitly allowed in the name of doing what's right for users and accessibility.

Whether it's built in to the app or built in to the OS, the same transformation is happening. And it's such an essential freedom to be able to consume media that you purchased in an accessible form that I could really give a rat's ass if the publisher thinks that you're only "licensed" to consume that media when it's emitted as sound waves from their approved audio source.


The big difference here is that Audible is an audio-book seller which has licensing agreements with publishers, and they are commercially offering the text transcription.

Those publishers probably don't have much leverage against third parties who make tools for personal use. Making copies for personal use is legal, as is manufacturing devices which can make copies. I figure the Android functionality is probably legal given this case:

https://en.wikipedia.org/wiki/Sony_Corp._of_America_v._Unive....


This was my first thought as well. Android will basically be able to do the same thing that Audible is doing but on a much larger scale. I wonder how many different licenses that is going to affect?


I feel like a number of AI applications are a bit like this. In this case, the training set could be the original book text, and an audiobook reading of the books (over thousands of books). When trained on that dataset the audio recordings are used to regenerate the original text.

So training set==test set and you don't care because the only purpose of the model was to circumvent certain potential copyright issues.

I feel this same issue exists in many other areas too. A model gets trained on a bunch of copyrighted data, and then can generate similar data... It seems like a kind of gray area. On one hand you might say the learning process is just "inspired" by the copyrighted material... someone else might say it's just derived from it (and should require licensing).

This often seems to get overlooked, particularly in image generation applications. Yea, of course you can generate interesting looking pictures. The question that often remains unanswered is how novel are these images... what is the closest image in the original dataset.


It's an interesting question. Arguably, any model trained on copyrighted works could be said to be a derivative of those works, as the model is composed entirely of the copyrighted works run through a deterministic computational process.

But on the other hand, we certainly wouldn't apply that same standard to a human who "learned" from copyrighted works, so where do we draw the line?


A JPEG is also sorta kinda a derivative work in the abstract. It learns the common patterns in an image and makes a new image with those patterns. A neural network that is hugely overfitting will do the same. The legal (and in this case maybe moral?) line isn't drawn based on the algorithm. The line is drawn based on the output. Is the output sufficiently original and transformative?


> But on the other hand, we certainly wouldn't apply that same standard to a human who "learned" from copyrighted works, so where do we draw the line?

Are you actually having trouble figuring out the line between commercial ML product and human being?


No, I'm talking about the line between learning and memorization. Not the overall line between human intelligence and modern-day AI (where there are certainly obvious distinctions).

Let's say, for example, that a human who was not a native-born English speaker taught themselves to speak English solely by listening to hundreds of copyrighted English fiction audio books, and comparing them to translations of those same works of fiction in their own native language. (Maybe not the easiest way to learn English, but entirely possible.) Now suppose that human used their newfound mastery of the English language to teach someone else to speak English. Since their knowledge of English was solely derived from copyrighted works, would their knowledge of English be considered a derivative work and therefore subject to copyright? Obviously not.

Now suppose the same thing occurred with an AI. The AI learned to translate some other language into English by reading hundreds of thousands of translated books in both languages. Is the AI model a derivative work, and therefore subject to copyright?


There are ton of things humans can legally do that the computers/corporations can't. I can memorize a copyrighted work (make a mental copy) without triggering a copyright infringement. I can remember a friends email address without triggering GDPR. Why is this suddenly a difficult question just because it's machine learning versus machine copying/storing?

That still leaves the question about should ML training sets do/should count as copyright infringement, but that's completely distinct from the idea that a human can do it without causing a copyright infringement.


Not sure copyright of a Audio version of a book include to use for other uses such as training models. If this is not mentioned in copyright, I would still assume publisher can a make a strong case about it is used for not the intended purpose.


In at least some cases, it would be considered fair use. The wikipedia page on fair use has this to say, regarding Google's book scanning for example:

As part of the ruling that found the book digitisation project was fair use, the judge stated "Google Books is also transformative in the sense that it has transformed book text into data for purposes of substantive research, including data mining and text mining in new areas" [1]

To me it feels like a continuum. On one end of the scale, the model has been informed by copyrighted data. The model has learned something about the structure of the data, which generalizes to new examples easily. On the other end of the spectrum it's just an efficient encoding of the corpus dataset (in this case mappings of sounds to text). It's not learned anything...

How do you distinguish these cases? Can the second type of "machine learning" be used to circumvent copyright in some cases, and is it?

[1] https://en.wikipedia.org/wiki/Fair_use#Fair_use_in_particula...


Indeed.

One particular argument in Authors Guild, Inc. v. Google, Inc. is about search functionality, which may be little similar. If I understand correctly google books lets search the book. But not sure, if they improved or modeled their search engine on copyrighted books.

On the same line, if I remember correctly google used classic and out of copyright material to teach its translation models. Which will resonate more to the current case.


I don't see a practical way to prove this though. The software is developed behind closed doors and unless someone explicitly leaks this information, it's just a caption feature working really well. Amazon could claim they trained it on a legit data set they had all the necessary rights for.

Or am I missing something and they're openly admitting they used it?


Just because it is difficult to prove does not make it less of a violation.


> Not sure copyright of a Audio version of a book include to use for other uses such as training models.

it must. I am using the audio book to train my meat-brain implementation of a neural network. This is no different than training an electron-brain neural network. I have purchased the book and the information contained within must be usable in any form i desire, except for copying and redistributing the text.


Legally I believe this distinction is quite clear (IANAL) between these two cases.

You can memorize a whole book, and it's not copyright infringement. If you, create a compressed zip of book you that infringes if you don't own the copyright.

If a trained model really doesn't infringe, then that's it copyright is dead. Because it's not difficult to overfit a model such that it can replicate the original work. And if that model is no longer under the original copyright it can be shared without restriction.


> If you, create a compressed zip of book you that infringes if you don't own the copyright.

of course you don't - a compresssed zip, or any transformation of the text is a derivative work. But the neural network you use the text to train is not a derivative work unless you intentionally only used this single book to train it for the express purpose of reproducing the original text. In which case, that is no different than doing a format transformation, and that is protected under copyright.

What i m saying is that any book that is used as a source of knowledge cannot claim derivative work on the resulting culmination of knowledge.


Why do you believe it's not a derivative work? Let's say a neural network was trained on two books, and it ends up reproducing large fragments of text from those two books. Is that a derivative work?

Or 3 Or 4... And reproducing how much of the original work exactly, under what conditions. Does the size of the model factor in to the analysis?

It doesn't seem as clear cut as you suggest to me.


Yeah, if you trained an AI to write Harry Potter fanfic (and remove all the trademarks by hand, no Harry Potter(TM) etc), is the result a derivative work? It seems like it could be, even if it's generic child wizards at a generic wizarding school. Fanfic is generally a derivative work.


IANAL, but I would guess you can't simply use everything for professional purposes, this depends on the licence/copyright of said stuff. In other words if you use copyrighted/lincenced stuff in any form or in any phase of your product without having permission to do that, you are breaking the law.


If you don't have 1:1 match in the text, I see no reason why it would be considered an infrigement. Isn't it what people also do? There is no parthenogenesis so a system that "learns" and doesn't reproduce should be treated the same as a human who reads and then writes


I don't think you can pick a book from your library, narrate it, and then start selling the audiobook without permission from the copyright holders.


But if I bought a book, read it, and then recited from memory as good as I could (which is obviously not 1:1, but a vague idea of the story), adding things from other books where I didn't really remember what was happening. I think, I've seen thousands of movies/books promoted as original content, but made this way


What you describe would be a derivative work, and you still won't be able to sell it without a license.


It all boils down to a very complex function of exactly how close to the original you are and how good your lawyers are. The better your lawyers, the closer to the original you can go.


Audible books are licensed though. The author is still compensated for an audio version of a book. If I bought a movie and had a smart TV that performed AI based closed captioning would that be some sort of copyright violation?


Yes. Captions have copyright.


They're not reselling it nor redistributing it. The captions are derived from the AI and presented to the user. It may need some court precedent for this sort of thing, but I believe this would fall under "Fair Use".

Fair Use Within copyright law there is a doctrine of law known as fair use. Although fair use does not actually give permission to make copies of a work or otherwise use a work without consent, the doctrine provides a defense to copyright infringement. The factors considered when applying the doctrine of fair use include: the purpose and character of the use, the nature of the copyrighted work, the amount of the work used, and the effect the use has upon the market for the copyrighted work.


Do you feel the same about Cambridge Analytica deleting their FB data but not deleting the ML models learned from it?


Not at all, the very act of learning from that data was (arguably) illegal. Reading a book is not illegal.


Honest question: If Apple would integrate an AI-based speech-to-text feature in the next iPhone as an accessibility feature that works for all audio captured by microphone or generated by apps, would that be seen as copyright infringing? Would it have to be disabled for Audible (even at the request of a single publisher), Netflix (by request of a Netflix licensing partner) and other apps? If yes, then the publishers have a point. If not, shouldn't Audible be allowed to add thus a feature to their app if the speech-to-text feature doesn't use the original manuscript but only the audio for Realtime processing?


Actually Android Q is adding exactly that feature so we may get to see this argument play out: https://www.theverge.com/2019/5/7/18528447/google-android-q-...


Legally, I suspect both cases would be handled the same. Real time speech to text for private use should be legal.

However, Audible has to license content from copyright holders, which gives copyright holders a certain amount of power over Audible to force them to not offer such a feature, independent of the legality.


Apple has a history of playing nice with copyright holders and allowing them to opt out of things like screen capture and even playing audio in the background on web pages (see YouTube)


By this logic, when you buy a book, you'd need a special license to read it to your children. It's the exact inverse of speech-to-text - if one is forbidden, then so is the other.


The critical difference is that your reading it to your children is not distributed or for sale, therefore is in private fair use.


But allowing a user to convert between text and speech on their own device isn't selling the converted version.

And if this conversion is done not locally, but via SaaS, that's just an implementation detail. I believe there's precedent for this (which I cannot find at the moment), but if this was not the case, then you could not touch any document you did not have copyright on using e.g. Google Docs.

Edit: Found the precedent where time-shifting (i.e. video recording) was found to be non-infringing, regardless of if it's done locally or remotely [1].

https://en.wikipedia.org/wiki/Cartoon_Network,_LP_v._CSC_Hol....


This isn’t the user doing the conversion, it’s an Amazon product doing it. You say it’s “just” an implementation detail, but the distinction of how it’s being generated matters.


If I add notes to a PDF using Adobe Acrobat, did I add the notes, or did an Adobe product do it? What if I do it with Google Docs, in the cloud? Then it's not even on my machine.


You are conflating agency of action with the means of action.


This is a distinction I don't understand.

If I run a local speech-to-text client on my computer for my own use, that's pretty established fair use.

If I run the same software on a Raspberry Pi and transmit the results over my LAN for my own use, I assume that's also fair use.

If I provision an AWS VM and run the same software there, and then transmit the results back to my device for my own use, is that fair use?

If I get my best friend to provision the AWS VM and run the same software there, and then transmit the results back to my device for my own use, is that fair use?

If Amazon provisions the AWS VM for me, and runs the same software there, and then transmits the results back to my device for my own use, is that fair use?

People are drawing a line here that I genuinely don't see. The whole thing feels incredibly arbitrary to me. The agency of action is that the user is asking a computer to perform a task. The means of action is a piece of software running on a computer, computing results on the fly.


>If Amazon provisions the AWS VM for me, and runs the same software there, and then transmits the results back to my device for my own use, is that fair use?

Did Amazon do it out of their own volition, or did you instruct them to do it? Also what you're considering fair use, may or may not be fair use. Fair use is a specific exemption made in a specific context. You cannot reapply that exemption in an SaaS context and call it equivalent. You'll have to first demonstrate the legal basis for your claim that the two contexts are equivalent.

Also I do recognize that all of us here are non-experts giving our two cents, so any disagreement we have is TBH inconsequential to what the law actually says and how its actually interpreted! :)


> Did Amazon do it out of their own volition, or did you instruct them to do it?

By purchasing the work through Amazon's store, and clicking the "enable speech-to-text", he instructed Amazon to do it. I should mention this looks like an extremely vague distinction, and it's not even mutually exclusive. Amazon provides a service of its own volition, and he took advantage of that service.


EULAs and those kinds of agreements have their own legal problems with how they can be enforced, so its best not to complicate things here !

>I should mention this looks like an extremely vague distinction, and it's not even mutually exclusive.

I don't agree with your opinion that its vague.

> Amazon provides a service of its own volition, and he took advantage of that service.

If the action that Amazon took could be proven to be "just as easily" done by the end user, then perhaps Amazons acting on behalf of the end user could be considered equivalent to the end user themselves performing that action.

For e.g. If these is a fair use exemption for making a copy of a CD for a personal backup, then if I told my neighbors kid to do it for me, I think it could be considered equivalent as me doing it myself. Copying a CD is at present, ubiquitous enough, that assisting someone is probably on safe legal ground.


> EULAs and those kinds of agreements

I'm not talking about any EULAs - I'm trying to say that clicking the "speech-to-text" button is no different than opening a file in notepad, changing a few lines, then clicking save - you're instructing a program to do something. Would you say that notepad.exe/Microsoft altered the file "of it's own volition"?


I already addressed this in the previous comment. You might have missed it.

>> Amazon provides a service of its own volition, and he took advantage of that service.

>If the action that Amazon took could be proven to be "just as easily" done by the end user, then perhaps Amazons acting on behalf of the end user could be considered equivalent to the end user themselves performing that action.

>For e.g. If these is a fair use exemption for making a copy of a CD for a personal backup, then if I told my neighbors kid to do it for me, I think it could be considered equivalent as me doing it myself. Copying a CD is at present, ubiquitous enough, that assisting someone is probably on safe legal ground.


I hope someone uses this line of reasoning on main TV channels: CHRONICLE, HACHETTE, HARPERCOLLINS, MACMILLAN, PENGUIN RANDOM HOUSE, SCHOLASTIC AND SIMON & SCHUSTER are trying to make it illegal to read their books to your children.

All with their logos on the screen.

Oh. Can I dream...


You would need a special license if you read it to your children and then sell that recording.


But the users aren't selling the generated text. And Amazon is only providing a service that lets the users do the conversion, much like Google Docs lets users edit documents.


If you livestreamed reading to your children you would have to pay -- you are performing the work in public.


Then I wonder why people who stream video games don't have to pay?


Because most video game publishers allow them to stream without paying. But not all: https://www.theverge.com/2018/11/28/18117172/nintendo-youtub...


Some license holders have legally restricted streaming of games in the past (think licensed music, but also some specific games). The reason they don't generally is because it's in their best interest to have people streaming their games online.


I would assume because it's viewed as free advertisement.


I think also because the key to the experience is interaction and playing. However, in more 'cinematic' games, one can avoid purchasing the game and just watch a speed run.


I like such features, but I can also see how, in the current legislation, the transcriptions are basically derivative works. Distributing them to the user without a license could very well be a copyright infringement.

I don't think Fair Use has a chance either: https://en.wikipedia.org/wiki/Fair_use#3._Amount_and_substan...

> In general, the less that is used in relation to the whole, the more likely the use will be considered fair.

If a whole book is transcribed, it's certainly not Fair use anymore.


And if the transcription is done by the user? What if the user gets the print version, and decides to change the font, or background color, or screen brightness. What if they decide to change the volume of the audio version, or use an equalizer to change pitch, or change the reading speed - is that not also a derivative work?

What if the user takes notes, associating them to time-stamps (audio) or written directly into the pages (print)?

In all those cases the user has altered the work, and made a derivative one, if you will. But the publishers want control over how you consume their works - they want control over what happens on your own device, so they can extract more money from you.

It's hard to express how abhorrent I find this.


> And if the transcription is done by the user?

If you can successfully argue that in court, chances are you'll win.

> In all those cases the user has altered the work, and made a derivative one, if you will.

It's a difference if you do it for your personal notes, or if you then distribute them at scale and make money from it.

Just for the record I don't like how it's currently handled, it's just my best understanding of what the law and the courts say.


> It's a difference if you do it for your personal notes, or if you then distribute them at scale and make money from it.

I think that's my main argument - this text-to-speech is the same as doing it for personal notes, and that if this is made illegal, then a huge number of other traditionally allowed activities become illegal.

That it happens on someone else's server is irrelevant: 1) It's the same as hiring someone to take notes on a book you own. 2) There's precedent where someone making a recording on your behalf is legally the same as you doing it yourself. 3) If all else fails, the transcription can be done locally.


"I think that's my main argument - this text-to-speech is the same as doing it for personal notes,"

I don't know if this is actually true? Because this isn't suing individual audiobook holders, it's suing the company that is choosing to sell copyrighted-text-in-speech-to-text as a service.


Well, it's selling a speech-to-text service, that the user applies to speech they own a license to. Why should the legal interpretation differ, just because of where the conversion happens? It could be done locally, or remotely, or locally but charge the user per-word translated. Looks like an implementation detail to me (and again, I think there's precedent backing me up in this view, but I can't find it, sorry).


"Looks like an implementation detail to me (and again, I think there's precedent backing me up in this view, but I can't find it, sorry)."

Yes, implemention details tend to be important when it comes to copyright and fair use.


So now speech-to-text is illegal if the system allows itself to be fed copyrighted works as input?


No, the service appears to exist only on the platform with copyrighted works?


Agreed, on the surface (and given current legislation/IP legal climate).

One thing that may also come into play - what about accessibility and the Americans with Disabilities Act? In this age of "slippery slopes be damned," I wouldn't necessarily be surprised to see someone file suit alleging that these sort of licenses - that limit scope only to audio - as a whole are infringing upon their rights under the ADA.

(Whether it would stand up or not is another question - anyone can file suit for anything, but there's no guarantee of a win. But it seems to me that Amazon might have a good defense here, just based on that idea.)


If you couldn’t hear, why would you buy an audiobook when there is a text version available? That’s like suing a store because you can’t walk up the stairs when they have a ramp available.


The disability doesn't have to be impaired hearing. Maybe you suffer from a reading disability and find the feature useful for practicing reading, or for some reason have trouble focusing on text or audio on its own, but can improve your comprehension by having both simultaneously.


If you have a reading disability, the last thing you want is an imperfectly transcribed, imperfectly punctuated text representation that would not be like what you would read normally.

Text to speech translation is a well known and well solved problem that computers have been able to do well since the mid 80s. Speech to text - not so much.


> why would you buy an audiobook when there is a text version available?

Someone gives it to you.

You build up a large collection and then go deaf.


The same reason someone who is deaf would buy a VHS tape (back in the day), when they could just see the film in the theater.

I'm not saying it makes sense - you and I agree on that front. But as I mentioned, this is the world we live in, and people have sued for (and won) huge amounts due to odder things.


- If someone else bought an audio book, I would like to be able to access it. I can't if it's audio-only.

- If I want to buy and listen to an audio book, I wouldn't be able to understand the speech without seeing the text. The text gives my brain additional information to process speech. This is much more relevant for music where lyrics help me hear the vocals in the music.


"I buy this audioboook but I want it to have raw text so I can use it with screen reader"


If I buy an audible book and transcribe it for personal use, that should be fair use.

Running machine learning on my audiobook seems like just a way I’m consuming the book I purchased.

But I also think it was wrong that the text to speech feature was blocked that basically stopped the reverse working on purchased text books.


Just to clarify, the "fair" in "fair use" is not really in the sense of whether the use is fair and just. e.g. it's not legally valid to argue "I paid for this audiobook, and I did all the work to transcribe it, and I'm not profiting from it. It's fair use, i.e. not unfair to anyone else"

Fair use is primarily about transformative work, e.g. how much copyrighted material is it "fair" for someone to copy and reuse in order to create a transformative work (e.g. parody or critique), without having to compensate the copyright owner.

Your example involves reproducing the entirety of an intellectual work onto a different medium. That's not a transformative work in the context of fair use. And IANAL, but as long as the copyright owner is making money from selling separate text and audio copies, I don't see how creating your own print copy via transcription, ostensibly because you didn't want to buy it from the publisher, is technically legal if for some reason it became a legal case.

Of course, when it comes to the real world, your personal transcription is protected by the fact that no one knows or cares about it :)


I disagree a bit with your definition of fair use. Fair use is a bit a hodgepodge, I think IANAL, but the “four factors [0] #4 is the use’s limitation on creating value from the original work. It seems to me that creating captions from an audiobook in no way limits the copyright holder from selling more audiobooks.

There’s also a factor on profit/non-profit uses that doesn’t hold up with amazon providing it, but could if it’s not a sold feature or is provided through an OSS plugin.

The other factors are amount and nature. Since it’s the whole work that seems to go against fair use and the nature doesn’t seem to apply.

[0] https://en.wikipedia.org/wiki/Fair_use


I didn't mean to imply my comment gave good coverage of the topic, just wanted to point out that "fair use" is focused on finding a balance between respecting copyright while protecting the freedom to make transformative works. It was definitely a nitpick comment on my part, with little real-world relevance to the original comment because it's inconceivable his personal-transcript would be legally challenged. And because it's so much work to manually-transcribe, to the point where he's likely to make mistakes and deliberate alterations and additions of his own, it'd be even more work to argue that it wasn't transformative, nevermind that it had any profit impact whatsoever.


In my opinion, this line is key: "One key difference, Audible says, is not being able to flip through pages, as users must wait for each line of text to be progressively generated as they’re listening."

Also, it will help in increased sales, and that will in the end benefit the publishers as well.

No consumer will pay extra for this feature, so it's also in the publishers interest to accept and move on.


> No consumer will pay extra for this feature, so it's also in the publishers interest to accept and move on.

Cool, then it's useless and Amazon can simply withdraw the feature.

But realistically, it gives Amazon a competitive advantage over other audiobook distributors and hence is something consumers will pay extra for, or at least change their purchasing habits for.


> No consumer will pay extra for this feature

While this might be true,

> so it's also in the publishers interest to accept and move on.

.. this doesn't follow, because the publisher's argument is that this feature existing reduces other revenue.


But people don't buy books so that they can see 6-7 words at a time at about 100wpm in sync with an audio track. What market is being usurped?


I imagine they think it will destroy the existing audiobook market eventually, if it becomes good enough.


In the future, people will look at these cases and shake their heads in disbelief. I hope the judges do the right thing here.

The publishers seem too attached to their licensing schemes. Speech-to-text and text-to-speech are here to stay and this is not a surprise for anyone. Publishers should catch up.


Um... so... you do know authors make a good amount of their money nowadays from audiobooks, right? Ultimately this hurts authors more.

Edit: I'm getting downvoted because I think authors should get paid for their work? Wow. Okay, ive been noticing the nature of hn has been going towards entitlement bs, but this is ridiculous. Why dont you clock in to your job and offer not to get paid for that week, and still do the work. Oh no? I wonder why...


Whats your logic on hurting the authors more. You can either buy the book or buy the audio book, someone is buying the book and having a separate technology read the book they bought with the fact that not every book has an audio version. So this opens up the possibility that authors without an audio book can still sell their digital copies to people with text-2-speech opening new point of sales to people who buy audio books.

That logic is the same as if you bought the digital book and printed it and the publisher suing you because it hurts their physical sales. If you buy a book it should be a license to print, translate, and text-2-speech, or whatever.


Um... so... you do know that blind people can't see the printed page, right? This technology opens up reading these books to them out loud. Requiring an additional license for people with a disability is pretty disgusting. Right now, I can share any books I download with my spouse, children, etc. But if one of them was blind and I wanted to share with them, suddenly I'd need a different license just for them? That doesn't seem right. The author's not going to get any additional money out of us, we'll just not read their book.


FYI the feature being discussed goes the opposite direction: it takes audiobooks and transcribes them to text.

But your overall point still stands.


Well, Amazon is making the audiobooks more useful. I think that's why you're being downvoted.



I would imagine one of Amazon's defenses is going to be that, actually, they are required to do this by the ADA?

I imagine various blind organizations will be filing friend of the court briefs.


You mean deaf organizations, right?

In any case, I agree with you. Some people would say "If you're deaf, just buy the regular book, it's cheaper", but there's a significant use case that's not covered by that: imagine if you bought a large collection of audiobooks and then due to some medical problem, you lose your hearing (and yes, I have heard of people suddenly losing most of their hearing in their 30s). Amazon could be sued if they didn't have an accessible option.

Edit: Here's another use case: Someone who's HOH and can understand most of the audiobook but needs assistance with a few passages here and there.


I'm not even what people usually describe as hard of hearing, and I need to go back and rewatch a couple lines of TV with subtitles to get it. It's useful for all kinds of people.


John Scalzi posted this article on twitter - and I made the same point that it was likely to comply with the ADA.

The whole thing seems like a tempest in a teapot given the relatively poor quality of speech to text. Text to speech is actually useful - but that has been around for years and publishers are not complaining.


Does the ADA cover publishing formats? Considering that most books in the US are published in regular print only, with no braille edition or audio edition or even a large print editions, if the ADA applied I would have expected lawsuits to have ended that long ago.


If the publishers prevail it probably would end non profits that "read newspapers and magazines" for the blind.

e.g.: http://gatewave.org/


Why would a blind [advocacy] organization be fighting for Speech-To-Text? Please explain.


« At the heart of the case will be a determination on the transformative nature of an AI-created audio transcription, and whether that constitutes a violation of the copyrights held on a written work. »

It seems to me that this shouldn't be what matters. Clearly Audible are distributing the books in a way protected by copyright, so they're relying on the permission they've paid the publishers for.

If the publishers have been giving out different licences for "audiobook" rights and "ebook" rights, that's not a distinction from copyright law, so deciding what's allowed should be a matter of interpreting the licence, not arguing about whether what they're doing is "transformative".

After all, the publishers could presumably write a licence that says Audible aren't allowed to distribute recordings of the work with background music added, and if a case about that went to court it shouldn't matter that the background music isn't itself any kind of copyright violation.


I wonder how useful this feature would actually be to the average listener.

I usually listen to Audible when I can't or don't want to read (e.g. walking or on a commute).


It makes me wonder what's really behind the suit.

Is it that the publishers are genuinely trying to protect their copyright and earnings - to ideally make consumers pay twice for different formats of the same content? (or similar content, given audiobooks also add some value via the narrator's interpretation). As a consumer, this kinda sucks - just as it did when --in theory-- one could be sued for downloading an MP3 of a song already purchased on CD, or even for ripping the audio off the CD oneself.

Or are they worried that this is Amazon's back-door route to pilot some wider usage of this technology, cleverly clothed (in this instance) within a pro-education, pro-child-literacy story?


> It makes me wonder what's really behind the suit.

I think there are a few things at play. The publishers were commoditize on the ebook switch. They are also facing more and more pressure as people question things like 'check outs' in an ebook world. Audiobooks are a growth area, and in very high demand. The publishers learned from the ebook shift, and now will go after anything that could possibly infringe.


I don't think this is about lost sales from the small amount of people who buy both the audio-book and e-book. I think that publishers are just using this new feature to pressure Amazon into paying them more for the audio-book licensing,.


This article by Ars Technica [1] includes some commentary on two prior copyright decisions that may shed some light on how this case might proceed: Sony v. Universal (that declared the VCR legal) and Cartoon Network v. Cablevision.

[1] https://arstechnica.com/tech-policy/2019/08/book-publishers-...


I worked in Amazon's Audiobook space for a while.

The thing with Audiobooks is, Audible never bought a physical item from a publisher they have an licensing agreement, which puts then into a very different space. That agreement could be with 1. Major Publishers Audiobook publishing group, for a recorded audiobook 2. An independent audiobook publisher, for a recorded audiobook 3. Directly with the author for rights to publish, for the rights to publish 4. Directly with the publisher for rights to publish, for the rights to publish Further, there are Geographic rights publishers only have rights to publish in specific regions. A UK publisher could buy worldwide rights for ebook, print, audio from the author and then license US eBook/Print rights to one publisher and Audio rights to someone else.

The eBook publication rights usually are with the authors publisher, but they can be owned by another publisher (see Open Roads media), or the author.

The problems here is separate publishers have rights to the ebook and audiobook rights. Take Harry Potter, Tim Ditlow from Listening Library bought the audiobook rights for $15k before the book was “big”. Ignore that his company was bought to keep this understandable. If you buy the audiobook and use this feature, the eBook publisher gets nothing which sets up some big conflicts.

Traditional publishers have been pushing for audiobook rights to be part of book deals for a while, the concerning thing I see here is that a response to this could kill the independent audiobook publishers which would take a lot of checks out of authors hands.


I co-manage an ebook to audiobook conversion product https://auditus.cc. Spending a little time in this space and being an avid audiobook listener myself, I somehow feel that both sides are right on some level. The publishers' sole goal is to place as many physical copies as they can in people's hands. That is why you often hear things like "The feel of paper is what makes the book better", "this digital reading takes the feel away", etc. Now if you are listening to a book on Audible and you feel "ah! this book sounds pretty interesting and has some very notable points; I should read this rather than just listening to it!", you might end up buying a copy of it. But now since Audible has the exact text of the book, you can read it right there while also listening to it, which is obviously better because two faculties of your brain are working on the same thing thus you grasp faster. But clearly that's not good for the publisher at all, thus the frowning. On the other hand, audible's point that the A.I. generated text is not gonna be an exact replica of the hard copy might be the only and helpful justification they can present.


This seems materially different than the reverse operation - taking a book and turning it into an audio experience. Doing so may diminish the book's experience because the AI voice / TTS may not convey the book correctly and could damage the user experience.

Captioning seems more like a facilitation of experience, using only the audio to add an additional layer on it. It's like saying the credits for a movie are copywrited and Amazon can't use that content for X-Ray (I'm aware they use IMBD and Amazon owns that).

It's certainly a gray area, but I feel the publishers are simply afraid that buying an audio book that is captioned may cannabalize their text-based book business, which seems disingenuous. The accessibility of a book of text is very different than captioned audio.

What if people can't hear well, but like the emotive layer added to captions in a reader's voice/pace, etc?

I can see uses for this that could be argued however - for instance, using the captions as a reading facilitator. The customer is using it like a book, with audio captions, not an audio book with hearing-impaired captions. Using the transcribed text to translate to a different language may also be infringing on copywrite as to provide an unofficially translated version of the text.

Assuming that there was an app on your phone that you could use to point at the Kindle readhing your audio book out loud and transcribing the text clearly isn't copywrite infringement, so the debate seems to be around Amazon providing this as an integrated service.

I think all that needs to be done is to allow the publisher to decide if that feature is enabled or not. Customers will obviously buy books that are captioned over ones that aren't, because you're getting more "content," and I can imagine publishers would charge more for this. Which is of course, what it ultimately comes down to - Money.

Publishers who didn't provide captions wouldn't get the business of those who want them, and those that don't care will make a decision on availability and price. Seems like Amazon could let their marketplace decide what people want, but they are in a legal position where they may consider it worth fighting. The trouble begins when you start asking the questions, "well, if Amazon wins, would publishers simply stop publishing their books on Amazon?" Then the monopoly question comes up.

I can see arguments both ways, and if the licensing terms to Amazon aren't clear, then Amazon should simply make them clearer and allow the marketplace participants fall where they may.


> but I feel the publishers are simply afraid that buying an audio book that is captioned may cannabalize their text-based book business

It should. If I buy an audiobook version, the publisher should not expect me to buy the book in any other format. Thinking I will buy two copies is greedy.


Do you also feel entitled to the movie adaptation or graphic novelization?

If you bought the Rob Inglis reading of The Fellowship of the Ring, would you feel entitled to the BBC Full-Cast Dramatization?


I feel entitled to movie subtitles and will auto generate them using ml if there are none provided by the creator. Do you think that should be forbidden? That is basically what the book publishers argues for in this case and I believe it sets a horrible precedent.


I agree, but that’s not the position of comment I’m replying to.


Transcoding and post-processing should really be easy and legal, but DRM and anti-circumvention say otherwise.

Not to mention that fast-forward should not be deactivated when I want to skip an advertisement on a movie disc I purchased.


Having a person read out a book, then using AI to convert that back into the original book feels like a very convoluted way of getting the text. Is this just a licensing thing? Is it to get the timing right of the captions?


How is having a book feed to you line by line (on a time you don't control) equivalent to "getting the text"? Also this surely can be considered a Usability feature for hearing impaired for example.


I think you have misunderstood. I am talking about Amazon getting the text.

The text for these books already exists, they're books that are being read out.


May be they are worried that this will lead to text leaking to public the minute it appears on such service... idk...


No, it's that since the text already exists (the written book), it seems overly complicated to transcribe what the voice actors are saying.


This is a tough one.

On the one hand since this is producing something directly from the audio that you have already paid for, not allowing this seems like it would be another example of the whole right to repair/modify and "you don't really own it" issues.

On the other hand, I think it's reasonable to say that the AI provides capabilities that current laws didn't anticipate. I'm also sympathetic to the argument that someone else in this thread made its an example of Amazon profiting off of something that they didn't make at the expense of those who did make it.


>On the other hand, I think it's reasonable to say that the AI provides capabilities that current laws didn't anticipate.

I don't see that being the case. It changes the cost, but I was always able to have someone transcribe text for me. If I paid someone to listen to audio of objects I own and produce a copy of the words within it just for my consumption, then it is legal.

Even more so, I was always able to transcribe it into words myself. Be it pen and paper or be it a keyboard and hard drive, I was always able to take audio I own and transcribe it into written text for my own use. The AI provides a great reduction in difficult in doing this, but does not provide any new functionality.


A company sells audio books and provides its consumers with an app that interprets the speech in the audio books on the fly back into text to show on the screen: publishers claim copyright infringement.

A company sells an app that, given speech in any form by a user, including any audio book format, interprets the speech on the fly back into text to show on the screen: what's the difference? what do publishers claim in that case?

Please, enlighten me.


I doubt audio is transcribed on each device rather than once in the cloud, but imagine for a moment... what if it was? Or if Amazon gave you access to a server to redo the transcription work in real time for each user? Would Amazon would be distributing the capability to create a derivative work rather than (what is arguably) a derivative work?

Should Apple be prevented from distributing Logic Pro audio software because it could be theoretically used to create unauthorized derivative works from music downloaded from iTunes? Should Google/Alphabet be prevented from serving search results for "youtube-dl" because it is the distributor of copyrighted music videos? Should MEGA.nz be prevented from sending data whose copyright status it has no ability to determine by design? At what point does responsibility pass to the person clicking the "transcribe" or "remix" button?

(IANAL)


That approach (circumventing the specifics of the law through technical workarounds while still violating it in spirit) seems unlikely to succeed for the same reasons as Aereo: https://en.wikipedia.org/wiki/American_Broadcasting_Cos.,_In....


It is not law, it is license terms, and while license terms between companies can be ridiculously complicated, license terms for consumers in most jurisdictions generally can't be enforced beyond the very basics.

That is why the consumer technically doing the translation matters, they can't be bound by the inter-company license terms.


Unless, of course, the technology is painted as circumvention measures under DMCA, and all AI research into speech-to-text becomes illegal overnight.

*cries self to sleep that this is actually a possibility


I just wish more ebooks were available through libraries. It seems like 50% of books I want to read aren't. The more I have used ebooks, the more I prefer them (for reading for pleasure).


To turn it around:

Am I allowed to have someone (e.g. a parent or partner) read a book I own to me? Most definitely yes.

Am I allowed to hire someone (e.g. a nanny) to read the same to me. Still, probably yes.

Why wouldn't I be allowed to have a computer read it to me (regardless of whether I pay for having it read) as long as it's not a recording.

Now, turn it around from text-to-speach to speach-to-text and the same argument should apply.


Considering how much of a tantrum they're throwing over this I'm amazed they didn't sue over the ability for ebook readers to convert the text into spoken words.


They made a stink about that a decade ago and Amazon backed down, making the feature publisher/author controlled, ahead of actually getting sued.


If there wasn't such a clear PR nightmare for fighting a feature for the visually impaired, they would have.


Ironically, this reminds me of a kindle-mindstorm copy machine that also uses OCR to copy legally books "rented" from Amazon. https://techcrunch.com/2013/09/09/this-lego-robot-strips-kin...

Disclaimer: it's from my old supervisor


I don't understand why you would like an audiobook. The bandwidth of a speaker is 100 words/min while reading is around 300 words/min, a significant difference that will compound in years over a lifetime. I never have the feeling I have enough time. Anyway If you can't innovate litigate.


I listen on my commute. While doing yard work. While resting my eyes. In many situations where I can’t read.

I also read a lot so these aren’t exclusive activities. Sometimes I even read the book and listen to the book at different times. Audible syncs up pretty nicely.

I also listen to some material at 1.5x or 2x.


> I don't understand why you would like an audiobook. The bandwidth of a speaker is 100 words/min while reading is around 300 words/min

Since my words/min while I'm driving or at the gym is 0 for reading and 150 words/min (I listen at 1.25x or 1.5x), this makes complete sense to me.

> Anyway If you can't innovate litigate.

They are literally suing because Amazon is recreating the text versions of the audiobooks. Text was invented 5000 years ago. The only innovation is HOW Amazon is recreating text versions of the orignal books.


> Text was invented 5000 years ago. The only innovation is HOW Amazon is recreating text versions of the orignal books.

Sounds like an innovation to me. When Amazon is going to do text-to-speech on normal books they are even going to put more pressure on the publishing business.


Inventing a new type of printing press doesn't give you the right to print copies of books without a copyright license. This is an innovative printing press.

But I'm pretty sure you understood my original point.


I understand your point of view but I have a different opinion.

It is ridiculous to expect people to pay the same price for a digital book as the printed version.

The distribution costs are not only a fraction, the business risk in publishing was originally ending up with an obsolete stock of books.

Publishers are profiting as much as Amazon from these new printing presses.

Sounds like an industry with a business model begging for disruption.

Would be awesome if I was able to actually lend out ebooks to friends with a similar restriction as with a physical book.

That's innovation, adding value to the customers experience and securing business survival.


From what I understand from someone in the industry, the digital version does take non-insignificant effort to format and meta-tag the literature according to the ebook formatting appropriately. Furthermore, the risk for the publishing company still exists, because editing, marketing, formatting, and cover-creation is still something that ebooks need and still costs money to produce.

In other words, the idea that digital books should cost significantly less than a printed book doesn't account that digital books require similar amounts of effort and labor, plus additional formatting labor.

"Sounds like an industry with a business model begging for disruption."

This already exists in the form of wattpad and other self-publishing mediums. There are plenty of authors who make a living off of these mediums. But it's a different ball game and requires someone to manage their author career like a business, not as a traditional writer. (hiring an editor, a cover-producer, and doing one's own marketing) Thus, it's not really something that authors are generally enamored by.


I own over 300 Audible audio books, about the same number of eBooks purchased from Google, Apple, and Amazon over the same time period.

As an example of the value of audio books: several years ago I read James Joyce’s “Ulysses” and I did not totally follow the story. The language he used is interesting. A year later I bought the audio book, narrated by four or five excellent actors, and even with the strange language, the story really came to life for me.

I also like to listen to audio books while walking.

BTW, I try to buy books on different platforms so if, for example, I lost my Amazon account, then my entire library would not go away.


Really? It's easy to do other stuff while listening to an audio book. I consume most of my books driving this way. The alternative is to not read as I have better stuff to do.


Not even taking into account that you can listen while doing laundry or other chores, a good voice actor can really bring a story to life in a different (sometimes better) way.

Nuances in the dialog can really change how you perceive a character.


My biking commute is about 80 minutes per day, 1984 is about 10 hours audiobook, so for me it is just using those 80 minutes.


Speed isn't everything, and not everything is a race.

Sex being a good example.

It's important to remember that just because you don't understand something doesn't mean it doesn't have merits, or other people won't understand it either.

That being said I prefer books too, and I'm a slow reader by design (I've discovered the slower I go the more I enjoy the journey)


One thing I really like about audiobooks is that it works really well when trying to get through dense texts. Thanks to audiobooks, I've "read" Capital in the Twenty-First Century, which I attempted in text form but got bored.

Also, they work great for a commute.


Well, for me (and I imagine, most) reading fiction isn't about trying to read as many words per minute as you can - it's about immersing yourself in a fantasy and enjoying the story.

Sometimes when I'm working offsite I have a long commute by hired car or train. Reading on a moving, bumping vehicle makes me feel really nauseous after about 20 minutes, so it's a no-no for me.

So, I find audio books great for commutes/land-based travel.

Back when I used to exercise, I enjoyed listening to audio books too.

Also, a good narrator can really bring books to life - it's a totally different experience than reading.


For me, English is not my mother tounge, audiobook is way better than reading it. And also I find myself difficult to stay focus while reading a book, but no problem when listening to a audiobook.


I can listen to audiobooks whilst driving at 100 words/min. My reading rate drops well below 300 words/min whilst driving.

Using otherwise dead time multiplies my 'reading' time.


Seriously? You read while you drive?


If I’m out running, I can’t be reading a book, but I can listen to one.


It makes otherwise tedious activities much more enjoyable.


That particular horse was never broken to begin with, much less in anyone's barn.

fbreader has a tts plugin, and Android has many utilities to convert between different formats. I sometimes use TextAloud (with Paul16, the best TTS voice imo) on my PC, to make audiobooks for myself.


So if I employ a secretary to take dictation and play the recording to him or her and read the text as it is typed (or written as shorthand if you will) would the same publishers claim that I have somehow violated their 'rules'?


If someone (e.g. that secretary) listens to an audiobook and transcribes it on a typewriter or in shorthand, then that's quite clearly creating a copy of the protected work on a durable medium, which is exactly the thing for which copyright law states that this is the exlusive right of the copyright owner. It's not even an edge case, it's the core of that law - the result of a person transcribing the (audio)book is either a copy or a derivative work, both of which require the copyright owner's permission.

In certain cases fair use or other exceptions (depending on jurisdiction) might permit the transcribing itself for personal use, but distributing that transcript to others is definitely a violation.


So if I make a copy in my own hand for my wife to read that is also a violation?


Well, it's obviously very unlikely to be prosecuted, so the practical answer is that noone would care.

There are also some provisions for e.g. backups or accessibility (which vary between jurisdiction as these details aren't a part of the worldwide copyright conventions) that might apply in certain situations that stretch the scenario of "for my wife to read" a bit.

But yes, if you had a book and made a handwritten copy so that you could have your wife read it while (for example) you read the original, then the legal answer is that you're technically prohibited to do that without explicit permission from the copyright owner; and if the copyright owner finds out and really cares, they could succeed in suing you for that action.


Let me get this straight, if Amazon hired a reader to sit in class to read the book it is ok, so then amazon should build a robot reader that sits there turns pages and reads the text with it's eyes and speaks it.


> a distinction between a newly created piece of text composed using AI, based on an audio recording, and the potentially near-identical text version of the book the audiobook was created from

How is transcribing audio book "a newly created piece of text" ? This is another instance of a corp trying to use authors contents without actually paying for it (first being google AMP) in the name of experience.

Amazon's justification seems to be:

> Amazon says its transcriptions may contain errors and are not intended to be complete recereations of the text version of a book.

Really, what does this have to do with anything ? A thief claiming ownership because they didn't steal everything at once ?


they Don't actually claim ownership. Audible has another product "Immersion reading" where you can buy both the Audio version and written version of a book and audible will sync between the two highlighting the spoken words on the written version in real time.The user can go forward/backward in the written version and have audio synced to that. Again this works with a licensed copy of both audio and text of the book. Whereas closed captioning is completely different. It is driven completely by the audio, does not have the accuracy of the real text, is being displayed line by line and in a time you don't control. noone would consider that equivalent to having the complete text book. And the intention to me is clear. Audible is already licensing Text versions for the customers who want them while having CC is a hearing aid just for the audio version.


Let's take that "newly created piece of text", and use a text-to-speech NN trained on the narrator's voice, and create alternative audio books. Maybe we can even include the original audio book in the training set for better results, after all we don't know that Amazon is not using the original book in theirs. I wonder if Amazon would consider it "newly created piece of audio".


If I buy the audio book does it make sense that I am allowed if I want to write on a piece of paper what I hear?

How about type it?

How about using some voice-to-text software that I have?


Yes, you do it for yourself. Thats fair use. Now you begin to sell it, thats an issue.

And to think amazon is "surprised and disappointed by this action" does not seem the case. They are definitely trying to make themselves look charitable because of "educational purpose".


So if they create a separate free app and inside it will allow to load books from Audible and transcribe them - you won't have any issue with it?


> My contract is crystal clear that the only rights conveyed to Audible are for voice recording and playback. The rights to reproduce text in any way are specifically withheld

The original creator has issues with it. And if I was the content creator, I probably would have.


The comment, under which we have this discussion, gives it very specific context: where will you draw the line in the range of tools to transform audio for fair use.

Why would you consider pen+pencil and free app to transcribe audio to text different in terms of copyright infringement? Does Audible charge you more to use this feature?

What I see here is an attempt to dictate ways in which I can use book I already purchased - i.e. limit my rights.

What I also see: a bunch of companies that are trying to squeeze more juice from the market, which they are not even willing to serve in the first place, because they, being experts in the book market, couldn't figure out that market needs this feature, while a bunch of IT geeks, did.


Will add just one last point on humane business practices: if publisher sold me the book so well that I decided to buy one of the most expensive forms of its distribution - he should feel lucky and fuck off.


The funny bit is that it’s been illegal to read books to your kids under similar reasoning from publishers.

It just was never enforced because how would that be possible.

But now with these ML/NN/AI tech approaches products are coming out at scale. If I write an open source package and give it away for free and it takes text and produces audio performances from it, that’s fair use, I think.

Amazon commercializing is a different thing, but unless the publishers are going to sue parents from using text to speech and speech to text utilities on stuff they bought we’re going to get into an interesting legal area.


Just like an MP3 is not an identical copy of music of the CD it was ripped from.

So, as long as the derived work is lower fidelity, everything is fine. /s


Let me get this straight...

1) Take a book

2) Record someone reading it out-loud.

3) Use a robot to listen to the words and write them down as a book.

Nice example to use when teaching 'Lean'. Let me just draw the value stream...


But it's not being written down as a book. It's being transcribed on-the-fly. The user can only see the words as they are being read right now.

Audible argues that the critical competitive advantage of a book is to see lots of words per page, being able to flip to any page you want, and to read at the speed of your inner voice. Their feature does none of those things, so it can't usurp the market for books.


Hopefully Amazon wins


This is ridiculous and just publishers trying to money-grab.

It's about accessibility.

E-books should be able to be read aloud text-to-speech, and audiobooks should be able to show captions speech-to-text. Period.

And neither replaces the other. Text-to-speech has none of of the nuance or author/celebrity/actor voices, and speech-to-text is only displaying snippets (not pages) and audiobooks can sometimes have different/extra content than the printed book anyways.


How are people on Amazon/Audible's side here? They're trying to use AI to get around copyright law, to produce the text of the book without a license to the text of the book. Their argument is that because they're using AI to do it, it won't be an exact replica. So they're saying because it's an inferior rip-off, they don't have to pay the IP holders. What nonsense.


Is title wrong? It's should be text to speech right


The title is correct. The issue is about creating captions (text) from audio.


Slippery slope being a thing, a ruling here has the potential to affect YouTube as well, since they run automated captions on videos. I’d be surprised if Google didn’t step in an offer up an opinion in support of Amazon.


Hmm.. has anyone ever been sued for adding captions? I don't see how they would.. as far as the end user is concerned, the captions are embedded into the video, are not a separate performance or a product. Not exactly analogous to audiobook vs ebook.


Captions is exactly what's being offered though. Amazon is not providing an ebook, they're providing captions in-sync with the audiobook.

The difference between an audiobook and a video is pretty small in a significant portion of YouTube content.


The difference is that audio-books is a separate product category with its own pricing. Embedding text in a video is quite different IMHO...


If Amazon added an animation to the existing audiobook content - making it a video - and then added captions would that be OK?

As the GP said, there's quite a few youtube videos with a completely static background where the content is the speaker's words.


>If Amazon added an animation to the existing audiobook content - making it a video - and then added captions would that be OK?

AFAIK, they would have to obtain prior permission from the copyright owner before re-distributing a modified version of the copyrighted work.


> The case happens to have a strong analog to a former Amazon publishing controversy a decade ago, when the company tried to launch a text-to-speech feature for its Kindle platform that would effectively do what Amazon Captions does today, but in reverse.

> Publishers at the time were enraged, accusing Amazon of trying to trample on the nascent audiobook market and the licensing rights that publishers believed would help it become a thriving business. Amazon eventually caved in that regard, allowing publishers to disable the Kindle text-to-speech feature after a massive outcry from the US Authors Guild.

Sooooo.... If this is illegal, is reading a physical book aloud to your kid (or a class) illegal? That feels like a very close parallel.

Or is it the act of automating the action that makes it illegal?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: