Hacker Newsnew | past | comments | ask | show | jobs | submit | woodson's commentslogin

Unfortunately, most big mail providers won’t accept email from your self-hosted mail server, even with DKIM, SPF, etc. So, diversifying is as good as it gets.

Has this been tested recently? I had no problem sending mail to my own Gmail account from my own server. Even without SPF (then I got a bunch of spam spoof bounces and realized I forgot SPF)

I've been self-hosting E-mail for a long time (which itself probably helps with reputation), and I very rarely have deliverability problems.

I guess some node.js based tools that are included in Zed (or its language extensions) such as ‘prettier’ don’t behave well in some environments (e.g., they constantly try to write files to /home/$USER even if that’s not your home directory). Things like that create some backlash.

There are many streaming ASR models based on CTC or RNNT. Look for example at sherpa (https://github.com/k2-fsa/sherpa-onnx), which can run streaming ASR, VAD, diarization, and many more.


Except it’s only been released in September 2022 (not even 3 years ago).



no, because bureaucrazy in Germany is rampart.


He didn't live in Germany.


He actually lived in Berlin for a few years before he died. But you're right, he spent most of his life in Prague. However, his native language was German.

Certainly an interesting man. I highly recommend checking some of his work (ie. The Metamorphosis).


Actually The Trial more closely resembles what the OP went through.


Check out this model based on the same architecture for Japanese: https://github.com/nu-dialogue/j-moshi


Sounds like the same farce as with the forced Pixel 4a update. Not sure if they learned anything from that, other than to ignore the issues and move ahead anyway.


I'll have you know I did successfully claim my $50 from that refund program they did for the 4a.

> "enter your sn in this form to see if you qualify"

> enter sn

> "great, you'll hear from us in 2 weeks"

> 2 weeks pass

> "you're eligible! you can either have $50 or $100 towards a new phone"

> just bought a new phone, give me the 50

> "great, you'll hear from us in 2 weeks"

> 2 weeks pass

> "we use a 3rd party to issue refunds, download their app"

> download app

> setup account

> verify account

> enter bank info

> "great, you'll hear from us in 2 weeks"

> 2 weeks pass

> "sorry, we couldn't process your bank info"

> dammit, forgot app password

> recover app account

> re-verify app account

> reenter banking info

> "great, you'll hear from us in 2 weeks"

> 2 weeks pass

> "your bank info is verified. You'll get your refund within 2 weeks"

> 2 weeks pass

> money deposited

barely even worth it.


Alternatively I went with:

> "enter your sn in this form to see if you qualify"

> "Your pixel 4a does not qualify"

But why? Does it mean my battery is unaffected or their lawyers managed to weasel out of all the phones qualifying somehow?

Who knows? Well google certainly knows, but they ain't telling me.


from https://arstechnica.com/gadgets/2025/07/a-mess-of-its-own-ma...

  Pixel 4a units contained one of two different batteries, and only the one manufactured by a company called Lishen was downgraded. For the Pixel 6a, Google has decreed that the battery limits will be imposed when the cells hit 400 charge cycles. Beyond that, the risk of fire becomes too great—there have been reports of Pixel 6a phones bursting into flames.
Perhaps your Pixel 4a had a non-Lishen battery? But if Google degraded your battery perf as well, then I have no idea.


Same with my Pixel 6a. I wish they would provide more details about why it doesn't qualify.


I know that this is the outcome they're hoping for, but my usual behaviour is:

> Do nothing

It makes me a little bit mad but having $50 'stolen' from me just feels like par for the course these days.


Yeah, that killed my battery... at least it got me a free battery replacement, even though actually getting that locally was a bit of a pain. Couldn't "reserve" or make an appointment ahead. Just call uBreakIFix and if they had one, come in. That Google did the release that broke the batteries after they had recalled all the spares from the fixit companies was really poorly timed.

Still using my 4a, though have been thinking of a switch to the 9XL.


The Google Pixel 6a was released on July 21, 2022. A perfectly fine phone artificially obsoleted in 3 years? Time to switch to Apple?


As someone who has designed neural network architectures, I disagree.


The speech data collected by this project has been used for more than a decade to build automatic speech recognition and text-to-speech synthesis systems (see LibriSpeech, LibriTTS, LJSpeech). It definitely has been a benefit to AI.


I think they are talking more about the impact of AI on Librivox, as in people running an ebook through an AI TTS tool and uploading it.

On one hand, a well curated/edited AI recording might be great but a lot of people will (try? Idk their policies) to upload AI slop (no proof-listen, no checking, just laziness).


I think that, for the purposes of creating high-quality Free audiobooks, the issues are essentially the same with human-generated recordings as with AI generated ones. The recording quality and faithfulness to the original text (both in terms of “content” and the appropriate reading in terms of tone, expression of emotion, etc.) have to be verified. The problem is scale. There will be many more TTS-generated recordings uploaded than human-generated ones. Some automated filters (e.g., ASR WER, audio quality metrics) would be a great first step to discard bad-quality slop right away (though it might unfairly penalize real human accented speech).

Importantly, the recording should indicate whether it was human or AI generated.


> Importantly, the recording should indicate whether it was human or AI generated.

This is all that's necessary. Sometimes I'm fine with mediocre TTS; sometimes I want an actual professional; librivox is somewhere in between, but should clearly specify whether I will be getting an amateur human or a robot.


I disagree, for the reasons stated by the person you replied to.

Historically, being told that a voice recording is AI generated would be enough to tell you to expect basic TTS robotic voice, but with advances in AI voice generation we're approaching the point where AI can sound as good as real humans - it's not yet to the point where it's easy to generate an audiobook as good as a professional reader, but that point will come in the not too distant future.

And equally on the other side, something being recorded by a human doesn't automatically mean it has the quality of a professionally-read audiobook. This is something LibriVox has always had to deal with, by gatekeeping which volunteer recordings to either give feedback requesting improvements to or to not use at all.

In some but not all cases, an amateur human reader can already be as good as a professional, that will soon be true for AI. For both AI and humans it will remain the case that some efforts are not as good, but the line between them (for quality) isn't going to be whether or not they are AI - though I do agree that AI or not should also be labelled.


Certainly TTS has improved a lot thanks to modern AI, but it simply doesn't have the information to improve beyond sounding like a human reading words fluently. A professional audiobook reader modulates his tone to reflect narrative mood, chooses voices for the characters consistent with their natures, etc., and transformer models can't do those things.

For an example of a professional audiobook, check out Rob Inglis' version of The Lord of the Rings.


I agree with you about the current stage of things, which is why I said that we're approaching the point but not yet there for AI to be able to match professional readers.

But I disagree with you when you write "it simply doesn't have the information to improve beyond sounding like a human reading words fluently" - it has the same information when reading it as a human does, meaning that the best implementation would have to not only adapt tone to explicit instructions like "... she shouted", but also read between the lines / make subjective choices to suit the different characters.

AI is already capable of doing sentiment analysis on text, and text to speech models are getting better at being able to simulate moods/emotions rather than just speaking flatly, and I don't think we're many years away, if that, from those two sides being paired together in a way that produces the sort of quality output we're talking about for the first time without human involvement. Add to that the fact that AI can train on the many good examples of humans reading things, they may get to the point of emulating not just the core accent but also how each accent should adopt to what meanings in the text and arrive at a great solution without even needing to go through the steps of analysing what the text means to use that to know how to modify the voice being generated.


You're more optimistic about this stuff than I am, but I think I get your perspective. We have decent sentiment analysis, fluent text generation, and real-sounding TTS, so combining them will yield a pretty good reading. I agree that you're probably right when it comes to newspaper columns and magazine articles, but that's not on the level of a good audiobook.

To take an example, here's an iconic line from the Fellowship of the Ring:

> The wizard swayed on the bridge, stepped back a pace, and then again stood still. ‘You cannot pass!’ he said.

If you think that is a command, you should shout it like Ian McKellen in the movie. If you think it's a statement based on superior knowledge (see https://acoup.blog/2025/04/25/collections-how-gandalf-proved...), you should probably state it with certainty and fatigue. And if you're making a movie with a ton of crazy special effects and swelling music, you should probably make whatever choice goes best in that context.

Even if a model could make some consistent choice there, I wouldn't be all that interested, because the reader conveying their interpretation of the character to the listener is what matters. Sure, it might get enough Spotify plays to make some money, but it's not art.


I get very annoyed at the AI voice overs in youtube shorts and videos (which are showing up more and more lately).

I just close the tab when I realize it is AI. Not sure how long I can do this.


Hopefully long enough till the fraction of AI voices you recognize as AI drops down to less than 10% so you don't get frustrated that often.


I imagine there's various disabilities where audio readings greatly simplify people's lives. They're probably appreciative of anything accurate regardless of whether it's humans talking or not.


How does this differ from SuperBPE, which seems to pursue a similar goal? https://arxiv.org/abs/2503.13423

Looks like parallel invention. (I’m not associated with the paper or its authors.)


In SuperBPE, a fixed number of tokens are learned, and then the constraints of pretokenization are removed entirely, and then the remainder of the target vocab size is learned.

In Boundless BPE, no schedule must be chosen, because there is not any point at which the constraints of pretokenization are removed entirely. Instead, at any point in the learning process, merges between adjacent pretokens are permitted if the pretokens are each represented by a single token. There are some additional details about how the authors incorporate Picky BPE, which I will not try to repeat because I would probably get them wrong.


Yes, they were concurrent work. (Co-author of BoundlessBPE here). A sibling comment describes the main differences. Our paper motivates why superwords can lead to such a big improvement, by overcoming a limit that pre-tokenization imposes on current tokenization methods. The SuperBPE paper has a wonderful set of downstream evaluation runs. So if you're interested in either, they are quite complimentary papers.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: