Nuenki looks pretty cool! Thanks for sharing this. I will definitely check out Claude 3.5 for scoring.
Quick question: Your site mentions using Claude 3.5-Sonnet for translating sentences. On average, how much inference cost does each user typically incur?
I built an app that teaches French by generating "cognateful" sentences: French text that English speakers can partially understand through cognate words, with difficulty automatically calibrated using LLMs to score sentence comprehensibility.
The project shows how AI can automate the creation of comprehensible input for language learning, replacing the traditional approach where native speakers manually craft thousands of carefully sequenced sentences.
I attended a speaker event on ed-tech startups hosted by Y Combinator at UC Berkeley. The speaker had some interesting things to say about startups, so I thought I'd share my thoughts on Hacker News.
I couldn't think of a good digraph for the "OO" in book. The most available option is "OH", but that might mislead readers into thinking "OH" = the "o" in "over" and not the "oo" in book.
"Ø", then, seems like the best choice. (Also, it's the coolest looking diacritic form of "O")
One reason why IPA isn't a good choice is that etymological spellings are very important, and IPA doesn't really preserve those. For example, French "biographie" (BYOO-graffey) sounds very different from English "biography". But because these archaic spellings are preserved, it becomes easier for English speakers to learn French and vice versa.
IPA is also a little trickier to adjust to because it has letters not found in English. Someone who's never seen IPA before would have a hard time guessing how to read dʒ or tʃ. (Granted, the point of IPA is scholarly accuracy and not ease of learning so this is an understandable choice the creators made.)
Another thing to consider is that adding new letters creates a lot of overhead for new learners even if it reduces the total # of characters in the alphabet. Ideally someone who has never seen VJScript can scan over a sentence written in it and guess/understand most of it.
Both “Grahyam” and “Gray’m” are writable in VJScript. Even if the script has some ambiguity because regional variations in pronunciation, it is still a substantial improvement over ordinary English spelling
Also, thanks to you and all the other commenters for the detailed feedback and for engaging with the post
Thanks for pointing out these mistakes
- I added the schwa back into the vowel inventory (forgot to include in the original post)
- added clarification for how to write the “ir” in “bird” or “firm”