I will always remember when the "Takedown" movie came out. I loved the original "Hackers" and couldn't wait for "Hackers 2" which was Takedown.
I had learned about Mitnick few years prior to the movie and was fascinated by his life story and what he had done up to that point (including his "takedown" by the FBI). It's an understatement to say that his work, character and some sort of positive social manipulation put a great influence on my upbringing and later my professional career. Back then I enjoyed playing pranks with my friends and "hacking" them with all sorts of trojans and ejecting their CD roms :)
I've been using Threads since late yesterday and it's been very slow... sometimes it wouldn't even load content when you browse specific profiles (especially the active ones with tens of thousands of followers). So I am not surprised the backend is in Python :)
I remember when Evernote launched and everyone was super hyped about it -- even some of the biggest VC names promoted them. Not to mention the funding they raised ($290M). It was literally iOS Notes on steroids. I even used it for awhile, but somehow it didn't stick.
They hired bunch of great people and had some good backend tech -- sad to see this happen to them.
Evernote preceeded the Iphone, the product was originally a non cloud small business that I think was sold/ spun off into a venture capital funded company.
Evernote is older than the Iphone.
I just checked and I found a review of Evernote 1.0 from 2007, but I think the product predated that review. I found a web site saying the first Evernote Beta was released in 2004. I remember using Evernote in grad school in 2008.
Short story about Lotus Notes: Back in high school, a friend of mine started writing "code" for Lotus Notes; can't remember what the "code" was called, though - templates, plugins, something like that. Anyway, he started selling his stuff; first to accountants and small businesses (his dad was an accountant), then he started selling to local banks (in Modesto). His dad used to do all the sales, and my friend would go along to meetings and just sit there. Then, he sold the "code" through magazine ads. Finally, he somehow got connected to corporations in San Francisco and started making a ton of money (for a high school kid). Last I heard, just after we all graduated high school, he moved to San Francisco to start a full-time company. I have no idea what happened after that. Steve, I hope you did well. ... His situation taught me two things: (1) you could use computers for more than games. And (2) a "little guy" could start a company and actually make money with software.
Great job. I must say that the speech synthesis sounds pretty realistic. I talked with Jobs, Musk and Obama and liked how they sounded and more importantly how they handled the questions. Do you mind sharing the entire stack you used to build this? Very well done!
Thanks much appreciated! It was a mixture of some the latest TTS models. Azure speech to text. Gpt ofc. And some other tools for handling conversational stuff (like interruptions).
Nicely done. Does Azure Speech to Text also handle speech synthesis and provide out of the box voices for different characters or you had to build your own model to do this? It's impressive if their service can do it all: speech recognition, speech to text and text to speech and in near real-time. I should take a closer look at the Azure ML stack :)
I've been using the Azure Cognitive Services speech recognition and text-to-speech for my own locally run 'speech-to-speech' GPT assistant application.
I found the Azure speech recognition to be fantastic, almost never making mistakes. The latency is also at a level that only the big cloud providers can reach. A locally run alternative I use is Vosk [0] but this is nowhere near as polished as Azure speech recognition and limits conversation to simple topics. (Running whisper.cpp locally is not an option for me, too heavy and slow on my machine for a proper conversation)
The default Azure models available for text-to-speech are great too. There are around 500 models in a wide variety of languages. Using SSML [1] can also really improve the quality of interactions. A subset of these voices have certain capabilities (like responding with emotions, see 'Speaking styles and roles').
Though in my opinion the default Azure voice models have nothing on what OP is providing. The Scarlett Johansson voice is really really good, especially combined with the personality they have given it. I would love to be able to run this model locally on my machine if OP is willing to share some information about it!
Maybe OP could improve the latency of Banterai by dynamically setting the Azure region for speech recognition based on the incoming IP. I see that 'eastus' is used even though I'm in West Europe.
But other than that I think this is the best 'speech-to-speech AI' demo I've seen so far. Fantastic job!
Glad to see this is still going. I was sad to see CodeJam being shut down recently. We need more programs like GSoC and CJ as they encourage students to take a part of something great and contribute to the open source community.
My GSoC year was 2010 and it was definitely an amazing experience -- not just getting to meet and work alongside amazing community, but also to sharpen my software engineering skills, improve communication and have fun along the way.
If you're a student, please find something interesting you'd like to work on and apply! Find where the folks hang out and reach out to them! They'll be happy to help you get started! Back in the day we used irc.freenode.net as our communication hub for pretty much all OSS talk, but I am sure there are Slack or Discord servers now available for most projects.
This is brilliant. A long time ago when I worked on Windows video apps, I used to use Graph Studio[1] to visualize the video graph comprised of countless DirectShow filters. It occurred to me multiple times that such a tool would be super useful for ffmpeg as well.
It really helps visualize your filter graphs, especially when building complex video processing pipelines. Too bad this is not open source... I'd be more than happy to contribute.
I've been following Kagi's development for some time now and the idea looks promising. Not sure if the team plans to develop its own crawling engine and index that won't depend entirely on sources such as Google/Bing/Wikipedia, etc. Right now, it seems like the results are 90% google results without the ads (which is still a big plus). However, I'd like to see if they can pull off indexing of (maybe a smaller part?) of the web on their own -- that way they can completely get decoupled from Google and not put their fate in the hands of much bigger players.
Anyway, exciting stuff and I wish the team best of luck!
I agree with this. From my experience most of the data scientists I have worked with didn't exit the world of Jupyter notebooks. For them, code management, CI/CD, dev/stage/prod separation, etc. is a world of its own that they are not very comfortable with. Heck, they even used Sagemaker to create git repo for their Jupyter notebooks.
It doesn't mean that there aren't data scientists who have some engineering experience as well, but this seems to be rare. For that reason, getting those ML models that they painstakingly build to where they'll generate some real value is super hard. They just don't know where to start.
Working across multiple teams and multiple functions is very challenging and it often creates friction. Therefore, creating tools and systems that will enable those data scientists to see the actual value of their labor is paramount.
That's why we're seeing a huge resurgence of so called MLOps tools and platforms that aim to solve all or some of the problems of the entire stack. We are very very early in this journey, but I believe 2020's will be for ML and AI what 2010's were for the cloud and data, ie. new Snowflakes and Databricks but for the actual ML apps. It's exciting.
My advice is to read "The Bogleheads' Guide to Investing". It's an amazing read written by people who strongly believe in Jack Bogle's (founder of Vanguard) vision to democratize the investment world and make average investors reap the benefits of slow, compounding effects of the regular investments over long periods of time.
I second this advice. Best book and netted me a lot of money since. It’s a lifelong process, but bogle is a great start. Also it’s a fairly passive strategy that ignores media hype.
Interesting concept, but I think Big Query ML [1] has been providing similar features for years now. Curious to learn what are the differences, other than offering this as a Postgres plugin.
I had learned about Mitnick few years prior to the movie and was fascinated by his life story and what he had done up to that point (including his "takedown" by the FBI). It's an understatement to say that his work, character and some sort of positive social manipulation put a great influence on my upbringing and later my professional career. Back then I enjoyed playing pranks with my friends and "hacking" them with all sorts of trojans and ejecting their CD roms :)
I am very sad to hear that he's gone. RIP Legend.