Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What have you built with LLMs?
372 points by break_the_bank 11 months ago | hide | past | favorite | 333 comments
Curious what people have been building with LLMs.

I worked on a chrome extension a few weeks ago that skips sponsorship sections in YouTube videos by reading through the transcript. Also was trying to experiment with an LLM to explain a function call chain across languages (in this case MakeFile, Python, Bash). I've tried running a few telegram bots that are PRE prompted to do certain things like help you with taxes.

What are you building?

What does the stack look like? How do you deploy it?




I don't like selling. I wanted a way to practice cold calling in a realistic way. I set up a phone number you can call and talk to an AI that simulates sales calls.

I ended up using it for more general purpose things because being able to have a hands-free phone call with an AI turned out to be pretty useful.

It's offline now, but here's the code with all the stack and deployment info: https://github.com/kevingduck/ChatGPT-phone/

Edit: forgot to mention this was all running off a $35 raspberry pi.


So the AI tries to sell to you, or you try to sell to the AI? This sounds very intriguing but I can tell by your README that you're an engineer and not a sales guy - there are no distinct value propositions.

But it sounds damn creative as a project.


The AI answers the call and acts as a potential customer. They take on personas to simulate behaviors like difficult or reluctant customers. You then do your pitch, handle objections, etc. At the end you get a transcript that's 'graded' to show you where you could improve your sales approach.

And you're right, I'm not a sales guy. This project is for people like me who want a risk-free place to learn the basics of sales so that when I do talk to an actual human, I won't panic and freeze up like I always do.


I absolutely love this idea.

Most high-level sales people rely on role play partners but that requires a pretty a big commitment. This would make a great product, imo.

Also (tip): Study, memorize and internalize a sales script for your product/service...along with the objection handlers and closing questions. Practice every single day. You'll gain massive confidence because you know exactly what you are going to say, every time.


> a risk-free place to learn

That's turning out to be a valuable feature of LLMs in many areas. You can practice complex interactions with them without worrying about boring or annoying them. Even the most patient human teacher gets tired eventually. LLMs don't.


I'd buy that. I'd buy that for interview preparation as well. Maybe 5$ per hour, up to 15$. I wouldn't buy a subscription, only actual consumption of the service.

Please consider putting it in online.


Love the idea of AI grading the answers, hopefully this can be extended to marking/evaluating/grading subjective manuscripts.

For me there's none is more boring than marking/evaluating/grading manuscripts. I prefer hard labor like gardening or farming than doing that activities although I'm quite good at evaluating stuffs I think.

Can you please elaborate how you do this and based on what metrics/scheme/etc the answers are being evaluated?


This, for some reason, reminds me of Nathan Fielder rehearsal skits.


This would be amazing to do practice code/tech interviews for software engineering roles

It could work both for practice and for automating interviews as well


do you have any reason to believe the phone calls are realistic?


This could be a product. AI sales training.


Now you can turn this into an AI sales cold caller based on the data you could collect from how the AI reacts to your selling. That is to say, the entire system becomes a generative adversarial network.


Like exposure therapy for people afraid of sales. Very nice idea.


nice term this, exposure therapy"


Yes exactly!


I like the idea very much! Using an LLM as a "sparring partner" for training in various areas. LLMs tend to hallucinate, so I find it harder to use them reliably in the context of decision making. Training however is a nice idea indeed: mistakes are not as critical, just as in real life any peer can make a mistake.


Very cool, sounds like a saleable product. I feel like there's already half a dozen landing pages with people trying to sell what you just made in the 18 hours since you've shared it here. That should however be a red flag to those same people, a demonstration in just how easily commoditized LLM products are.


Are you finding response time to be an issue? I can imagine some very long pauses might kill the flow of conversation.


It's not perfect, but it's tolerable, and not unlike some real-world calls where there's a slight delay. There are some "Hmm ..." and "well ..." scripted in as well to make it feels natural if there is a long response.


I love the scripted filler words, that’s smart


To that point, I would love to hear an audio file of it in action since I see from GitHub the phone number is down.


That's cool. Thanks for sharing the source. What else has it been good at for you?


The cold call sales part can be replaced to suit any need. I had another version that was just a generic AI (no sales stuff). I found myself on walks frequently ringing up the chatbot ("Hey siri, call ChatGPT") and just asking it whatever is on my mind. "Tell me about Ghengis Khan" or "where's a good place to catch trout in north Georgia" or "how do I make baked ziti". Makes the walks go by super quickly.


Would you be willing provide a live demo (via web interface) - as a preludebto providing a similar training bot as a consultant?


Now do it for dating practice - great for nerds ;)


I helped "writing" a cookbook from my grandmother's recipes. For her 100th birthday, my dad rescued more than 250+ pages of recipes that my Grandma had collected over the years. Some were written in typing machine, others written by hand by her. So, my dad scanned (pictured) all the typed recipes, and "dictated" all the handwritten.

For the dictated recipes, I told him to dictate just "flat" the words and numbers. So that I had paragraphs of recipes.

For the scanned recipes, I used Google OCR (I found out it was the best one quality wise).

For both sets of recipes, I then used GPT4 to "format" the unformatted recipes into well formatted Markdown. It successfully fixed typos and bad OCR from Google.

We then pasted all that well formatted text into a big Google Docs, and added images. Using OpenAI image generation I generated images for each of the 250+ recipes. For some of them I had to manually curate it, given that some of the recipes are for typical Mexican food: For example there's a (delicious) recipe called "PibiPollo" that for the unitiated it may look like a stew, so I had to tell something like "large corn tamale with thick hard crust".

In the end, the book was pretty nice! We distributed digital copies within the family and everybody was amazed :) . I loved spending time doing that.


This is absolutely awesome. I really want to do the same for my mom’s recipe before it’s too late. Though I wonder what would have happened if you went for GPT-V or LLaVa and the like. I have a hunch you might have been able to skip over the OCR part and straight from picture to markdown? Would be awesome if you can try and compare!


Would you mind sharing the cookbook or excerpts from it? I'd love to see it.


I cannot share the full book because a) I don't own the copyright and b) My dad (who ultimately owns it) still has plans to put sell it. And I feel it still requires some editing. But I can share a couple of sample pages:

https://drive.google.com/file/d/1OGE-zfNHHDnALbhgmf3lykBjcSg...

It is in Spanish though.


Very cool, thanks for sharing! The pictures also look a lot better than I imagined.


That's great!


My "stack" is just Apple Shortcuts making HTTP POST API calls to OpenAI, which does stuff in MacOS via BetterTouchTool. I trigger each by hotkey or typing a few letter into Spotlight (with Alfred). One transcribes and summarizes whatever youtube URL is highlighted. One does grammar and style correction of whatever is highlighted (and replaces it). One simply replaces the Dictate key with OpenAI Whisper but otherwise works exactly the same as voice typing. It's just way more accurate. One replaces the magnifying glass key to have a voice conversation with ChatGPT (using Microsoft voice synthesis). The built in prompt keeps it's answers short and conversational. It's like asking Siri something, but much better. One simply reduces the highlighted text by ~50% by rewriting it shorter, for when I have typed too much. One gives the key points of whatever article is in the foreground tab, so I know what I'm about to read. One outputs purely code, for example I use my voice to say "javascript alert saying blah" and alert("blah"); will appear at my cursor. Of course, it's usually more complex boilerplate stuff, but it helps speed up my coding. Every time I find myself using an LLM repeatedly for something, I make it into a little Apple Shortcut to streamline it into my workflow, as if it were a built in MacOS feature.


Could you please share the prompt for the grammar and style correction shortcut? I've just started using it for the same purpose, but I haven't been able to find a prompt that yields consistent results. Sometimes, ChatGPT completely changes the style of my text.


I use role: system, temperature 0.7, prompt: Fix the spelling, grammar, punctuation, order, and sentence structure. It's does sometimes change the style too much, but not often enough to annoy me into fiddling with it.


Have you tried Raycast? It has all the AI scripts you mentioned and many more. And many things done better, like showing diff before inserting grammar-corrected text.


Looks expensive at $20/month if you want GPT-4


I have looking for a way to do "push to record audio" (instead of Mac's dictate) for ages, thanks for the push to look at Shortcuts!

Are you using the "Record Audio" action or something else? Ideally the shortcut would stop listening after a pause like the native Dictate feature does it. At a minimum Record Audio seems to require hitting spacebar to stop - not great but not terrible.


Yes, "Record audio". BetterTouchTool launches the shortcut on keydown, then clicks the Stop button on keyup.


I really love that, super good ideas. I also generally love to create workflow optimizations. Will probably create some of your stuff for myself (I especially like the dictation replacement, could be super useful to me).

Wondering: How big is you monthly OpenAI bill when using all these tools? Only a few $$$, or is it higher?


Only a few dollars a month


You beast! They all sound awesome!


These sound amazing, if you don’t mind sharing somehow, I’d love to see how these work. I’ve never used shortcuts, but I think you’ve inspired me to try.


I put couple screenshots here https://news.ycombinator.com/item?id=39283515 to show the API call part. The rest is just whatever you want it to feed into in Shortcuts. For launching a shortcut on keydown, and clicking out of it on keyup, I used BetterTouchTool like this https://i.imgur.com/sqJ7cOc.png


Would love to learn more details about your setup! I use BetterTouchTool, too and wonder how to make use of it + shortcuts + the API


You might want BetterTouchTool too, since it adds things like cut and paste, to Apple Shortcuts All Actions list. I also use it as the initial trigger usually, to make a hotkey launch a Shortcut. Whisper looks like this https://i.imgur.com/ApAwf2E.png and ChatGPT looks like this https://i.imgur.com/g9f9ZDH.png .


Nobody heard of Raycast?


I have not. It looks like it just does some things Alfred was already doing 10 years earlier, and some ChatGPT. Seems like everything has ChatGPT integration these days. I don't think it's compelling enough for me to try tho personally, but I only skimmed the home page. The $8/mo subscription model turned me off a bit.


What are the settings and prompt you use for the youtube one?


I built an Interactive Resume AI chatbot where anyone can ask questions about my experience and skills: https://www.jon-olson.com/resume_ai/

The backend is a Python FastAPI that uses ChromaDB to store my resume and Q&A pairs, OpenAI, and Airtable to log requests and responses. The UI is Sveltekit.

I'm currently building a different tool and will apply some learnings to my Interactive Resume AI. Instead of Airtable, I am going to use LangSmith for observability.

I started writing and my Substack articles are also linked to via my website. I'm currently working on applying sentence window retrieval and that article will be out shortly. This is part of a #buildinpublic effort to help build my brand as well.

I've been unemployed since Sept as a Senior Software Engineer. The market is tough so I'm focusing on the above to help get employment or a contract.


Nicely done Jon. I really like the UI - I wanted to have buttons as well but didn't find how to do it in Streamlit.

I also built Resume Chatbot but using slightly different stack: Python, Langchain, Faiss as vector store, MongoDB to store chat logs and Streamlit for UI. Here is a link: https://www.artkreimer.com/resume/ or you can try it on streamlit https://art-career-bot.streamlit.app/. Code is available here https://github.com/kredar/data_analytics/tree/master/career_.... Great thread and I got some ideas for my next project. Thanks a lot everyone.


Hey Jon - I'm Jon too - working on an AI startup in the recruiting space and will be hiring remotely. I like your resume ap and can definitely see utility in it. I'd be happy to connect and maybe see if there is a way we could work together! I'll find you on LinkedIn and send you an invite request.


Resume AI is cool, really nicely done, mate !


Thanks. This is the first real test apart from a couple dozen test users. I've received hundreds of prompts in the past 24 hours from Hacker News users, mainly from my suggested questions buttons.

The actual questions I got did not provide a response that is to my liking. Most of that is due in part because I'm using gpt3.5 since gpt4-turbo is a lot more expensive, and I can learn a lot more by using an inferior LLM.

For example, using an llm router to analyze the query and route to a specific helper function with a specific prompt would be helpful. Sometimes a user starts with a greeting but the response is a pre-written "Sorry an answer cannot be found". Questions are typically grouped into a category such as skills, experience, project, personal (ie: where are you located), preferences (ie: favorite language), and general interview questions (ie: why should I hire you). Questions in categories can be better answered by using a different prompt and/or RAG technique.


I’m sorry it’s been tough. The job market for seniors and leads is still quite strong in Australia if you can move here


Thanks. I'm starting to realize that part of the problem is job search and matching.

I was contacted by a company recruiter for a small healthcare SaaS in California and had 3 interviews recently. When I looked up the job, only 7 people had applied in 2 weeks on LinkedIn. They are a very real company with very real people, but their job post is not getting seen (it's not a promoted post).

My next AI project will be to scrape LinkedIn jobs, analyze it for repost/promoted behavior, group it by consulting/headhunters vs company job post, eliminate duplicates, and filter based on my skillset and hard-no qualities (such as can't work if I live in California, must be in EST but I'm in PST timezone, requires Java experience, etc).


... btw. are you sure that only 7 people applied for that job? Because there are a lot of job announcements on LinkedIn which just won't show the number of applicants correctly in case there's an application link outside of LinkedIn for applying, meaning the application doesn't take place within LinkedIn. In that case, you'll get the question by LinkedIn if you have applied for the job, which most people just won't click. I'm seeing this all the time.

But still good point that there might be promoted jobs and non-promoted ones, maybe it's worth creating an own job scraper.


That's a good point about the applicant number. I don't think anyone knows exactly how it works, but it was the first time I saw a job posting older than a few hours with <10 applicants with such straightforward skills such as Python.


Just played with your app, I think is super cool! I especially liked the way that you can just click the next question within an answer, makes it super convenient and fun to use.

I'm currently also looking for a dev job. So you have 15 years of experience, live in California and struggle to find something? That sounds a bit demotivating to me lol, because I'm kinda half of all of that or a bit less.

I also like your LinkedIn analysis idea, should try that maybe, too.


I'm looking for remote dev job, which is much harder, since I am in Sacramento.


I had a similar idea. Scraping LinkedIn and some other job boards and analyzing and highlighting jobs that best fit specific criterias.


I've done a handful of fun hardware + LLM projects...

* I built a real life Pokedex to recognize Pokemon [video] https://www.youtube.com/watch?v=wVcerPofkE0

* I used ChatGPT to filter nice comments and print them in my office [video] https://www.youtube.com/watch?v=AonMzGUN9gQ

* I built a general purpose chat assistant into an old intercom [video] https://www.youtube.com/watch?v=-zDdpeTdv84

Again, nothing terribly useful, but all fun.


Oh hey I just watched that pokedex video. It was so impressive! Deserves way more attention


Indeed! Such a beautiful project!


Great job on that Pokedex and the video entirely. So freakin cool!


We've made a lot of data tooling things based on LLMs, and are in the process of rebranding and launching our main product.

1. sketch (in notebook, ai for pandas) https://github.com/approximatelabs/sketch

2. datadm (open source, "chat with data", with support for the open source LLMs (https://github.com/approximatelabs/datadm)

3. Our main product: julyp. https://julyp.com/ (currently under very active rebrand and cleanup) -- but a "chat with data" style app, with a lot of specialized features. I'm also streaming me using it (and sometimes building it) every weekday on twitch to solve misc data problems (https://www.twitch.tv/bluecoconut)

For your next question, about the stack and deploy: We're using all sorts of different stacks and tooling. We made our own tooling at one point (https://github.com/approximatelabs/lambdaprompt/), but have more recently switched to just using the raw requests ourselves and writing out the logic ourselves in the product. For our main product, the code just lives in our next app, and deploys on vercel.


Having a play with datadm. It's really good and intuitive to use - good job! I'm getting errors now, but was having a lot of fun before.


This is cool. Thank you for sharing.


I've built several things! These include bots for code generation that you can tag onto issues, q&a on text etc.

The thing I'm working on now is AI mock interviewing. It's basically scratching my own itch, since I hate leetcode prep, and have found I can learn better through interaction. To paste a blurb from an earlier comment of mine:

I'm building https://comp.lol. It's AI powered mock coding interviews, FAANG style. Looking for alpha testers when I release, sign up if you wanna try it out or just wanna try some mock coding. If its slow to load, sorry, everything runs on free tiers right now.

I really dislike doing leetcode prep, and I can't intuitively understand the solutions by just reading them. I've found the best way for me to learn is to seriously try the problem (timed, interview like conditions), and be able to 'discuss' with the interviewer without just jumping to reading the solution. Been using and building this as an experiment to try prepping in a manner I like.

It's not a replacement for real mock interviews - I think those are still the best, but they're expensive and time consuming. I'm hoping to get 80% of the benefit in an easier package.

I just put a waitlist in case anyone wants to try it out and give me feedback when I get it out

Gonna apologize in advance about the copywriting. Was more messing around for my own amusement, will probably change later


Very cool, I signed up. I agree that practicing a coding interview is better under pressure. It's a much difference skill to solve a coding problem both under time pressure and pressure to speak your thoughts to entertain the interviewer. Only practice can help improve that skill.


Yeah, I agree, the scenario is totally different in an actual pressure situation, I've fumbled so many easy questions. I don't necessarily like leetcode style questions as the standard for the industry for interviewing, but its still a reality and, from what I'm noticing, becoming more difficult in terms of expectations.

Thanks for signing up, will send out an email once its ready to take for a spin!


A Twitter filter to take back control of your social media feed from recommendation engines. Put in natural language instructions like "Only show tweets about machine learning, artificial intelligence, and large language models. Hide everything else" and it will filter out all the tweets that you tell it to.

Runs on a local LLM, because even using GPT3 costs would have added up quickly.

Currently requires CUDA and uses a 10.7B model but if anyone wants to try a smaller one and report results let me know on github and I can give some help.

https://github.com/thomasj02/AiFilter


That could actually be a universal ad-whacker for similarily stubborn sites (reddit)


I've been thinking the same thing. It'll be interesting to see if we end up with prompt-injecting ads


I didn’t know you could interact with pages like that so easily with Chrome extensions


I built an AI Hiring Assistant that performs an initial screening, collects candidate information, answers questions about the role, and also asks a several behavioral interview questions: https://hiring.gracekelly.dev/

Built entirely on Vercel & OpenAI. Took about a day, hardest part was configuring Sign In With Google. Had several dozen candidates use it, saved a lot of time and helped prioritize conversations.

I just did a brief writeup about it yesterday: https://www.linkedin.com/pulse/i-built-ai-hiringscreening-as...


> A few people emailed their resumes directly rather than using the chat

How did they fare compared to candidates that went through the chat process?


Small dataset for those that emailed, n=~3, but none of them were standout resumes. Best few candidates actually went through chat and also followed up via email with additional information a few days later.


I wrote a script that takes in my credit card statement line by line and categorized the transactions into a custom set of categories that I cared about as well as generating a human readable description of the transaction.


Tell me more, that is interesting. Even my bank (a big one) is unable to categorize the transactions correctly.


I'd love to see the script but especially the prompt you're using here.


Was thinking about this the other day too!


I used an LLM connected to a messaging service to defeat romance scammers. I was able to get these romance scammers to speak to my program for hours without knowing they were talking to a machine. Essentially, it's a DDOS for scammers. The scammers can only talk to a few dozen victims at a time, while the "people" in my programs can be spun up by the millions. It will essentially eliminate messaging scams from whatever messaging platform it's on.

I believe a large company like Meta, or any of the other companies with messaging platforms, would find this valuable. Especially because they will be fined by the UK for fraud that takes place on their messaging services.


At what point is it just two AIs talking to each other, back-and-forth?


Great idea! Defense through a trap, where the criminals will be used against themselves, I like it! Do you happen to have an address for your project so I can forward it? I may know people who would be interested in purchasing or supporting it.


LLM agents to forecast geopolitical and economic events.

- Site: https://emergingtrajectories.com/

- GitHub repo: https://github.com/wgryc/emerging-trajectories

I've helped a number of companies build various sorts of LLM-powered apps (chatbots mainly) and found it interesting but not incredibly inspiring. The above is my attempt to build something no one else is working on.

It's been a lot of fun. Not sure if it'll be a "thing" ever, but I enjoy it.


Fascinating. I've done this on a tiny, micro scale -- giving the GPT scenarios (eg, conversations, situations) and asking how it would play out. In early 2023 it seemed to work really well, now that they've nerfed it so much, it's a bit too generic and proper.


Have you tried GPT-4 with the update from the past few days? (When Sam mentioned it should be less lazy.) I notice it’s gotten much better and more willing to make forecasts since then.


No but I'll check it out. Thanks!


Very interesting, have you attempted to backtest to see if the LLM forecasts are accurate?


Thanks for asking! Not yet as I’ve been focusing on building agents that can properly and regularly log predictions.

Ideally, I’d like the agents to then participate in prediction markets or “superforecasting” groups to use actual human predictions as baselines.


If your project makes you rich and you need some engineering help, call me ;)


I built a couple of things, but the most useful is probably allalt[1], which describe images and generate alt tags for visually impaired users using GPT-4V. Next I want to add the option to use local LLMs using ollama[2], but I'm still trying to decide the UX for that.

There's also Moss[3], a GPT that acts as a senior, inquisitive, and clever Go pair programmer. I use it almost daily to help me code and it has been an huge help productivity-wise.

[1] https://git.sr.ht/~jamesponddotco/allalt

[2] https://ollama.ai/

[3] https://git.sr.ht/~jamesponddotco/moss


A “YouTube video subtitles generator” script for Estonian content.

Powered by whisper-timestamped [1] using a model trained by the local tech university TTÜ [2]

And it just… works! (with some tweaks and corrections)

[1] https://github.com/linto-ai/whisper-timestamped

[2] https://huggingface.co/TalTechNLP/whisper-large-et


I've created just-tell-me [1] that summarizes youtube videos with ChatGPT. It's built with Deno, uses TypeScript and is deployed with Deno Deploy. It's open source, you can run it from CLI as well [2]

[1] https://just-tell-me.deno.dev/

[2] https://github.com/franekmagiera/just-tell-me


This is great! I ignore so many videos from friends and family because I suck at watching videos!


This is really neat!


I used FlowWise[1], LM Studio[2], the llama2[3] model, and Ollama[4] (for embeddings) to create a local-only RAG chatbot so I could chat directly with Tristram Shandy, Gentleman[5]. For the context document I used the text of the novel of the same name, downloaded from Project Gutenberg.

Primarily it was a PoC to see if a document based chatbot could work without crossing trust boundaries by calling out to untrusted APIs. It only makes calls to localhost.

If you’re familiar with the novel you will be pleased to know that the chatbot ended a recent answer with, “I must go now as I have an appointment with my chamber pot and I wouldn’t want to keep it waiting.”

[1]https://github.com/FlowiseAI/Flowise

[2]https://lmstudio.ai/

[3]https://llama.meta.com/

[4]https://ollama.ai/

[5]https://www.gutenberg.org/ebooks/1079

Everything runs on a Mac Mini with the M2 Pro CPU/GPU and Mac OS Sonoma.


Seriously!? I love the idea, but I perhaps love more that Tristram Shandy was your choice of character to chat with. You have good taste!


I'm building https://www.brief.news, an AI powered newsletter that condenses tens of thousands of news articles into a daily briefing of the top stories, we support 30 topics today and are adding the ability to add your own!

Stack is a combination of TypeScript (Next / Node) + Python with a pretty simple deployment setup right now (GHA -> Container -> Cloud Run).


How much money you need to spend per day on OpenAI api?


A lot! We’re actively reducing that though by training our own specialized models. We’re seeing equal or better performance with curated datasets at > 10x cost reduction.


This is pretty cool. Just a heads up: there's a french newsletter company I was subscribed to that is using brief.eco, brief.me and brief.science. Ironically, their main selling point is summarized news but by humans.


There’s definitely a need here, love the category specific TLDs.


This looks awesome - might I suggest splitting the headlines on the homepage into a punchy title and subtitle? The wordiness of them makes it difficult for me to parse them for the topic quickly


Thanks! We got a couple different formats available, check out the top stories format which is close to what you suggest. Would love to hear your thoughts on it! We’re considering making that the default.


Have you considered a weekly version as well - I personally don't like receiving daily e-mails


great stuff, for general news-synthesis I use https://www.newsminimalist.com/ definately check it out!


This is fairly well done, good job!

The layout is clean, and it's fast. The summaries are solid as well.


Thanks!


I’ve always found podcast discovery to be lacking, so I’m building the ultimate solution to that.

We’re processing the top podcasts in many genres every day (currently thousands of daily episodes) and running them through our pipeline.

From this we’ve made a semantic search engine, for example: https://www.podengine.ai/podcasts/search?search_term=Should+...

We’re soon going to improve and summarise the responses from the raw embeddings in a few ways. Would love some feedback on the experience.

We have also opened up a keyword alerting feature to alert folks when they’ve been talked about in an episode.


I really like the idea of using embeddings in this way. I'm sure scaling out to get "most" of the podcasts is no joke. But some bigger podcasts like Smartless didn't seem to be in your database.

Have you considered using embeddings to show similar podcasts?


I built an iOS and macOS offline LLM app called Private LLM[1]. I don't have any visibility into what the users do with it, but from what I hear on the app's discord, people love to use it in their Apple Shortcuts workflows for text manipulation.

I initially built it using llama.cpp for offline LLM inference, but soon discovered mlc-llm and moved to using it, because the latter is way faster and flexible.

[1]: https://apps.apple.com/us/app/private-llm/id6448106860


I wanted to automate the process of creating self-guided tours and online treasure hunts around towns and cities.

Ultimately I wanted a whole marketplace where anybody can create a tour and then sell it.

But the process of creating the tours was quite laborious.

So to speed this up I fed GPT-4 information about local points of information and had it write the questions and the multi choice answers. It also wrote some narrative bits as various personas. For example, there was a Christmas hunt where GPT4 played the part of an elf and came up with a theme about Santa needing to recruit you to be a new elf, once you’d answered all the various clues etc.

Front end is React Typescript, backend is Net Core Web API on Linux with MySQL under EF Core and also integrations with GPT4 and Stripe.

It’s hosted at treasuretours.org

Only superusers can access the AI tools right now because cost, but you can try out some of the pre-made hunts which were partially AI generated.


https://www.rivadata.com/

I have been hacking together a poor-man's crunchbase that's fueled by GPT.

React / Python / Supabase. The most interesting piece thus far has been the success of the self-correcting loops through GPT. At each turn basically feeding the results back to another 3.5 prompt that is only about reviewing quality. I found that with these loops you can get solid results without having to use the more expensive GPT4 API.

(Also loving all the projects in this thread)


It's funny bc that's exactly how fine tuning is done rn. I find it amusing how we can leverage the same techniques pre and post. Wild wild west


This is awesome. I've been looking for this exactly in the last few days.


I've made a couple games, though I am still having a hard time finding the soul of the game in the LLM and haven't released them; there's a historical roleplay game (that I plan to release soon), a storytelling game (the player tells stories to the LLM), a wander-a-world-aimlessly-and-chat game, and I never get further than 50% through the way of murder mystery games, though murder mysteries seem like an excellent structure.

I've built some abstract content development tools, generally focused on building larger content somewhat top-down (defining vibes, then details).

I'm working on a general project helper using the GPT-Vision, voice, and regular GPT. You setup the camera above your workspace, work on paper, and chat with the LLM while you do it. I think there's a lot of potential, but the voice stuff is quite hard to deal with... there's just a ton of stuff happening in parallel, and I find it very hard to code something reliable.

The stack I use is all in the browser, generally Next.js, Preact Signals, and my own code to call into GPT, Whisper, etc. I like having everything available for inspection, and I generally keep all the working bits visible somewhere. (This can be overwhelming when other people see it.)

But I haven't gotten over the deployment hump... the cost and complexity is a challenge. I've used Openrouter.ai recently in a project, and I think if I leaned on that more completely I'd find the release process easier.


I have a somewhat unique answer for that- I started with building a product, and ended up building a dev platform for LLM based products (more specifically- dev platform for json outputting LLM structured tasks).

Here's the story:

At first I was building a tool for stock analysis- the user writes in free language what companies they want to compare, along with a time period, and their requested stocks show up on a graph. They can then further reiterate on it- add companies, and change range all in free language (I had many more analysis functions planned). Following some unique dev challenges I've found- I ended up not releasing the product (possibly will sometime in the future..), and switched to work on a dev platform to help with these challenges.

I was using what I called 'LLM structured task'- basically instructing the LLM to perform some task on the user input, and outputting a json that my backend can work with (in the described case- finding mentioned companies and optional time range, and returning stock symbols, and string formatted dates). The prompting has turned out to be not trivial, and kind of fragile- things broke with even minor iterations on the prompt or model configurations. So- I developed a platform to help with that- testing (templated) prompt versions, as well on model configurations on whole collections of inputs at once- making sure nothing breaks in the development process (or after). * If you're interested, welcome to check it out on https://www.promptotype.io


https://www.askaway.bot/

AI concierge for my parents’ vacation rental. Mostly just pulling info from the guest binder, but I’ve also started using some local guides to give better suggestions. Built with NextJs and deployed on Vercel (was really easy and they have a generous free tier).


How well does it work, regarding accuracy?


https://www.mealbymeal.com

It's macro + calorie tracking over text message. You just text what you eat and it matches against a food database to estimate your food intake. It's basically an easier alternative to MyFitnessPal.

My stack is OpenAI on Azure, Vercel, Convoy, FatSecret API, Postmark, NextJS.


I built a summarizer for drilling reports. Anytime you drill boreholes, whether it's on a drilling platform in the ocean or the middle of the desert or wherever, there's a geologist watching what comes out and writing notes about it. They likely do this multiple times both in the field and a laboratory setting. These notes are paired with logging software which also asks the geologist more quantitative questions sometimes (e.g., on a scale 1 to 5 how many fractures are there). Typically these are written for at least every meter of extracted core/rock/etc. typically you are drilling hundreds or thousands of meters, or more. So you end up with a highly unstructured data set that occasionally someone glances through to find tidbits. Using chatgpt we converted this data into keywords that could then be used to look at depth dependencies of various geological or petrological features of the region.


I was tired of the need to scroll through dozens of blogs and RSS feeds to learn about technologies and industry news, so I’ve built a service that helps you learn and stay updated about any topic by sending a single fully personalized weekly email digest, making relevant information come to you, instead of you chasing it (push vs pull):

https://peekly.ai

It’s basically an LLM-based RAG that works over the best blogs and websites covering any topic you provided during onboarding.


I'm unable to submit my email/interests.

This is in firefox with and without UBlock Origin.

Errors: https://i.imgur.com/N28wnVY.png


Ironically, when trying to view your picture of errors, I myself get an error on Imgur itself:

{"data":{"error":"Imgur is temporarily over capacity. Please try again later."},"success":false,"status":403}


Wow, thanks for letting me know! I've just fixed the problem thanks to your bug report.


I read the paper "Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling" that was published last week and started building a tool for people to collectively generate synthetic training data.

The tool still needs a trust mechanism and a coherent incremental publishing strategy to be able to operate in a public fashion. Right now, running one node using my RTX 3060 it would take 1.2 years to do one split of the C4 dataset.

https://arxiv.org/abs/2401.16380

https://www.emergentmind.com/papers/2401.16380

https://github.com/gardner/gsd


I got fed up sending cover letters so I made a tool that writes them for me. Scrapes the company website and summarizes it to get relevant background info, takes my resume + arbitrary info I provide as input, and the job posting (can also work without for unsolicited applications). I then fine-tuned a GPT3 model on actual cover letters I had written to make it sound like me, and voilà ! Actually landed me a job.


Summarisation for calls, emails. Lots of extraction tasks & closed domain chatbots.

Deployment is usually FastAPI for business logic, Langchain or MS/Guidance library, LLM hosted via. HF-TGI server


Lots of small stuff like bots and scripts to automatically rename files that I use locally every single day

Then things like:

“Fix My Japanese” - uses LLM to correct Japanese grammar (built with Elixir LiveView): https://fixmyjapanese.com

It has different “Senseis” that are effectively different LLMs, each with slightly different style. One is Claude, one is ChatGPT.

Or a slack bot that summarizes long threads:

https://github.com/dvcrn/slack-thread-summarizer


Love this, and love that you're using Elixir LiveView as well as Elixir with the Slack bot and lastly that you're based in Tokyo.

Followed you on Github, I'm looking at moving (back) to Japan in two years, likely to start a bootstrapped startup business and always good to have dev friends!


Thanks :)


I use a local LLM too for my Japanese grammar and sentence improvements to make thme more native.

Sadly, every time I tried the fix my japanese page it just says the text I inputted wasn't Japanese. Maybe next time it'll work for me.


> Lots of small stuff like bots and scripts to automatically rename files that I use locally every single day

What kinds of prompts are using for the file renaming scripts?


I will check this out. My native Japanese wife rolls her eyes at any AI Japanese bot. "We would never say it that way."


I just built a tool that uses Whisper.cpp compiled to WASM in conjunction with SQLite WASM for a fully client-side book writing tool.

Basically, I want to write a book without having to type out the whole thing. I got the dictation idea from an episode of Columbo.

It is very much a work in progress and a proof of concept for another writing tool I want to make.

https://orderly.cmgriffing.com/

https://github.com/cmgriffing/orderly


This is awesome! What is the performance like? particularly around WASM compiled Whisper.


Thanks! It really depends on your machine.

My mac mini is heavily constrained due to only having 2 proper cores and not much RAM. So the smallest models run best. The quantized tiny runs better than the regular tiny simply due to memory pressure.

So, in my testing on my mac mini it tends to take about 30% more time to process than the audio clip that was recorded. I added a specific warning that lets the user know to reduce their thread count if the processing time takes longer than a specific threshold.

Some of my stream viewers report it being pretty fast though. (much faster than what they see on my mac mini).


LLMs have been game changing productivity-wise for me

But I found that LLMs are often wrong and hallucinates, so I have to double check with google or other resources.

So I built a google and chatgpt alternative to answer any question and hallucinations are more obvious. I do this by using by multiple LLM's including search enabled ones i.e. GPT4, Gemini, Claude, Perplexity, Mistral, and Llama.

It's been growing healthily https://labophase.com


A search engine that saves me time by detecting SEO spam, downranks results containing ads, and summarizes click bait descriptions away

I made it available to the public aisearch.vip


I like how a few comments in this thread are the cause of the problem you're fighting.


I'm building a way to automate creation of software video lessons and courses, putting it all under the name 'CodeVideo'. One tool leverages OpenAI's whisper, as well as GPT3.5 or GPT4 for help with generating the steps that ultimately produce the video (this part is not yet in the repo; everything is a work in progress). The tool is here:

https://github.com/codevideo/codevideo-ai

My goal is to definitely NOT generate the course content itself, but just take the effort out of recording and editing these courses: you provide (or get help generating) the stuff to speak and the code to write and the video is deterministically generated) The eventual vision is to convert book or article style text to generate the steps to generate the video in an as-close-as-possible-to-one-shot.

I also leverage Eleven Lab's voice cloning (technically not an LLM, but impressive ML models nonetheless)

For anyone more curious, I'm wondering if what I'm trying to do is in general a closed problem - to be able to generate step by step instructions to write functional code (including modifications, refactoring, or whatever you might do in an actual software course) or if this truly is something that can't be automated... any resources on the characteristics of coding itself would be awesome! What I'm trying to say is, at the end of the day code in an editor is a state machine - certain characters in a certain order produce certain results. Would love if anyone had more information about the meta of programming itself - abstract syntax trees and work there comes to mind, but I'm not even sure of the question I'm asking yet or trying to clarify at this point.


This is super interesting. From what I gathered, generating a video autonomously given the content or instructions is sort of difficult. Curious to hear if you (or others) have any leads on how you can build this? (I'm assuming you intend to generate an instructional course video given the content) If this works well, could be used for teaching students via auto-generated video.


I'm building a spaced-repetition flashcards language learning app, that generates sentences and explanations for a given word.

Unfortunately only for German, but I plan on expanding the languages soon.

https://vokabeln.io

Tech stack: - The app is in Flutter. - Backend I'm nodejs TS. - GPT4 for generation of sentences and explanations - GCP text-to-speech for audio


Nice! I'm building something similar for French


We built https://gptforwork.com a set of add-ons for Excel, Word, Google Sheets and Docs that brings custom GPT functions in Excel and Sheets, to prompt directly from cells, a chat in Word to interact with documents, and a simple prompt box in Docs We offer OpenAI and Azure providers (as well as Anthropic on Sheets)


5M installations, wow!


Built this little tool to summarize Hacker News articles using HuggingFace. https://gophersignal.com

It doesn't do a ton, but it's kinda cool. Feel free to fix/add anything https://github.com/k-zehnder/gophersignal


I have built a webapp for translating srt files: https://www.subsgpt.com

GPT-4 excels as a translator, but it often encounters issues with content warnings and formatting errors when translating entire subtitle files via ChatGPT. The solution is straightforward: divide the subtitle file into sections, focusing solely on translating the text and disregarding the timestamps. While it's feasible to have ChatGPT maintain the correct format, I've observed a decline in translation quality when attempting this in a single pass. My preferred approach is a two-phase method: first, translate the text, and then, if necessary, request ChatGPT to adjust the formatting.

The webapp splits the srt file into batches of 20 phrases and translates each batch. It also allows for manual correction of the final translation.

Ah and it's also serverless: you input your OpenAI token & select the model of your choice and the webapp makes the requests to OpenAI directly.


This is great, how well does it do with informal/slang Portuguese, Russian or Spanish?


GPT4 is incredible! We watched https://en.wikipedia.org/wiki/The_Boy%27s_Word:_Blood_on_the... which is full of slang and it was perfectly comprehensive. I love how it'd translate informal speech into informal.


Some little projects I've been playing around with:

- https://github.com/iloveitaly/sql-ai-prompt-generator generate a ChatGPT prompt with example data for a sqlite or postgres DB

- https://github.com/iloveitaly/conventional-notes-summarizati... summarize notes (originally for summarizing raw user interview notes)

- https://mikebian.co/using-chatgpt-to-convert-labcorp-pdfs-in... convert labcorp documents into a google sheet

- https://github.com/iloveitaly/openbook scrape VC websites with AI


"Widjosumarajzer" = video summarizer

It's just a hodgepodge of prototype scripts, but one that I actually used on a few occasions already. Most of the work is manual, but does seem easily run as "fire and forget" with maybe some ways to correct afterwards.

First, I'm using the pyannote for speech recognition: it converts audio to text, while being able to discern speakers: SPEAKER_01, _02, etc. The diarization provides nice timestamps, with resolution down to parts of words, which I later use in the minimal UI to quickly skip around, when a text is selected.

Next, I'm running a LLM prompt to identify speakers; so if SPEAKER_02 said to SPEAKER_05 "Hey Greg", it will identify SPEAKER_05 = Greg. I think it was my first time using the mistral 7b and I went "wow" out loud, once it got correct.

After that, I fill in the holes manually in speaker names and move on to grouping a bunch of text - in order to summarize. That doesn't seem interesting at a glance, but removing the filler words, which there are a ton of in any presentation or meeting, is a huge help. I do it chunk by chunk. I'm leaning here for the best LLM available and often pick the dolphin finetune of mixtral.

Last, I summarize those summarizations and slap that on the front of the google doc.

I also insert some relevant screenshots in between chunks (might go with some ffmpeg automatic scene change detection in the future).

aaand that's it. A doc, that is searchable easily. So, previously I had a bunch of 30 min. to 90 min. meeting recordings and any attempt at searching required a linear scan of files. Now, with a lot of additional prompt messaging I was able to:

- create meeting notes, with especially worthwile "what did I promise to send later" points

- this is huge: TALK with the transcript. I paste the whole transcript into the mistral 7b with 32k context and simply ask questions and follow-ups. No more watching or skimming an hour long video, just ask the transcript, if there was another round of lay-offs or if parking spaces rules changed.

- draw a mermaid sequence diagram, of a request flowing across services. It wasn't perfect, but it got me super excited about future possibilities to create or update service documentation based on ad-hoc meetings.

I guess everybody is actually trying to build the same, seems like a no-brainer based on current tool's capabilities.


Very interested in this. I have been contemplating building something similar, but am unaware of any existing services that do this. Haven't played with pyannote, how does it compare to whisper? Also thought it might be useful to be able to OCR screenshots and use the text to inform the summariation and transcription especially for things like code snippets and domain-specifc terms.


I remember whisper v3 large blowing my mind: it was able to properly transcribe some two language monstrosity (przescreenować, which is a english word "to screen a candidate", but conjugated according to standard polish rules). Once I saw that I thought "it's finally time: truly good transcription has finally arrived".

So I view whisper as sota with excellent accuracy.

Now, for the type of transcription I need speaker discerning is much more valuable than accurate to the point translation: so it will be summarized anyway and that tends to gloss over some of errors anyway.

That said, pyannote has also caught me off guard: it correctly annotated lazily spoken "DP8" with non native speaker accent.

It looks really good


Is pyannote the best diarization library you found? What's SOA? I've been using a saas product (Gladia) and I'm getting close to my 10-hour mark.


The first and good enough for me not to look further


At https://openadapt.ai/ we are using LLMs to automate repetitive tasks in GUI interfaces. Think robotic process automation, but via learning from demonstration rather than no-code scripting.

The stack is mostly python running locally, and calling the OpenAI API (although we have plans to support offline models).

For better visual understanding, we use a custom fork of Set-of-Mark prompting (https://github.com/microsoft/SoM) deployed to EC2 (see https://github.com/OpenAdaptAI/SoM/pull/3).


We're building a GPT for managing your finances.

https://candle.fi/gpt

Our backend stack: - AWS - SST - TypeScript

Our clients:

- Next (web) - Vanilla React Native (mobile)

OpenAI's App Store announcement is what got us interested in building w/ LLMs.


Why not show names and faces of the founders? Explain the backstory. Using your service requires users put absolutely enormous trust in you. But there is currently nothing on the site to engender that trust. I would work on that as a priority.


I appreciate the feedback; I'm working on adding an about page!

We have an /updates page & blog, as well as links to GitHub. I figured finding out more about myself and my co-founder was pretty easy.


Even more ominous that it's free


link seems broken to me.


We've been deploying changes all day so could related, thanks for the report. Should work now.


I'm building a weight-loss app that leverages LLM to do 2 things:

1. Analyze calories/macronutrients from a text description or photo

2. Provide onboarding/feedback/conversations like you'd get from a nutritionist

https://www.fatgpt.ai/

My stack is Ruby on Rails, PostgreSQL, OpenAI APIs. I chose Rails because I'm very fast in it, but I've found the combination of Rails+Sidekiq+ActionCable is really nice for building conversational experiences on the web. If I stick with this, I'll probably need a native iOS app though.

Vendor stack is: GitHub, Heroku (compute), Neon (DB), Loops.so (email), PostHog (analytics), Honeybadger (errors), and Linear.


> 1. Analyze calories/macronutrients from a text description or photo

Step 1: Is it a hot dog or not hot dog? https://www.youtube.com/watch?v=ACmydtFDTGs

I'm glad someone is keeping the dream alive!


Jokes aside, GPT-4 Vision is surprisingly good at noticing facts from food images. For example:

- In my chipotle bowl, it can tell if I had brown rice vs white rice

- In my In-n-out, it can tell if I got it protein style

It struggles with accurate weights/volumes but I'm excited about where this is going.


fatGPT... the LLM that helps you be more model, less large.


our large language model is large so you don't have to be.


Transformers that transform your body


I was holding a free screening of a short film I made, and as an alternative to Eventbrite and the like, I built a simple SMS-based ticket reservation system that used GPT-4 to read and respond to messages. People interested in attending would text a number and their messages were routed by Twilio to my Node.js app, which in turn sent them to GPT to generate a response. The LLM was instructed to provide a structured JSON of each reservation once the person gave their name and the number of the seats they wanted. Worked very smoothly and only took an afternoon to build. Would've been infinitely more tedious if I had to worry about parsing messages with my own code.


I have two main projects that are public ATM with LLMs.

The more notable one was experimenting with LLMs as high level task planners for robots (https://hlfshell.ai/posts/llm-task-planner/).

The other is a golang based AI assistant, like everyone else is building. Worked over text, had some neat memory features. This was more of a "first pass" learning about LLM applications. (https://github.com/hlfshell/coppermind).

I plan to revisit LLMs as context enriched planners for robot task planning soon.


I made some LLM-powered text-adventure games: https://cosmictrip.space/gameannouncement

And I'm working on a webapp that is a kanban board where LLM and human collaborate to build features in code. I just got a cool thing working there: like everyone, having LLM generate new code is easy but modifying code is hard. So my attempt at working on modifying code with LLM is starting with HTML and having GPT-4 write beautfulsoup code that then makes the desired modification to the HTML file. Will do with js, python via ast, etc. No link for this one yet :) still in development.


I didn't make text-adventures with LLMs. I try to solve them [0].

So, far, none of the 7 tested models were able to win even one of the easiest text adventures. I tried many prompting techniques. But only GPT-4 was able to play through the first half of the game.

[0] https://github.com/s-macke/AdventureAI


Fun, I tried to do this back with GPT-3: https://llm.ianbicking.org/interactive-fiction/

But Zork wouldn't be a very accurate measure of skill because GPT definitely knows Zork. Unfortunately the emulator (https://github.com/DLehenbauer/jszm) doesn't work with most games newer than Zork. I haven't revisited the code with newer GPT models either.


GPT-3 doesn't even manage the first few steps of the tested text adventure. And GPT-4 is not good at playing these adventures either.

However, my code run a newer version of the Z-machine. So Zork and many other text adventures will work. I have not tried many other games though.


I was surprised how high your costs were. I assume you are putting the entire transcript into each prompt, but even then that seems high. Is GPT's planning also taking up a lot of room?

I did find giving GPT some hints about the known commands helped a lot, and I put in some detection of error messages and kept a running log of commands that wouldn't work. Getting it to navigate the parser is kind of half of the skill of playing one of these games. It would be interesting to have it play some, then step back and have it reflect and enumerate things about how the play itself works.


The costs have dropped significantly months after I created the cost image. Now I use GPT-4 Turbo. This GPT-4 model understand how text adventures work and there is no need to give him known commands.

Of course you try even more sophisticated techniques than mine. I tried the ReAct pattern and virtual discussions. So far, he always stumbles at the same place in a critical understanding of the text. And I tried exactly this critical step dozens of times.

You will understand the issue yourself, once you play the game yourself. It just takes 20 minutes and is very easy:

https://adamcadre.ac/if/905.html


You mean at the very end of the game? The game seems like it's only designed to trick you into that very ending :) Are you hoping it will figure out the game based on the context clues? I'm not sure I can find them myself...

A long time ago I did some exercises in "classical planning algorithms", which all feel very like the early part of this game. I.e., how do you get ready to leave if you have to shower, and can't do that with clothes on, etc. A similar planning example involved changing a tire (opening the trunk, removing lug nuts, etc). It was surprisingly difficult to make an algorithm that could figure it out! You could search the state space given the transitions, but it exploded with what was effectively lots of dead ends; obvious to me as a human, but not to the algorithm. Which is to say that this is a harder problem than it might seem.


Yes, that is the first "bad" ending. After that follow just the one relevant context clue and look under the bed. That might be already enough.

I chose this game, because the game just helps you, at the every step, what you have to do next. Not much to try out. Just the narrative changes. One time, you have to go to work and one time you have to flee.

Other text adventures are even more problematic. I saw GPT-4 trying for dozens of steps in the "The Hitchhiker's Guide to the Galaxy" adventure just to turn on the lights. And this just the first command you have to get right in the game.


I built a diagram generator in PlantUML format: https://chatuml.com

Also, hello HN! If you are interested, use this promo code for 50% off your first purchase ;)

  HELLOHACKERNEWS


Project 1 — Source code: https://github.com/bingdai/summaryfeeds. The code is for Summary Feeds (https://www.summaryfeeds.com). It shows summaries of AI-related YouTube Channels.

****

Project 2 - I also built a YouTube summarizer for individual video called Summary Cat (https://www.summarycat.com). It is not open source for now. The stack is very similar to project 1.

****

And yes I like summarizing YouTube videos:)


For my expense sharing app [1], I added receipt scanning [2] in a few minutes and a few lines of code by using GPT 4 with Vision. I am aware that LLMs often are a solution looking for a problem, but there are some situations where a bit of magic is just great :)

It is a Next.js application, calling OpenAI’s API using a plain API route.

[1] https://spliit.app

[2] https://spliit.app/blog/announcing-receipt-scanning-using-ai



I'm working on Invoker Network.

A Decentralised AI App store with cross border micro transactions.

You will be able to sell your LLM output (could be multi modal) for dollars or you decide. (LLMs working on your infra, you can keep weights for yourself forever.)

https://dev.invoker.network/share/9/0 (Dev environment is ready).

https://dev.invoker.network/share/9/1


I was working on this stuff before it was cool, so in the sense of the precursor to LLMs (and sometimes supporting LLMs still) I've built many things:

1. Games you can play with word2vec or related models (could be drop in replaced with sentence transformer). It's crazy that this is 5 years old now: https://github.com/Hellisotherpeople/Language-games

2. "Constrained Text Generation Studio" - A research project I wrote when I was trying to solve LLM's inability to follow syntactic, phonetic, or semantic constraints: https://github.com/Hellisotherpeople/Constrained-Text-Genera...

3. DebateKG - A bunch of "Semantic Knowledge Graphs" built on my pet debate evidence dataset (LLM backed embeddings indexes synchronized with a graphDB and a sqlDB via txtai). Can create compelling policy debate cases https://github.com/Hellisotherpeople/DebateKG

4. My failed attempt at a good extractive summarizer. My life work is dedicated to one day solving the problems I tried to fix with this project: https://github.com/Hellisotherpeople/CX_DB8


1) https://imaginanki.com - auto generating flashcards (Anki decks) for language learning with accompanying images and speech audio. Flutter web (JS) with backend on Cloudflare Pages Functions, connected to SDXL, Azure TTS and Claude.

2) https://amiki.app - practise speaking French, Spanish, German or Italian with a 3D partner. Flutter web with Whisper and my own rendering package.


I've been learning about RAG using LlamaIndex, and wrote a small CLI tool to ingest folders of my documents and run RAG queries through a gauntlet of models (CodeLlama 70b, Phind, Mixtral, Gemini, GPT-4, etc etc) as a batch proccess, then consolidate the responses. It is mostly boilerplate but comparing the available models is fun, and the RAG part kind-of works.

https://github.com/StuartRiffle/ragtag-tiger


I know chat is lame and overdone but here's my open source local AI chat app for macOS :). I wanted something simple enough for the non-technical people in my life who were using ChatGPT. For better or worse, those people are mostly not using chat AI much anymore. Seems like the initial awe wore off.

https://github.com/psugihara/FreeChat

I'm also working on a little text adventure game that I hope to release soon.


I’ve always wanted a tool to help me track my online orders. However, it wasn’t practical to make integrations with every merchant. Even scraping the order emails was way too much work to do for an unproven product.

Now with LLMs it’s simple to extract structured data from emails.

I built [Orderling](https://orderl.ing) that is basically a CRM for your orders. It uses OpenAI api to extract the order information and automatically adds it.


We built a social media platform for chatbots... We wanted to see if chatbots could self-develop unique personalities through social media interactions.

The results were actually hilarious... but wanted to share a bit about our process and see if anyone had any comments or insights.

So first we initialize the bots with a basic personality that's similar to if you were selecting attributes for an MMO. Things like intelligence, toxicity, charisma and the like. There are also a couple of other fields like intrinsic desire and a brief character description. These are fed to the model as a system prompt with each inference.

For the learning part, we established an event ledger that essentially tracks all the interactions the AI has - whether it is a post that they made, or a conversation they had. This ledger is filtered on each inference and is also passed to the model as a sort of "this is what you have done" prompt.

Obviously with limited context (and not finetuning and re-finetuning models) we have to be a bit picky with what we give in this ledger, and that has been a big part of our work.

Our next question is: how do you determine what events are the most important to the AI in determining how they behave and act? It's been interesting!

The platform is anotherlife.ai for those curious!


I am currently building an automatic book generator of Rust source code, in which the LLM will write the description of the code of a whole Rust project. It will be a bot, which will connect to the website, generate descriptions, download them, and create the book. It is very early in the project, 3 days in, but it's going well.

https://github.com/pramatias/documentdf


Nice idea, but README is required. Also it can be generated by GPT :)


It is generated in it's entirety by GPT. Well 98% is more like it. By the time it's ready, it will have a README. I will announce it on Reddit /r/rust if you are interested.

Something i want to test, is how much documentation is needed, for the machine to infer the rest of it. Something like, one sentence of human documentation + code, how much can LLM infer and describe the code as accurately as possible. Does it need two sentences? 3? We'll see.


I built https://eternalsouls.ai/ for a client recently.

You just export and upload a WhatsApp conversation and it will learn the personality AND voice of your conversation partner. You can send/receive text or voice messages; It was pretty damn spooky to actually have a voice conversation back and forth with an AI standing in for my "friend"


I've seen this episode of Black Mirror.


Yeah, the pricing tiers make it all the more morbid and disconcerting. Scary future for sure.


I'm currently working on an interface for google calendar @ https://calendarcompanion.io My next feature is integrating the functionality with telegram, it's hard to predict the value of these features in the moment - but I do think this could be an extremely interesting "iPhone" moment for technology. Just like how the iPhone reduced everything to a single button press, we can now squeeze the functionality of some pretty complicated apps into natural language through text - and as the response time of LLM's improves it will become a short conversation for things that used to dazzle new users! Exciting times!

As for the stack, I have Supabase and Typescript on the frontend, python on the backend and k3's as a cluster for my apps (can recommend this if you want to get devops-y on a budget). Next time, I'll just go pure Typescript since python really doesn't add much working this far away from the base models.


Our CEO believes LLMs are a fad, so there's nothing really strategic about it in the company's roadmap, but I was able to assemble a skunkworks team of enthusiasts who integrated ChatGPT into one of our eLearning products. It allows a course author to improve writing, it makes suggestions about content, etc. Technologically, it's nothing special, just a bunch of pre-made prompts. The reception was kind of lukewarm because we were too late with it (due to decision makers not caring much about it and delaying the release for no reason) - by the time we rolled it out, you couldn't already impress anyone with it. Plus, there's almost no marketing about it. Currently, the main users of the integration are our own marketing and sales teams. It was my first experience of this sort (assemble a team, introduce a new feature from scratch - I was just an ordinary engineer before) but the ending was kind of... anticlimactic.


So was your CEO correct? Was this a fad? It seems interest has certainly cooled, at least for your market segment, according to your experiment?


What I originally envisioned was that we would first dip our toes in the water by quickly adding a very simple, easy-to-implement feature (the "improve my writing" feature mentioned above). If we received positive feedback from customers and management, we would be able to secure a proper budget for adding more complex and cool features based on RAG/function calling. For example, a scenario where a course learner asks a question and the LLM looks for the answer in the company's knowledge base/course library and provides a tailored, precise response. I had many other cool, useful scenarios in mind.

In fact, when we showcased our prototype of the "improve my writing" feature that we quickly put together in one week in early 2023 to a few select clients, their feedback was very enthusiastic. However, it took several months to bring it into production due to several bureaucratic hurdles: clearance from the legal department, the product owner delaying the release because of other priorities, and so on.

Now that the first feature has received a lukewarm reception because of the delayed release, we have neither a dedicated team nor a budget for adding more LLM-based features. Implementing proper and useful RAG is more complex and requires a certain level of expertise (vector DB integration, chunking strategy/indexing, tricks like HyDE, reranking, etc.), compared to just using the command "hey ChatGPT, improve this: %s." It is now unlikely that we'll have anything cool anytime soon without the support of the CEO or product owner (and they probably believe I proved it was indeed a fad). Most teams are currently busy with ordinary features from our backlog that do not require LLMs, and no one cares anymore.


We built Jumprun. You can use it to research and analyze data sources, and it'll produce beautiful canvases with tables, charts, videos, maps, etc. We're working on automations so you can setup natural language trigger conditions that execute actions.

We built it in Kotlin with Ktor server, htmx and tailwind. It uses a mixture of models, including gpt4-turbo, gpt4-vision and gemini-pro-vision. It's deployed using Kamal on bare metal.

Example canvas that provides a roundup of Apple Vision Pro reviews: https://jumprun.ai/share/canvas/01HNXB2K3GM7KPRP45Y2CVVJSC

Our learn more page with some screenshots to show creating a canvas: https://jumprun.ai/learn-more

It's a free closed beta at the moment to control costs, but let me know if you'd like an invite.


Cool! I like the "intelligent canvas" concept. Not exactly the same but my brother and I have been building a side project that also is all about making the most of a set of information using different views like maps, calendars, tables, etc. We have been looking into adding AI to make it easier to import data without having to manually tag all the data. https://visible.page


txtai (https://github.com/neuml/txtai), an embeddings database for semantic search, graph networks and RAG


I built out a few utilities as experiments. One app linked to Salesforce to query/analyze sales data. Another that reads our help documentation and gives instructions via chat.

The last app, the only one that was deployed anywhere, is https://catchingkillers.com This app is a simple murder mystery game where the witnesses and the killer are ChatGPT bots. The first two stories are complete and active, the third is not complete yet. The first story of the working two is taken from another murder mystery group game https://www.whodunitmysteries.com/sour.html. The second story was highly influenced by ChatGPT.

It's a bit rough because I didn't spend too much time on it, but if anyone does signup to play, I'd love to hear feedback.


Is the salesforce data in a structured (SQL-like) format? Or uploaded documents?


I am working on a RAG based chatbot to answer the queries based on contents of my main website and blog which is fintech related .

I would also in future try to make it generic so that it can crawl any website and store new contents in vector databases. Response to user query then can be returned by combining the vector search and llm


A BERT-based summarization system for financial earnings calls. It can take a 60-minute transcripts of such meetings can compress the contents down into 5 bullet points.

https://link.springer.com/chapter/10.1007/978-3-031-28238-6_...

Financial earnings calls are important events in investment managements: CEOs and CFOs present the results of the recent quarter, and a few invited analysts ask them questions at the end in a Q&A block.

Because this is very different prose from news, traditional summarization methods fail. So we pre-trained a transformer from scratch with a ton of high-quality (REUTERS only) finance news and then fine-tuned with a large (100k sentences) self-curated corpus of expert-created summaries.

We also implemented a range of other systems for comparison.


Oh funny, i've been working on a similar project, analyzing earning call transcripts using LLM's. My first attempt was with BERTopic. The results were awful. My second attempt was with a finetuned 7B version of Mistral, with heavy prompt engineering, the results were actually super good in my opinion... plus it runs on a single 3090.


We've built https://agentgold.ai/chat, which is an interface to chat with youtube creators about their content.

It looks through past transcripts, topics, view counts, and other metadata so users can quickly learn what a Youtuber is all about.


As I was building LLM projects, I found I was re-implementing a new vector database for each one. So I built RagTag (https://ragtag.weaveapi.com), a vectordb/RAG as a service to make the process faster. This provides a CRUD interface to push and retrieve documents, which are automatically chunked and converted to embeddings.

AgentX (https://theagentx.com), an LLM chat support app is one of the projects I built on this framework. It is a self-updating customer support agent that is trained on your support docs. Not only does this answer your customer questions, it provides summaries of the queries so you get a sense of where your product and/or documentation is deficient.


A text to slide based online course video with images workflow.

I’m working for an edTech company. Some students prefer video. So I built a Django app that takes a block of text and formats it into a set of slides, each with a title, some bullet points, an Dalle-3 generated image, and a voiceover.

It then compiles that all into a video.


An Extensible Conversational UI for Interactive Components[1][2], current use case is a Personal Productivity Assistant for structured data.

The stack is simple, preact in the fronted with a custom framework on top and bun on the backend calling OpenAI, I may port it to rust in the future.

I plan to try local LLMs when I have some free time.

For now each users runs the application locally with their own keys[3].

[1] https://www.youtube.com/watch?v=nS1wsif3y94

[2] https://www.youtube.com/watch?v=f-txlMDLfng

[3] Alpha software, check the readme: https://gloodata.com/download/


request - i want an LLM tool that can process raw text or email and update or create salesforce records.

Example 1: i get an email from a potential customer that says they want [product A]. I can forward that email (or call notes) to salesforce (or somewhere) and it will understand the preference and the relevant customer and update that customer's profile.

Example 2: In a B2B context, lets say my customer is a company, and there is a news article about them. I could forward a link to the article to the LLM and it would understand that the article is about a customer, and append that article and key info about it to my saleforce record for that customer. The news item becomes an object that is linked to that customer (for call context, better sales targeting, profiling, etc).

Can someone help me build that?


I'm working on something like that. Feel free to email me (address in my bio).


I've been using a combo of LLMs + live transcription to build a passive assistant that keeps track of talking points and can pull out data/tasks from a conversation you're having (https://sightglass.ai or here's a demo of me using it: https://www.loom.com/share/0220ca03bce341669d314d4254872226)

So far this is being used for:

- Sales -> guiding new recruits during more complex client calls

- HR -> Capturing respones during screening interviews

If you'd like to try this out feel free to DM me or email me at andrew at sightglass.ai, we're looking for more testers!


I am building a no code solution. Use case is simple: Write complete programs/software for the browser from natural language input.

https://domsy.io

Currently running on my little digital ocean droplet. Stack is javascript/python.


I like it! One of the example is the most straightforward QR code generator I have ever seen too.

https://domsy.io/share/ddf54149-5de9-4f3a-b936-f007a451c0b5


I am working on building out a better voice interface for LLMs.

It is still a work in progress (early beta), but you can check it out at https://www.bonamiko.com

Currently I have mainly been using it as a tandem conversation partner for a language I'm learning, but it can be used for many more things. As it is right now, you can use it to bounce ideas of, practice interviews, and help answer quick general questions. You just need to tell it what you want.

The stack is a Next.js application hosted on Vercel using Supabase for the backend. (There is also some plumbing in AWS for email and DNS.) It is automatically deployed via GitHub actions.


Very cool, just signed up. What advantages does this have over the one built into the ChatGPT app? Also, it would be great if I could see the text output in addition to the voice.


The main differences fundamentally come down to OpenAI treating it more like a party trick demo, rather than a core functionality. I think it has a lot of potential if I can just fine tune a couple rough edges. (When you chat with someone in person, you don't pull out notebooks a write messages to each other. I see writing as a fallback medium.)

To answer your question more specifically,

Pro Bonamiko:

  - Faster average first response latency (but higher first audio latency since OpenAI uses a ding). This is the main focus currently, reducing latency as much as I can. I'd like to be able to avoid the ding, but we'll see how low I can get it.
  - Can be used anywhere with a browser, OpenAI requires a mobile app installed. (I.E. Desktop support)
  - In the future we can support deeper customization since we are focused on the audio medium. As soon as you have to run a function in the ChatGPT app there is a long response latency, which could easily be fixed by something as simple as the AI saying "Let me perform a search to get the details"
Pro ChatGPT:

  - Nice animation
  - Already has built in tool support such as web search
  - Supports language switching automatically between messages, Bonamiko requires manually changing the language


I'm interested in RAG, so I make benchmarking & optimization tool for RAG system that using LLM. AutoRAG : https://github.com/Marker-Inc-Korea/AutoRAG

Since it is python library, we deploy it to pypi. But for using it on my own, I am using H100 linux server on the torch docker & CUDA. Running it needs only vim and bash. And plus, for running local model I love VLLM. I make my own VLLM Dockerfile and use it for deploying local model in 5 minutes.

FYI : Borrowing whole H100 instance is really expensive, but in my hometown, the government support us the instance for researching AI.


I started working on a Rust based AI agent host with the goal of running locally. It has Rhai scripting built in which is what the agent function calling is based on. Very rough at the moment. Also on hold for me because I need to do more dirt cheap Upwork projects to scrape by this month.

I think what will be really powerful is to have a registry for plugins and agents that can be easily installed in the system. Sort of like WordPress in that way. Also similar to an open source GPT store.

https://github.com/runvnc/agenthost

I believe the are several variations of this type of idea out there.


https://github.com/russellballestrini/flask-socketio-llm-com...

This project is a chatroom application that allows users to join different chat rooms, send messages, and interact with multiple language models in real-time. The backend is built with Flask and Flask-SocketIO for real-time web communication, while the frontend uses HTML, CSS, and JavaScript to provide an interactive user interface.

demo here supports communication with `vllm/openchat`:

* http://home.foxhop.net:5001


I run a survey platform[0] and I use an LLM to generate insights from open-ended response data. Using it for open-ended response classification as well.

[0]https://www.zigpoll.com


We build https://aichat.realtimex.co, a customer support AI working along side Human Agents. It's a RAG system with embeddings built from crawling pages of the website and user-uploaded documents including dynamic databases (such as products and pricing). The key difference with other LLM's CS products is the collaboration between the AI Agents and Human Agents. We are inspired by aircraft's pilots and autopilots collaboration. In this case, the AI and Human Agents silently collaborate to bring the best support to customers.


I built a RAG implementation for 35k books/articles/wiki pages/web pages i collected over the years(it took about 6 weeks on 3070ti 100% constant usage). I query it with various steps of data extraction/narrative building/refining etc, over LLMs. Almost daily i figure out new steps to add to the pipeline and honestly, i could not imagine learning about niche topic x from so many perspectives/periods in such a short time(including the original source). I did not yet figure out how to package this, but i spend at least 2h of my free time daily with it. Ideas and feedback is welcome.


I’m also building a RAG app and I’m finding so many different ways to do it.

I’m curious: was there one method that improved the accuracy/relevance of the answers the most?

Also, are you using Langchain, Llamaindex, or something else?


used langchain but for this implementation i used llamaindex. bsturza@duck.com


I am doing something very similar I would love to trade notes.


bsturza@duck.com


can you share more about this? when you say "it took about 6 weeks on 3070ti 100% constant usage" is that 6 weeks generating embeddings?


yes, generating embeddings. i can share more: bsturza@duck.com


An automatic video editor.

It should be cheap enough to deploy that it can be applied to relatively low-value content like video meeting recordings, so it can’t spend a lot of expensive GPU time analyzing video frames.

It also needs to be easily customizable for various content verticals and visual styling like branding and graphics overlays.

And everything is meant to be open sourced, so that’s fun!

I wrote about it on my employer’s blog here:

https://www.daily.co/blog/automatic-short-form-video-highlig...


I wish someone hooked up a chat interface to a CAD program. I find CAD very hard to get in to. It would be really nice to able to ask it how do stuff or to modify parts. Would be very "Star Trek in Holodeck" :)


Also a Chrome extension [0]! The concept is to use the browser's context menu to run commands on the LLM, so it stays out of your way most of the time but feels like a somewhat native experience.

The stack is: 1. TypeScript/Node/tRPC/Postgres/Redis/OpenAI on the backend 2. SolidJS/Crxjs/tRPC on the front end 3. Astro for the docs/marketing site

And deployment is currently through render.com for the databases and servers, and manually via a zip file to the Chrome webstore for the extension itself.

[0] https://smudge.ai


Open source alternative that does the same but in vanilla JS and completely in user's control https://github.com/SMUsamaShah/LookupChatGPT


A turing test disguised as a game:

https://humanornot.so/

Heavily inspired by https://humanornot.ai/ (which was a limited time research by Ai21 Labs), now the project is on its own path to be more that just a test.

My work is to make AI chats sound like real humans and it's shocking how good sometimes the AIs are .

Even I as a creator, knowing everything (prompts, fine-tuning data, design, backend etc.), often can't tell if I'm speaking to human or designed by me AIs


Ooooh, not something i have built, I do want to but suspect someone else has done it better than i could.

A tool to RAG a github repo, so i can ask questions of how a certain library or tool works? Even better if it pulls in issues


This is very easy to do and a great idea!

Langchain and Llama Index both have classes to read a directory, if you git clone, and perform a RAG.

If you want it to scrape a github URL, there is a module for that too!

This starter tutorial will do RAG on a directory of files: https://docs.llamaindex.ai/en/latest/getting_started/starter...


I've built an open-source ChatGPT UI designed for team collaboration.

Github Link: https://github.com/joiahq/joia

Benefits vs the original: - Easy to invite entire teams and centralize billing - Talks to any Large Language Model (eg: Llama 2, Mixtral, Gemini) - Collaborative workspace to easily share GPTs within the team, similar to how Notion pages work - Savings of 50%-70% vs ChatGPT's monthly subscription

Tech stack: NextJS, Trpc and Postgres. All wonderful technologies that have helped me develop at the speed of thought.


I'm building https://www.getmosaic.io that helps GTM teams enrich lead data and power personalization at scale, by integrating 30+ data providers and web scraping.

I've built this by using AI as the foundation for everything. I am using LLMs to classify information and extract structured data points for any webpage, or RAG for finding data.

Tech stack: - Mistral 8x7b and Perplexity API for data processing and GPT-4 input - GPT-4 for content output - pgvector in Supabase - LangChain for the pipeline and RAG stuff


AI Assisted Open Source Communication App for Autism - https://github.com/RonanOD/OpenAAC

It's a flutter app (in beta on Google play store currently) that uses OpenAI embeddings with Postgres pg_vector DB hosted in Supabase. Any poor matches go to Dalle3 for generation.

Our charity (I am vice-chair on the board) is hoping to use it as part of our program: https://learningo.org/app/


I use sponsor block and it's really good, I like that it's community-driven but sometimes it's not available for videos so your solution sounds great.

I consult to a law firm as their founder-in-residence. For fun, I trained Llama 2 on all the non-client data of the firm so that people could ask it questions like "Who are the lawyers in Montreal who litigate American securities laws, what are their email addresses and what time is it where they are?" It's a njs app running on linode.

It's extremely simple, but people seem to find it useful.


I was frustrated with ChatGPT's inability to answer questions of popular-but-not-that-popular open-source projects. So I helped build a ChatGPT-like tool that can answer questions about any open-source project, and you can add your own (public) GitHub repositories to it. The tool is meant to be used by sales engineers, but can be used by anyone.

Check it out here: https://app.commonbase.ai/

It has been a huge help for me when working with certain open-source libraries.


Me and a friend built Mysterian. It allows you to draft AI replies, summarize emails, and chat with your inbox.

We used Plasmo to build the chrome extension, React for the frontend, and currently OpenAI as the LLM provider.

Currently it only works with Gmail but we plan on adding other email providers as well.

Feel free to check it out: https://chromewebstore.google.com/detail/mysterian-ai-for-gm...


I wrote gait, an LLM-powered CLI that sits on top of git and translates natural language commands into git commands. It's open-source: https://github.com/jordanful/gait

I also wrote PromptPrompt, which is a free and extremely light-weight prompt management system that hosts + serves prompts on CDNs for rapid retrieval (plus version history): https://promptprompt.io


I am building textool [1] an app that lets you create endpoints using GPT4. The idea is to make it so you can create "actions" for GPT4 assistants easily.

  - Nextjs
  - Deno Deploy for hosting the apis 
  - Supabase - postgres / auth
  - Shadcn
I want to use the t3 app stack [2] for v2.

It's really MVP, but I want to see if anyone is interested at all before I work on v2: creating gpts that come with databases!

  [1] https://textool.dev
  [2] https://create.t3.gg/


IMO the Grimoire GPT's success is proof that there is a market for something like this.


Thanks for saying this! Really appreciate it :)


I am working on an app to make it even easier to run Local LLMs and support for multiple chats, RAG, and STT. I did it mostly for learning about different tasks that’s possible using local LLMs and specifically for my wife who was working overwhelmed with those things (and for some reason was overwhelmed setting up Ollama. Tech stack is Electron + NuxtJS, currently only for Mac but I have already started tinkering with Windows support.

https://msty.app


I created a Chrome extension which shows cryptocurrency prices & insights when you hover cash tags on Twitter. I'm a product manager with solid CS understanding, but haven't had the time to learn React or glue frontend stuff together - so about 80% of the code is generated by GPT4. I've mainly architected the code and deployed on Vercel. I feel like AI + Vercel has given me that final push to actually deploy products instead of just building stuff and leave it lying around.


I built the copilot for flux.ai, which allows LLM-driven interaction with circuit schematics and datasheets.

The stack is react / cloud run / job queue / LLMs (several) / vector db.


I’m working on some tools to help GMs of tabletop games make content for their players.

Little demo is up at npcquick.app.

Doesn’t look like much rn, but there’s no openai involved. Currently it doesn’t even use a gpu.


I am working on a part search engine for company maintenance teams. We built a search engine that searches parts in real time across a dozen or so vendors (Amazon, eBay, McMaster, etc). We then leverage Chat GPT to extract data from product titles. Part number is one of the key elements we extract. Since part numbers vary greatly across manufacturers, it's difficult to throw something like a regex at it. It has done a really good job so far for data extraction.


1. An infinite crafting game: https://foodformer.com

2. An embeddings-based job search engine: https://searchflora.com

3. I used LLMs to caption a training set of 1 million Minecraft skins, then finetuned Stable Diffusion to generate minecraft skins from a prompt: https://multi.skin


I love the skin generator


Absurd news article generator using local LLMs. I wanted to create a static website from the articles, but ultimately didn't think anyone would give a damn. In the same vein I create a person + CV generator, and a group chat between simulated crazy people.

I made a private Discord bot for me and my friends to talk to, that also generates images using SD 1.5 LCM.

The self-hosted backend uses the ComfyUI Python API directly for images, and the LLM part uses oobabooga's web API.


I'm making two LLM's negotiate the exchange of a product, price is the main issue but I'm trying to make them negotiate another issues too in order to avoid the "bargaining" case.

I've tried several models and gpt4 is currently the one that better performs, but OS LLM's like Mixtral and Mixtral-Nous are quite capable too.

https://github.com/mfalcon/negotia


I built https://HackYourNews.com to summarize the top HN stories and their comments.


This is a cool project! Love the fact that some comment summaries are added too. I hope you would continue building it, and also make the text easier to read- like by making the page fixed-width.


I built an app to make dealing with Jira less painful. It caches Jira tickets in a SQLite database, then uses GPT-3.5 to translate natural language queries into SQL that it then executes. It also uses Ollama/Mixtral to summarize Jira tickets and GitHub PRs. It can generate a summary of a single Jira ticket with its associated GitHub PRs or a whole sprint. It's written in Python and runs in the terminal.


I'm building a platform where product managers and engineers can build interaction automation with users using small model. The goal is to help people to build LLM for them without deep expertise in DS/ML, train and host the model in their infrastructure, where no data require to be submitted.

Still on progress at https://www.chathip.com/


I built a tool that uses LLMs to write a literature review on any research topic. (https://www.epsilon-ai.com).

It gives back ChatGPT styled answers, but they contain citations to underlying academic articles so that you know the claims are valid. Clicking on the reference actually takes you directly the paragraph in the source material where the claim was found.


I built Joke-Understander bot, a Mastodon bot that responds to a joke setup before the punchline is revealed. It's not very popular but I think it's hilarious.

https://botsin.space/@jokeunderstander

It's just a bash script that calls ollama on my desktop PC every morning and schedules a handful of posts on the Mastodon server.


I think this is the best thing in this whole thread. It's hilarious! I really love it. Thanks for sharing!


My project team in university built a meme generator that uses GPT and Dall-E to generate image macros using Impact font. It was pretty entertaining.


Me and an colleague working on a language learning app https://poli.xyz. It integrates in you favorite messenger and offers a wide variety of languages. You can either either do freestyle conversations or play certain scenarios. The bot corrects your Grammatik, translates and explains words and sentences and support tts and stt.


I built this demo of using LLMs to query databases, knowledge bases, and most interestingly create PDFs. It’s targeted at financial services but similar could be achieved in many industries.

Very pleased with how it turned out as it really brings the potential of LLMs to life IMO.

https://www.youtube.com/watch?v=r8MyAxyPJsA


Created an AI explainer app that helps you understand a topic, kind of like Perplexity.

It's currently free to use. Its built using nextjs+tailwind and is powered by Vercel + Brave + Gemini Pro. https://xplained.vercel.app

There are other projects that I worked on as part of my job, mostly around bots, search, classification, and analytics.


We've been training custom LLMs using indices pulled from domains. For example, we demo'd the NFL with a Chicago Bears custom Chatgpt site search. We trained it using over 900 pages from their site, and then used reinforced human training to really polish it up. https://sapien.ai


I built a tool to create "average llm" probability of code for checking how aligned code is with what an LLM would output. Working on adding context from a project to check how the style of a section aligns with the style, content and domain of a project.

Idea is to use it to identify code that sticks out, because that usually what's interesting or bad.


https://github.com/christianhellsten/ollama-html-ui

I'm building a minimal, cross-browser, and cross-platform UI for Ollama.

Stack: HTML, CSS, JavaScript, in other words, no dependency on React, Bootstrap, etc. Deployment: web server, browser extension, desktop, mobile


A chrome extension to ask about selected text with a right click. https://github.com/SMUsamaShah/LookupChatGPT

A chrome extension to show processed video overlay on YouTube to highlight motion.

A script to show stories going up and down on HN front page. This one just took 1 prompt.


I've built a sales bot that would go over a predefined sales scenario like a real human would, being able to jump between steps and work with any complications real conversation would throw at it. It would appear fully human to whoever converted with it. Unfortunately, it was never deployed in production due to business reasons.


I built a platform for homeschooling families with structured courses that are taught and graded by an llm (chatgpt 4 API).

Homeschoolmate.com


I'm not sure if this is the category of "build" that you had in mind, but I used 3.5 to make a pay-as-you-go chat interface for the OpenAI API: https://benwheatley.github.io/YetAnotherChatUI/


Built a tool to summarize certification and licensing costs associated with jobs that require State credentialing.


Wrote an application to find myself a flat in Berlin, scans some rental websites every minute, uses Google Maps API to calculate the distance to my office, and summarizes the rental description with the GPT-4 API, sends it to me via Telegram.

I have no time to read all that generic "vibrant neighborhood" stuff :D


That's hilarious! I always wanted to build something like that and host it on All About Berlin. Unfortunately, ImmobilienScout is working really hard to block bots. How did you get around that?

Fun fact by the way: listings on ImmobilienScout are usually gone after 5 minutes and hundreds of messages.


Do you mind sharing a github link if it is public.


it isnt.


I built https://listingstory.com as a way to learn about and play with LLMs. It's unlikely to ever be a commercial success, but it served it's purpose in allowing me to learn much more about how an LLM powered app works.


I made a platform that helps you create and execute multi party workflows, right now focused on Health but later on looking to expand to other verticals. The LLM acts as an assistant when building the protocol for the workflow.

https://codifyhq.com


I built a language learning tool [1] that uses LLMs to get word definitions in the context of a sentence, among other features I'm planning to release.

I'm using modal.com as the backend for the AI related micro services.

[1] https://www.langturbo.com


I am stunned at how few people are sharing projects using LLMs for real estate.

I own the domain homestocompare and I am working on a project that will use AI to help compare homes. Unfortunately I don't have a working demo yet but please reach out to me if you would be interested in finding out more.


I've been building data projects for real estate for a while and applying LLM's for conversational products but I've never mixed both of them. I think something like a conversational buying/renting assistant could be feasible but I don't think it's something interesting to monetize.


hey would love to hear more! I built a bunch of projects in real estate and happy to collaborate


I can't find any contact info for you.

You can reach me via my twitter (in my profile) or reddit:

https://www.reddit.com/user/klavado


I use GPT-4/4-vision and other models as part of a pipeline for automatically translating comics(French/European stuff as well as Manga, Webtoons etc)

https://github.com/ogkalu2/comic-translate


An AI agent to answer questions about any github/gitlab repository. www.useadrenaline.com

It does the work of understanding questions in the context of a repo, code snippet, or any programming question in general, and pulls in extra context from the internet with self thought + web searches.


I wrote a flash card app that uses GPT-4 and Whisper speech-to-text to help me memorize Korean phrases. I’m 1,800 sentences in and use it every day since October.

https://github.com/RickCarlino/KoalaSRS


> I worked on a chrome extension a few weeks ago that skips sponsorship sections in YouTube videos by reading through the transcript

You might want to connect that to SponsorBlock

https://sponsor.ajay.app/


I built https://tailgate.dev/ a few months ago. It can help with deployment of simple, client-facing generative web apps. There are a few simple demos on the home page!


Perhaps a follow on question, as I presume a lot of people reading the comments are looking for inspiration to build things (and those building might not want to reveal yet) what would you like to see built with the capabilities provided by LLMs?


AI/ML that reads my zsh history and suggests automations or other time-savers when asked.

Something that reads my teams and outlook, and listens to meetings, and takes notes / remembers stuff for me.


> Something that reads my teams and outlook, and listens to meetings, and takes notes / remembers stuff for me.

I know of a few startups doing this, but have used Grain and have enjoyed the experience quite a bit.


We've built a prompting, synthetic data generation, and training library called DataDreamer: https://github.com/datadreamer-dev/DataDreamer


I built a blog with stupid & overengineered technical solutions. Also has an audio interview for every blog post.

https://shitops.de/posts/


Found myself needing to find emojis relevant to a specific theme and wanted to play with OpenAI's API. So, I built https://emojisearch.fun


Built an LLM interface to control my browser. Used it to generate playwright tests for me https://github.com/mayt/BrowserGPT


I'm building AI Construx (https://aiconstrux.com): build things with AI. I'm planning to launch the private beta by end of Feb.


Can the smart folks on HN point out to a good resource or a collection of resource for a software engineer to get up to speed with LLMs and Gen AI concepts, and understand basic deployments and use cases?



I built https://QexAI.com.

I also use LLMs in some other web apps, but mainly as incidental writing aids, rather than the central feature of the app.


I built a completely useless ai phone bot. You call it and ask it a question, and it responds with an answer that always involves sandwiches.

It adds no value beyond entertainment, but I suppose it does do that.


Link? (I mean: Phone number?)


484.854.1582


Upload product photos, get detailed, seo optimized, product descriptions. https://producks.ai/


An open source retrieval augmented generation (RAG) framework:

https://www.github.com/jerpint/buster


I've built a tool to help students in the note-taking process.

It is https://cmaps.io


i built autosuggestions / catch all prompt responses on https://aesthetic.computer and you can also talk to characters like boyfriend, girlfriend, husband and wife. characters are great for kids and older users who really wouldn't experience the tech otherwise.


I'm building a chatbot (API + frontend) to transcribe natural language questions into SQL query for a Snowflake database.


An app for making children’s stories.

https://schrodi.co/


An app that aggregates the news from websites, blogs, YouTube channels and podcasts, and generate easily digestible summaries, along with a small podcast version so you can stay informed in an easy stress-free way.

Right now I’m working on including automatic fact checking and insights on how each source might be opinionated vs. reporting just the facts.

https://usetailor.com


bookmarking extension, not much traction though https://chromewebstore.google.com/detail/autolicious/jbmpoml...


I'm making a Magic Card generator


I like this. I tried something similar ~10 years ago, but it didn't go very well. I'm sure an LLM can do much better than the nonsense I hacked together.


I built a simple RAG chatbot, and my "stack" is plain openai python client at this point.


I wrote an autonomous AI space opera tv show generator. It takes a short topic phrase on one end and spits out a 10-15 minute 3D animated and AI voiced video suitable for upload to YouTube on the other end.

Super interesting learning exercise since it intersects with many enterprise topics, but the output is of course more fun.

In some ways it is more challenging - a summary is still useful if it misses a point or is a little scrambled, whereas when a story drops a thread it’s much more immediately problematic.

I’m working on a blog post as well as getting a dozen episodes uploaded for “season 1”.


Semi-automated transcriptions for my favourite podcast, via OpenAI Whisper.


The range of creativity and ingenuity in these answers is mind-boggling!


A little AI domain name generator: https://namebrewery.com/

Used SvelteKit and Supabase. Deployed to Cloudflare Pages.


I built a flask app based chrome extension that takes content from the DOM and sends it to chatGPT for summarization, I also configured it to work on YouTube videos and PDFs, helps when you want to share the tl;dr of a site or video to a friend, I'm thinking I'm going to add some more specific summary functionality next, like listing out a recipe's ingredients and cooking steps

https://chromewebstore.google.com/detail/news-article-summar...


My brother built a security scanner with an LLM


Just a personal project - I got a deep interest in the CIA's Stargate program and the declassified documents in the "reading room." I wrote a script to scrape all of the readable or OCRd text from the documents, and fed them into GPT-3.5 to get a summary. It definitely makes reading through the documents easier.

I have all of the docs with summaries on a small webserver here: https://ayylmao.info

Simple Flask site with SQLite as the database.


Feel like this needs way more discoverabilty. Otherwise what’s the point? My ideas:

Better URL Some CSS More ways into the content Most importantly — put the whole thing on GitHub (most important are the summaries and index). The SQLite file will be too big but you could easily have 10, 100k small summary files.

With a proper readme and these changes (and more of course) you might create a resource that far outlives you and helps researchers long into the future.


It’s just a scratch that needed itching, but I wrote a command-line utility for translating “SRT” format subtitles into other languages.

I hit some interesting challenges, overcoming which was a valuable set of lessons learnt:

1. GPT4 Turbo slowed down to molasses in some Azure regions recently. Microsoft is not admitting this and is telling people to use GPT3.5 instead. The lesson learned is that using a regional API exposes you to slowdowns and queuing caused by local spikes in demand, such as “back to school” or end of year exams.

2. JSON mode won’t robustly stick to higher level schemas. It’s close enough, but parsing and retries are required.

3. The 128K context in GPT4 is only for the input tokens! The output is limited to 4K.

4. Most Asian languages use as many as one token per character. Translating 1 KB of English can blow through the 4 KB token limit all too easily.

5. You can ask GPT to “continue”, but then you have to detect if you received a partial or a complete JSON response, and stitch things together yourself… and validate across message boundaries.

6. The whole process above is so slow that it hits timeouts all over the place. Microsoft didn’t bother to adjust any of their default Azure SDK timeouts for HTTP calls. You have to do this yourself. It’s easy, just figure which of the three different documented methods are still valid. (Answer: none are.)

7. You’ll need a persistent cache. Just trust me on this. I simply hashed the input and used that as a file name to store responses that passed the checks.

8. A subtitle file is about 30–100 KB so it needs many small blocks. This makes the AI lose the context. So it’s important to have several passes so it can double check and stitch things together. This is very hard with automatic parsing of outputs.

9. Last but not least: the default mode of Azure is to turn the content policy up to “puritan priest censoring books”. Movies contain swearing, violence, and sex. The delicate mind of the machine can’t handle this, and it will refuse to do as it is asked. You have to dial it down to get it to do anything. There is no “zero censorship” setting. Microsoft says that I can’t feed text to an API that I can watch on Netflix with graphic visuals.

10. The missus says that the AI-translated subtitles are “perfect”, which is a big step up from some fan translated subtitles that have many small errors. Success!

I wrote this as a C# PowerShell module because that makes it easy to integrate the utility as a part of a pipeline. E.g.: I can feed it a directory listing and it’ll translate all of the subtitles.

The performance issues meant I had to process 8x chunks in parallel. Conveniently I already had code lying around to do this in PowerShell with callbacks to the main thread to report progress, etc…


I generate and sell books that summarize historical events. I was actually ready to launch last month until I realized I could generate extremely realistic photographs in Midjourney and splice them between paragraphs using a simple python script, so I went back and did another pass.

My process involves generating chapters as markdown, using a script to join chapters together, and then finally converting the markdown to ebooks using Gitbook.


You really need to tell people that the images are AI-generated. Anyone who remotely cares about history will feel very upset otherwise. Even using real images in the wrong context is a big no-no.

Honestly, for anything non-fiction, I would strongly advise against using fake images.


Hey, had a similar idea, would be great to chat - can reach me at my username on google’s mail.



I built a tool to repeat a chat discussion against a set of data.

Let say, you have a row with 4 fields, you chat with your row, then you apply same conversation to all other rows!

https://www.youtube.com/watch?v=e550X6R89W4 https://bulkninja.com/




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: