More

Mockapapella · 2025-08-04T04:11:23 1754280683

> When it’s able to create code that compiles, the code is invariably inefficient and ugly.

Why not have static analysis tools on the other side of those generations that constrain how the LLM can write the code?

Jensson · 2025-08-04T15:05:45 1754319945

> Why not have static analysis tools on the other side of those generations that constrain how the LLM can write the code?

We do have it, we call those programmers, without such tools you don't get much useful output at all. But other than that static analysis tools aren't powerful enough to detect the kind of problems and issues these language models creates.

jdiff · 2025-08-04T06:17:27 1754288247

I'd be interested to know the answer to this as well. Considering the wealth of AI IDE integrations, it's very eyebrow-raising that there are zero instances of this. Seems like somewhat low hanging fruit to rule out tokens that are clearly syntactically or semantically invalid.

octopoc · 2025-08-04T12:54:26 1754312066

I’d like to constrain the output of the LLM by accessing the probabilities for the next token, pick the next token that has the highest probability and also is valid in the type system, and use that. Originally OpenAI did give you the probabilities for the next token, but apparently that made it easy to steal the weights, so they turned that feature off.

qcnguy · 2025-08-04T17:55:24 1754330124

It's been tried already and doesn't work. Very often a model needs to emit tokens that aren't valid yet but will become so later.

hkt · 2025-08-04T07:43:43 1754293423

This can be done: I gave mine a justfile and early in the project very attentively steered it towards building out quality checks. CLAUDE.md also contains instructions to run those after each iteration.

What I'd like to see is the CLI's interaction with VSCode etc extending to understand things which the IDE has given us for free for years.

Mockapapella · 2025-08-01T15:25:51 1754061951

SEEKING WORK | Remote | AI Infrastructure & Performance

Location: Wisconsin

---

Last September I built an AI inference tool that hit #3 on HN (https://news.ycombinator.com/item?id=41620530). It processed 17.3M messages in 24 hours and only cost $17 to run.

I specialize in:

- LLM inference optimization (FastAPI, proper batching, memory management)

- CI/CD pipelines for ML deployments

- Making AI systems cost-effective at scale

Recent work: FrankenClaude (reasoning injection experiments, https://thelisowe.substack.com/p/frankenclaude-injecting-dee...), self driving Rocket League (https://thelisowe.substack.com/p/building-an-ai-that-plays-r...), diffdev (AI-powered code modification tool, https://pypi.org/project/diffdev/).

Previously at Sprout Social where I built their ML inference platform - reduced deployment time from 6 months to 6 hours and cut AWS costs by $500K/yr.

Looking for interesting problems in AI infrastructure, performance optimization, or building products from scratch.

Tech: PyTorch, FastAPI, K8s, Docker, AWS, ONNX

---

Resume: https://drive.google.com/file/d/1qO8XdisNTFq_wmrQGDKnu6eWDi2... GitHub: github.com/Mockapapella Blog: thelisowe.substack.com

Contact: My email is in my bio or on my resume

Mockapapella · 2025-08-01T15:24:51 1754061891

    Location: Wisconsin
    Remote: Yes
    Willing to relocate: Yes
    Technologies: Python, PyTorch, Kubernetes, Docker, AWS, FastAPI, ONNX, MLOps
    Resume: https://drive.google.com/file/d/1qO8XdisNTFq_wmrQGDKnu6eWDi2g33Me/view
    GitHub: https://github.com/Mockapapella
    Email: In profile or on resume

AI/ML Engineer specializing in high-performance deployments. Built distributed systems handling 30K QPS, developed a neural network for Rocket League gameplay, and created platforms that cut model deployment time from 6 months to 6 hours. Saved $500K/yr in infrastructure costs through optimization at previous role. Former technical founder with experience in humanoid robotics and AI writing assistance. I write about my projects and musings on my blog: https://thelisowe.substack.com/

Seeking roles focusing on ML infrastructure, model optimization, post-training, or full-stack AI engineering.

Mockapapella · 2025-07-06T03:54:58 1751774098

SEEKING WORK | Remote | AI Infrastructure & Performance

Location: Wisconsin

---

Last September I built an AI inference tool that hit #3 on HN (https://news.ycombinator.com/item?id=41620530). It processed 17.3M messages in 24 hours and only cost $17 to run.

I specialize in:

- LLM inference optimization (FastAPI, proper batching, memory management)

- CI/CD pipelines for ML deployments

- Making AI systems cost-effective at scale

Recent work: FrankenClaude (reasoning injection experiments, https://thelisowe.substack.com/p/frankenclaude-injecting-dee...), self driving Rocket League (https://thelisowe.substack.com/p/building-an-ai-that-plays-r...), diffdev (AI-powered code modification tool, https://pypi.org/project/diffdev/).

Previously at Sprout Social where I built their ML inference platform - reduced deployment time from 6 months to 6 hours and cut AWS costs by $500K/yr.

Looking for interesting problems in AI infrastructure, performance optimization, or building products from scratch.

Tech: PyTorch, FastAPI, K8s, Docker, AWS, ONNX

---

Resume: https://drive.google.com/file/d/1qO8XdisNTFq_wmrQGDKnu6eWDi2... GitHub: github.com/Mockapapella Blog: thelisowe.substack.com

Contact: My email is in my bio or on my resume

Mockapapella · 2025-07-06T02:27:16 1751768836

    Location: Wisconsin
    Remote: Yes
    Willing to relocate: Yes
    Technologies: Python, PyTorch, Kubernetes, Docker, AWS, FastAPI, ONNX, MLOps
    Resume: https://drive.google.com/file/d/1qO8XdisNTFq_wmrQGDKnu6eWDi2g33Me/view
    GitHub: https://github.com/Mockapapella
    Email: In profile or on resume

AI/ML Engineer specializing in high-performance deployments. Built distributed systems handling 30K QPS, developed a neural network for Rocket League gameplay, and created platforms that cut model deployment time from 6 months to 6 hours. Saved $500K/yr in infrastructure costs through optimization at previous role. Former technical founder with experience in humanoid robotics and AI writing assistance. I write about my projects and musings on my blog: https://thelisowe.substack.com/

Seeking roles focusing on ML infrastructure, model optimization, post-training, or full-stack AI engineering.

Mockapapella · 2025-05-07T21:53:00 1746654780

This is a good article on the "fog of war" for GPU inference. Modal has been doing a great job of aggregating and disseminating info on how to think about high quality AI inference. Learned some fun stuff -- thanks for posting it.

> the majority of organizations achieve less than 70% GPU Allocation Utilization when running at peak demand — to say nothing of aggregate utilization. This is true even of sophisticated players, like the former Banana serverless GPU platform, which operated at an aggregate utilization of around 20%.

Saw this sort of thing at my last job. Was very frustrating pointing this out to people only for them to respond with ¯\_(ツ)_/¯. I posted a much less tactful article (read: rant) than the one by Modal, but I think it still touches on a lot of the little things you need to consider when deploying AI models: https://thelisowe.substack.com/p/you-suck-at-deploying-ai-mo...

charles_irl · 2025-05-07T22:57:26 1746658646

Nice article! I had to restrain myself from ranting on our blog :)

Mockapapella · 2025-05-06T18:54:36 1746557676

Honestly I thought you guys had launched already (and didn't know you were a part of YC), been aware of you guys for years now it seems. Congrats on the launch! Hope the twitter issues aren't causing you guys too many problems.

Normally I'd send this as a DM or email, but I think it could be useful for others to learn about how to use your service/the limitations of it. A couple weeks ago I made a search for:

    In early 2023, Andrej Karpathy said something like "large training runs are a good test of the overall health of the network." Something something resilience as well I think. I need you to find it.

Unfortunately it wasn't able to find it, but it was either in a tweet or a really long presentation, neither of which are good targets for search. It was around the same time that this (https://www.youtube.com/watch?v=c3b-JASoPi0) video was posted, like within a couple weeks before or after. How could I have improved my query? Does exa work over videos?

liam-hinzman · 2025-05-06T20:03:54 1746561834

I think I found it! Unfortunately we do not include tweets in our search index

> TLDR LLM training runs are significant stress-tests of an overall fault tolerance of a large computing system acting as a biological entity.

https://x.com/karpathy/status/1765424847705047247

Mockapapella · 2025-05-07T01:27:44 1746581264

Holy shit I think that might be it! I have been looking for that tweet for like a year now. Thanks!

Mockapapella · 2025-05-02T19:01:25 1746212485

Could you elaborate on the instructions in brackets part?

natdempk · 2025-05-02T23:02:29 1746226949

Sure, you can do a lot of things here... stuff in [brackets] isn't sung.

For example I was trying to steer a melodic techno prompt recently in a better direction by putting stuff like this upfront:

    [intro - dramatic synths, pulsing techno bass]
    [organic percussive samples]
    [rolling galloping pulsing gritty bassline]
    [soaring experimental synths, modulation heavy, echos, sound design, 3d sound]
    [lush atmosphere, variation]
    [hypnotic groovy arppegiation arps]
    [sampled repetitive trippy vocal]

All of this is just stuff I kind of made up and wanted in the song, but it meaningfully improved the output over just tags. I think "steering/nudging the generation space" is a decent idea for how I feel like this affects the output.

I also often use them to structure things around song structure like [intro], [break], [chorus], and even get more descriptive with these describing things or moments I'd like to happen. Again adherence is not perfect, but seems to help steer things.

One of my favorite tags I've seen is [Suck the entire song through vacuum] and well... I choose to believe, check out 1:29 https://suno.com/s/xdIDhlKQUed0Dp1I

Worth playing around with a bunch, especially if you're not quite getting something interesting or in the direction you want.

dumpsterdiver · 2025-05-02T20:15:25 1746216925

Brackets such as [Verse] help provide waveform separation in the edit view so that you can easily edit that section without manually dragging the slider.

Others such as [Interrupt] will provide a DJ-like fade-out / announcement (that was <Artist name>, next up..." / fade-in - providing an opportunity to break the AI out of repetitive loops it obsesses about.

I've used [Bridge] successfully, and [Instrumental] [No vocals] work reliably as well (there are also instrumental options, but I still use brackets out of habit I guess).

Mockapapella · 2025-05-02T02:30:38 1746153038

    Location: Wisconsin
    Remote: Yes
    Willing to relocate: Yes
    Technologies: Python, PyTorch, Kubernetes, Docker, AWS, FastAPI, ONNX, MLOps
    Resume: https://drive.google.com/file/d/1qO8XdisNTFq_wmrQGDKnu6eWDi2g33Me/view
    GitHub: https://github.com/Mockapapella
    Email: In profile or on resume

AI/ML Engineer specializing in high-performance deployments. Built distributed systems handling 30K QPS, developed a neural network for Rocket League gameplay, and created platforms that cut model deployment time from 6 months to 6 hours. Saved $500K/yr in infrastructure costs through optimization at previous role. Former technical founder with experience in humanoid robotics and AI writing assistance. I write about my projects and musings on my blog: https://thelisowe.substack.com/

Seeking roles focusing on ML infrastructure, model optimization, post-training, or full-stack AI engineering.

fzysingularity · 2025-05-02T03:06:09 1746155169

Great background!

Interested in building bleeding-edge VLM infrastructure at VLM Run? https://vlm-run.notion.site/vlm-run-hiring-25q1-staff

Mockapapella · 2025-05-03T04:45:37 1746247537

Thanks, looks interesting. Sent an email

Mockapapella · 2025-04-01T16:23:34 1743524614

    Location: Wisconsin
    Remote: Yes
    Willing to relocate: Yes
    Technologies: Python, PyTorch, Kubernetes, Docker, AWS, FastAPI, ONNX, MLOps
    Resume: https://drive.google.com/file/d/1qO8XdisNTFq_wmrQGDKnu6eWDi2g33Me/view
    GitHub: https://github.com/Mockapapella
    Email: In profile or on resume

AI/ML Engineer specializing in high-performance deployments. Built distributed systems handling 30K QPS, developed a neural network for Rocket League gameplay, and created platforms that cut model deployment time from 6 months to 6 hours. Saved $500K/yr in infrastructure costs through optimization at previous role. Former technical founder with experience in humanoid robotics and AI writing assistance. I write about my projects and musings on my blog: https://thelisowe.substack.com/

Seeking roles focusing on ML infrastructure, model optimization, post-training, or full-stack AI engineering.