Show HN: Axilla – Open-source TypeScript framework for LLM apps

mlejva · on Aug 7, 2023

Hey Nick and Ben, congrats to launch! I really like that you're going in the TS way by default. I personally think there will me more AI Engineers (devs building LLM apps/agents) working in TS than in Python.

I wanted to ask if you accept PRs for integrations?

I'm a co-founder of E2B [0]. We give private sandboxed cloud envs to any agent. We're building two things:

- [1] Agent Protocol - it's an open protocol that defines how to communicate with an agent. The current goal is to make benchmarking agents simple (it's used for example by folks at AutoGPT and other popular agents)

- [2] SDK that gives your agent a cloud environment (currently in early access)

Would love to figure out how to integrate these to into Axilla if it makes sense to you. What would be the best way to connect?

[0] https://e2b.dev/

[1] https://github.com/e2b-dev/agent-protocol

[2] https://github.com/e2b-dev/rest-api We built for example our ChatGPT plugin with it https://github.com/e2b-dev/chatgpt-plugin

nichochar · on Aug 7, 2023

We're very open to contributions, I am interested in what the integration would look like.

Do you want to email me at nicholas@axilla.io? we can get into the details.

mlejva · on Aug 7, 2023

Thanks! Just sent you an email (vasek@e2b.dev)

williamzeng0 · on Aug 7, 2023

I'm really excited about E2B. Axilla looks great too! :)

mlejva · on Aug 8, 2023

Thank you Will :)

jumploops · on Aug 7, 2023

We use GPT-4 pretty heavily in a Typescript project, but have noticed lag from the TS versions of popular libraries (OpenAI’s npm lib, Langchain TS, etc.).

This framework is exciting to see. Even though Python is the “language of AI” most foundational models just sit behind an HTTP endpoint, making the web (and thus JS/TS) a perfect fit, as you’ve called out.

It’d be neat to see a caching layer (maybe similar API to evals?) that can be a drop-in for production workflows where the responses are somewhat deterministic.

nichochar · on Aug 7, 2023

Glad to hear this, indeed we think there's opportunity for some more cutting edge tooling in the TS ecosystem.

We absolutely want to add a caching layer. Actually, we think middleware is where a lot of the value of the framework will come: it enables a whole bunch of features, e.g. sending errors to datasets for labeling, caching, user throttling, analytics, A/B tests, ...

We're likely going to build the serving module next which will cover this.

jumploops · on Aug 7, 2023

Very cool. To be completely candid, we just hit OpenAI directly, no 3rd party libs involved at the moment (just fetch).

We're open to trying more TS-focused libraries, but definitely more hesitant after our initial experiences with other libs. The less magic the better (no hidden prompts, etc.).

savy91 · on Aug 7, 2023

Amazing! I am working on projexts where we use LLMs and Typescript and and besides langchain-js which can be only described as bloatware, I can never find anything and find myself reinvent the wheel most of the time.

zarazas · on Aug 7, 2023

Why do you think langchainjs is bloatware?

fswd · on Aug 7, 2023

I've come to the conclusion that anything that "abstracts" the openai complete/chat complete API call is just bad practice and to stay away from the entire framework, with the exception of microsoft guidance. Just because you can, doesn't mean you should. And if you do abstract the completion API, then it must either reduce friction or increase capabilities over just calling openai with http fetch/axios. Which microsoft guidance does this.

benjreinhart · on Aug 7, 2023

Yes, we largely agree with you on that. The APIs are high-level enough that wrapping really doesn't add much value in many circumstances.

We chose to do this for our first module to take a stab at integrating RAG pipelines in a coherent manner, but we don't plan on following this pattern in all modules within our framework. There is possibly one exception here, which is that an interface that allows composable middleware for things like logging, error handling, or redirecting of requests may justify wrapping in some places.

The next steps for us involve lower-level functionality. One need we see again and again is more robust data extraction and processing. Most people we talk to who use other community projects (e.g., langchain or llama) find that data loading and chunking are among the most valuable parts of those libraries. We agree, but would like more robust functionality for these tasks, so this is one thing we're working towards next.

Beyond that, we're working on infrastructure. Easy model serving from Node (for OSS or proprietary models), monitoring, and pipelines for fine-tuning based on production inference results.

jawerty · on Aug 7, 2023

Yea learning how to use the core API directly should be the focus for any engineer. Lots of frameworks built on top of LLMs are being made very quickly each with their own philosophy. It's a good time to stay with the fundamentals as much as possible until the dust settles. Learning langchain will take you less than a day if you have fundamentals don't worry about not staying up to date.

Now's the time to learn how LLMs work from the ground up not being a framework chaser. (watch Karpathy's GPT from scratch video and read through huggingface's LLM documentation from RLHF to PEFT fine-tuning)

victor106 · on Aug 8, 2023

> with the exception of microsoft guidance.

why?

rzmmm · on Aug 7, 2023

Hi, I checked out the demo and it looks very promising. As someone who is not very familiar with AI development, I feel a bit puzzled looking at the code examples. If I use the lib and it sends textual prompts based on some templates, can I be certain that the AI outputs will be well structured and contain the right information? Would it be possible to build an AI model with a lower level, programmable interface...?

BoorishBears · on Aug 7, 2023

My two cents which you are free to ignore (and I almost implore you to ignore): I'm sure this is useful but as someone who works at an AV company and is building with these new generative AI tools... your psuedo-YC story intro kind of puts me off even wanting to look at the library, because it sets you up as grifters.

The only overlap with what you were doing at Cruise and the problems people building off a REST API wrapper for an LLM are running into are things that all software being pushed into a production environment runs into. High level things like "let's not introduce a regression".

I think if you're talking to investors who don't know better go for it. But if you're posting for technical folks, some of them will be completely put off the moment you try to imply working in MLOps at an AV company makes you any more suited to implement RAG than any suitably experienced engineer who's messed around with embeddings for a month.

nichochar · on Aug 8, 2023

Thanks for the feedback!

The lesson that we learned at Cruise is that the tough thing when shipping AI software is closing the data loop and integrating all of the steps of the ML lifecycle together, so if you only look at the RAG workflow, I actually agree with you.

The vision for Axilla is that all of the modules interoperate with each other naturally. This means that your production data gets logged such that the datasets can be sent for data processing, labeling, or added to regression test suites. This way, production and development workflows are tied together.

In terms of RAG: how do you test your RAG workflow in an automated way? A lot of people are building these workflows today, but from our customer conversations nearly none of them are testing them or monitoring their performance automatically in production, because most evaluation frameworks don't integrate naturally with document retrieval.

We have a way to go before the framework delivers on its full potential, but we still feel that it's in a useful enough shape for people to use it and contribute today, which is why we open-sourced it while we keep building it.

BoorishBears · on Aug 8, 2023

I just don't see many people shipping AI software right now in the way AI applies to AVs.

But I'm also giving you my two cents as someone who's not only building in the space, but sitting next to * a lot of the people who will use something like this, and there's some real fatigue building around the flood of tooling for "LLMOps".

* figuratively and literally: checking my past events after the sibling mentioned your YC connection and even you and I have been to at least one mutual AI event

—

At the end of the day I get that as a startup you need to weave stories sometimes: If this was a Launch HN I wouldn't have bothered with my comment and that's kind of what my "ignore this" intro is getting at.

But we went from chatbots to selling shovels in a gold rush as the default AI play in the last couple of months. Most builders will take any excuse to assume any given tool is just another rushed shovel. So you don't want to invite the mental friction of "ML vs AI": at most I'd mention you're former coworkers at Cruise and let the people amenable to that connection make it themselves. For the rest of us even knowing two former coworkers are working on something is enough to build some confidence in its staying power.

iamnafets · on Aug 8, 2023

They are a YC company.

fullstackchris · on Aug 7, 2023

I'm gonna be that guy who will probably show up sooner or later anyway, but... I can't imagine performance can compete with other languages? What were your findings or experience with that?

Still, I'm a huge fan of TypeScript and will give it a try anyway :)

benjreinhart · on Aug 7, 2023

Hey Chris , can you further qualify performance?

Before I share some thoughts on this, let me just say that our primary motivators for Axilla have much more to do with bringing better AI tooling to an otherwise flourishing ecosystem rather than shaving milliseconds off an arbitrary task or request. Given that, I'm not sure how fruitful a performance discussion will be.

If by performance you meant maturity of third party packages for AI-related functionality, then yes JS/TS is lacking. This is what is motivating us :). We want better tooling for AI applications in TS.

If you're referring to performance for CPU-bound tasks, then yes JS would not be as good as lower-level languages like Rust or Go. If you're referring to JS compared to Python, then I don't know how true that is. Python doesn't have a great concurrency story either (at least not today). JS may be single threaded for the most part, but with web workers and WASM (+ WebGPU!), we now have tools at our disposal for dramatically speeding up CPU-bound tasks while not blocking the main thread. Assuming we get the interfaces right, we can swap out a subset of the implementation with a WASM-based implementation later if justified.

There is nothing about Python the language that makes it especially well-suited for AI/ML-related functionality. It is just the language whose ecosystem has the most maturity when it comes to that functionality. We hope to chip away at that over time.

fullstackchris · on Aug 7, 2023

I'm no expert in actual ML implementations but I was under the impression that Python (i.e Tensorflow) is actually C/C++ based under the hood. I just meant I can't imagine the V8 engine can be as performant for all that matrix math in those models.

But now that I'm looking at the actual code samples, I'm not even sure JavaScript is doing any of the actual heavy lifting? (I see you use OpenAI's embedding) so this tool is more of the glue connecting all the parts? Again, I'm out of my wheelhouse here.

benjreinhart · on Aug 7, 2023

Ahh yes, right now we're operating at a higher-level of the stack.

That said, we are investigating serving from Node and possibly on edge devices with WebGPU. For serving from Node, it would be similar to what you describe with Tensorflow compiling down to C/C++. There are various backends for frameworks like Tensorflow, Pytorch, etc. and those backends are often C/C++. We would bridge this lower-level code to Node through e.g. Node API (https://nodejs.org/api/n-api.html) or use frameworks like ONNX / ONNX Runtime.

kwanbix · on Aug 7, 2023

What a horrible name. Why? Just why? You are putting so much love into something to give it a name that is so bad?

dietr1ch · on Aug 8, 2023

Yeah, I'd reconsider this too.

If this sounds too opinionated so far, I'll give you a fact. As a Spanish speaker, this is also a little too close to the word for armpit, and I'd not name my project Armpitt.

cafard · on Aug 8, 2023

It is the Latin for armpit, and related to the (obsolete) English "oxter" for the same.

mattigames · on Aug 8, 2023

I guess that you are not a fan of "put.io" neither.

fosterfriends · on Aug 8, 2023

Congrats on building the library! I’ve recently been playing around with the js implementation of langchain, and I’m excited for there to be more high quality typescript support here.

capitanazo77 · on Aug 8, 2023

Feedback: Axila in Spanish means armpit

astrodude · on Aug 7, 2023

Hi Nick and Ben, Looked at the demo. Great job! I'll be trying it soon!

SwiftyBug · on Aug 7, 2023

"Axilla" in Portuguese means "armpit" (it's spelt "axila"). I like the name more because of this. Congrats on the launch! As a developer who's been working a lot with TypeScript and LLMs, I'll definitely take a look.

whatrocks · on Aug 7, 2023

It’s also the name of a great Phish song!

jw1224 · on Aug 7, 2023

Two great Phish songs!

wbkang · on Aug 7, 2023

Axillary in English also means "of armpit"

giovannylira · on Aug 7, 2023

Same in spanish.

whstl · on Aug 7, 2023

I think that's also the more "technical" term for armpit in english, so probably not an accident: https://en.wikipedia.org/wiki/Axilla

mattsan · on Aug 7, 2023

Yep, comes from Latin! [1]

[1]: https://www.merriam-webster.com/dictionary/axilla#:~:text=Th....

anthk · on Aug 7, 2023

In Spanish too. The usual term it's "sobaco" which came from sub-brachium, under the arm.

SwiftyBug · on Aug 7, 2023

Didn't know that! In Portuguese, we use the word "Suvaco"

carbocation · on Aug 7, 2023

Same in English. (At least, in medicine.)