Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Auto-generate an OpenAPI spec by listening to localhost (github.com/adawg4)
178 points by adawg4 11 months ago | hide | past | favorite | 73 comments
Hey HN! We've developed OpenAPI AutoSpec, a tool for automatically generating OpenAPI specifications from localhost network traffic. It’s designed to simplify the creation of API documentation by just using your website or service, especially useful when you're pressed for time.

Documenting endpoints one by one sucks. This project originated from us needing it at our past jobs when building 3rd-party integrations.

It acts as a local server proxy that listens to your application’s HTTP traffic and automatically translates this into OpenAPI 3.0 specs, documenting endpoints, requests, and responses without much effort.

Installation is straightforward with NPM, and starting the server only requires a few command-line arguments to specify how and where you want your documentation generated ex. npx autospec --portTo PORT --portFrom PORT --filePath openapi.json

It's designed to work with any local website or application setup without extensive setup or interference with your existing code, making it flexible for different frameworks. We tried capturing network traffic on Chrome extension and it didn't help us catch the full picture of backend and frontend interactions.

We aim in future updates to introduce features like HTTPS and OpenAPI 3.1 specification support.

For more details and to get started, visit our GitHub page (https://github.com/Adawg4/openapi-autospec). We also have a Discord community (https://discord.com/invite/CRnxg7uduH) for support and discussions around using OpenAPI AutoSpec effectively.

We're excited to hear what you all think!




When you build an API, please start with the OpenAPI specification, before you write any code for your API. It can be iterative, but for every part, just start with the OpenAPI, and think about what you want from the API, what do you want to send, and what to receive.

It is like the TDD approach, design before build.

Writing or generating tests after you build the code, is the same as this. It is guessing what it should do. The OpenAPI specification, and the tests should tell you what it should do, not the code.

If you have the specification, everyone (and also AI) can write the code for you to make it work. But the specification is about what you think it should do. That are the questions and requirements that you have about the system.


I get the feeling you may not have gone 0-1 on an API before. In general, you have 1 consumer when you're starting off, and if you're lucky your API gathers more consumers over time.

In that initial implementation period, it's more time-consuming to have to update a spec nobody uses. Maintaining specs separately from your actual code is also a great way to get into situations where your map != your territory.

I'd instead ask: support and use API frameworks that allow you to automatically generate OpenAPI specs, or make a lot of noise to get frameworks that don't support generating specs to support that feature. Don't try to maintain OpenAPI specs without automation :)


How is adding 10-20 lines, depending on how many structures you're creating, and then re-running a generation tool (or simply just running a build command again depending on your build configuration) time consuming? I've written OpenAPI-first services both at Big Tech for services handling crazy amounts of RPS and at tiny seed startups where we release the API and literally nobody uses it but our app. Sure I've run up against the occasional sharp edge/incompatibility with some form of nested structure and the generator we used but it was usually a minor diversion and represented 20-30 min of wasted time for the occasional badly-behaving endpoint.

I'm even writing a side project now where I'm defining the API using OpenAPI and then running a generator for the echo Go framework to generate the actual API endpoints. It takes just a few minutes to create a new API.


They are advocating for exactly that: "I'd instead ask: support and use API frameworks that allow you to automatically generate OpenAPI specs


I don’t agree with you. Write a spec, use generators to generate your servers and clients, and use those generated objects yourself.

The point is twofold: you test your API immediately AND you get a ton of boilerplate generated.

So many products out there just feel like a bunch of separate things with a spec slapped on top. Sometimes the spec doesn’t make sense. For example, the same property across different endpoints having a different type.

Save yourself time and do it right from the get go.

> Maintaining specs separately from your actual code is also a great way to get into situations where your map != your territory.

So yeah, write your spec once and generate all servers and clients from it…


I agree with this as well!

OpenAPI spec seems intended to be consumed, not written. Its a great way to convey what your API does, but is pretty awful to write from scratch.

I do wish there was a simpler language to write in... JSON-based as well that would allow this approach of writing the spec first. But alas, there is not, and I have looked a loooot. If anyone has suggestions for other spec languages I'd love to learn!



oh thanks a lot for sharing - I was looking for something just like this! Something like this + hurl is the perfect combination to sketch out APIs imo


Simpler than YAML?


OpenAPI specs can save weeks even on small projects, when you need to autogenerate multiple clients in different languages after the API part is ready btw


I don't want to write OpenApi. Yaml is a terrible programming language, and keeping it in sync with actual code is always a nightmare.

I've been using a tool to generate OpenApi from code, and am pretty happy with that workflow. Even if writing the API before logic, I'd much rather write the types and endpoints in a real programming language, and just have a `todo` in the body.

You can still write API-driven code without literally writing OpenApi first.


You can write an OpenAPI spec in JSON. You can use Jsonnet to generate your spec from whatever input you need.


JSON is a different kind of yuck to have to author by hand, especially in the volume an api spec tends to be.


I don't like writing structured formats by hand either - at some point, you either need to, as they say in France, split or get off your seat

Either don't write it by hand, i.e. use a generator for the structured format, as the comments advocate for and article is about.

Or, just say you'll never have a spec.


Yes, so use Jsonnet or generate it from some intermediate representation using some alternative method. What’s the problem?


Why though when I can just generate it from my actual code and not have to maintain two copies of my api spec?


Jsonnet introduces another (flawed) language?


You are correct about YAML, but OpenAPI is not YAML -- it just commonly uses it as for the textual representation. As others mentioned it, JSON is an alternative, although it doesn't make it much easier to write the code directly.

Sadly, there is a distinct lack of tools to make spec-first development easier. At the moment, Stoplight [0] is the only game in town as a high quality schema editor, but it requires payment for any more significant usage.

[0] https://stoplight.io/


Absolutely, and yes YAML is trash.


Which tool?


I'm thinking using this tool, and having your test suite run through it might work?

At least for people comfortable with doing test driven development.

Write your requirements for your API-driven code as tests first, then document those APIs by running the tests through this tool.


It's going to be very language/framework dependent.

I'm using aide for a Rust/Axum server: https://github.com/tamasfe/aide


100% agree with you... taking the time to do/go design first greatly improves the quality of the final API...

But as some comments below point out, an OpenAPI spec is a pain to create manually which is why TypeSpec from Microsoft is such a great tool. Lets you focus on the important bits of creating a solid API (model, consistency, best practices) in an easy to use DSL that spits out a fully documented OpenAPI spec to build against! see https://typespec.io/


What’s wrong with designing an API by writing its code? Code itself is a design tool (and usually any decent programming language is a better design tool than YAML)


As someone who documents APIs: it's easy to tell which APIs were designed with intention and which ones were designed on the fly. In part because it's much, much easier to document the former :)


Unfortunately OpenAPI specs suck to write manually.

Generating OpenAPI spec from the server code has always felt significantly better for me.


I completely agree as a general design principle, but I still think there’s a place for the above tool.

Example: I used to work at a place that had a massive PHP monolith, developed by hundreds of devs over the course of a decade, and it was the worst pile of hacky spaghetti code I’ve ever seen. Unsurprisingly, it had no API spec. We were later doing tonnes of work to clean it up, which included plans to add an API spec, and switch to a spec-first design process (which we were already doing in services split from the monolith), but there was a massive existing surface area to spec out first. A tool like this would’ve been useful to get a first draft of the API spec up and running quickly for this huge legacy backend.


The API library I wrote for my last couple projects required the developer to fill in the openapi spec specifics and said spec was the part of the API itself, making it difficult to add something to the API that wasn't also in the spec.

Incoming request params, became validated and casted object properties. Outgoing response params were validated and casted according to spec.

In the end I think it worked really well, and loved not needing to maintain the spec separately. The annoying bit was adjusting the library when the spec changed.

And some gnarly bits of the spec that weren't easy to implement logically.

At any rate, it also made for a similar experience of considering the client experience while writing/maintaining the api.


I prefer going the other direction in practice, autogenerating the spec from the code e.g. with drf-spectacular for Django.


Waste of time imo if you use a framework like fastapi which generates the spec for you


Exactly this, I’ve been a python guy which is apparently not the main language used by most api developers or what? Is there nothing like FastAPI in js land? I do start my APIs by writing the openAPI spec, only it’s written in pydantic inside FastAPI and turns out this also creates the actual API lol.


As a curiosity, how do you feel about languages/frameworks where APIs can be pretty self-documenting? For example, Java/JAX-RS creates pretty self-documenting APIs:

    @Path("/people")
    public class PeopleApi {
        @Path("{personId}")
        @GET
        public Person getPerson(@PathParam("personId") int personId) {
            return db.getPerson(personId);
        }
    }
It's easy to generate a spec for a JAX-RS class because it has the paths, parameters, types, etc. right there. There's a GET at /people/{personId} which returns a Person and takes a path parameter personId which is an integer.

If we're talking about a Go handler which doesn't have that information easily accessible, I understand wanting to start with a spec:

    func GetPerson(w http.ResponseWriter, r *http.Request) {
        personId, _ := strconv.Atoi(r.URL.Path.something)
        person := db.GetPerson(personId)
        w.Write(json.marshal(person))
    }

    func GetPerson(c echo.Context) error { //or with something like Echo/Gin
        id := c.Param("id")
        person := db.GetPerson(id)
        return c.Json(http.StatusOK, person)
    }
In Go's case, there's nothing which can tell me what the method takes as input without being able to reason about the whole method. With JAX-RS, it's easy reflect on the method signature and see what it takes as input and what it gives back, but that's not available with Go (with the Go tools that most people are using).

This isn't meant as a Go/Java debate, but more a question of whether some languages/frameworks basically already give you the spec you need to the point where you can easily generate an OpenAPI spec from the method definition. Part of that is that the language has types and part of it is the way JAX-RS does things such that things you're grabbing from the request become method parameters before the method is called rather than the method just taking a request object.

JAX-RS makes you define what you want to send and what you want to receive in the method signature. I totally agree that people should start with thinking about what they want from an API, what to send, and what to receive. But is starting with OpenAPI something that would be making up for languages/frameworks that don't do development in that way naturally?

----------

Just to show I'm not picking on Go, I'm pretty sure one could create a Go framework more like this, I just haven't seen it:

    type GetPersonRequest struct {
        Request `path:/people/{personId}`
        PersonId int `param:path`
    }
    func GetPerson(personRequest GetPersonRequest) Person {
        return db.GetPerson(personRequest.PersonId)
    }
I think you'd have to have the request object because Go can annotate struct fields (with struct tags), but can't annotate method parameters or functions (but I could be wrong). The point is that most languages/frameworks don't have the spec information in the code in an easy way to reflect on like JAX-RS, ASP.NET APIs, and some others do.


I absolutely hate the approach of scattering routing instructions everywhere via annotations. Nothing beats a router.go file with all the endpoints declared in the same place. Routing annotations is a bad idea that caught up just because it looks clever.

Looking for the handler for ˋGET /foo/{fooID}/barˋ is terrible in a codebase using annotations.


At work they force me to use NestJS. Want to make a new GET endpoint? Find the controller class, add a method, add a get decorator, add an authentication decorator, add a param decorator, add openapi decorators, and if you are feeling helpful, add openapi decorators to every property of every object you take in or return.

I hate decorators so much, just let me use regular data as code.


For the happy path, the Java code works great, but a good open API spec also includes the following:

- examples, they are a pain to write in Java annotations.

- multiple responses, ok, invalid id, not found, etc.

- good descriptions, you can write descriptions in annotations (particularly post Java 14) but they are overly verbose.

- validations, you can use bean validation, but if you implement the logic in code it's not easy to add that to the generated spec.

See for example this from springfox https://github.com/springfox/springfox/blob/master/springfox...

It's overly verbose and the generated open API spec is not very good.


You don't need annotations for descriptions, they get picked up from javadoc-style comments which you should have anyway. Same with asp.net.


You are right, for Spring Boot, the relatively new springdoc supports javadoc[1] as descriptions, which is better than the annotation.

[1] https://springdoc.org/#javadoc-support


your example doesn't look any worse than an openapi yaml spec given how easy/frequently you can reach 10+ identation levels for a trivial spec.

you might be able to add descriptions easily, but expressing types in yaml is much more verbose than in a decently typed language.


swaggest allows you to define your inputs and outputs, and generate docs from them.


I’ve only glanced at the code on mobile, but am I reading this right? It seems like this does pretty much… nothing apart from writing everything to a .har file, and then calls out to a separate library called “har-to-openapi” to do the actual work?https://github.com/jonluca/har-to-openapi


1. What else should it do?

2. I think I like this blunt elevator pitch much better than OP's multiple paragraphs of text...


A nice tool for research, or for documenting third-party APIs. Let's not forget, though, that one of the goals of OpenAPI is to serve as a design and documentation artifact in design-first API development; generating OpenAPI from code or, as in this case, from network traffic, is an interesting complement and something you can use to test the implementation against the design.


It makes sense, and we love API-first companies. How are frameworks prioritizing this? Seen DRF and lite star but seemed like this was needed to help at places we worked/API market reports about companies that hadn't put those standards in yet


I think pairing this tool with something that recursively clicks through app would be insanely helpful. (the latter is what I have trouble finding)


One slightly related thing you can do is to test the API with schemathesis[0]

[0] https://github.com/schemathesis/schemathesis


This seems pretty simple to me to do. Search the html of the main page for anchor tags. Add the links in those tags to an array as your exploration frontier. Once done parsing that html, load the next link. Add deduplication to avoid loops and just run a depth-first search. What am I missing?


For brochure / static content sites this is definitely the beginnings of a web crawler but it can be a lot trickier for web apps.

For example, clicking a link which loads some data, then clicking edit (which isn't even an anchor), typing in & clicking stuff, then clicking the save button (don't click the cancel button!) would not be an interaction that would get picked up with your suggestion. Detecting loops becomes much more ambiguous and backtracking to get all the permutations of interactions becomes a whole other problem to solve.


In many web apps there are going to be buttons and links that are not represented as <a>. You would realistically have to enumerate everything that has any kind of event handler attached since it could potentially trigger an API call.

You would also have to fill and submit forms with valid and invalid data. You would have to toggle checkboxes, change radio buttons, click buttons, (e.g. "Apply filters" after changing values in a product filter section), and generally go through many combinations of inputs to find all valid parameters and possible responses.


Open to ideas! We're thinking of adding agents/crawler suggestions to the github if there's a package that clicks around in that fashion


Forgive the naive question, but to pair with the GP, thoughts:

1. Wouldnt this also be helpful in understanding the exact nature of all traffic/calls against a particular page, user-workflow matriculating through your site from a UX perspective?

2. Could one make a proxy from this on a local home egress such that you could see the nature of outbound network traffic to site you visit (more importantly, traffic heading to 3rd-party trackers/cookies' APIs via your site visits?

3. Could it be used to nefariously map open API endpoints against a system one is (whiteHat) pen testing?


This is fantastic! I think I'll try to use this to generate the spec for openlibrary.org APIs. We have a few basic ones now but it's a huge pain to write. Someone looked into generating them from the python code but it didn't pan our.


That would be awesome!


I have a similar need but for the FHIR[1] spec, which has its own way of describing RESTful http endpoints that serve FHIR data.

I was looking into how this works for inspiration, and it seems like the work of inferring the OpenAPI definition from recorded requests/responses is handled by the har-to-openapi nodejs library [2]. Is this by the same team? If not, kudos for packaging this up in a proxy -- seems like a useful interface into that library.

1. https://www.hl7.org/fhir/

2. https://github.com/jonluca/har-to-openapi


I thought I recalled something similar to this posted in the past.

https://news.ycombinator.com/item?id=38012032

(5 months ago)


Love what Andrew is doing! We built this for localhost when testing vs. the client side with a Chrome extension.


Another good option for this kind of thing is Optic: https://www.useoptic.com/docs/generate-openapi-from-tests


I wrote a tool for work that does the same thing based on request logs. It would parse each line into a structure then merge the same call point structures down to one spec. It was helpful to see the api but in the end was not that helpful in back filling the openapi spec.

things to consider: - junk data from users will show up. unless your downstream service rejects extra params users will mess with you. - it documents each endpoint but its harder to say if this "user" data is the same as another's endpoints "user" - it is hard to tell if users are hitting all endpoint inputs/outputs without manual review.


Doesn't look like this tool is intended to be deployed into production where "junk data from users" would be encountered. My impression is it's a localhost proxy which only ever sees deliberate test traffic from the developer who's running it on their own machine.

(Although I'd be curious to see something very similar to this running in prod and generating WAF rules and/or alerting on suspicious requests. Kinda like Dynatrace or Splunk, but much more aware of the API documentation and expectations.)


Is this comparable to Akita? https://www.akitasoftware.com/

> By watching API traffic live and automatically inferring endpoint structure, Akita bypasses the complexity of implementing and maintaining monitoring and observability.

> […]

> - Export your API description as an OpenAPI spec.

(Not affiliated, nor am I a user of either of these.)


I could see this (or similar) being useful for generating a spec for an old, undocumented, legacy "service"


This looks interesting - it seems like it wouldn't be a huge lift to turn this into something that compares against an existing spec too. create an AutoSpec, take your defined spec, and spot the difference.


Feature request: a browser extension that intercepts HTTP requests for any given website (instead of localhost) and documents their API :)



Very cool. Thank you!


Could this work with previously saved traffic dumps?


Just replay into nc?


Although this feels like a ok-ish approach to just serve API docs to users, you are missing the whole point of specs if you use this approach. I have started to extensively use spec-first approach with open API. You wouldn't believe how many hours it saved for me despite the initial cost of time to get things started. For example, No need to tediously write DTOs ever, just have them generated using API spec. Need an SDK for API consumers, just generate an SDK using the spec for almost all the popular languages. While using Open API as just an API documentation is fine, it is a waste of potential that Open API provides.


Reminds me of https://github.com/alufers/mitmproxy2swagger which I discovered from this thread https://news.ycombinator.com/item?id=31354130

I generated some specs from that!

I ran into trouble keeping them up to date.


Super curious - How did you try keeping them up to date?


Boring things like running the proxy while I did manual QA / ran automated tests.

I quickly realized that if I wanted an up to date spec then I should do it properly in the application


This is probably cool and useful, but there's no way to know how much of the API you're covering, right?

I find that the real shortage of tools exists going the other way: from OpenAPI to code. The ecosystem has proven to be a huge disappointment there, comprising a janky collection of defective tools that still (years later) don't support version 3.1 (a critical update).


I think going from code to OpenAPI makes a lot more sense, at least for strong typed languages. And even if not directly translated from code, at least closer to the actual code, in annotations or something. Generating the spec from code removes a step, where you simply need to update code, rather than update the spec then update the code


Completely agree. Keeping the openapi spec as tightly coupled to types in code builds a single source truth from your development to your deployment




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: