Doctree

emidoots · on April 30, 2022

Whoa, wasn't expecting to see this on HN! I've only done a week of work on this, super early stages - here are our plans for it:

* 100% OSS tool, run locally on your machine (static Go binary) or use https://doctree.org (not online yet, plug in a repository name, get docs) - really want this to be a proper, useful FOSS tool.

* Work with any language, based on tree-sitter.

* Provide symbol-level search functionality.

* Surface real-world usage examples automatically, probably based on some statistical analysis of how functions are commonly used in open source code via Sourcegraph API, similar to what https://codestat.dev is doing.

Tech details (again, just a week in):

* Go for backend, Elm for frontend

* Indexers will be written in Go, use tree-sitter queries to produce a standard index schema which then gets served to Elm frontend for rendering. https://github.com/sourcegraph/doctree/blob/main/doctree/sch...

Probably not worth trying out right now, but if you're interested in it we set up a Discord server for collaboration, etc.

https://discord.com/invite/vqsBW8m5Y8

Happy to answer any questions!

bradrn · on May 1, 2022

After a week of work, how much of this have you managed to accomplish so far? (Very little, I assume, but I could be wrong.)

emidoots · on May 1, 2022

Quite a lot, actually! It's definitely not ready for prime-time usage or anything, but I feel good about the week of time on it:

* There are some Docker commands you can use to try it out on some Go code right now[0] (still working on getting binary releases for each OS so Docker is not necessary)

* There's a not-too-bad frontend (written in Elm), screenshots in the README are real & it all functions!

* There's an indexer implemented for the Go language[1] that runs tree-sitter queries & emits a basic schema[2], the idea is each language would emit to a common schema like this and then the frontend can serve it, we can index it for search, etc.

So, I mean, yeah - just a week into it, but like - you can already view documentation for Go functions in it so moving quickly!

[0] https://github.com/sourcegraph/doctree#try-it-out-extremely-...

[1] https://github.com/sourcegraph/doctree/tree/main/doctree/ind...

[2] https://github.com/sourcegraph/doctree/blob/main/doctree/sch...

bradrn · on May 1, 2022

Impressive, thanks for clarifying!

learndeeply · on April 30, 2022

Why is this open source? Sourcegraph could make this a successful paid, enterprise product.

emidoots · on May 1, 2022

Fair question! I want to say really, the goal is just to find a way to make an actually useful library documentation tool like this that people enjoy using. Myself, Beyang (CTO), and Quinn (CEO) have all wanted something like this for a while and think there's potential for something cool here.

It'll require _a lot_ of iteration to make this work really well, though, and make it something that everyone feels good about using *for their projects*. Don't want there to be barriers to using it, if it was enterprise/paid it'd be tough to do that.

There will be features/functionality doctree _can_ gain if you connect it to a Sourcegraph instance, but largely because they'd be impossible to do otherwise:

* Usage examples - we need statistical analysis of a large corpus of open source code to find good real-world usage examples, so we'll leverage Sourcegraph for that (it already has that data.)

* Respecting repository permissions, OAuth integration, etc.; very important in enterprise environments, super complex/annoying to do. Sourcegraph already has all this data about your github/gitlab/bitbucket repos, user accounts, etc. and so maybe you can one-click connect doctree to a Sourcegraph instance to gain this functionality if you're some large enterprise that needs it.

I think there are some great synergistic ways doctree will work with Sourcegraph if you use that (or are OK with it contacting Sourcegraph.com for public code, but very important to make that respectful / opt in.)

I want to be clear, though, doctree is 100% open source, it'll be a proper OSS project - just want to make a useful tool for everyone first and foremost.

learndeeply · on May 1, 2022

Can someone explain why I was downvoted?

emidoots · on May 1, 2022

I interpreted your question as "How does it make sense for Sourcegraph to build this as an open source thing, and not as a paid product?" which I thought was a super reasonable question.

I think "Why is this open source?", though, is what got you down-voted because it implies it should be closed source, when folks obviously prefer open source.

Hope that helps!

jitl · on April 30, 2022

I'm ecstatic that someone is building this.

Figuring out sourcecode-to-apidocs for one language is annoying, and figuring it out in the context of multi-language monorepos is exhausting. Then on top of that, I want to fail the build if someone adds public APIs to a library and doesn't document them! Now I have to go back to all my doc generators and get some kind of metadata out of them??????? And what if I want to make my docs pretty, and link to each other across languages?????? SFLSJHDFKJSDHKJF

So, this is great. A small dream, coming true. Best of luck to y'all!

brainbag · on April 30, 2022

I've been wanting this to exist for years, ever since using Elixir and the amazing first class doc support. I hope it is super successful!

mountainriver · on April 30, 2022

I’m getting “server not found” when I click on the web page

beyang · on April 30, 2022

I’m helping with the doctree effort at Sourcegraph. Apologies, the site isn’t actually up yet. This project is still very early stages and we wrote up the README to serve as a sort of launch spec that we could update in response to feedback we receive. We made the repo public so we could build in public but didn’t expect it to receive this much attention this early!

So sorry that the site isn’t up yet. We’ll update the README soon to reflect that. If folks are interested in trying out a super early version, there’s the Docker run command and if you’d like to help us realize this vision, please join our Discord! https://discord.gg/vqsBW8m5Y8

bradleyjkemp · on April 30, 2022

And it's not just DNS issues. The domain doesn't even seem to be registered: it's available for purchase...

edit: No longer! Hopefully someone benevolent picked it up

mdaniel · on April 30, 2022

Sweet, a phishing opportunity!

    | Welcome to the doctree.dev demo site!
    |---
    | Enter your github oauth2 token to see it work

emidoots · on April 30, 2022

To be clear, we do own the domain. It was registered about ~7 days ago. It's just not deployed yet.

bradleyjkemp · on April 30, 2022

May want to double check that: domain data shows it was only registered today about an hour ago: https://client.rdap.org/?type=domain&object=doctree.dev

It was definitely available to purchase when I commented

emidoots · on April 30, 2022

..huh, yep, you're right. That's.. super embarrassing and huge screw up on my part, uhg. I was 100% positive I submitted the order through Namecheap before pushing the repo up to GitHub for this exact reason, and that it went through.. but yeah, looks like I didn't and we don't own it. :(

Good news is we've got doctree.org, so will be using that instead. I've removed all references to the other domain.

If it was a good samaritan, shoot me an email -> stephen@sourcegraph.com

__ryan__ · on April 30, 2022

I love namecheap but I have walked away thinking I’ve completed an order to find that there’s a second order confirmation screen numerous times.

emidoots · on April 30, 2022

This was probably it.. luckily, it seems like it may have been a good samaritan from HN, they reached out to me just now. People here rock :)

devass · on April 30, 2022

Probably some developer misusing the .dev domain for their own personal test projects, yet again.

HWR_14 · on April 30, 2022

You mean "some dev deciding to keep using the .dev for personal test projects, as was the standard before ICANN and Google took a standard community resource and privatized it, yet again"

chrisseaton · on April 30, 2022

> as was the standard

Which standard was that? I thought only example.com was special-cased?

HWR_14 · on April 30, 2022

Before ICANN started auctioning off TLDs it was common practice to use .dev and .test (probably others that escape me).

It wasn't formalized, but that doesn't really matter. It was well known and commonly done.

In fact, it couldn't have been formalized, because the TLDs were limited and by definition any non standard TLD was for internal use only. It would make no sense to have a defined standard for an impossible situation.

detaro · on April 30, 2022

> In fact, it couldn't have been formalized, because the TLDs were limited and by definition any non standard TLD was for internal use only.

No, there never was any guarantee that the existing TLDs were all that would exist ever, so non-standard TLDs were just that: non-standard, undefined what happens to them. And you even provided a counter-example: .test is explicitly reserved by an internet standard to never be in public DNS and thus safe to use for testing purposes.

chrismorgan · on May 1, 2022

Using .dev was always contrary to spec. Dunno how common it actually was—I personally never encountered it. Clearly ICANN decided it wasn’t such a hazard as .home and .corp, which are both indefinitely delayed (https://icannwiki.org/Name_Collision) due to their popularity (despite being contrary to spec). You should instead have used something like .localhost (reserved in RFC 2606) if it’s on your local machine, or .test (reserved in RFC 2606) in a local network, or some domain that you control (even if it’s not publicly routable).

my69thaccount · on April 30, 2022

https://github.com/basecamp/pow

Pow used .dev and .test in the 2000s-2010s

chrisseaton · on April 30, 2022

That’s not a standard that’s just someone using them.

rvieira · on April 30, 2022

Incidentally, I've been getting a lot of timeouts from .dev TLDs today.

mdaniel · on April 30, 2022

Unless "tosh" is one of the project authors, I'd guess this was submitted to HN before an official launch. That is, I bet there are a lot of broken links from the website field in GitHub projects, and the only reason you expected this one to work was because it made the front page of HN and was in sourcegraph's GitHub org

tosh · on April 30, 2022

found it via the author’s tweet:

https://twitter.com/slimsag/status/1519910761400741888

mhh__ · on April 30, 2022

https://mir-algorithm.dpldocs.info/mir.ndslice.html is a D specific approach to something like this (it doesn't use the same approach as the builtin doc generator you get with the D compiler, e.g. everything appears in the docs just some things are noted as being undocumented)

patrick451 · on May 1, 2022

This seems like a really slick idea.

Something I've always wanted is better multi-language documentation support. kLike suppose I have a c++ project that is integrated to python with pybind11. The python bindings may be the highlevel interface, but sphinx doesn't make it easy (as far as I know) to integrate python documentation generated from doc strings with c++ doxygen style briefs, especially in a way that lets you navigate seamlessly between the two.

I wonder if you have considered a use-case like this?

emidoots · on May 1, 2022

Definitely want to enable this use case better. I think it will be quite easy to have a "you're viewing your project, you search for a function named 'curl fetch' and it turns up both the Python and C++ method named that"-type experience.

Linking between the two may be tougher (showing you with confidence they're related), but maybe possible for us to do something there, not sure yet.

pizza · on April 30, 2022

Interesting - wonder if there's a way to combine this with fzf + some kind of html or e.g. markdown viewer to search and read docs straight in the terminal

emidoots · on April 30, 2022

Since `doctree` will be a static Go binary you can run locally, the idea is to have a "get docs straight in the terminal" via the CLI in the future. But probably a little ways away from that since we're just a week into it!

qbasic_forever · on April 30, 2022

If the docs are generated from comments in the code a simple ripgrep search of the source (with fzf or other integration) will find things just as well too. Hound is a nice little web UI that does this: https://github.com/hound-search/hound