Bind broker

montecarl · on July 11, 2017

I think this is a response to this: http://adamierymenko.com/privileged-ports-are-causing-climat...

See discussion here: https://news.ycombinator.com/item?id=14712576

akkartik · on July 11, 2017

Also, OP's original comment: https://lobste.rs/s/kzb2xx/privileged_ports_cause_climate_ch...

raarts · on July 11, 2017

Thanks. Another one of Adam's great blog posts I didn't read yet. His startup is really interesting too.

profwick · on July 11, 2017

Rather than proxying data, why wouldn't you just bind the socket, and then transfer the file descriptor over the UNIX domain socket (using sendmsg/recvmsg)?

Or acept() incoming connections, and then pass the connection's file descriptor.

tyingq · on July 11, 2017

User space proxying for protocols other than http, though, is a bit tricky. If you aren't exposing things like source ip, listen queue depth, buffer sizes, errno from failures, etc...you are potentially limiting how well it works. Plus the read/write overhead. I'm not convinced this is any better than other approaches. Brokered iptables (or pf,etc) port rewrites seems cleaner, though it has issues as well.

sigil · on July 11, 2017

> If you aren't exposing things like source ip, listen queue depth, buffer sizes, errno from failures, etc...you are potentially limiting how well it works.

This is a good point.

> Plus the read/write overhead.

The author's code uses OpenBSD's `SO_SPLICE` for zero-copy socket data transfers.

http://man.openbsd.org/OpenBSD-5.4/setsockopt.2

Edit: apparently `SO_SPLICE` can't splice IP sockets onto unix domain sockets yet, so the presented code is kind of wishful thinking. It sounds like this is in the works though.

elFarto · on July 11, 2017

It sounds like you should probably just go the whole hog and give each user their own network namespace and bridge them to the main network. Then they can run DHCP and get their own address and do with it what they like.

Wouldn't really work for Internet accessible IPv4 addresses, but IPv6 would be fine.

nhaehnle · on July 11, 2017

This is neat. There's a minor race condition at startup because scanhostdir calls scanportdir before watchdir. Reversing the order of calls would close the gap and shouldn't have any adverse effects.

SwellJoe · on July 12, 2017

For the most common case of lots of users needing ports (web apps), there's the option in nginx and newer Apache versions to proxy to a UNIX socket, so no port is needed for the user app. It's not gonna work for everything but many web app servers are fine with using a socket instead of a port.

There's also http://cyberelk.net/tim/software/portreserve/ for when ports are really needed (not as elegant as the solution under discussion, though, as it just holds a port by binding to it until the rightful owner/service comes along).

zokier · on July 11, 2017

I think these days I would approach the problem by creating per-user network namespaces and hack the privileged port limitation away from kernel (is there a sysctl for that/why not?)

tyingq · on July 11, 2017

For Linux, there's:

- CAP_NET_BIND_SERVICE - Assigned to an executable, doesn't work for scripts, etc.

- Workarounds like authbind (https://en.wikipedia.org/wiki/Authbind)

- Then, as of kernel version 4.11, you can set where the non-privileged ports start, like "sysctl net.ipv4.ip_unprivileged_port_start=0" Somewhat helpful in that you could have them start above things like sshd (22), but below port 80. Still not great for multi-tenant, etc, though.