Was just casually (well ok maybe it’s more compulsive than that) browsing HN and was pleasantly surprised to find tus on the front page. I’m one of the core contributors and happy to answer questions. Although it’s late here so it may take a few hours while I’m asleep :)
People ask that more yes, on the surface they have a lot in common. Both can be used to transmit huge files, both can chunk files up and only transmit remaining parts, and pick up and resume at a later point in time, and (in case of tus optionally with the Concat extension) send these chunks simultaneously.
Tus however works as a thin layer on top of HTTP, so it’s easy to drop into existing web sites/load balancers/auth proxies/firewalls. BitTorrent ports are often closed off on airports/hotels/corporate networks. But websites work. And if you can access a website, you will be able to upload files to it with tus.
Another difference is that tus assumes classic client/server roles. The client uploads to the server. Downloading is done via your regular http stack and not facilitated by tus. BitTorrent facilitates both uploading and downloading in single clients. It is more peer-to-peer and decentralized in nature, where tus clients typically upload to a central point (like: many video producers upload to Vimeo. Not very contrived as Vimeo adopted tus).
There are more differences (Discoverability, trackers, pull vs push, pulling from many peers at once) but the comment is getting very long so I hope this already helps a bit :)
Yes that is very helpful. Our s3 storage backend for tusd uses it, and our https://uppy.io file uploader does too, usable directly from the browser (so you can choose to not use tus at all with it). S3 resumable uploads do come with a few limitations that make some people still choose tus tho:
* chunks need to be >5MB which can be problematic on flaky/poor connections (rural areas, tunnels, clubs/basements, people on the move switching connections all the time)
* your s3 bucket needs to allow write by the world, or you need to deploy signature authentication
* there’s an s3 vendor lock-in some might worry about
* not an open protocol, no chance of advancing it with the community
That said, that still leaves a large audience for direct s3 resumable uploads and I’m thankful aws offers it!
That’s a fair point. And I guess with e.g. Minio you could selfhost too.
S3 is great and in fact, at Transloadit we deploy a content ingestion network (reverse cdn) of many regional tusd servers, close to our customers’ end users, but they all ultimately save to S3 using multipart. We’re happy S3 customers.
So why the extra layer. Because this let’s us offer resumability below 5MB, lower regional latencies, roll our own auth, and switch to a different cloud provider without introducing breaking changes at the customer facing side (assuming the new cloud bucket provider does not offer an S3 compatible interface, or even just a slightly incompatible one)
Ultimately you’re still locked-in with AWS protocol-wise, and there’s no community platform for advancing it, so addressing any of these issues is going to be hard.
If I read the spec correctly, PATCH method is actually more of APPEND, no?
It would seem logical and practical to allow PATCH to modify any part of a resource that is already present on the server and/or to extend it by appending. This would also make the whole thing useful beyond resuming of interrupted uploads, e.g. to allow for rsync-style updating of existing files.
Yes, APPEND is not an official HTTP method though. Allowing to modify parts at any location makes things a little bit more complex and comes with some overhead. If you do need to upload multiple chunks simultaneously, you can opt into our Concat extension however, which does exactly that. Our latest blog posts has some images to illustrate.
My point is that you appear to be pushing for adoption of an extension that handles one specific use case for PATCH, when a more general extension is trivially possible with little to no extra effort.
(I hope I understand your proposal correctly, I fear I might not, if so please clarify, but) more chunks come at the expense of more requests. After a connection drop each separate chunk needs to be renegotiated and transmitted. For some use cases that trade off is well worth it, like when latency is low, but tcp settings or QoS policies won’t let you saturate single connections, so tus does offer ‘sending multiple chunks by default and in parallel’, as an opt-in, via the Concat extension.
If your question is why not make Concat the default mode of operation, the additional roundtrips are the reason. For fragile connections these are often very costly, and we want tus to really shine in those situations, by default. If your users are all operating on big tubes, you’ll likely want to deploy Concat, but that’s not an assumption we want to make.
The HTML5 FileAPI has been around for a few years now yet a lot of sites don't support resumable uploads. I know it adds a bunch of complexity server side as you have to restitch those pieces together but it makes for a good user experience.
I hope with a client like https://uppy.io and a server like tusd, it’s much more manageable these days. Less boilerplate writing and more battle tested components for sure.
Slight Offtopic - why after so many years Chrome & Firefox have so poor support for resuming interrupted file downloads? In case of Firefox I am almost sure it was better in past. I have to use 'wget -c' or https://www.freedownloadmanager.org/ for bigger filles.
As I suspect you may already know, this is dependent on the server 1) indicating support for byte range requests and 2) correctly implementing it.
I don't think I have noticed Firefox getting worse at this over time, but I'm not downloading large files every day. Would you be willing to share where you're noticing this?
It depends on the server, which has to implement HTTP Ranges [0]. Servers like nginx and Apache 2 should suport it. I'm not certain about the whole Node.js and Go backends out there. I think the support in Firefox does not have changed.
Love the work that Janko is doing in our ecosystem! There are implementations for most major languages. So a tus server could even just be some php code that you install with composer and add to your existing Apache setup.
Zawinski's Law needs some revision. Not only do WWW apps expand until users can chat asynchronously, but WWW protocols expand until they incorporate ZMODEM. (-:
We are discussing this very topic here https://github.com/tus/tus-resumable-upload-protocol/issues/... — it has stalled a bit so I would be very happy to see you or other interested/concerned HN readers weigh in. People sharing concerns on GitHub is the main way the protocol has progressed.
> An origin server that allows PUT on a given target resource MUST send
> a 400 (Bad Request) response to a PUT request that contains a
> Content-Range header field (Section 4.2 of [RFC7233]),
Responding with 400 Bad Request is actually something that was added after some servers allowed Content-Range on PUT and others didn't.
It was never standard, but the end-result was that some clients assumed PUT + Content-Range would work, which meant that some servers would apply the change while others would ignore the header and overwrite the entire resource with the chunk.
There's no sane way to add support for this header and make older servers behave correctly, so now we have better facilities for this.
The standard way is to use PATCH + a mimetype that describes the update + perhaps using Accept-Patch to find out what formats are available. It's extremely doubtful that Content-Range for PUT will ever be standard. If there's going to be a future standard, it's likely PATCH based.
It could be possible with PUT and a new 'Expect' header, but not sure if that gives any advantages now over PATCH.
> There's no sane way to add support for this header and make older servers behave correctly, so now we have better facilities for this
KISS. Endow the "400 Bad Request" server response with a special header that acts like a cookie or nonce, with the semantics "this server does support Content-Range uploads and won't corrupt your resource". If the client resends the PUT + Content-Range request with the correct cookie/nonce added to it, it has acknowledged this semantics in turn, and the upload can now go through. This adds a roundtrip, but it's still trivial compared to what's being proposed here, and keeps the semantics of PATCH open for more complicated cases.
Or do a HEAD on the resource you want to resume uploading (this is recommended to find out how many bytes have actually went through) and if the response contains a "Accept-Range: true" header then the client can resume the upload.
You are right about those things, and some of the proposed solutions (that one and others).
It look to me PATCH is actually better; perhaps one of the patch formats can be a partial patch, for example if the Content-Type of the PATCH request is application/partial-content-patch then the first line of the body is the contents of the Content-Range header. In my opinion, this look better to me than the other replies to the message that this message is in reply to (although I admit anything I write may be mistaken; I am not perfect).
Yes for browsers it’s cheaper to build upon http, and it let’s you move through airport/hotel/corporate firewalls without problems.
Tus is also used in datacenters for high throughput & reliable transmissions. Probably in most cases rsync is a sensible choice, but sometimes maybe you already have tus, http based auth, loadbalancing, etc in place that you want to leverage, or maybe you want to avoid exchanging ssh secrets