Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The site host doesn't have to participate in IPFS for this to work.

Really?

An IPFS user will scrape HTTP content, republish it over IPFS and update a directory of where non-IPFS content can be found over IPFS?

I must have missed that part.



Yeah - there's a ton of immutable URLs on the web. All of CDNJS/JSDelivr/Google Hosted Libraries, most of raw.github.com, all Imgur & Instagram images, all YouTube video streams (excluding annotations & subtitles), and all torrent file caching sites (like itorrents.org). There are probably some large ones I'm forgetting about, but just mapping immutable URLs to IPFS could probably cover 1/3rd of Internet traffic.

Check out https://github.com/ipfs/archives to learn more.


IPFS archives look like something entirely different from what the grandparent was talking about.

Sure, you can manually publish archives over IPFS, but that's not something that automatically creates an IPFS cache copy of whatever you are surfing.


IPFS archives is the effort that's going on right now to archive sites. Eventually there will be a system for automatically scraping & re-publishing content on IPFS.


Fair enough. Eventually somebody will have to do a lot of work to get all that done then.


Right now storage space and inefficiencies in the reference IPFS implementation are the biggest problems I've hit. Downloading sites is easy enough with grab-site, but my 24TB storage server is getting pretty full :( ... Gotta get more disks.


Say you grab a site. How do you announce that fact, verify that it is an unmodified copy, sync/merge/update copies and deduplicate assets between different snapshots?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: