I have a truly naive question about the distributed web: what makes the supporters of it think it will be any different from the original web. I mean, isn't it likely that at some point, there will be the need for a centralized search engine for it? Isn't it unavoidable that big companies like facebook runs their own non-distributed subnetwork, so that it can deliver standard functionality to all its users? The original web IS distributed already, isn't it? It's just that organically, the way people use it has become a lot more centralized, no? Or am I missing the main argument for a distributed architecture?
The difference is IPFS (or similar) would distribute content. Currently communication is distributed, but content is not. IPFS is like https, but where every page is a bittorrent. There are challenges for security in distributing services and monetization. However, the key is that content would be distributed. YouTube probably would not switch to ipfs, but DNS (what was targeted recently) would make sense. A DNS host list should be more distributed. Even with DNS there is a dynamic content challenge. When a DNS entry changes, it would still need to propagate.
This brings it back to a distributed communication problem. If you could distribute content more consistently and 100,000 computers could step in as dns providers with authentic data rather than concentrating dns at dyn or Google it would help. That will still require protocol updates so that the fail over uses local content. Really ipfs and distributed content could be a component to help with distribution. It definitely isn't a complete solution alone (yet -- there is a push to make it the solution).
With the web going more mobile with less storage space. Why would anyone would want to be a node in such a network where you have to cache content for others.
Well apart from battery, If I was an ISP, I´d like it if a Youtube movie would flow from one persons mobile to the hotspot of the train said person is sitting on and on towards another person on the train. This scenario would require significantly less bandwidth overall. Same goes for neighbors accessing the same information. And this can be extrapolated to many situations.
The IPFS powered internet would perhaps lead to more uploads but it would scale significantly better and will be cheaper to maintain. Hence cost can also go down for end users/consumers.
... and that's pretty much the deal killer for this on mobile, even ignoring everything else.
> If I was an ISP, I´d like it if a Youtube movie
... except you are not allowed to download videos from YouTube or most (if not all) the popular video content services.
> a Youtube movie would flow from one persons mobile to the hotspot of the train said person is sitting on and on towards another person on the train.
This would only work as long as person A is running IPFS, connected to the hotspot and has any cached content somebody else is concurrently interested in. The hotspot is very unlikely to run IPFS and to have any storage, so cache hit ratios would not only be low they would be dependent on person A's transit schedule.
> This scenario would require significantly less bandwidth overall.
No, the same amount of bandwidth would be consumed, but perhaps over a cheaper radio bearer.
> The IPFS powered internet would perhaps lead to more uploads but it would scale significantly better and will be cheaper to maintain.
Somehow I doubt that. Total costs would most likely go up, but they would perhaps be more spread out.
> Hence cost can also go down for end users/consumers.
Consumers don't spend anything for accessing content on the Internet, so I really doubt there are any cost savings to be had.
> ... and that's pretty much the deal killer for this on mobile, even ignoring everything else.
It is, for now, indeed.
> Except you are not allowed to download videos from YouTube
I think youtube would be interested in some load balancing if they could keep the income from commercials. Wait, what, there wouldn't be a need for YouTube. We'd only need a way to pay content creators build into the system (Ethereum coupling somehow?? I'm not sure, can views be tracked in IPFS?).
> This scenario would require significantly less bandwidth overall.
The same amount of bits are pumped around but they have to cover significantly less physical distance and hubs. This reduces bandwidth.
> Somehow I doubt that. Total costs would most likely go up, but they would perhaps be more spread out.
Me pumping bits from my neighbor's house to mine instead of us both via the backbone from some server in a central location requires less (expensive) infrastructure between me and that central location.
> Consumers don't spend anything for accessing content on the Internet
indirectly they (we) pay for the copper and the fiber.
> I think youtube would be interested in some load balancing if they could keep the income from commercials.
Firstly, it's not up to Google. Secondly, how would that even work?
> Wait, what, there wouldn't be a need for YouTube.
Yes, there would. YouTube isn't the solution to a technical problem.
> We'd only need a way to pay content creators build into the system
Why?
That's also a pretty big "only".
> The same amount of bits are pumped around but they have to cover significantly less physical distance and hubs. This reduces bandwidth
No, it doesn't. What it reduces is bitmiles. Which may or may not be significant. Mostly not, but it may have some significant if we can lower the usage of some scarce and expensive radio bearer.
> Me pumping bits from my neighbor's house to mine instead of us both via the backbone from some server in a central location requires less (expensive) infrastructure between me and that central location.
Not really. You are still going to need that infrastucture, so no cost savings.
> indirectly they (we) pay for the copper and the fiber.
So we do, but using IPFS isn't going to result in us getting a check in the mail. Costs are still zero and savings likewise.
> except you are not allowed to download videos from YouTube or most (if not all) the popular video content services.
If you're not allowed to download videos, then how come they show up on my screen? That information wasn't on my device before I pressed the play button.
Streaming _is_ downloading. It's just downloading _continuously_ as the content is presented. Whether or not you seed is an entirely separate issue and has nothing to do with streaming vs downloading.
No, it is not. Not from a legal perspective, which is all that matters in this context. With seeding you are making things worse for yourself, as you are not merely downloading but also distributing. Seeding is also relevant with regards to downloading, as IPFS automatically starts seeding what you download.
Depends, I'm wondering if it could be cheaper energy wise if everybody was tapping into each other phones nearby instead of reaching for the cell tower.
By allowing your phone to be used as a server, other people will let you use their phones as a server, so you can download faster. If this sharing doesn't emerge naturally, then it can be directly incentivized with ratio tracking (like private torrent trackers), or even with money like the Karma WiFi service.
Also, I have WiFi on always and my battery lasts all day, so I doubt this will be a problem.
Having wifi on all day is one thing, actually using it is totally different. Just activate your wifi hotspot on your phone and start using it from your computer. See how fast the battery depletes.
No matter what incentive scheme you device , it's not going to help you with your battery.
> No matter what incentive scheme you [devise], it's not going to help you with your battery.
Obviously with certain incentives, battery would become irrelevant. If this hypothetical "IPFS ratio" were as valuable as (for example) what.cd ratio, I'd have no problem carrying around car battery to continue seeding. Thankfully, I don't think it would be possible for my phone to consume that much power in a day.
> Obviously with certain incentives, battery would become irrelevant.
Sure, cold hard cash would suffice, but imaginary Internet points won't do. But even with cash there comes a point where you would turn off IPFS, because the alternative would be to have a dead phone for the rest of the day.
> Thankfully, I don't think it would be possible for my phone to consume that much power in a day.
Perhaps not that much battery, but it is quite easy to deplete your battery in much less than a day by having your hotspot on constantly, which is what IPFS would require.
Let’s not forget the context here. We’re talking over-WiFi (not cellular) opportunistic sharing between two phones running an IPFS node.
If you have anything to share at this point this is probably a good indication that both clients have some content in common. So first of all, transfers over WiFi while the phone is in use anyway are not that expensive, and more importantly there’s an opportunity to save a lot of battery, and data, that results in substantial net positive gain.
To illustrate imagine the likely scenario of two people using some form of Youtube with a distributed IPFS cache while on a plane. If those two people have anything in common—and certainly there are things like news etc. that are always in common, those can be shared instantly over the WiFi.
So for an insignificant battery investment, you can potentially save a lot of data and modem power in the long run.
Plus a plane, bus, or train could easily have 50 people within range. Assuming everyone has 8GB of cache space on their phones, you could access about 0.4TB of content directly. With a mesh network that supports packet forwarding, you could connect with everyone on the plane, meaning 100-700 people, which would give you 0.8-5.6TB of content.
Obviously content would be highly duplicated between peers, but you'd have access to a significant chunk of the "popular" internet, and would be able to avoid in-flight WiFi.
For such a mesh network, you don't even necessarily need to connect over WiFi - lower energy Bluetooth could be used too.
> Assuming everyone has 8GB of cache space on their phones
That feels optimistic. I doubt people have that much free space on their phones, especially those that don't have flash cards. Heck, a basic iPhone won't have. Even a 16GB model has something like only 12 GB free when it's empty.
> For such a mesh network, you don't even necessarily need to connect over WiFi - lower energy Bluetooth could be used too.
IPFS doesn't (currently) work over Bluetooth.
However, my main point is, that I don't think most people would be altruistic enough to participate unless they could plug in their phones on the plane.
> So for an insignificant battery investment, you can potentially save a lot of data and modem power in the long run.
Having run many a phone into the ground while hotspotting, I do not consider the battery investment insignificant. It's annoying enough when I drain my own battery when using the hotspot, I would never allow complete strangers to do that to me.
Isn't that the direct antithesis of IPFS? Might as well just Airdrop whatever you want to share with your buddy if you are going to require a rendezvous mechanism.
Opportunistic as in piggyback on active WiFi connection and non-idle state.
P2P networks like Bittorrent pride themselves on being available despite high churn, so this is right up their alley.
The network works even if people only seed during a download. Obviously, that’s less than ideal. For mobile, though, this is perfectly fine.
The point of IPFS is you can do an ‘airdrop’ without having to coordinate it. You lookup content by hash— If it’s available locally? Great! If not just get it from remote and add to cache. Simple as that.
> Opportunistic as in piggyback on active WiFi connection and non-idle state.
That's a reasonable approach, but due to the synchronous requirement I guess the offload factor would be minimal. When I want to download something I might as well enable wifi and see if it's available locally. After I'm done there's really no incentive for me to keep seeding or even have wifi turned on.
> P2P networks like Bittorrent pride themselves on being available despite high churn, so this is right up their alley.
This really isn't applicable to opportunistic mobile use of IPFS, as you are unlikely to have a large enough local swarm to guarantee uptime or availability.
> This really isn't applicable to opportunistic mobile use of IPFS, as you are unlikely to have a large enough local swarm to guarantee uptime or availability.
IPFS content is chunked, your swarm is lan+wan.
> That's a reasonable approach, but due to the synchronous requirement I guess the offload factor would be minimal.
A room full of people using their phones is what I have in mind. Remember, a node’s cache is not exclusive to just the content that’s being currently consumed. Anyway, the point was whatever’s the offload factor it would still be a net positive.
Also one can imagine if IPFS gets popular, just as ISPs collocate CDN caches in the present model, WiFi routers could come with a local node of their own. The “Web Accelerator.” And now we’re talking!
There’s opportunity for caching at many network levels, but there’s no incentive to do so with HTTP.
Yes, but for offload, all that matters is your local lan swarm.
> A room full of people using their phones is what I have in mind.
That might work to some extent, but how often do you (i) spend your time in rooms full of people and (ii) have no wifi.
> Anyway, the point was whatever’s the offload factor it would still be a net positive.
Not quite. If the offload factor is sufficiently small there is no rational reason to burn battery on it.
> Also one can imagine if IPFS gets popular, just as ISPs collocate CDN caches in the present model, WiFi routers could come with a local node of their own. The “Web Accelerator.” And now we’re talking!
Possible, but unlikely. Web cache hit ratios are abysmal, normally it's much cheaper to just up the bandwidth. A lot of business models would also have to change for the (large) and interesting content to become third party cacheable.
> There’s opportunity for caching at many network levels, but there’s no incentive to do so with HTTP.
There are also many business models where there is no incentive to cache either.
> If the offload factor is sufficiently small there is no rational reason to burn battery on it.
Sure, but there’s also no cost to it, if you go with the piggyback/opportunistic approach. That’s the point. It’s either net positive or just neutral. Whether the local offload is effective or not— that’s a different argument.
Anyway, the driving reason for my interest in IPFS is re-decentralization of the Web. If it can, theoretically, save some data on mobile—so much better.
What it needs, though, is simply work on mobile at least as well as HTTP. There’s no doubt IPFS has no fundamental architectural problems in that regard.
Though, full disclosure, it does have some major implementation issues at the moment.
The idea is that companies or people backing the content could easily provision more nodes to meet the demand. To create an IPFS "host", its super simple:
The internet may run on content (storage) and communication, but both those things run on computation. Until computation can be safely distributed real changes can't happen
> In which way is IPFS immune to content takedowns?
In the way that, if one node is forced to take down some information, that information may still be accessible from other nodes under the same address as before. It's not quite immunity: They can still harrass individuals. Call it "resilience", maybe.
> IPFS provides no anonymity, so it won't take long before DMCA notices to start arriving.
Because mirroring content on IPFS is trivial and transparent to the consumer. All it takes is a single user (or an arbitrary number) outside of US jurisdiction and the content is more or less DMCA-immune.
The equivalent is not practical in traditional HTTP, where content at risk of being taken down is scraped from the server and that snapshot-in-time is hosted as a mirror, generally at a different domain.
> Because mirroring content on IPFS is trivial and transparent to the consumer.
Which is also why a regular consumer should never use IPFS and why prosumer should immediately disable caching and seeding upon install.
> All it takes is a single user (or an arbitrary number) outside of US jurisdiction and the content is more or less DMCA-immune.
Perhaps so, but if your plan for resiliency is based on the kindness of strangers in countries where the DMCA nor censorship does not apply, it's not much of a plan.
Plus you'll still have to seed the original which opens you up to liability, as IPFS does not provide any kind of anonymity.
So, while better than traditional HTTP, IPFS isn't really immune to takedowns nor very resilient.
>Which is also why a regular consumer should never use IPFS and why prosumer should immediately disable caching and seeding upon install.
Unless they changed policy, seeding is strictly a manual, opt-in process.
>Perhaps so, but if your plan for resiliency is based on the kindness of strangers in countries where the DMCA nor censorship does not apply, it's not much of a plan.
I disagree, IPFS is a pretty reasonable plan for resiliency in the case of static web content, as previously discussed.
If instead of "kindness of strangers" you frame it as a "market with incentives" to maintain information availability it is both less condescending and more accurate to how situations are likely to play out with things like DMCA'd content.
>So, while better than traditional HTTP
(which is all it needs to be)
>IPFS isn't really immune to takedowns nor very resilient.
That wasn't the goalpost we originally set, nor one of the project's longterm objectives, so I'm not sure the relevance.
> Unless they changed policy, seeding is strictly a manual, opt-in process.
Pinning may be manual, but is not content automatically cached and seeded (until purged from the cache) once any content is retrieved?
> I disagree, IPFS is a pretty reasonable plan for resiliency in the case of static web content, as previously discussed.
How is it a reasonable plan to exploit the naive and the uninformed or to depend only on those that are outside your juridistiction and unassailable by their own?
> If instead of "kindness of strangers" you frame it as a "market with incentives" to maintain information availability it is both less condescending and more accurate to how situations are likely to play out with things like DMCA'd content.
What incentives exactly would those be?
I also find it interesting that you object to kindness from strangers.
>>So, while better than traditional HTTP
> (which is all it needs to be)
That might not be sufficient for IPFS to get enough adoption, tho.
>>IPFS isn't really immune to takedowns nor very resilient.
> That wasn't the goalpost we originally set, nor one of the project's longterm objectives, so I'm not sure the relevance.
Fair enough. It is however relevant in the sense that it both removes a use case and acts as a disincentive for users to participate, as there is no liability shielding.
> Perhaps so, but if your plan for resiliency is based on the kindness of strangers in countries where the DMCA nor censorship does not apply, it's not much of a plan.
Torrents work for content that is heavily seeded, but you also have a lot of content that has ~0-5 seeders with very bad bandwidth.
I don't see how IPFS tackles the problems of guaranteed availability and enough redundancy while relying on volunteers to mirror and serve content.
Also, if you want to DDoS content, it sounds like all you need to do is find the IPs of the nodes that host the content you want to target (which might not be many, and I guess become visible to you once you access/download that content) and knock them out
It's not like the grandparent is advocating DDOSing, he is just pointing out that doing so is trivial for less popular content. This is something that IPFS doesn't address, but that their marketing fluff implies would IPFS would take care of.
Good luck adding all DDOS bot IPs to your null route.
I've come to the conclusion compute needs to be separated from content and users using a new non-financial based business model. That's the only way to ensure game theory based models don't fuck up the infra on which everything depends.
I am a novice, but I'll do my best to answer. IPFS isn't a replacement for the existing web like many of its predecessors, for the purposes of your question, it really works more like a drop-in shared caching system. The site host doesn't have to participate in IPFS for this to work.
As an example: You host a blog. As an IPFS user, I surf to the blog and store your content in my cache. When another IPFS user attempts to access your blog they may pull directly from my cached version, or from the original host (depending which is fastest). Merkle DAGs are used to hash content for quick locating, to ensure content is up-to-date, and to build a linked line of content over time.
This gets more interesting if there's a widespread service outage. IPFS nodes will continue to serve the most up-to-date version of the web even if the web is fragmented. As new information becomes available it is integrated into the existing cache and then propagated to the rest of the fragments.
I still struggle to understand how this works with databased content, but I do believe IPFS addresses this content.
Yeah - there's a ton of immutable URLs on the web. All of CDNJS/JSDelivr/Google Hosted Libraries, most of raw.github.com, all Imgur & Instagram images, all YouTube video streams (excluding annotations & subtitles), and all torrent file caching sites (like itorrents.org). There are probably some large ones I'm forgetting about, but just mapping immutable URLs to IPFS could probably cover 1/3rd of Internet traffic.
IPFS archives is the effort that's going on right now to archive sites. Eventually there will be a system for automatically scraping & re-publishing content on IPFS.
Right now storage space and inefficiencies in the reference IPFS implementation are the biggest problems I've hit. Downloading sites is easy enough with grab-site, but my 24TB storage server is getting pretty full :( ... Gotta get more disks.
Say you grab a site. How do you announce that fact, verify that it is an unmodified copy, sync/merge/update copies and deduplicate assets between different snapshots?
Say I have a site that sells Awesome Products. L337Hacker mirrors my site, does a DDOS on the original and lets IPFS take over, redirecting the shopping cart to his own site.
Is this a potential scenario? If so, is there any way to prevent it?
If I'm reading this correctly, there are a few ways IPFS could be used in support of distributed, fraud-resistant, commerce.
First: publishing the catalog isn't the same as processing the shopping request. Online commerce is largely an update to catalog + mail-order shopping as it existed from ~1880 - 1990. If someone else wants to print and deliver your (PKI-authenticated, tamper-resistant) catalog, that's fine.
Second: The catalog isn't the transaction interface, it's the communications about product availability. The present e-commerce world is hugely hamstrung on numerous points, but one of these is the idea separating the catalog and product presentation itself from ordering. So long as you're controlling the order-request interface, you're good. A payment processing system which authorised payments via the bank rather than from the vendor would be helpful. Also a move away from account-info sharing.
The key is in knowing who the valid merchant is, and in establishing that the fraudulent merchant has misrepresented themselves as the valid merchant. Perhaps authentication within the payment system would help.
Taking the shopping cart's payment mechanism out of the shopping cart would help.
All IPFS URLs contain the hash of the content, so you can't change it. There's a mechanism to allow for URLs which can point to varying bits of content, but I'm not aware of a paper which shows its security properties.
Upon further reading, it appears that it may be impossible to verify the security of an IPFS cached page, simply because the hash is calculated post-fetch on the client. That allows any sort of shenanigans to be performed on the original content before it's stored.
If content is created specifically for IPFS-caching (similar to Freenet or Onion), then it may be possible to be authoritative, but content cached from the web should never be considered so.
> Upon further reading, it appears that it may be impossible to verify the security of an IPFS cached page
Not at all, rather the opposite, it's very easy to verify a page since the hash is based on the content.
You have file "ABC" that you want to download. So you fetch it and once you have it locally, you hash it yourself and you compare the hashes. If they are the same, you know you have the right thing. If they are different, someone is sending you bad content.
Re-read the original question. If someone is preventing access to the original page and the alternate is being served thru IPFS, there is no way to compare the original. The IPFS cached page becomes the authoritative page and could contain altered content, which the hash takes into account.
If the original page can perform the hash and embed it, that would somewhat alleviate the issue during the fetch, but do nothing to prove that the IPFS-served page was trustworthy or not, unless some third-party knows the original hash, as well.
If the page was served to the IPFS network, to be cached, by a neutral, trusted third-party, that would somewhat alleviate the problem, although there arises the problem of trust again.
The only way to minimize the trust issue is if the page originates from inside the IPFS network and is not a cached version of page originally served outside the network.
You're right. I misread your parent's comment, which suggested that web pages could magically be cached securely into IPFS from the public web without any involvement from the website's owner, which is nonsense.
1) The web as a whole is decentralized, but any given web site is centralized by default. A webmaster can "push" a modification or deletion from a central server and have it trickle down through caches/proxies unless someone like archive.org makes a deliberate decision to archive it. In systems like IPFS, content deletion and modification follows a model more like garbage collection, where pointers to old data can hang around and continue working even if an "authoritative" pointer is pointing to something else. Basically, nobody has the authority to break a link (law/policy-mandated filters notwithstanding).
2) The web doesn't really have any built-in mechanisms to promote redundancy for durability, availability, or performance purposes. By default, there's one daemon serving one copy. Everything else (CDNs, forward/reverse proxies, failover) is some kind of optimization tacked on after-the-fact.
The idea is that we run that centralized search engine via Smart Contracts.
In other words, using a decentralized currency such as Bitcoin and decentralized execution environment like Ethereum, we create softwares which pays people for providing CPU time and disk space (facilitated by IPFS).
Few missing pieces of this puzzle are homomorphic encryption implemented on Smart Contracts level which will allow that CPU service.
Once that's done then we can have the distributed web of the next level.
A search engine via ethereum is not going to scale, is not easily upgraded, and will fall far short of any properly centralized search engine. If you want a competitve decentralized search engine, you're going to need a different technology.
Homomorphic encryption as best I'm aware is more than a decade away from practical use. I believe the best libraries today have blowups of 100,000,000x in resource usage, meaning it's not even practical for basic crypto operations.
Multiparty computation makes some security sacrifices (breaks if M out of N people cheat, breaks undetectably iirc), but I think the blowup is less than 10x for some operations and less than 1000x for the others.
Still not pretty, but at least makes certain things possible. But not competitive search engines.
Does the issue of digital-currency volatility apply?
If you're hosting content at a certain rate, then the rate drops so that it's less profitable or no longer profitable, wouldn't the proper strategy be to drop that content and then find something worth hosting?
>what makes the supporters of it think it will be any different from the original web.
Disclaimer: I have worked on building the Internet since before the web, and have a BOFH kind of point of view of things, 2 or 3 times over now ..
I support IPFS - and things like it - as a technologist/hacker/punk-ass because, in fact, we have built an utter monstrosity of a beast of a spaghetti god of an Internet, and we can always do better.
Software - and by extension, inter-networking - is like music. There will always be better ways to do it, beyond the horizon of what is current and now. Just because what we've got "basically works, even though its all broken", doesn't mean we can't 'fix whats broken'.
To those of us in the group of your postulate, whats broken is the fact that its all so damned hierarchical, and requires canonical/authority issue and agency in multiple forms before the bits can flow. We can't just set up an IP address - we also have to have DNS, serve content on it on some port, etc. And, on top of this, we end up having to sort it out all the way down the OSI stack, at the hardware layer, by keeping as many hosts as possible up and running as are required to service the customers, here and now, who are using our service.
All of that is what is broken about the current Internet, but of course the irony is that it is all that works about the Internet at the same time.
So, IPFS/distributed/etc. means this: we can un-bork all of the above, and just have all the things route down to the real, hard, user of the system, i.e. the original content provider. IPFS, and its ilk, is all about putting the original content provider in control, while also having the network - as a function of its operating capability - contribute to helping those content providers who become, eventually, popular.
So, we don't have a lot of rigid hierarchy - we have instead multi-variate spread spectra of responsibility/duty to serve up the bits of it all - packets at a time, as part of participation in the whole - to those who want something, and those who know how to find it, from those who have something, and know how to name it.
In short, "its not about the network any more, its the people..."