Web browser with integrated IPFS node/support for browser cache?

I thought of the following:

You open your browser and you are spun up as a node. When your browser caches a file it is shared on the network. So, whenever the web browser needs cache-able assets it checks whether they exist on ipfs on a node and d/l it from them, if not the website is pulled from the site and cached in the traditional manner. Your node shuts down when you close the browser.

The idea being that you would share and benefit from the cache of everyone also currently browsing at the same time you are. Is this theoretically possible?

2 Likes

Yes, I like your idea. If any browser, while surfing the internet, can work as an IPFS node, fetching and contributing content to the IPFS network that would be fantastic.
The fundamental problem is that in HTTP protocol, content is mutable. But in IPFS protocol, content is the only identification factor. Which means you can not get any information about the content unless you download it. Given an http linked file, you cannot convert it into ipfs:// with confidence. That is fundamental.

But we can’t ignore that there are always many duplicated files around the internet. ex: wildly-used js libs, common style files, analysis js … In a word, any files that are cached somewhere (CDN) should be suitable for ipfs. That where IPFS will take place HTTP as planned.

I’ve actually come up with some ideas may work in limit situations.

[1]. HTTP 300:
There is potential compatibility in HTTP Stander in HTTP 3xx (Redirection). The Redirection is not limit to HTTP but can be any URI (ipfs included) Especially for HTTP 300

The HTTP 300 Multiple Choices redirect status response code indicates that the request has more than one possible responses. The user-agent or the user should choose one of them. As there is no standardized way of choosing one of the responses, this response code is very rarely used.

When receiving an HTTP GET request for a file, instead of directly return the data, the server can choose response an HTTP 300 with 2 URIs including one direct HTTP GET link and one IPFS link. So that the client can choose IPFS to get the files in IPFS network.
Risk: since there is no standardized way of choosing one of the responses. No sure if normal browsers without ipfs support can handle this properly.

[2]. HTTP HEAD before GET:
Before GET a resource, the browser can send a HEAD request to get meta info about the file. The return header always includes the file’s Content-Length Content-Type etc, in rare cases, the files checksum is included as well. For IPFS support, the sever can return the IPFS hash as well. So that the browser can use the IPFS hash to get the file instead of performing a GET request.

4 Likes

Okay, so let me see if I understand this correctly:

The main hurdle to my idea is that there needs to some known piece of info (like a hash or known address) about the desired content provided to the client in order to check the ipfs network for that specific file. I really like your second idea. I guess the question is, how would the server find out the ipfs addresses?

I guess one solution is to be running ipfs and including certain folders in it. I feel like that’s potentially too tedious though, and you wouldn’t want people to shoot themselves in the foot hosting the wrong files by accident on their server, they just want the cache-able ones included.

1 Like

I really like this idea, and it has some similarity and overlap with my transparent proxy cache (eg. Squid on IPNS) and IPNS-CDN ideas. The main difference is this would be implemented on the browser directly, rather than on a LAN edge gateway, but otherwise you should be able to map URLs to IPNS paths the same way on both.

One security caveat here is that you would want to exclude data from the cache that is personal to the browser user at all, for example bank statement images sent via HTTPS.

Otherwise, this would never require the HTTP 300 response feature to map from URLs to IPFS multihashes (IPNS does all that mapping work), but the HTTP 300 feature would still be nice to implement on IPFS backed IPNS-CDN gateways.

how would the server find out the ipfs addresses?

Yes, as I mentioned before, the ipfs multi-hash should be and can only be provided by the server. (science the content may be dynamic). So the server is responsible for selecting static contents. This can be implemented in the web frameworks. On responding to HEAD requests, the server can calculate IPFS multi-hash just in time.
Moreover, the server, actually, needs not to running any IPFS node. And need not to serve the content in IPFS itself. If the content is a popular one, like jQuery lib, the client can always fetch it elsewhere. And if the content is original, the former customers can GET it through HTTP and provide it to the following customers before they go offline.

Living in the CDN space for sometime, I am way too familiar with some of the limitations of browser caches and HTTP 1.1 (so IPFS + browser caches could be awesome!).

The HTTP working group did think about people wanting to possibly use different ports and protocols for some pieces of content and they came up with: HTTP Alternative Services for HTTP2.

This could be a more elegant way to handle this than a redirect based solution, however you still need demarcation of the shared cache as indicated above (there is no reason an HTTP header cannot indicate whether the IPFS node will or won’t seed content). Also, this functionality hasn’t been very widely implemented or adopted, from what I understand it was a future-proofing mechanism. HTTP2 implementations also have to take advantage of TLS.

There is a philosophical issue here though…most end users are very use to being clients, not content providers. Hopefully, we can change hearts and minds about that!

The beta version of the IPFS Companion browser extension does something close to this

Small clarification: we are experimenting with running embedded js-ipfs node within browser extension, but as a means of exposing API as window.ipfs on every page.
Things discussed in this topic are not implemented, but we are tracking related ideas in dedicated issue:

2 Likes