Make your website resilient without a gatekeeping centralized CDN.
 
 
Go to file
Michał 'rysiek' Woźniak 96de033dde working on gun-ipfs plugin and test to make the test pass 2021-08-25 22:33:01 +00:00
__tests__/plugins working on gun-ipfs plugin and test to make the test pass 2021-08-25 22:33:01 +00:00
docs
lib
plugins working on gun-ipfs plugin and test to make the test pass 2021-08-25 22:33:01 +00:00
.eslintrc.yml
.gitignore
.gitlab-ci.yml
CODE_OF_CONDUCT.md
LICENSE
README.md
config.js.example
libresilient.js libresilient.js linted (left only no-unused-vars errors) 2021-08-25 19:51:56 +00:00
package-lock.json
package.json
service-worker.js service-worker.js linted (left only no-unused-vars errors) 2021-08-25 19:43:21 +00:00

README.md

LibResilient

A browser-based decentralized content delivery network, implemented as a JavaScript library to be deployed easily on any website. LibResilient uses ServiceWorkers and a suite of non-standard in-browser delivery mechanisms, with a strong focus on decentralized tools like IPFS.

Ideally, users should not need to install any special software nor change any settings to continue being able to access an overloaded LibResilient-enabled site as soon as they are able to access it once.

Current status

LibResilient is currently considered alpha: the code works, but major rewrites and API changes are coming. It has been tested on Firefox, Chromium and Chrome on desktop, as well as Firefox for mobile on Android, but it should work in any browser implementing the ServiceWorker API.

Feel free to test it, but be aware that it might not work as expected. If you'd like to get in touch, please email us at rysiek+libresilient[at]hackerspace.pl, create an issue.

Rationale

While a number of content delivery technologies exist, these typically require enormous centralized services. This creates opportunities for gate-keeping, and causes any disruption at these centralized providers to become a major problem for thousands of websites.

This project explores the possibility of solving this in a way that would not require website visitors to install any special software or change any settings; the only things that are needed are a modern Web browser and the ability to visit a website once, so that the JavaScript ServiceWorker kicks in.

You can read more in-depth overview of LibResilient here. And here is a document describing the philosophy influencing project goals and relevant technical decisions.

Architecture

A ServiceWorker is used as a way to persist the library after the initial visit to the participating website.

After the ServiceWorker is downloaded and activated, it handles all fetch() events by first trying to use the regular HTTPS request to the original website. If that fails for whatever reason (be it a timeout or a 4xx/5xx error), the plugins kick in, attempting to fetch the content via any means available.

A more complete overview of the architecture and technicalities of LibResilient is available here.

Draft API

The plan is to have an API to enable the use of different strategies for getting content. There are two basic functions a plugin needs to perform:

  • resolution
    where a given piece of content (image, stylesheet, script, HTML file, etc.) is to be found
  • delivery
    how to get it

These need to be closely integrated. For example, if using Gun and IPFS, resolution is performed using Gun, and delivery is performed using IPFS. However, Gun needs to resolve content to something that is usable with IPFS. If, alternatively, we're also using Gun to resolve content available on BitTorrent, that will have to be a separate namespace in the Gun graph, since it will have to resolve to magnet links.

Therefore, it doesn't seem to make sense to separate resolution and delivery. Thus, a LibResilient plugin would need to implement the whole pipeline, and work by receiving a URL and returning a Promise that resolves to a valid Response object containing the content.

It should be possible to chain the plugins (try the first one, in case of error try the next, and so on), or run them in parallel (fire requests using all available plugins and return the first complete successful response). Running in parallel might offer a better user experience, but will also be more resource-intensive.

An additional part of the API is going to deal with reporting the status of the plugins, their versions, and how a given piece of content was fetched (using which plugin). This will require modifying actual content from the ServiceWorker to pass that data to the DOM.

Content versioning

Implementing content versioning might be necessary. Some delivery mechanisms (IPFS, BitTorrent) might be slow to pick up newly published content, and while information about this might be available, it might be faster to fetch and display older content that has already propagated across multiple peers or network nodes, with a message informing the reader that new content is available and that they might want to retry fetching it.

An important consideration related to content versioning is that it needs to be consistent across a full set of published pieces of content.

For example, consider a simple site that consists of an index.html, style.css, and script.js. Non-trivial changes in index.html will render older versions of style.css and script.js broken. A particular version of the whole published site needs to be fetched, otherwise things will not work as expected.

This will probably need to be fleshed out later on, but the initial API needs to be designed in a way where content versioning can be introduced without breaking backwards compatibility with plugins.

Status information

Status information should be available to users, informing them that the content is being retrieved using non-standard means that might take longer.

LibResilient information is kept per-request in the ServiceWorker, meaning it is transient and does not survive ServiceWorker restarts, which might happen multiple times over the lifetime of an open tab. However, each update is communicated to the browser window context that is relevant for a given request via client.postMessage calls. This is also how information on ServiceWorker commit SHAs and available plugins are made available to the browser window context.

The data provided (per each requested URL handled by the ServiceWorker) is:

  • clientId – the Client ID for the request (that is, the Client ID of this browser window)
  • url – the URL of the request
  • serviceWorker – the commit SHA of the ServiceWorker that handled the request
  • fetchErrornull if the request completed successfully via regular HTTPS; otherwise the error message
  • method – the method by which the request was completed: "fetch" is regular HTTPS fetch(), gun-ipfs means Gun and IPFS were used, etc.
  • state – the state of the request (running, error, success)

The code in the browser window context is responsible for keeping a more permanent record of the URLs requested, the methods used, and the status of each, if needed.

Review of possible resolution/delivery methods

  • Gun
    Better suited for resolution than for delivery, although it could handle both. Pretty new project, dynamically developed. No global network of public peers available currently. Content is cryptographically signed.

  • IPNS
    Only suitable for resolution. Experimental, not fully functional in the browser yet. Fits like a hand in a glove with IPFS.

  • DNSLink
    Only suitable for resolution. Deployed, stable, and well-documented. Fits like a hand in a glove with IPFS. The downside is that it requires publishing of DNS records to work (every time any new content is published), which means it might be difficult to implement by website admins.

  • IPFS
    Only suitable for delivery, since it is content-addressed. Resolution of a content URI to an IPFS address needs to be handled by some other technology (like Gun or IPNS, or using gateways). Deployed and well-documented, with a large community of developers. Redeploying a new content package with certain files unchanged does not change the addresses of the unchanged files, meaning that small changes in content do not lead to the whole content tree needing to be re-seeded.

  • WebTorrent
    Only suitable for content delivery. It seems possible to fetch a particular file from a given torrent, so as not to have to download a torrent of the whole website just to display a single page with some CSS and JS. Requires a resolver to point to the newest torrent since torrents are immutable. Even small changes (for example, only a few files changed in the whole website tree) require creating a new torrent and re-seeding, which is obviously less than ideal.

  • Plain files via HTTPS
    This delivery method is obvious if we're talking simply about the originating site and it serving the files, but this can also mean non-standard strategies like pushing static HTML+CSS+JS to CloudFront or Wasabi, and having a minimal resolver kick in if the originating site is unavailable, to fetch content seamlessly from alternative locations (effectively implementing domain fronting and collateral freedom in the browser). However, this will require some thought being put into somehow signing content deployed to third-party locations – perhaps the resolver (like Gun) could be responsible for keeping SHA sums of known good content, or perhaps we should just address it using the hashes, effectively imitating IPFS.

Limitations

There are certain limitations to what can be done with LibResilient:

Service worker cannot be updated if origin is down

ServiceWorker script apparently cannot be delivered using any of the transport plugins, since:

when you call navigator.serviceWorker.register('service-worker.js) the request for service-worker.js isn't intercepted by any service worker's fetch event handler.

So, the ServiceWorker script will be un-updateable via LibResilient in case the origin site is down, unless we find a way to hack around it with caches etc.

JS implementations of decentralized protocols are still bootstrapped using servers

Gun and IPFS (and probably other potential LibResilient strategies) still use bootstrapping servers (STUN/TURN, and other kinds of public nodes), so technically it would be possible for all of these to be overwhelmed by traffic also, rendering LibResilient ineffective. This is a limitation of browsers and is related to IPv4 and NATs.

One way to deal with this is to have a large list of such public nodes and send only 2-3 each time LibResilient calls home (including via already working decentralized means), so that the traffic is spread more evenly.

Plus, the ever-increasing adoption of IPv6 will also partially fix this.