libresilient/README.md

145 wiersze
12 KiB
Markdown
Czysty Zwykły widok Historia

2021-04-06 17:18:37 +00:00
# LibResilient
2022-10-02 21:39:21 +00:00
A browser-based decentralized content delivery network, implemented as a JavaScript library to be deployed easily on any website. LibResilient uses [ServiceWorkers](https://developer.mozilla.org/en-US/docs/Web/API/Service_Worker_API/Using_Service_Workers) and a suite of unconventional in-browser delivery mechanisms, with a strong focus on decentralized tools like [IPFS](https://ipfs.io/).
2021-04-06 17:18:37 +00:00
Ideally, users should not need to install any special software nor change any settings to continue being able to access an overloaded LibResilient-enabled site as soon as they are able to access it *once*.
## Current status
LibResilient is currently considered *beta*: the code works, and the API is mostly stable, but it has not been deployed in production and would benefit from real-world testing. During development it has been tested on Firefox, Chromium and Chrome on desktop, as well as Firefox for mobile on Android, but it should work in any browser implementing the Service Worker API.
2021-04-06 17:18:37 +00:00
Feel free to test it, but be aware that it might not work as expected. If you'd like to get in touch, please email us at `rysiek+libresilient[at]hackerspace.pl`, create an [issue](https://gitlab.com/rysiekpl/libresilient/-/issues/new).
## Rationale
While a number of content delivery technologies exist, these typically require enormous centralized services. This creates opportunities for gate-keeping, and [causes any disruption at these centralized providers to become a major problem for thousands of websites](https://blog.cloudflare.com/cloudflare-outage-on-july-17-2020/).
This project explores the possibility of solving this in a way that would not require website visitors to install any special software or change any settings; the only things that are needed are a modern Web browser and the ability to visit a website once, so that the JavaScript ServiceWorker kicks in.
You can read more in-depth overview of LibResilient [here](./docs/ARCHITECTURE.md). And [here](./docs/PHILOSOPHY.md) is a document describing the philosophy influencing project goals and relevant technical decisions.
## Architecture
A [ServiceWorker](https://developer.mozilla.org/en-US/docs/Web/API/Service_Worker_API) is used as a way to persist the library after the initial visit to the participating website.
After the ServiceWorker is downloaded and activated, it handles all `fetch()` events by first trying to use the regular HTTPS request to the original website. If that fails for whatever reason (be it a timeout or a `4xx`/`5xx` error), the plugins kick in, attempting to fetch the content via any means available.
A more complete overview of the architecture and technicalities of LibResilient is available [here](./docs/ARCHITECTURE.md).
## Draft API
The plan is to have an API to enable the use of different strategies for getting content. There are two basic functions a plugin needs to perform:
- **resolution**
*where* a given piece of content (image, stylesheet, script, HTML file, etc.) is to be found
- **delivery**
*how* to get it
These need to be closely integrated. For example, if using Gun and IPFS, resolution is performed using Gun, and delivery is performed using IPFS. However, Gun needs to resolve content to something that is usable with IPFS. If, alternatively, we're also using Gun to resolve content available on BitTorrent, that will have to be a separate namespace in the Gun graph, since it will have to resolve to magnet links.
Therefore, it doesn't seem to make sense to separate resolution and delivery. Thus, a LibResilient plugin would need to implement the whole pipeline, and work by receiving a URL and returning a Promise that resolves to a valid Response object containing the content.
It should be possible to chain the plugins (try the first one, in case of error try the next, and so on), or run them in parallel (fire requests using all available plugins and return the first complete successful response). Running in parallel might offer a better user experience, but will also be more resource-intensive.
An additional part of the API is going to deal with reporting the status of the plugins, their versions, and how a given piece of content was fetched (using which plugin). This will require modifying actual content from the ServiceWorker to pass that data to the DOM.
### Content versioning
Implementing content versioning might be necessary. Some delivery mechanisms (IPFS, BitTorrent) might be slow to pick up newly published content, and while information about this might be available, it might be faster to fetch and display older content that has already propagated across multiple peers or network nodes, with a message informing the reader that new content is available and that they might want to retry fetching it.
An important consideration related to content versioning is that it needs to be consistent across a full set of published pieces of content.
For example, consider a simple site that consists of an `index.html`, `style.css`, and `script.js`. Non-trivial changes in `index.html` will render older versions of `style.css` and `script.js` broken. A particular version of the whole published site needs to be fetched, otherwise things will not work as expected.
This will probably need to be fleshed out later on, but the initial API needs to be designed in a way where content versioning can be introduced without breaking backwards compatibility with plugins.
### Status information
Status information should be available to users, informing them that the content is being retrieved using non-standard means that might take longer.
LibResilient information is kept per-request in the ServiceWorker, meaning it is transient and does not survive ServiceWorker restarts, which might happen multiple times over the lifetime of an open tab. However, each update is communicated to the browser window context that is relevant for a given request via [`client.postMessage`](https://developer.mozilla.org/en-US/docs/Web/API/Client/postMessage) calls. This is also how information on ServiceWorker commit SHAs and available plugins are made available to the browser window context.
The data provided (per each requested URL handled by the ServiceWorker) is:
- `clientId` – the [Client ID](https://developer.mozilla.org/en-US/docs/Web/API/FetchEvent/clientId) for the request (that is, the Client ID of this browser window)
- `url` – the URL of the request
- `serviceWorker` – the commit SHA of the ServiceWorker that handled the request
- `fetchError` – `null` if the request completed successfully via regular HTTPS; otherwise the error message
- `method` – the method by which the request was completed: "`fetch`" is regular HTTPS `fetch()`, `gun-ipfs` means Gun and IPFS were used, etc.
- `state` – the state of the request (`running`, `error`, `success`)
The code in the browser window context is responsible for keeping a more permanent record of the URLs requested, the methods used, and the status of each, if needed.
## Review of possible resolution/delivery methods
- **[Gun](https//gun.eco/)**
Better suited for resolution than for delivery, although it could handle both. Pretty new project, dynamically developed. No global network of public peers available currently. Content is cryptographically signed.
- **[IPNS](https://docs.ipfs.io/guides/concepts/ipns/)**
Only suitable for resolution. Experimental, not fully functional in the browser yet. Fits like a hand in a glove with IPFS.
- **[DNSLink](https://docs.ipfs.io/guides/concepts/dnslink/)**
Only suitable for resolution. Deployed, stable, and well-documented. Fits like a hand in a glove with IPFS. The downside is that it requires publishing of DNS records to work (every time any new content is published), which means it might be difficult to implement by website admins.
- **[IPFS](https://ipfs.io/)**
Only suitable for delivery, since it is content-addressed. Resolution of a content URI to an IPFS address needs to be handled by some other technology (like Gun or IPNS, or using [gateways](https://ipfs.github.io/public-gateway-checker/)). Deployed and well-documented, with a large community of developers. Redeploying a new content package with certain files unchanged does not change the addresses of the unchanged files, meaning that small changes in content do not lead to the whole content tree needing to be re-seeded.
- **[WebTorrent](https://github.com/webtorrent/webtorrent)**
Only suitable for content delivery. It seems possible to fetch a particular file from a given torrent, so as not to have to download a torrent of the whole website just to display a single page with some CSS and JS. Requires a resolver to point to the newest torrent since torrents are immutable. Even small changes (for example, only a few files changed in the whole website tree) require creating a new torrent and re-seeding, which is obviously less than ideal.
- **Plain files via HTTPS**
This delivery method is obvious if we're talking simply about the originating site and it serving the files, but this can also mean non-standard strategies like pushing static HTML+CSS+JS to CloudFront or Wasabi, and having a minimal resolver kick in if the originating site is unavailable, to fetch content seamlessly from alternative locations (effectively implementing domain fronting and collateral freedom in the browser). However, this will require some thought being put into somehow signing content deployed to third-party locations – perhaps the resolver (like Gun) could be responsible for keeping SHA sums of known good content, or perhaps we should just address it using the hashes, effectively imitating IPFS.
## Limitations
There are certain limitations to what can be done with LibResilient:
### Service worker cannot be updated if origin is down
ServiceWorker script apparently cannot be delivered using any of the transport plugins, [since](https://gist.github.com/Rich-Harris/fd6c3c73e6e707e312d7c5d7d0f3b2f9#the-new-service-worker-isnt-fetched-by-the-old-one):
> when you call `navigator.serviceWorker.register('service-worker.js)` the request for service-worker.js isn't intercepted by any service worker's fetch event handler.
So, the ServiceWorker script will be un-updateable via LibResilient in case the origin site is down, unless we find a way to hack around it with caches etc.
### JS implementations of decentralized protocols are still bootstrapped using servers
Gun and IPFS (and probably other potential LibResilient strategies) still use bootstrapping servers (STUN/TURN, and other kinds of public nodes), so technically it would be possible for all of these to be overwhelmed by traffic also, rendering LibResilient ineffective. This is a limitation of browsers and is related to IPv4 and NATs.
One way to deal with this is to have a large list of such public nodes and send only 2-3 each time LibResilient calls home (including via already working decentralized means), so that the traffic is spread more evenly.
Plus, the ever-increasing adoption of IPv6 will also partially fix this.
## Related developments
- https://ipfs.io/ipfs/QmNhFJjGcMPqpuYfxL62VVB9528NXqDNMFXiqN5bgFYiZ1/its-time-for-the-permanent-web.html
- https://blog.archive.org/2015/02/11/locking-the-web-open-a-call-for-a-distributed-web/
## Development
To test the service worker locally you will need a minimal web server. Probably the simplest way is to start one directly in the project directory, either using Python:
```bash
python3 -m http.server
```
...or Docker:
```bash
docker run --publish 8000:80 --volume="$PWD:/usr/share/nginx/html" nginx
```
In both cases you will end up with a very basic webserver running locally on port `8000/tcp`. You will be able to access it under: `http://localhost:8000/`.
2021-11-20 18:48:32 +00:00
### Running the test suite
Tests are written in [Jest](https://jestjs.io/). You can run them using this command in the project directory:
```bash
2021-11-20 18:50:42 +00:00
npx jest ./__tests__/
2021-11-20 18:48:32 +00:00
```
If you don't have NodeJS on your machine, you can run them by using Docker (also in the project directory):
```bash
docker run -ti --rm --volume "${PWD}:/code" node:17.0 /bin/bash -c 'cd /code && npx jest ./__tests__/'
```