High-fidelity capture of Twitter threads as sealed PDFs - archive.social.
 
 
 
 
Go to file
Matteo Cargnelutti 381fe26e60 Version bump 2023-11-29 10:33:42 -05:00
.github We don't need poppler either 2023-08-22 10:29:04 -04:00
app Update glob imports to account for ESM 2023-08-01 10:20:02 -04:00
certs Pre-alpha 2022-11-18 11:20:26 -05:00
docs Doc updates, version bump 2022-12-22 14:41:28 -05:00
executables Pre-alpha 2022-11-18 11:20:26 -05:00
fixtures Hybrid ip + access key system 2022-11-29 15:39:08 -05:00
scripts yt-dlp 2023.11.16 2023-11-16 20:04:34 -05:00
.gitignore v0.0.2 2022-11-27 19:04:54 -05:00
.prettierrc Pre-alpha 2022-11-18 11:20:26 -05:00
LICENSE Initial commit 2022-11-14 17:43:03 -05:00
README.md Update README.md 2023-10-10 16:45:40 -04:00
brewfile Pre-alpha 2022-11-18 11:20:26 -05:00
github.png v0.0.2 2022-11-27 19:04:54 -05:00
package-lock.json Version bump 2023-11-29 10:33:42 -05:00
package.json Version bump 2023-11-29 10:33:42 -05:00

README.md

thread-keeper 📚

High-fidelity capture of Twitter threads as sealed PDFs @ social.perma.cc.

An experiment of the Harvard Library Innovation Lab.

Perma Tools


Summary


Dependencies

Runtimes

Browsers

  • Google Chrome (npx playwright install --force chrome may be used).

Python dependencies

  • ⚠️ For now: Python dependencies are installed at machine level, as a post-install step of npm install.

Known Ubuntu packages

curl bash gcc g++ python3 python3-pip python3-dev zlib1g zlib1g-dev libjpeg-dev libssl-dev libffi-dev ghostscript poppler-utils
  • ⚠️ On Linux, this project is only compatible with Ubuntu at the time, because it uses Playwright + Chrome.
  • Node may be sourced from Nodesource.

For development on Mac OS

A brewfile is available. Run brew bundle to install machine-level dependencies that can be provided by homebrew.

☝️ Back to summary


Local development

Getting started

Run the following commands to initialize the project and start the development server.

brew bundle # (Mac OS only) - See Linux dependencies above.
npm install # To install npm packages
npx playwright install chrome # To ensure Playwright has a version of Chrome to talk to
npm run generate-dev-cert # Will generate a certificate for self-signing PDFs. For testing purposes only.
npm run dev # Starts the development server on port 3000

Certificates history

The "Signatures Verification Page" page lists the certificates that were used for signing PDFs with the app. You may provide that history by creating two files under /data:

  • signing-certs-history.json
  • timestamping-certs-history.json

Expected format:

[
  {
    "from": "2022-11-18 13:07:56 UTC",
    "to": "present",
    "domain": "domain.ext",
    "info": "https://...",
    "cert": "https://..."
  },
  ...
]

☝️ Back to summary


Dev CLI

start

npm run start

Starts the app's server on port 3000 with warning-level logs.

dev

npm run dev

Starts the app's server on port 3000 with info-level logs. Watches for file changes.

generate-dev-cert

npm run generate-dev-cert

Generate a certs/cert.pem and certs/key.pem for local development purposes.

docgen

npm run docgen

Generates JSDoc-based code documentation under /docs.

test

npm run test

Runs the test suite. Requires test fixtures (see fixtures folder).

⚠️ At the moment, this codebase only features a very limited set of high-level integration tests.

☝️ Back to summary


Environment variables

Name Required? Description
CERTS_PATH No If set, will be used as path to .pem files used for signing .PDF files.
DATA_PATH No If set, will be used as path to folder used for storing app data.
TEMPLATES_PATH No If set, will be used as a templates path. Can be used to replace the website's UI with a custom one.
REQUIRE_ACCESS_KEY No If set and "1", an access key will be required to make capture.
MAX_PARALLEL_CAPTURES_TOTAL No If set and contains an integer, determines the maximum of captures that the server can run in parallel.
MAX_PARALLEL_CAPTURES_PER_IP No If set and contains an integer, determines the maximum of captures that a single client can run in parallel.

☝️ Back to summary


Access keys system

If the REQUIRE_ACCESS_KEY environment variable is on, users will be required to use an access key to make captures.

Keys can be stored in a file named access-key.json under the "data" folder.

Example: app/data/access-keys.json:

{
  "BB67BBC4-1F4B-4353-8E6D-9927A10F4509": true
}

Create an access key to test with:

$ uuidgen
BB67BBC4-1F4B-4353-8E6D-9927A10F4509

☝️ Back to summary