archive.social/docs/utils/TwitterCapture.md

15 KiB

TwitterCapture

thread-keeper

Author: The Harvard Library Innovation Lab
License: MIT

TwitterCapture.TwitterCapture

Generates a "sealed" PDF out of a twitter.com url using Playwright.

Usage:

const tweet = new TwitterCapture(url);
const pdf = await tweet.capture();
fs.writeFileSync("tweet.pdf", pdf);

Kind: static class of TwitterCapture

new exports.TwitterCapture(url, options)

Param Type Description
url string twitter.com url to capture. Works best on statuses and threads.
options object See TwitterCapture.defaults for detailed options. Will use defaults unless overridden.

twitterCapture.defaults

Defaults for options that can be passed to TwitterCapture.

Kind: instance property of TwitterCapture
Properties

Name Type Description
appVersion string
privateKeyPath string Path to .pem file containing a private key.
certPath string Path to a .pem file containing a certificate.
tmpFolderPath string Path to a folder in which temporary file can be written.
ytDlpPath string Path to the yt-dlp executable.
templatesFolderPath string Path to the templates folder (t.co resolver summary feature).
timestampServerUrl string Timestamping server.
networkidleTimeout number Time to wait for "networkidle" state.
runBrowserBehaviors boolean If true, will try to auto-scroll and open more responses. Set to false automatically when trying to capture a profile url.
browserBehaviorsTimeout number Maximum browser behaviors execution time.
videoCaptureTimeout number Maximum yt-dlp execution time.
renderTimeout number Time to wait for re-renders.

twitterCapture.options : object

Kind: instance property of TwitterCapture

twitterCapture.url : string

Kind: instance property of TwitterCapture

twitterCapture.urlType : string

Kind: instance property of TwitterCapture

twitterCapture.playwright : Object

Kind: instance property of TwitterCapture

twitterCapture.interceptedJPEGs : object.<string, Buffer>

Kind: instance property of TwitterCapture

twitterCapture.capture ⇒ Promise.<Buffer>

Captures the current Twitter.com url and makes it a signed PDF.

Kind: instance property of TwitterCapture
Returns: Promise.<Buffer> - - Signed PDF.

twitterCapture.setup ⇒ Promise.<void>

Sets up the browser used for capture as well as network interception for images capture. Populates this.playwright.

Kind: instance property of TwitterCapture

twitterCapture.teardown

Closes browser used for capture.

Kind: instance property of TwitterCapture

twitterCapture.adjustUIForCapture ⇒ Promise.<void>

Adjusts the current page's DOM so the resulting PDF is not affected by UI artifact. Playwright needs to be ready.

Kind: instance property of TwitterCapture

twitterCapture.runBrowserBehaviors ⇒ Promise.<void>

Runs browser behaviors:

  • Tries to scroll through the page.
  • Tries to click on the next available "Show replies" button.

Playwright needs to be ready.

Kind: instance property of TwitterCapture

twitterCapture.resizeViewportToFitDocument ⇒ Promise.<void>

Stretches the viewport to match the document's dimensions.

Kind: instance property of TwitterCapture

twitterCapture.getDocumentDimensions ⇒ Promise.<{width: number, height: number}>

Returns the current dimensions of the document. Playwright needs to be ready.

Kind: instance property of TwitterCapture

twitterCapture.interceptJpegs ⇒ Promise.<void>

Uses Playwright's network interception to capture images and add them to this.interceptedJPEGs. Called whenever Playwright processes an HTTP response.

Kind: instance property of TwitterCapture

Param Type
response playwright.Response

twitterCapture.generateRawPDF ⇒ Promise.<Buffer>

Generates a PDF of the current page using Chrome Dev Tools. Playwright needs to be ready.

Populates this.pdf.

Kind: instance property of TwitterCapture
Returns: Promise.<Buffer> - - PDF Bytes

twitterCapture.addInterceptedJPEGsToPDF ⇒ Promise.<void>

Adds entries from this.interceptedJPEGs

Kind: instance property of TwitterCapture

Type Description
PDFDocument Editable PDF object from pdf-lib.

twitterCapture.captureAndAddUrlMapToPDF ⇒ Promise.<void>

Tries to list and resolve all the t.co urls on the page, and add the resulting map as an attachment.

Attachment filename: url-map.csv. Playwright needs to be ready.

Kind: instance property of TwitterCapture

Type Description
PDFDocument Editable PDF object from pdf-lib.

twitterCapture.captureAndAddVideoToPDF ⇒ Promise.<void>

Tries to capture video(s) from current Twitter url and add them as attachment to the PDF.

Kind: instance property of TwitterCapture

Type Description
PDFDocument Editable PDF object from pdf-lib.

twitterCapture.cropMarginsOnPDF

Tries to remove some of the white space at the bottom of the PDF. [!] TODO: This is a "let's ship it" hack. We will need to find a better solution.

Kind: instance property of TwitterCapture

Param Type
editablePDF PDFDocument

twitterCapture.signPDF ⇒ Buffer

Kind: instance property of TwitterCapture
Returns: Buffer - - PDF Bytes

Param Type Description
editedPDF Buffer PDF Bytes

twitterCapture.filterOptions

Applies some basic filtering to new option objects and fills gaps with defaults. Replaces this.options after filtering.

Kind: instance property of TwitterCapture

Param Type
newOptions Promise.<object>

twitterCapture.filterUrl ⇒ bool

Filters a given URL to ensure it's a twitter.com one. Also asserts it's "type": "status", "search", "profile".

Automatically populates this.url and this.urlType.

Kind: instance property of TwitterCapture

Param Type
url string