kopia lustrzana https://github.com/bellingcat/auto-archiver
Tidy ups to manifests for docs
rodzic
4c119b4db8
commit
5b481f72ab
43
README.md
43
README.md
|
@ -118,45 +118,4 @@ auto-archiver --config secrets/orchestration.yaml
|
||||||
auto-archiver --config secrets/orchestration.yaml --gsheet_feeder.sheet="use it on another sheets doc" --gsheet_feeder.header=2 --gsheet_feeder.columns='{"url": "link"}'
|
auto-archiver --config secrets/orchestration.yaml --gsheet_feeder.sheet="use it on another sheets doc" --gsheet_feeder.header=2 --gsheet_feeder.columns='{"url": "link"}'
|
||||||
# all the configurations come from orchestration.yaml and specifies that s3 files should be private
|
# all the configurations come from orchestration.yaml and specifies that s3 files should be private
|
||||||
auto-archiver --config secrets/orchestration.yaml --s3_storage.private=1
|
auto-archiver --config secrets/orchestration.yaml --s3_storage.private=1
|
||||||
```
|
```
|
||||||
|
|
||||||
### Extra notes on configuration
|
|
||||||
#### Google Drive
|
|
||||||
To use Google Drive storage you need the id of the shared folder in the `config.yaml` file which must be shared with the service account eg `autoarchiverservice@auto-archiver-111111.iam.gserviceaccount.com` and then you can use `--storage=gd`
|
|
||||||
|
|
||||||
#### Telethon + Instagram with telegram bot
|
|
||||||
The first time you run, you will be prompted to do a authentication with the phone number associated, alternatively you can put your `anon.session` in the root.
|
|
||||||
|
|
||||||
#### Atlos
|
|
||||||
When integrating with [Atlos](https://atlos.org), you will need to provide an API token in your configuration. You can learn more about Atlos and how to get an API token [here](https://docs.atlos.org/technical/api). You will have to provide this token to the `atlos_feeder`, `atlos_storage`, and `atlos_db` steps in your orchestration file. If you use a custom or self-hosted Atlos instance, you can also specify the `atlos_url` option to point to your custom instance's URL. For example:
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
# orchestration.yaml content
|
|
||||||
steps:
|
|
||||||
feeder: atlos_feeder
|
|
||||||
archivers: # order matters
|
|
||||||
- youtubedl_archiver
|
|
||||||
enrichers:
|
|
||||||
- thumbnail_enricher
|
|
||||||
- hash_enricher
|
|
||||||
formatter: html_formatter
|
|
||||||
storages:
|
|
||||||
- atlos_storage
|
|
||||||
databases:
|
|
||||||
- console_db
|
|
||||||
- atlos_db
|
|
||||||
|
|
||||||
configurations:
|
|
||||||
atlos_feeder:
|
|
||||||
atlos_url: "https://platform.atlos.org" # optional
|
|
||||||
api_token: "...your API token..."
|
|
||||||
atlos_db:
|
|
||||||
atlos_url: "https://platform.atlos.org" # optional
|
|
||||||
api_token: "...your API token..."
|
|
||||||
atlos_storage:
|
|
||||||
atlos_url: "https://platform.atlos.org" # optional
|
|
||||||
api_token: "...your API token..."
|
|
||||||
hash_enricher:
|
|
||||||
algorithm: "SHA-256"
|
|
||||||
```
|
|
||||||
|
|
|
@ -11,6 +11,8 @@
|
||||||
"api_token": {
|
"api_token": {
|
||||||
"default": None,
|
"default": None,
|
||||||
"help": "An Atlos API token. For more information, see https://docs.atlos.org/technical/api/",
|
"help": "An Atlos API token. For more information, see https://docs.atlos.org/technical/api/",
|
||||||
|
"required": True,
|
||||||
|
"type": "str",
|
||||||
},
|
},
|
||||||
"atlos_url": {
|
"atlos_url": {
|
||||||
"default": "https://platform.atlos.org",
|
"default": "https://platform.atlos.org",
|
||||||
|
|
|
@ -32,7 +32,6 @@
|
||||||
|
|
||||||
GDriveStorage: A storage module for saving archived content to Google Drive.
|
GDriveStorage: A storage module for saving archived content to Google Drive.
|
||||||
|
|
||||||
Author: Dave Mateer, (And maintained by: )
|
|
||||||
Source Documentation: https://davemateer.com/2022/04/28/google-drive-with-python
|
Source Documentation: https://davemateer.com/2022/04/28/google-drive-with-python
|
||||||
|
|
||||||
### Features
|
### Features
|
||||||
|
|
|
@ -20,5 +20,6 @@
|
||||||
- Processes HTML content of messages to retrieve embedded media.
|
- Processes HTML content of messages to retrieve embedded media.
|
||||||
- Sets structured metadata, including timestamps, content, and media details.
|
- Sets structured metadata, including timestamps, content, and media details.
|
||||||
- Does not require user authentication for Telegram.
|
- Does not require user authentication for Telegram.
|
||||||
|
|
||||||
""",
|
""",
|
||||||
}
|
}
|
||||||
|
|
|
@ -40,5 +40,9 @@ To use the `TelethonExtractor`, you must configure the following:
|
||||||
- **Bot Token**: Optional, allows access to additional content (e.g., large videos) but limits private channel archiving.
|
- **Bot Token**: Optional, allows access to additional content (e.g., large videos) but limits private channel archiving.
|
||||||
- **Channel Invites**: Optional, specify a JSON string of invite links to join channels during setup.
|
- **Channel Invites**: Optional, specify a JSON string of invite links to join channels during setup.
|
||||||
|
|
||||||
|
### First Time Login
|
||||||
|
The first time you run, you will be prompted to do a authentication with the phone number associated, alternatively you can put your `anon.session` in the root.
|
||||||
|
|
||||||
|
|
||||||
"""
|
"""
|
||||||
}
|
}
|
Ładowanie…
Reference in New Issue