kopia lustrzana https://github.com/bellingcat/auto-archiver
Tidy ups to manifests for docs
rodzic
4c119b4db8
commit
5b481f72ab
43
README.md
43
README.md
|
@ -118,45 +118,4 @@ auto-archiver --config secrets/orchestration.yaml
|
|||
auto-archiver --config secrets/orchestration.yaml --gsheet_feeder.sheet="use it on another sheets doc" --gsheet_feeder.header=2 --gsheet_feeder.columns='{"url": "link"}'
|
||||
# all the configurations come from orchestration.yaml and specifies that s3 files should be private
|
||||
auto-archiver --config secrets/orchestration.yaml --s3_storage.private=1
|
||||
```
|
||||
|
||||
### Extra notes on configuration
|
||||
#### Google Drive
|
||||
To use Google Drive storage you need the id of the shared folder in the `config.yaml` file which must be shared with the service account eg `autoarchiverservice@auto-archiver-111111.iam.gserviceaccount.com` and then you can use `--storage=gd`
|
||||
|
||||
#### Telethon + Instagram with telegram bot
|
||||
The first time you run, you will be prompted to do a authentication with the phone number associated, alternatively you can put your `anon.session` in the root.
|
||||
|
||||
#### Atlos
|
||||
When integrating with [Atlos](https://atlos.org), you will need to provide an API token in your configuration. You can learn more about Atlos and how to get an API token [here](https://docs.atlos.org/technical/api). You will have to provide this token to the `atlos_feeder`, `atlos_storage`, and `atlos_db` steps in your orchestration file. If you use a custom or self-hosted Atlos instance, you can also specify the `atlos_url` option to point to your custom instance's URL. For example:
|
||||
|
||||
```yaml
|
||||
# orchestration.yaml content
|
||||
steps:
|
||||
feeder: atlos_feeder
|
||||
archivers: # order matters
|
||||
- youtubedl_archiver
|
||||
enrichers:
|
||||
- thumbnail_enricher
|
||||
- hash_enricher
|
||||
formatter: html_formatter
|
||||
storages:
|
||||
- atlos_storage
|
||||
databases:
|
||||
- console_db
|
||||
- atlos_db
|
||||
|
||||
configurations:
|
||||
atlos_feeder:
|
||||
atlos_url: "https://platform.atlos.org" # optional
|
||||
api_token: "...your API token..."
|
||||
atlos_db:
|
||||
atlos_url: "https://platform.atlos.org" # optional
|
||||
api_token: "...your API token..."
|
||||
atlos_storage:
|
||||
atlos_url: "https://platform.atlos.org" # optional
|
||||
api_token: "...your API token..."
|
||||
hash_enricher:
|
||||
algorithm: "SHA-256"
|
||||
```
|
||||
|
||||
```
|
|
@ -11,6 +11,8 @@
|
|||
"api_token": {
|
||||
"default": None,
|
||||
"help": "An Atlos API token. For more information, see https://docs.atlos.org/technical/api/",
|
||||
"required": True,
|
||||
"type": "str",
|
||||
},
|
||||
"atlos_url": {
|
||||
"default": "https://platform.atlos.org",
|
||||
|
|
|
@ -32,7 +32,6 @@
|
|||
|
||||
GDriveStorage: A storage module for saving archived content to Google Drive.
|
||||
|
||||
Author: Dave Mateer, (And maintained by: )
|
||||
Source Documentation: https://davemateer.com/2022/04/28/google-drive-with-python
|
||||
|
||||
### Features
|
||||
|
|
|
@ -20,5 +20,6 @@
|
|||
- Processes HTML content of messages to retrieve embedded media.
|
||||
- Sets structured metadata, including timestamps, content, and media details.
|
||||
- Does not require user authentication for Telegram.
|
||||
|
||||
""",
|
||||
}
|
||||
|
|
|
@ -40,5 +40,9 @@ To use the `TelethonExtractor`, you must configure the following:
|
|||
- **Bot Token**: Optional, allows access to additional content (e.g., large videos) but limits private channel archiving.
|
||||
- **Channel Invites**: Optional, specify a JSON string of invite links to join channels during setup.
|
||||
|
||||
### First Time Login
|
||||
The first time you run, you will be prompted to do a authentication with the phone number associated, alternatively you can put your `anon.session` in the root.
|
||||
|
||||
|
||||
"""
|
||||
}
|
Ładowanie…
Reference in New Issue