diff --git a/docs/source/how_to/run_instagrapi_server.md b/docs/source/how_to/run_instagrapi_server.md new file mode 100644 index 0000000..ace1550 --- /dev/null +++ b/docs/source/how_to/run_instagrapi_server.md @@ -0,0 +1,169 @@ +# InstagrAPI Server + +The instagram API Extractor requires access to a running instance of the InstagrAPI server. +We have a lightweight script with the endpoints required for our Instagram API Extractor module which you can run locally, or via Docker. + + + +⚠️ Warning: Remember that it's best not to use your own personal account for archiving. [Here's why](../installation/authentication.md#recommendations-for-authentication). +## Quick Start: Using Docker + +We've provided a convenient shell script (`run_instagrapi_server.sh`) that simplifies the process of setting up and running the Instagrapi server in Docker. This script handles building the Docker image, setting up credentials, and starting the container. + +### 🔧 Running the script: + +Run this script either from the repository root or from within the `scripts/instagrapi_server` directory: + +```bash +./scripts/instagrapi_server/run_instagrapi_server.sh +``` + +This script will: +- Prompt for your Instagram username and password. +- Create the necessary `.env` file. +- Build the Docker image. +- Start the Docker container and authenticate with Instagram, creating a session automatically. + +### ⏱ To run the server again later: +```bash +docker start ig-instasrv +``` + +### 🐛 Debugging: +View logs: +```bash +docker logs ig-instasrv +``` + + +### Overview: How the Setup Works + +1. You enter your Instagram credentials in a local `.env` file +2. You run the server **once locally** to generate a session file +3. After that, you can choose to run the server again locally or inside Docker without needing to log in again + +--- + +## Optional: Manual / Local Setup + +If you'd prefer to run the server manually (without Docker), you can follow these steps: + + +1. **Navigate to the server folder (and stay there for the rest of this guide)**: + ```bash + cd scripts/instagrapi_server + ``` + +2. **Create a `secrets/` folder** (if it doesn't already exist in `scripts/instagrapi_server`): + ```bash + mkdir -p secrets + ``` + +3. **Create a `.env` file** inside `secrets/` with your Instagram credentials: + ```dotenv + INSTAGRAM_USERNAME="your_username" + INSTAGRAM_PASSWORD="your_password" + ``` + +4. **Install dependencies** using the pyproject.toml file: + + ```bash + poetry install --no-root + ``` + +5. **Run the server locally**: + ```bash + poetry run uvicorn src.instaserver:app --port 8000 + ``` + +6. **Watch for the message**: + ``` + Login successful, session saved. + ``` + +✅ Your session is now saved to `secrets/instagrapi_session.json`. + +### To run it again locally: +```bash +poetry run uvicorn src.instaserver:app --port 8000 +``` + +--- + +## Adding the API Endpoint to Auto Archiver + +The server should now be running within that session, and accessible at http://127.0.0.1:8000 + +You can set this in the Auto Archiver orchestration.yaml file like this: +```yaml +instagram_api_extractor: + api_endpoint: http://127.0.0.1:8000 +``` + + +--- + +## 2. Running the Server Again + +Once the session file is created, you should be able to run the server without logging in again. + +### To run it locally (from scripts/instagrapi_server): +```bash +poetry run uvicorn src.instgrapinstance.instaserver:app --port 8000 +``` + +--- + +## 3. Running via Docker (After Setup is Complete, either locally or via the script) + +Once the `instagrapi_session.json` and `.env` files are set up, you can pass them Docker and it should authenticate successfully. + +### 🔨 Build the Docker image manually: +```bash +docker build -t instagrapi-server . +``` + +### ▶️ Run the container: +```bash +docker run -d \ + --env-file secrets/.env \ + -v "$(pwd)/secrets:/app/secrets" \ + -p 8000:8000 \ + --name ig-instasrv \ + instagrapi-server +``` + +This passes the /secrets/ directory to docker as well as the environment variables from the `.env` file. + + + +--- + +## 4. Optional Cleanup + +- **Stop the Docker container**: + ```bash + docker stop ig-instasrv + ``` + +- **Remove the container**: + ```bash + docker rm ig-instasrv + ``` + +- **Remove the Docker image**: + ```bash + docker rmi instagrapi-server + ``` + +### ⏱ To run again later: +```bash +docker start ig-instasrv +``` + +--- + +## Notes + +- Never share your `.env` or `instagrapi_session.json` — these contain sensitive login data. +- If you want to reset your session, simply delete the `secrets/instagrapi_session.json` file and re-run the local server. diff --git a/scripts/instagrapi_server/.gitignore b/scripts/instagrapi_server/.gitignore new file mode 100644 index 0000000..33bcfed --- /dev/null +++ b/scripts/instagrapi_server/.gitignore @@ -0,0 +1,2 @@ +secrets* +*instagrapi_session.json diff --git a/scripts/instagrapi_server/Dockerfile b/scripts/instagrapi_server/Dockerfile new file mode 100644 index 0000000..0fa9a63 --- /dev/null +++ b/scripts/instagrapi_server/Dockerfile @@ -0,0 +1,19 @@ +FROM python:3.12-slim +WORKDIR /app + +# Install Poetry +RUN pip install --upgrade pip +RUN pip install poetry + +# Copy all source code +COPY . . + +# Prevent Poetry from creating a virtual environment +RUN poetry config virtualenvs.create false + +# Install dependencies +RUN poetry install --no-root + + +# Use uvicorn to run the FastAPI app +CMD ["poetry", "run", "uvicorn", "src.instaserver:app", "--host", "0.0.0.0", "--port", "8000"] diff --git a/scripts/instagrapi_server/pyproject.toml b/scripts/instagrapi_server/pyproject.toml new file mode 100644 index 0000000..3c0177c --- /dev/null +++ b/scripts/instagrapi_server/pyproject.toml @@ -0,0 +1,18 @@ +[project] +name = "instaserver" +version = "0.1.0" +description = "A FastAPI InstagrAPI server" +package-mode = false +requires-python = ">=3.10" +dependencies = [ + "fastapi (>=0.115.12,<0.116.0)", + "instagrapi (>=2.1.3,<3.0.0)", + "uvicorn (>=0.34.0,<0.35.0)", + "pillow (>=11.1.0,<12.0.0)", + "python-dotenv (>=1.1.0,<2.0.0)" +] + + +[build-system] +requires = ["poetry-core>=2.0.0,<3.0.0"] +build-backend = "poetry.core.masonry.api" diff --git a/scripts/instagrapi_server/run_instagrapi_server.sh b/scripts/instagrapi_server/run_instagrapi_server.sh new file mode 100755 index 0000000..752c743 --- /dev/null +++ b/scripts/instagrapi_server/run_instagrapi_server.sh @@ -0,0 +1,48 @@ +#!/usr/bin/env bash +# +# run_instagrapi_server.sh +# Usage: +# From repo root: ./scripts/instagrapi_server/run_instagrapi_server.sh +# Or from script dir: ./run_instagrapi_server.sh +# + +set -e + +# Step 1: cd to the script's directory (contains Dockerfile and secrets/) +cd "$(dirname "$0")" || exit 1 + +# Create secrets/ if it doesn't exist +if [[ ! -d "secrets" ]]; then + echo "Creating secrets/ directory..." + mkdir secrets +fi + +echo "Enter your Instagram credentials to store in secrets/.env" +read -rp "Instagram Username: " IGUSER +read -rsp "Instagram Password: " IGPASS +echo "" + +cat < secrets/.env +INSTAGRAM_USERNAME=$IGUSER +INSTAGRAM_PASSWORD=$IGPASS +EOF +echo "Created secrets/.env with your credentials." + +# Build Docker image +IMAGE_NAME="instagrapi-server" +echo "Building Docker image '$IMAGE_NAME'..." +docker build -t "$IMAGE_NAME" . + +# Run container +CONTAINER_NAME="ig-instasrv" +echo "Running container '$CONTAINER_NAME'..." +docker run -d \ + --env-file secrets/.env \ + -v "$(pwd)/secrets:/app/secrets" \ + -p 8000:8000 \ + --name "$CONTAINER_NAME" \ + "$IMAGE_NAME" + +echo "Done! Instagrapi server is running on port 8000." +echo "Use 'docker logs $CONTAINER_NAME' to view logs." +echo "Use 'docker stop $CONTAINER_NAME' and 'docker rm $CONTAINER_NAME' to stop/remove the container." diff --git a/scripts/instagrapi_server/src/instaserver.py b/scripts/instagrapi_server/src/instaserver.py new file mode 100644 index 0000000..8d5c57b --- /dev/null +++ b/scripts/instagrapi_server/src/instaserver.py @@ -0,0 +1,157 @@ +"""https://subzeroid.github.io/instagrapi/ + +Run using the following command: + uvicorn src.instgrapinstance.instaserver:app --host 0.0.0.0 --port 8000 --reload +""" + +import logging +import os +import sys +from dotenv import load_dotenv + +from fastapi import FastAPI, HTTPException +from instagrapi import Client +from instagrapi.exceptions import LoginRequired, BadCredentials + +load_dotenv(dotenv_path="secrets/.env") +logging.basicConfig(level=logging.INFO, format="%(asctime)s [%(levelname)s] %(message)s") + +INSTAGRAM_USERNAME = os.getenv("INSTAGRAM_USERNAME") +INSTAGRAM_PASSWORD = os.getenv("INSTAGRAM_PASSWORD") +SESSION_FILE = "secrets/instagrapi_session.json" + +app = FastAPI() +cl = Client() + + +@app.on_event("startup") +def startup_event(): + """Login automatically when server starts""" + try: + login_instagram() + except RuntimeError as e: + logging.error(f"API failed to start: {e}") + sys.exit(1) + + +def login_instagram(): + """Ensures Instagrapi is logged in and session is persistent""" + if not INSTAGRAM_USERNAME or not INSTAGRAM_PASSWORD: + raise RuntimeError("Instagram credentials are missing.") + + if os.path.exists(SESSION_FILE): + try: + cl.load_settings(SESSION_FILE) + cl.get_timeline_feed() + logging.info("Using saved session.") + return + except LoginRequired: + logging.info("Session expired. Logging in again...") + + try: + cl.login(INSTAGRAM_USERNAME, INSTAGRAM_PASSWORD) + cl.dump_settings(SESSION_FILE) + logging.info("Login successful, session saved.") + except BadCredentials as bc: + raise RuntimeError("Incorrect Instagram username or password.") from bc + except Exception as e: + raise RuntimeError(f"Login failed: {e}") from e + + +@app.get("/v1/media/by/id") +def get_media_by_id(id: str): + """Fetch post details by media ID""" + logging.info(f"Fetching media by ID: {id}") + try: + media = cl.media_info(id) + return media.model_dump() + except Exception as e: + logging.warning(f"Media not found for ID {id}: {e}") + raise HTTPException(status_code=404, detail="Post not found") from e + + +@app.get("/v1/media/by/code") +def get_media_by_code(code: str): + """Fetch post details by shortcode""" + logging.info(f"Fetching media by shortcode: {code}") + try: + media_id = cl.media_pk_from_code(code) + media = cl.media_info(media_id) + return media.model_dump() + except Exception as e: + logging.warning(f"Media not found for code {code}: {e}") + raise HTTPException(status_code=404, detail="Post not found") from e + + +@app.get("/v2/user/tag/medias") +def get_user_tagged_medias(user_id: str, page_id: str = None): + logging.info(f"Fetching tagged medias for user_id={user_id} page_id={page_id}") + try: + # Placeholder for now + items, next_page_id = [], None + return {"response": {"items": items}, "next_page_id": next_page_id} + except Exception as e: + logging.warning(f"Tagged media not found for {user_id}: {e}") + raise HTTPException(status_code=404, detail="Tagged media not found") from e + + +@app.get("/v1/user/highlights") +def get_user_highlights(user_id: str): + logging.info(f"Fetching highlights list for user_id={user_id}") + try: + highlights = cl.user_highlights(user_id) + return [h.model_dump() for h in highlights] + except Exception as e: + logging.warning(f"Highlights not found for {user_id}: {e}") + raise HTTPException(status_code=404, detail="No highlights found") from e + + +@app.get("/v2/highlight/by/id") +def get_highlight_by_id(id: str): + logging.info(f"Fetching highlight details for id={id}") + try: + highlight = cl.highlight_info(id) + return {"response": {"reels": {f"highlight:{id}": highlight.model_dump()}}} + except Exception as e: + logging.warning(f"Highlight not found for id {id}: {e}") + raise HTTPException(status_code=404, detail="Highlight not found") from e + + +@app.get("/v1/user/stories/by/username") +def get_stories(username: str): + logging.info(f"Fetching stories for username={username}") + try: + user_id = cl.user_id_from_username(username) + stories = cl.user_stories(user_id) + return [story.model_dump() for story in stories] + except Exception as e: + logging.warning(f"Stories not found for {username}: {e}") + raise HTTPException(status_code=404, detail="Stories not found") from e + + +@app.get("/v2/user/by/username") +def get_user_by_username(username: str): + logging.info(f"Fetching user profile for username={username}") + try: + user = cl.user_info_by_username(username) + return {"user": user.model_dump()} + except Exception as e: + logging.warning(f"User not found: {username}: {e}") + raise HTTPException(status_code=404, detail="User not found") from e + + +@app.get("/v1/user/medias/chunk") +def get_user_medias(user_id: str, end_cursor: str = None): + logging.info(f"Fetching paginated medias for user_id={user_id}, end_cursor={end_cursor}") + try: + posts, next_cursor = cl.user_medias_paginated(user_id, end_cursor=end_cursor) + return [[post.model_dump() for post in posts], next_cursor] + except Exception as e: + logging.warning(f"No posts found for user_id={user_id}: {e}") + raise HTTPException(status_code=404, detail="No posts found") from e + + +if __name__ == "__main__": + import uvicorn + + uvicorn.run(app, host="0.0.0.0", port=8000) diff --git a/src/auto_archiver/modules/instagram_api_extractor/__manifest__.py b/src/auto_archiver/modules/instagram_api_extractor/__manifest__.py index e10bd1e..c40a5d8 100644 --- a/src/auto_archiver/modules/instagram_api_extractor/__manifest__.py +++ b/src/auto_archiver/modules/instagram_api_extractor/__manifest__.py @@ -31,9 +31,11 @@ }, }, "description": """ -Archives various types of Instagram content using the Instagrapi API. +Archives Instagram content using a deployment of the [Instagrapi API](https://subzeroid.github.io/instagrapi/). -Requires setting up an Instagrapi API deployment and providing an access token and API endpoint. +Requires either getting a token from using a hosted [(paid) service](https://api.instagrapi.com/docs) and setting this in the configuration file. +Alternatively you can run your own server. We have a basic script which you can use for this which can be ran locally or using Docker. +For more information, read the [how to guide](https://auto-archiver.readthedocs.io/en/latest/how_to/run_instagrapi_server.html) on this. ### Features - Connects to an Instagrapi API deployment to fetch Instagram profiles, posts, stories, highlights, reels, and tagged content.