kopia lustrzana https://git.sr.ht/~edwardloveall/scribe
Add a bunch of well-known, LLM scrapers to robots.txt
Unknown if this will actually stop them, but at least I can show my intent. User agents sourced from https://darkvisitors.com/main
rodzic
41b391e22c
commit
6ccea391ed
|
@ -1,5 +1,6 @@
|
||||||
Unreleased
|
Unreleased
|
||||||
|
|
||||||
|
* Add a bunch of well-known, LLM scrapers to robots.txt
|
||||||
* Add command to tag releases
|
* Add command to tag releases
|
||||||
* Modernize nix config
|
* Modernize nix config
|
||||||
* Added scribe.manasiwibi.com instance
|
* Added scribe.manasiwibi.com instance
|
||||||
|
|
|
@ -1,4 +1,55 @@
|
||||||
# Learn more about robots.txt: https://www.robotstxt.org/robotstxt.html
|
# ChatGPT-User
|
||||||
User-agent: *
|
User-agent: ChatGPT-User
|
||||||
# 'Disallow' with an empty value allows all paths to be crawled
|
Disallow: /
|
||||||
Disallow:
|
|
||||||
|
# cohere-ai
|
||||||
|
User-agent: cohere-ai
|
||||||
|
Disallow: /
|
||||||
|
|
||||||
|
# anthropic-ai
|
||||||
|
User-agent: anthropic-ai
|
||||||
|
Disallow: /
|
||||||
|
|
||||||
|
# Bytespider
|
||||||
|
User-agent: Bytespider
|
||||||
|
Disallow: /
|
||||||
|
|
||||||
|
# CCBot
|
||||||
|
User-agent: CCBot
|
||||||
|
Disallow: /
|
||||||
|
|
||||||
|
# FacebookBot
|
||||||
|
User-agent: FacebookBot
|
||||||
|
Disallow: /
|
||||||
|
|
||||||
|
# Google-Extended
|
||||||
|
User-agent: Google-Extended
|
||||||
|
Disallow: /
|
||||||
|
|
||||||
|
# GPTBot
|
||||||
|
User-agent: GPTBot
|
||||||
|
Disallow: /
|
||||||
|
|
||||||
|
# omgili
|
||||||
|
User-agent: omgili
|
||||||
|
Disallow: /
|
||||||
|
|
||||||
|
# Amazonbot
|
||||||
|
User-agent: Amazonbot
|
||||||
|
Disallow: /
|
||||||
|
|
||||||
|
# Applebot
|
||||||
|
User-agent: Applebot
|
||||||
|
Disallow: /
|
||||||
|
|
||||||
|
# PerplexityBot
|
||||||
|
User-agent: PerplexityBot
|
||||||
|
Disallow: /
|
||||||
|
|
||||||
|
# PerplexityBot
|
||||||
|
User-agent: PerplexityBot
|
||||||
|
Disallow: /
|
||||||
|
|
||||||
|
# YouBot
|
||||||
|
User-agent: YouBot
|
||||||
|
Disallow: /
|
||||||
|
|
Ładowanie…
Reference in New Issue