kopia lustrzana https://github.com/thinkst/zippy
Add action badge to README
rodzic
ae5f458b5f
commit
0d1a1053f2
|
@ -1,5 +1,7 @@
|
|||
# ai-detect: Fast methods to classify text as AI or human-generated
|
||||
|
||||
[![Classifiation accuracy testing](https://github.com/Tail-Pipe/ai-detect/actions/workflows/pytest.yml/badge.svg)](https://github.com/Tail-Pipe/ai-detect/actions/workflows/pytest.yml)
|
||||
|
||||
This is a research repo for fast AI detection methods as we experiment with different techniques.
|
||||
While there are a number of existing LLM detection systems, they all use a large model trained on either an LLM or
|
||||
its training data to calculate the probability of each word given the preceeding, then calculating a score where
|
||||
|
@ -16,4 +18,4 @@ are all dynamic (though influenced by the 'preset' of 0-9--with 0 being the fast
|
|||
is to 'seed' an LZMA compression stream with a corpus of AI-generated text (`ai-generated.txt`) and then measure the compression ratio of
|
||||
just the seed data with that of the sample appended. Samples that follow more closely in word choice, structure, etc. will acheive a higher
|
||||
compression ratio due to the prevalence of similar tokens in the dictionary, novel words, structures, etc. will appear anomalous to the seeded
|
||||
dictionary, resulting in a worse compression ratio.
|
||||
dictionary, resulting in a worse compression ratio.
|
||||
|
|
Ładowanie…
Reference in New Issue