Give your python scripts regenerative healing abilities!

Go to file

Felix Boehme 4db9d1bf43 more cleanup		2023-04-14 17:49:09 -04:00
.env.sample	cleanup	2023-04-14 17:46:18 -04:00
.flake8	use Fire for args, add flag to use 3.5-turbo	2023-04-08 12:38:14 -07:00
.gitignore	more cleanup	2023-04-14 17:49:09 -04:00
LICENSE	initial commit	2023-03-18 15:16:47 -07:00
README.md	cleanup	2023-04-14 17:46:18 -04:00
buggy_script.py	initial commit	2023-03-18 15:16:47 -07:00
prompt.txt	update prompt to make it pay attention to indentation	2023-04-14 17:08:18 -04:00
requirements.txt	Added python-dotenv to requirements.txt	2023-04-14 16:53:02 -04:00
wolverine.py	more cleanup	2023-04-14 17:49:09 -04:00

README.md

Wolverine

About

Give your python scripts regenerative healing abilities!

Run your scripts with Wolverine and when they crash, GPT-4 edits them and explains what went wrong. Even if you have many bugs it will repeatedly rerun until it's fixed.

For a quick demonstration see my demo video on twitter.

Setup

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.sample .env

Add your openAI api key to .env

warning! by default this uses GPT-4 and may make many repeated calls to the api.

Example Usage

To run with gpt-4 (the default, tested option):

python wolverine.py buggy_script.py "subtract" 20 3

You can also run with other models, but be warned they may not adhere to the edit format as well:

python wolverine.py --model=gpt-3.5-turbo buggy_script.py "subtract" 20 3

If you want to use GPT-3.5 by default instead of GPT-4 uncomment the default model line in .env:

DEFAULT_MODEL=gpt-3.5-turbo

Future Plans

This is just a quick prototype I threw together in a few hours. There are many possible extensions and contributions are welcome:

add flags to customize usage, such as asking for user confirmation before running changed code
further iterations on the edit format that GPT responds in. Currently it struggles a bit with indentation, but I'm sure that can be improved
a suite of example buggy files that we can test prompts on to ensure reliability and measure improvement
multiple files / codebases: send GPT everything that appears in the stacktrace
graceful handling of large files - should we just send GPT relevant classes / functions?
extension to languages other than python