This post was inspired by Geoffrey Huntley’s blog post and the minimal agent tutorial. I redid the code from scratch (using Codex, of course) and adjusted the content for computational scientists and engineers.
Play with the Jupyter notebook on Google Colab
AI coding agents are surprisingly simple to make. The ingredients you need are:
A way to send and receive messages from a good LLM, like GPT-5.4 or Opus 4.6.
A system prompt that establishes a convention of how the LLM gives you back bash commands.
A way to extract bash commands from the LLM response.
A way to run a bash command and read its result.
A way to ask a human for more input.
Then you put these ingredients together as follows:
messages = [system_prompt, user_task]
while True:
response = call_openai(messages)
text = response_text(response)
messages.append(text)
command = extract_bash_command(text)
if command is not None:
result = run_bash(command)
messages.append(format_bash_result(command, result))
else:
user_input = ask_human_for_input()
if user_input is empty:
break
messages.append(user_input)
Let’s look at these ingredients in detail.
The system prompt
The system prompt is persistently staying in the LLM’s context. It cannot be forgotten. Its role is to establish a convention about how the LLM can communicate bash commands to us. Read it.
SYSTEM_PROMPT = """You are a tiny coding agent.
When you need to run a shell command, reply with exactly one fenced
bash block and nothing else.
Example:
```bash
ls
```
When you want the human to respond, reply with plain text and do not
include a bash block.
Keep commands small and relevant to the task.
"""
The other ingredients
All the other ingredients are pretty much trivial. You can find the details in the accompanying Jupyter notebook. For those curious:
- Use a standard API to connect to the LLM of your choice.
- Use regular expressions to extract the bash command (if there is any).
- Use the subprocess module to run the bash command.
This is it, really.
Does it work?
Yes, it does. It’s amazing that writing bash commands and getting feedback on them is sufficient for the LLMs to produce quality code. You will have to go and play with the Jupyter notebook to appreciate the power and the simplicity of this agentic loop.

