I made a minimal AI coding agent in a Jupyter notebook

This post was inspired by Geoffrey Huntley’s blog post and the minimal agent tutorial. I redid the code from scratch (using Codex, of course) and adjusted the content for computational scientists and engineers.

Play with the Jupyter notebook on Google Colab

AI coding agents are surprisingly simple to make. The ingredients you need are:

  • A way to send and receive messages from a good LLM, like GPT-5.4 or Opus 4.6.

  • A system prompt that establishes a convention of how the LLM gives you back bash commands.

  • A way to extract bash commands from the LLM response.

  • A way to run a bash command and read its result.

    A way to ask a human for more input.

Then you put these ingredients together as follows:

				
					  messages = [system_prompt, user_task]

  while True:
      response = call_openai(messages)
      text = response_text(response)
      messages.append(text)

      command = extract_bash_command(text)

      if command is not None:
          result = run_bash(command)
          messages.append(format_bash_result(command, result))
      else:
          user_input = ask_human_for_input()

          if user_input is empty:
              break

          messages.append(user_input)
				
			

Let’s look at these ingredients in detail.

The system prompt

The system prompt is persistently staying in the LLM’s context. It cannot be forgotten. Its role is to establish a convention about how the LLM can communicate bash commands to us. Read it.

				
					SYSTEM_PROMPT = """You are a tiny coding agent.

When you need to run a shell command, reply with exactly one fenced
bash block and nothing else.

Example:
```bash
ls
```

When you want the human to respond, reply with plain text and do not
include a bash block.
Keep commands small and relevant to the task.
"""
				
			

The other ingredients

All the other ingredients are pretty much trivial. You can find the details in the accompanying Jupyter notebook. For those curious:

  • Use a standard API to connect to the LLM of your choice.
  • Use regular expressions to extract the bash command (if there is any).
  • Use the subprocess module to run the bash command.

This is it, really.

Does it work?

Yes, it does. It’s amazing that writing bash commands and getting feedback on them is sufficient for the LLMs to produce quality code. You will have to go and play with the Jupyter notebook to appreciate the power and the simplicity of this agentic loop.

Picture of Ilias Bilionis

Ilias Bilionis

Purdue professor. I build AI agents that bring rigorous scientific reasoning to every scientist. Inverse problems, uncertainty, Bayesian inference, experimental design. 3 books and 3 courses on SciML.