Janik von Rotz


5 min read

Working with LLM agents

In the last few days I tinkered with Cortecs and OpenCode. It seems LLM agents are here to stay and I need to deal with them. I admit that I am very skeptic of the AI hype. However, I see a lot of benefits when working with LLM agents in a controlled environment.

People have built great tools and practices. There are many providers that host LLMs or route between multiple providers. The ecosystem as matured and this led me to explore more and put in more effort to understand what is going on.

The goal of this exploration was to find a new coding workflow that includes an LLM agent and fits well into my existing workflow.

As the good engineer that I am, I first wrote down some requirements for the workflow.

Requirements

No unnecessary features

The workflow should not encourage me to build more stuff, but to fix issues and improve features.

Code is technical debt and the theory of code is cognitive debt. Building a lot of features and not using them is producing waste and noise.

No digressions

In my experience, if the agent does not satisfy the requirements, you start explaining the issue over and over again. You often end up in argument and LLM always promises to do better.

This kind of arguing and guessing can be a huge waste of time and defeats the very reason for using an agent. Instead of having an ongoing conversation and a bloated context, it is often easier to start over.

And thus the workflow should encourage me to abort and refine my initial prompt.

Spec-driven prompts

Form experience it is worth to write an elaborate prompt. The initial prompt is a specification of of the new feature or bug to resolve.

Before starting an agent session, a well defined prompt must already exist.

Quick boostrap

I have many software projects with a different layout and framework. Before letting an agent go rough in the repo, I want to provide clear guideliens.

The workflow should make it easy to provide the necessary context for very different projects.

Task file integration

This is the most important requirement. I manage all my software projects with taskfile.build. It is a standard for a bash script and also a library for commands. It allows me to manage and bootstrap very different software projects.

Whatever workflow I am going to build, it can be called from a task file.

Workflow

Now let me show you how the coding workflow looks like.

After iterating on many ideas, practices, scripts and commands, I ended up with three step workflow:

  1. Setup: Prepare the project to run an LLM agent
  2. Run: Create and run a prompt
  3. Finish: Summarize and commit the changes

The commands I use in my workflow are imported from https://taskfile.build/library/#llm.

Setup

First, I enter an new or existing project and run the init project command:

task init-project

This command creates several files and folders. The relevant files are:

I update these files according to the context of the project.

Run

I create new prompt file with this command:

task create-prompt "Add prompt commands"

This creates and opens the file prompts/08_add-prompt-commands.md in the default editor. Then I update the task section with the specification.

Here is an example of what the prompt file might look like before being executed:

---
title: Add prompt commands
---

# Run 08

Replace the `==` marked instructions in this file while you work on the task.

## Context

Read the `AGENTS.md` and `README.md` to get understanding of the project.

## Task

Add two new commands `create-prompt` and `list-prompt` to the `bin` folder.

The first command asks for a title. The user is expected to enter something like `Title of the prompt`.
Then the command creates a file `prompts/NN_title-of-the-prompt.md` from a template `prompt.md.template`. It writes the title of the prompt into the frontmatter.
Once the file is created it uses the `$EDITOR` to open the file.
The number sequence starts at 01 and continues upwards.

Similar to `list-dotenv` the `list-prompt` commands lists all prompts. It shows a simple table:

```markdown
| ID  | title               |
| --- | ------------------- |
| 08  | Add prompt commands |
```

The title is retrieved from the frontmatter. The table width should be dynamic.

Add the new commands to the `library.md` in the LLM section. Update the `task.template` with `task update-template`.

## Worklog

==Fill this in as you work on the task==

## Summary

==Fill this once you completed the task==

In order to execute the prompt I launch opencode in the root of the project and simply copy-paste the path to the prompt file. Open Code then starts working on the task.

If it does not get the instructions, I either create a new session or give some manual inputs.

Finish

After a few seconds the task has been completed. The worklog and summary section should completed by the agent.

It is crucial to not blindly trust the completeness of your spec or the output of the LLM. So I check the git diff. If somethings seems off, I either update the prompt or make manual inputs.

If I am happy with the result, the changes are staged gaa and I run task update-changelog-with-llm. This will update the CHANGELOG.md according to the keep a changelog specification.

To commit the changes I use task commit-with-llm. This command generates the commit message from the git diff according to the Conventional Commits specification.

And that’s it!

Afterthoughts

This workflow works well for trivial and well defined tasks. Once you try more complex task and multiple agents, the quality of the project erodes pretty fast.

Some researchers have demonstrated that LLMs can create entire browsers within a week. Browser engines are well specified, but their creation takes a very long time. The result met all requirements, but was very slow and inefficient.

I believe that the AI hype is a challenge for the entire software industry. It’s not just salespeople who are hyping AI, but also engineers who should know better.

Until the bubble pops, I’ll try my best.

Categories: Large Language Model
Tags: llm , augmented , artifical , intelligence
Edit this page
Show statistic for this page