Willison Insights

Simon Willison: Using DSPy to Evaluate LLM Prompts — Schema Without Column Names Causes Agent Retry Loops

Piatok 3. júla 2026 • Source: Simon Willison's Weblog

Main idea

Willison used DSPy — a framework for systematically evaluating and optimizing AI prompts — to test system prompts for Datasette Agent's SQL feature. Key finding: schema listings that show only table names without column names force the model to guess columns blindly and enter error-retry loops.

Context

This directly builds on his work on the llm library and Datasette. Willison delegated the research to Claude Code, which tested improvements via GPT models against a live database with auto-generated gold-standard datasets and custom metrics.

Why it matters

A concrete, replicable finding with direct impact on SQL agent prompt engineering: explicitly including column names in schema listings dramatically reduces errors and retry loops. The methodology (DSPy + live database + gold datasets) is a template for systematic agent prompt testing.

Details / arguments

DSPy acted as a testing harness: agents invoked actual Datasette tools against a live database
Problem: 'do not call describe_table if you already have the information' caused column-name guessing and retry loops
Solution: include column names in the schema listing or modify the guidance
Research delegated to Claude Code, tested via GPT models with auto-generated gold datasets

Open original source Simon Willison's Weblog