Back to section
Willison

Simon Willison: Using DSPy to Evaluate LLM Prompts — Schema Without Column Names Causes Agent Retry Loops

Piatok 3. júla 2026 Source: Simon Willison's Weblog

Main idea

Willison used DSPy — a framework for systematically evaluating and optimizing AI prompts — to test system prompts for Datasette Agent's SQL feature. Key finding: schema listings that show only table names without column names force the model to guess columns blindly and enter error-retry loops.

Context

This directly builds on his work on the llm library and Datasette. Willison delegated the research to Claude Code, which tested improvements via GPT models against a live database with auto-generated gold-standard datasets and custom metrics.

Why it matters

A concrete, replicable finding with direct impact on SQL agent prompt engineering: explicitly including column names in schema listings dramatically reduces errors and retry loops. The methodology (DSPy + live database + gold datasets) is a template for systematic agent prompt testing.

Details / arguments

  • DSPy acted as a testing harness: agents invoked actual Datasette tools against a live database
  • Problem: 'do not call describe_table if you already have the information' caused column-name guessing and retry loops
  • Solution: include column names in the schema listing or modify the guidance
  • Research delegated to Claude Code, tested via GPT models with auto-generated gold datasets
Open original source Simon Willison's Weblog