Skip to content

Tool Generation

Anvil’s core capability is generating working Python code from natural language descriptions. This page explains the generation process in detail.

When you call use_tool(), Anvil constructs a prompt for the LLM that includes:

  1. System instructions - Guidelines for generating clean, safe Python code
  2. Your intent - The natural language description of what the tool should do
  3. Documentation - Fetched API docs (if docs_url provided)
  4. Input schema - Parameter definitions (if inputs provided)

The quality of generated code depends heavily on your intent description.

# Specific and actionable
anvil.use_tool(
name="search_notion",
intent="Search a Notion workspace for pages matching a query using the official API"
)
# Includes key details
anvil.use_tool(
name="send_slack",
intent="Send a message to a Slack channel using the Web API, handling rate limits"
)
# Specifies return format
anvil.use_tool(
name="get_stock",
intent="Get stock price for a ticker symbol, returning price and daily change as a dict"
)
# Too vague
anvil.use_tool(name="search", intent="Search stuff")
# Missing context
anvil.use_tool(name="send", intent="Send a message")

Providing a docs_url significantly improves code quality:

anvil.use_tool(
name="create_github_issue",
intent="Create a new issue in a GitHub repository",
docs_url="https://docs.github.com/en/rest/issues/issues#create-an-issue"
)

Anvil will:

  1. Fetch the documentation page using FireCrawl
  2. Extract relevant API details
  3. Include them in the generation prompt

This gives the LLM accurate endpoint URLs, authentication methods, and request/response formats.

For complex tools, explicitly define inputs:

from anvil import Anvil, InputParam
anvil.use_tool(
name="create_calendar_event",
intent="Create a Google Calendar event",
inputs=[
InputParam(
name="title",
param_type="str",
required=True,
description="Event title"
),
InputParam(
name="start_time",
param_type="str",
required=True,
description="Start time in ISO 8601 format"
),
InputParam(
name="duration_minutes",
param_type="int",
required=False,
default=60,
description="Event duration in minutes"
),
InputParam(
name="attendees",
param_type="list",
required=False,
default=[],
description="List of attendee email addresses"
),
]
)
TypePython TypeExample
strstr"hello"
intint42
floatfloat3.14
boolboolTrue
listlist["a", "b"]

Anvil regenerates tool code when:

  1. Intent changes - The hash of your intent differs from the cached version
  2. Force flag - You explicitly request regeneration
  3. Self-healing - A tool execution failed
# Force regeneration by changing intent
anvil.use_tool(
name="search",
intent="Search the web with advanced filters" # Different from before
)
# Or clear the cache
# $ anvil clean

Each regeneration increments the version:

Version 1.0 → Initial generation
Version 1.1 → After self-healing fix
Version 1.2 → Another fix
Version 2.0 → Major intent change

Version history is tracked in the tool registry.

Anvil supports multiple LLM providers for generation:

# Use Claude (default)
anvil = Anvil(provider="anthropic")
# Use GPT-4
anvil = Anvil(provider="openai", model="gpt-4o")
# Use Grok
anvil = Anvil(provider="grok")

Each provider has default models:

  • Anthropic: claude-sonnet-4-20250514
  • OpenAI: gpt-4o
  • Grok: grok-2-latest

Generate placeholder tools without LLM calls:

anvil = Anvil(use_stub=True)
# Returns a stub function that returns mock data
tool = anvil.use_tool(name="test", intent="Test tool")
result = tool.run() # {'status': 'stub', 'intent': 'Test tool'}

Useful for:

  • Testing without API keys
  • CI/CD pipelines
  • Development without incurring LLM costs