Documentation
Everything you need to build production-grade AI agents with GoGrid.
Getting Started
GoGrid (G2) is a unified system for developing and orchestrating production AI agents in Go. It supports five composable workload types: Single Agent, Team, Pipeline, Graph, and Dynamic Orchestration.
Installation
go get github.com/lonestarx1/gogridProject Structure
gogrid/
├── pkg/
│ ├── agent/ # Agent creation and execution
│ ├── llm/ # LLM provider interfaces and implementations
│ │ ├── openai/ # OpenAI provider
│ │ ├── anthropic/ # Anthropic provider
│ │ ├── gemini/ # Google Gemini provider
│ │ └── mock/ # Mock provider for testing
│ ├── memory/ # Memory interfaces and implementations
│ │ ├── file/ # File-backed memory
│ │ ├── shared/ # Shared memory for teams
│ │ └── transfer/ # Transferable state for pipelines
│ ├── tool/ # Tool interface and registry
│ ├── trace/ # Tracing and observability
│ │ ├── otel/ # OTLP JSON exporter
│ │ ├── log/ # Structured JSON logging
│ │ └── metrics/ # Prometheus-compatible metrics
│ ├── cost/ # Cost tracking, budgets, and governance
│ ├── eval/ # Evaluation framework and benchmarks
│ │ └── bench/ # Performance benchmarks
│ └── orchestrator/
│ ├── team/ # Team (chat room) orchestrator
│ ├── pipeline/ # Pipeline (linear) orchestrator
│ ├── graph/ # Graph orchestrator
│ └── dynamic/ # Dynamic orchestration runtime
├── internal/
│ ├── id/ # ID generation
│ ├── config/ # YAML configuration loading
│ ├── runrecord/ # Run record persistence
│ └── cli/ # CLI implementation
│ └── templates/ # Project scaffolding templates
└── cmd/
└── gogrid/ # CLI entry pointCLI
The gogrid CLI is the primary interface for working with GoGrid projects. Define agents declaratively in gogrid.yaml, run them from the command line, and inspect execution traces and costs — all without writing Go code.
Installation
Build the CLI from source. The binary is written to bin/gogrid.
git clone https://github.com/lonestarx1/gogrid.git
cd gogrid
make build
# Verify the installation
bin/gogrid version
# gogrid dev (darwin/arm64, go1.25.4)
# Optional: embed a version string
make build VERSION=1.0.0Quick Start
Scaffold a project, set an API key, and run your first agent in under a minute.
# Scaffold a new project
gogrid init --template single my-agent
cd my-agent
# Set up the project
go mod init github.com/example/my-agent
go mod tidy
export OPENAI_API_KEY=sk-proj-...
# Run your agent
gogrid run assistant -input "Explain Go's concurrency model"Project Templates
Three templates are available for gogrid init, each generating a working project with gogrid.yaml, main.go,Makefile, and README.md.
single
A single agent with instructions and configuration. The simplest starting point for any GoGrid project.
team
Two agents (researcher + reviewer) collaborating via shared memory and consensus. Demonstrates multi-agent coordination.
pipeline
Two sequential stages (drafter + editor) with state transfer between stages. Demonstrates linear workflows.
# Scaffold each template type
gogrid init --template single my-agent
gogrid init --template team my-research-team
gogrid init --template pipeline my-content-pipeline
# Use a custom name
gogrid init --template team --name research-bot ./researchConfiguration (YAML)
GoGrid projects are configured via a gogrid.yaml file in the project root. Each agent is defined with a model, provider, system prompt, and execution parameters.
Schema
version: "1" # Required. Config schema version.
agents:
<agent-name>: # Unique agent identifier.
model: <string> # Required. LLM model ID.
provider: <string> # Required. One of: openai, anthropic, gemini.
instructions: <string> # System prompt for the agent.
config:
max_turns: <int> # Max LLM round-trips. 0 = unlimited.
max_tokens: <int> # Max response tokens per turn.
temperature: <float> # LLM randomness (0.0-1.0). Omit for default.
timeout: <duration> # Wall-clock limit (e.g. "30s", "5m", "1h").
cost_budget: <float> # Max cost in USD for a single run.Full Example
version: "1"
agents:
researcher:
model: claude-sonnet-4-5-20250929
provider: anthropic
instructions: |
You are a technical researcher. When given a topic:
1. Explain the core concepts clearly
2. Provide concrete examples
3. Discuss trade-offs and alternatives
4. Mention common pitfalls
config:
max_turns: 10
max_tokens: 4096
temperature: 0.7
timeout: 2m
cost_budget: 0.50
code-reviewer:
model: gpt-4o-mini
provider: openai
instructions: |
You are a senior Go code reviewer. Check for correctness,
error handling, naming, and Go idioms.
config:
max_turns: 5
max_tokens: 2048
timeout: 60s
cost_budget: 0.10
summarizer:
model: gpt-4o-mini
provider: openai
instructions: |
Condense the input into 3-5 bullet points. Under 200 words.
config:
max_turns: 3
max_tokens: 1024
timeout: 30s
cost_budget: 0.05Environment Variable Substitution
Config values support ${VAR} and ${VAR:-default} syntax. This is processed before YAML parsing, so you can override any value per environment without editing the config file.
version: "1"
agents:
assistant:
model: ${MODEL:-gpt-4o-mini}
provider: ${PROVIDER:-openai}
instructions: ${AGENT_INSTRUCTIONS:-You are a helpful assistant.}
config:
max_turns: 10
timeout: ${TIMEOUT:-60s}# Override model and provider at runtime
MODEL=claude-sonnet-4-5-20250929 PROVIDER=anthropic gogrid run assistant -input "Hello"
# Use a different model for cost savings
MODEL=gpt-4o-mini gogrid run assistant -input "Quick question"Environment Variables
API keys are resolved from environment variables — never stored in config files.
OPENAI_API_KEY
API key for OpenAI models (gpt-4o, gpt-4o-mini, gpt-4.1, o3, etc.)
ANTHROPIC_API_KEY
API key for Anthropic models (claude-sonnet-4-5, claude-opus-4-6, etc.)
GEMINI_API_KEY
API key for Google Gemini models (gemini-2.5-pro, gemini-2.5-flash, etc.)
Validation
The config is validated on load. The CLI will report clear errors for missing fields, invalid providers, or malformed YAML.
versionmust be"1"- At least one agent must be defined
- Each agent must have
modelandprovider providermust be one of:openai,anthropic,gemini
CLI Commands
gogrid init
Scaffold a new GoGrid project from a template. Creates a directory withgogrid.yaml, main.go, Makefile, and README.md.
gogrid init [flags] [directory]
Flags:
-template string Project template: single, team, pipeline (default "single")
-name string Project name (defaults to directory name)# Scaffold a single-agent project
$ gogrid init --template single my-agent
Created GoGrid project in my-agent/
gogrid.yaml Agent configuration
main.go Programmatic entry point
Makefile Build targets
README.md Setup instructions
Next steps:
cd my-agent
go mod init github.com/example/my-agent
go mod tidy
export OPENAI_API_KEY=sk-...
gogrid run assistant -input "Hello!"gogrid list
List all agents defined in the project's gogrid.yaml.
$ gogrid list
NAME PROVIDER MODEL
code-reviewer openai gpt-4o-mini
researcher anthropic claude-sonnet-4-5-20250929
summarizer openai gpt-4o-minigogrid run
Execute a named agent with the given input. The agent's response is printed to stdout. A run record is saved for later inspection with gogrid trace and gogrid cost.
gogrid run <agent-name> [flags]
Flags:
-config string Path to config file (default "gogrid.yaml")
-input string Input text to send to the agent (required)
-timeout string Override the agent's timeout (e.g. "30s", "5m")# Run an agent
$ gogrid run researcher -input "Explain Go's context package"
Go's context package provides a way to carry deadlines, cancellation signals,
and request-scoped values across API boundaries...
# The run ID is printed to stderr for later inspection
Run ID: 019479a3c4e80001
# Override timeout
$ gogrid run summarizer -input "Summarize this paper..." -timeout 2mWhat happens during a run:
- Loads and validates
gogrid.yaml - Looks up the agent by name
- Resolves the LLM provider using environment variables
- Creates the agent with configured model, instructions, and parameters
- Calls
agent.Run()with an in-memory tracer - Prints the response to stdout
- Saves the run record to
.gogrid/runs/
gogrid trace
Inspect execution traces. With no arguments, lists recent runs. With a run ID, renders the span tree showing the full execution flow.
# List recent runs
$ gogrid trace
Recent runs:
019479a3c4e80001 researcher claude-sonnet-4-5-20250929 4.2s
019479a1b2c70002 code-reviewer gpt-4o-mini 1.1s
019479a0a1b60003 summarizer gpt-4o-mini 0.8s
# View span tree for a specific run
$ gogrid trace 019479a3c4e80001
Run: 019479a3c4e80001
Agent: researcher | Model: claude-sonnet-4-5-20250929 | Duration: 4.2s
agent.run (4.2s)
├── memory.load (1ms)
├── llm.complete (2.1s) [prompt: 150, completion: 89]
├── llm.complete (1.8s) [prompt: 280, completion: 145]
└── memory.save (2ms)
# Export as JSON for programmatic use
$ gogrid trace 019479a3c4e80001 -json | jq '.[].name'gogrid cost
View cost breakdown for agent runs. With no arguments, lists all runs with their total cost. With a run ID, shows a per-model cost breakdown.
# List all runs with costs
$ gogrid cost
RUN ID AGENT MODEL COST
019479a3c4e80001 researcher claude-sonnet-4-5-20250929 $0.003280
019479a1b2c70002 code-reviewer gpt-4o-mini $0.000150
019479a0a1b60003 summarizer gpt-4o-mini $0.000090
# Detailed cost breakdown for a specific run
$ gogrid cost 019479a3c4e80001
Run: 019479a3c4e80001
MODEL CALLS PROMPT COMPLETION COST
claude-sonnet-4-5-20250929 2 430 234 $0.003280
────────────────────────────────────────────────────────────────
TOTAL 2 430 234 $0.003280
# Export as JSON
$ gogrid cost -jsongogrid version
$ gogrid version
gogrid 1.0.0 (darwin/arm64, go1.25.4)Supported Models
GoGrid includes built-in pricing for cost tracking. Any model string is accepted — these have pre-configured pricing:
OpenAI
gpt-4o, gpt-4o-mini, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, o3, o4-mini
Anthropic
claude-opus-4-6, claude-opus-4-5, claude-sonnet-4-5, claude-sonnet-4-0, claude-haiku-4-5
Google Gemini
gemini-3-pro, gemini-3-flash, gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash
Models not in this list work fine — cost tracking will report $0.00 until custom pricing is configured via the Go API.
Run Records
Every gogrid run invocation saves a JSON record to.gogrid/runs/<run-id>.json. Run IDs are time-sortable, so newer runs always sort after older ones.
Record Fields
run_id
Unique, time-sortable identifier for the run.
agent / model / provider
Which agent ran, which model and provider were used.
input / output
The user's input and the agent's final response.
turns / usage
Number of LLM round-trips and token counts (prompt, completion, total).
cost
Estimated cost in USD based on built-in model pricing.
spans
Execution trace spans — LLM calls, tool executions, memory operations.
duration / error
Wall-clock duration and error message if the run failed.
Inspecting Run Records
Run records are plain JSON files. You can inspect them directly, back them up, or pipe them to other tools.
# List run records
ls .gogrid/runs/
# View a raw record
cat .gogrid/runs/019479a3c4e80001.json | jq .
# Extract agent names and costs from all runs
for f in .gogrid/runs/*.json; do
echo "$(jq -r '.agent' $f): $(jq -r '.cost' $f)"
done
# Total cost across all runs
ls .gogrid/runs/*.json | xargs -I{} jq '.cost' {} | paste -sd+ | bcAdd .gogrid/ to your .gitignore — run records are local development artifacts.
Typical Workflow
# 1. Scaffold a project
gogrid init --template single my-project
cd my-project
# 2. Set up Go module and dependencies
go mod init github.com/example/my-project
go mod tidy
# 3. Set your API key
export OPENAI_API_KEY=sk-proj-...
# 4. List agents
gogrid list
# 5. Run an agent
gogrid run assistant -input "Explain the CAP theorem"
# 6. Inspect the trace
gogrid trace # list recent runs, copy a run ID
gogrid trace <run-id> # view execution span tree
# 7. Check costs
gogrid cost <run-id> # detailed breakdown
gogrid cost # summary of all runsCore Types
GoGrid is built on a small set of composable interfaces. These are the building blocks for all orchestration patterns.
Message
Messages represent conversation turns. Every LLM interaction flows through the llm.Message type.
// Roles: system, user, assistant, tool
msg := llm.NewUserMessage("What is the weather?")
msg := llm.NewAssistantMessage("The weather is sunny.")
msg := llm.NewSystemMessage("You are a helpful assistant.")
msg := llm.NewToolMessage(callID, "72°F and sunny")Provider
The llm.Provider interface abstracts LLM backends. Swapping providers is a configuration change — no code changes required.
type Provider interface {
Complete(ctx context.Context, params Params) (*Response, error)
}
// Params includes Model, Messages, Tools, Temperature, MaxTokens
// Response includes Message, Usage (tokens), and ModelTool
Tools are functions that agents can call. Each tool has a name, description, JSON Schema for parameters, and an Execute method.
type Tool interface {
Name() string
Description() string
Schema() Schema
Execute(ctx context.Context, input json.RawMessage) (string, error)
}Memory
Memory is a first-class primitive, not a plugin. Every agent pattern is designed with memory as a core concern.
type Memory interface {
Load(ctx context.Context, key string) ([]llm.Message, error)
Save(ctx context.Context, key string, messages []llm.Message) error
Clear(ctx context.Context, key string) error
}Single Agent
The Single Agent pattern is the fundamental unit of work in GoGrid. It combines an LLM provider, tools, memory, and configuration to execute tasks through an iterative tool-use loop.
Creating an Agent
Agents are created with functional options. At minimum, you need a name, provider, and model.
import (
"github.com/lonestarx1/gogrid/pkg/agent"
"github.com/lonestarx1/gogrid/pkg/llm/openai"
"github.com/lonestarx1/gogrid/pkg/memory"
)
provider := openai.New(os.Getenv("OPENAI_API_KEY"))
a := agent.New("assistant",
agent.WithProvider(provider),
agent.WithModel("gpt-4o"),
agent.WithInstructions("You are a helpful assistant."),
agent.WithMemory(memory.NewInMemory()),
agent.WithTools(mySearchTool, myCalcTool),
agent.WithConfig(agent.Config{
MaxTurns: 10,
MaxTokens: 4096,
CostBudget: 1.00, // USD
}),
)Running an Agent
Call Run with a context and user input. The agent loops: call LLM, execute tool calls, repeat until the LLM gives a final response or limits are hit.
result, err := a.Run(ctx, "What is 42 * 17?")
if err != nil {
log.Fatal(err)
}
fmt.Println(result.Message.Content) // "42 * 17 = 714"
fmt.Printf("Turns: %d, Cost: $%.4f\n", result.Turns, result.Cost)
fmt.Printf("Tokens: %d prompt, %d completion\n",
result.Usage.PromptTokens, result.Usage.CompletionTokens)Agent Loop
The execution flow inside Agent.Run:
- Build messages from system prompt, memory history, and user input
- Call the LLM with messages and tool definitions
- If the LLM responds with tool calls, execute them and loop
- If the LLM responds with a final message, return the result
- Respect max turns, timeout, and cost budget at each iteration
LLM Providers
GoGrid ships with three built-in providers. All implement the samellm.Provider interface, so swapping is a one-line change.
OpenAI
import "github.com/lonestarx1/gogrid/pkg/llm/openai"
provider := openai.New(os.Getenv("OPENAI_API_KEY"))
// Models: "gpt-4o", "gpt-4o-mini", "gpt-4.1", "o3", etc.Anthropic
import "github.com/lonestarx1/gogrid/pkg/llm/anthropic"
provider := anthropic.New(os.Getenv("ANTHROPIC_API_KEY"))
// Models: "claude-sonnet-4-5-20250929", "claude-opus-4-6-20250827", etc.Google Gemini
import "github.com/lonestarx1/gogrid/pkg/llm/gemini"
provider, err := gemini.New(ctx, os.Getenv("GOOGLE_API_KEY"))
// Models: "gemini-2.5-pro", "gemini-2.5-flash", etc.Tools
Tools give agents the ability to take actions. Define the tool interface and the agent will call it when appropriate.
type CalculatorTool struct{}
func (t *CalculatorTool) Name() string { return "calculator" }
func (t *CalculatorTool) Description() string { return "Evaluate math expressions" }
func (t *CalculatorTool) Schema() tool.Schema {
return tool.Schema{
Type: "object",
Properties: map[string]*tool.Schema{
"expression": {Type: "string", Description: "Math expression to evaluate"},
},
Required: []string{"expression"},
}
}
func (t *CalculatorTool) Execute(ctx context.Context, input json.RawMessage) (string, error) {
var args struct {
Expression string `json:"expression"`
}
if err := json.Unmarshal(input, &args); err != nil {
return "", err
}
// Evaluate the expression...
return result, nil
}Tool Registry
The tool.Registry provides centralized tool management with duplicate-name prevention.
registry := tool.NewRegistry()
registry.Register(&CalculatorTool{})
registry.Register(&SearchTool{})
t, err := registry.Get("calculator") // retrieve by name
names := registry.List() // ["calculator", "search"]Memory
Memory is a first-class primitive in GoGrid. The base Memory interface provides Load/Save/Clear. Extended interfaces add search, pruning, and statistics.
InMemory Store
The default memory implementation. Thread-safe, suitable for development and short-lived sessions. Implements all extended interfaces.
mem := memory.NewInMemory()
// Save conversation
_ = mem.Save(ctx, "session-1", []llm.Message{
llm.NewUserMessage("Hello"),
llm.NewAssistantMessage("Hi there!"),
})
// Load conversation
msgs, _ := mem.Load(ctx, "session-1")
// Search across all keys (case-insensitive)
entries, _ := mem.Search(ctx, "hello")
// Get aggregate statistics
stats, _ := mem.Stats(ctx)
fmt.Printf("Keys: %d, Entries: %d, Size: %d bytes\n",
stats.Keys, stats.TotalEntries, stats.TotalSize)ReadOnly Wrapper
Wraps any Memory, permitting loads but rejecting saves and clears withErrReadOnly. Useful for reference data.
ro := memory.NewReadOnly(inner)
msgs, _ := ro.Load(ctx, "k") // works
err := ro.Save(ctx, "k", msgs) // returns memory.ErrReadOnlyExtended Interfaces
// SearchableMemory adds keyword search
type SearchableMemory interface {
Memory
Search(ctx context.Context, query string) ([]Entry, error)
}
// PrunableMemory adds policy-based pruning
type PrunableMemory interface {
Memory
Prune(ctx context.Context, policy PrunePolicy) (int, error)
}
// StatsMemory adds aggregate statistics
type StatsMemory interface {
Memory
Stats(ctx context.Context) (*Stats, error)
}File Memory
File-backed memory persists data to disk as JSON files. Each key maps to a separate file (hex-encoded filename) with a .meta.json sidecar for timestamps and sizes. Easy to inspect, no contention between keys.
import "github.com/lonestarx1/gogrid/pkg/memory/file"
mem, err := file.New("/var/data/agent-memory")
if err != nil {
log.Fatal(err)
}
// Use like any other Memory
_ = mem.Save(ctx, "session-1", messages)
loaded, _ := mem.Load(ctx, "session-1")
// Also supports Search, Prune, and Stats
results, _ := mem.Search(ctx, "keyword")
stats, _ := mem.Stats(ctx)On disk, keys are hex-encoded for filesystem safety. A key session-1 becomes:
/var/data/agent-memory/
├── 73657373696f6e2d31.json # message data
└── 73657373696f6e2d31.meta.json # metadata sidecarTransferable State
For pipeline patterns, state must transfer between stages — not be shared.TransferableState uses a generation counter so that when ownership moves to the next agent, the previous owner's handle becomes invalid.
import "github.com/lonestarx1/gogrid/pkg/memory/transfer"
// Create transferable state wrapping any Memory
state := transfer.NewState(memory.NewInMemory())
// Stage 1 acquires ownership
h1, _ := state.Acquire("stage-1")
_ = h1.Save(ctx, "pipeline", messages)
// Transfer to stage 2
h2, _ := state.Transfer("stage-1", "stage-2")
loaded, _ := h2.Load(ctx, "pipeline") // works
// Stage 1's handle is now invalid
_, err := h1.Load(ctx, "pipeline")
// err == transfer.ErrStateTransferredValidation Hooks
Register hooks to enforce policies before transfers occur.
state.OnTransfer(func(from, to string) error {
if to == "untrusted-agent" {
return errors.New("transfer to untrusted agent denied")
}
return nil
})
// Audit trail
log := state.AuditLog()
for _, entry := range log {
fmt.Printf("%s -> %s (gen %d)\n", entry.From, entry.To, entry.Generation)
}Prune Policies
Prune policies control which memory entries get removed. Policies implement the PrunePolicy interface and compose with AnyPolicy.
// Remove entries older than 1 hour
removed, _ := mem.Prune(ctx, memory.NewMaxAge(1 * time.Hour))
// Remove entries larger than 10KB
removed, _ := mem.Prune(ctx, memory.NewMaxSize(10_000))
// Keep only the last 100 entries per key
removed, _ := mem.Prune(ctx, memory.NewMaxEntries(100))
// Compose: prune if old OR too large
policy := memory.NewAnyPolicy(
memory.NewMaxAge(24 * time.Hour),
memory.NewMaxSize(50_000),
)
removed, _ := mem.Prune(ctx, policy)Team (Chat Room)
The Team pattern orchestrates multiple agents working concurrently on the same input. Agents communicate via a shared message bus, store results in shared memory, and reach decisions through pluggable consensus strategies.
import "github.com/lonestarx1/gogrid/pkg/orchestrator/team"
// Create agents with different perspectives
reviewer := agent.New("reviewer",
agent.WithProvider(provider),
agent.WithModel("gpt-4o"),
agent.WithInstructions("You review code for correctness and clarity."),
)
security := agent.New("security",
agent.WithProvider(provider),
agent.WithModel("gpt-4o"),
agent.WithInstructions("You review code for security vulnerabilities."),
)
perf := agent.New("performance",
agent.WithProvider(provider),
agent.WithModel("gpt-4o"),
agent.WithInstructions("You review code for performance issues."),
)
// Create the team
t := team.New("code-review",
team.WithMembers(
team.Member{Agent: reviewer, Role: "code quality"},
team.Member{Agent: security, Role: "security"},
team.Member{Agent: perf, Role: "performance"},
),
team.WithStrategy(team.Unanimous{}),
team.WithConfig(team.Config{
MaxRounds: 1,
CostBudget: 5.00,
}),
)
result, err := t.Run(ctx, "Review this function: ...")
fmt.Println(result.Decision.Content)
fmt.Printf("Cost: $%.4f, Rounds: %d\n", result.TotalCost, result.Rounds)Multi-Round Discussions
With MaxRounds > 1, agents see each other's previous responses and can refine their answers across rounds. The team continues until the consensus strategy is satisfied or max rounds are reached.
t := team.New("debate",
team.WithMembers(
team.Member{Agent: proAgent, Role: "advocate"},
team.Member{Agent: conAgent, Role: "skeptic"},
),
team.WithStrategy(team.Unanimous{}),
team.WithConfig(team.Config{MaxRounds: 3}),
)
// Round 1: Both agents respond independently
// Round 2: Each sees the other's Round 1 response
// Round 3: Each sees all prior responses
result, _ := t.Run(ctx, "Should we adopt microservices?")Per-Agent Results
for name, agentResult := range result.Responses {
fmt.Printf("[%s] %s\n", name, agentResult.Message.Content)
fmt.Printf(" Tokens: %d, Cost: $%.4f\n",
agentResult.Usage.TotalTokens, agentResult.Cost)
}Message Bus
The message bus provides pub/sub communication within a team. Subscribe to topics to monitor agent responses in real time. Sends are non-blocking — messages are dropped if a subscriber's channel is full.
bus := team.NewBus()
// Subscribe to agent responses
ch, unsub := bus.Subscribe("team.response", 100)
defer unsub()
// Monitor in a goroutine
go func() {
for msg := range ch {
fmt.Printf("[%s] %s (round %s)\n",
msg.From, msg.Content, msg.Metadata["round"])
}
}()
// Pass the bus to a team
t := team.New("my-team",
team.WithBus(bus),
team.WithMembers(...),
)
t.Run(ctx, "discuss this topic")
// Retrieve full history
history := bus.History()Consensus Strategies
Strategies determine when a team has reached a decision and how to form the final answer.
Unanimous
Waits for all agents to respond. Combines all responses in alphabetical order. Default strategy.
Majority
Waits for more than half of agents. Returns as soon as a majority have responded.
FirstResponse
Returns immediately when any agent completes. Cancels remaining agents.
Custom Strategies
Implement the Strategy interface for domain-specific consensus logic.
type Strategy interface {
Name() string
Evaluate(total int, responses map[string]string) (decision string, reached bool)
}
// Example: converge when all agents agree on a keyword
type KeywordConsensus struct {
Keyword string
}
func (k KeywordConsensus) Name() string { return "keyword" }
func (k KeywordConsensus) Evaluate(total int, responses map[string]string) (string, bool) {
if len(responses) < total {
return "", false
}
for _, content := range responses {
if !strings.Contains(strings.ToLower(content), k.Keyword) {
return "", false
}
}
return combineResponses(responses), true
}Coordinator
By default, team decisions are formed by concatenating agent responses. A coordinator is an optional leader agent that receives all member responses and synthesizes a single, coherent final decision — like a team lead who listens to everyone before making the call.
coordinator := agent.New("lead",
agent.WithProvider(provider),
agent.WithModel("gpt-4o"),
agent.WithInstructions("You are the team lead. Synthesize all perspectives into a clear decision."),
)
t := team.New("code-review",
team.WithMembers(
team.Member{Agent: reviewer, Role: "correctness"},
team.Member{Agent: security, Role: "security"},
team.Member{Agent: perf, Role: "performance"},
),
team.WithCoordinator(coordinator),
team.WithStrategy(team.Unanimous{}),
team.WithConfig(team.Config{MaxRounds: 1}),
)
result, _ := t.Run(ctx, "Review this function: ...")
// result.Decision.Content is the coordinator's synthesized answer
// result.Responses includes both member and coordinator resultsHow It Works
- Member agents run their rounds as normal (controlled by the consensus strategy)
- After rounds complete, the coordinator receives the original input and all member responses
- The coordinator produces the final team decision
- If the coordinator fails, the team falls back to the combined member responses
Coordinator Costs
The coordinator's LLM call is included in the team's total cost and token usage. Its result appears in result.Responses alongside member results, and a team.coordinator trace span is emitted.
Pipeline (Linear)
The Pipeline pattern chains agents sequentially — each stage processes input, produces output, and transfers state ownership to the next stage. Previous stages lose access to the data once ownership is transferred.
import "github.com/lonestarx1/gogrid/pkg/orchestrator/pipeline"
p := pipeline.New("research-pipeline",
pipeline.WithStages(
pipeline.Stage{Name: "collect", Agent: collector},
pipeline.Stage{Name: "analyze", Agent: analyzer},
pipeline.Stage{Name: "summarize", Agent: summarizer},
),
pipeline.WithConfig(pipeline.Config{
Timeout: 5 * time.Minute,
CostBudget: 2.00,
}),
)
result, err := p.Run(ctx, "Research the impact of AI on healthcare")
fmt.Println(result.Output) // Final stage's output
fmt.Printf("Stages: %d, Cost: $%.4f\n", len(result.Stages), result.TotalCost)State Ownership Transfer
Pipelines integrate with GoGrid's memory/transfer package. Each stage gets an owned handle — when state transfers to the next stage, the previous handle is invalidated. The audit trail records every transfer.
// The transfer log shows ownership history
for _, entry := range result.TransferLog {
fmt.Printf("%s -> %s (generation %d)\n",
entry.From, entry.To, entry.Generation)
}
// Output:
// -> collect (generation 1)
// collect -> analyze (generation 2)
// analyze -> summarize (generation 3)Pipeline Stages
Each stage can optionally transform its input and validate its output.
pipeline.Stage{
Name: "analyze",
Agent: analyzer,
// Transform the previous stage's output before passing to this agent
InputTransform: func(input string) string {
return "Analyze the following data:\n\n" + input
},
// Validate the output before proceeding to the next stage
OutputValidate: func(output string) error {
if len(output) < 100 {
return errors.New("analysis too short")
}
return nil
},
// Per-stage timeout and cost budget
Timeout: 30 * time.Second,
CostBudget: 0.50,
}Progress Reporting
Track pipeline progress with a callback function.
p := pipeline.New("tracked",
pipeline.WithStages(...),
pipeline.WithProgress(func(idx, total int, sr pipeline.StageResult) {
fmt.Printf("[%d/%d] Stage %q completed\n", idx+1, total, sr.Name)
}),
)Retry & Error Handling
Stages support configurable retry policies and error actions.
// Retry up to 3 times with a delay between attempts
pipeline.Stage{
Name: "flaky-api",
Agent: apiAgent,
Retry: pipeline.RetryPolicy{
MaxAttempts: 3,
Delay: 2 * time.Second,
},
}
// Skip on failure instead of aborting the pipeline
pipeline.Stage{
Name: "optional-enrichment",
Agent: enrichAgent,
OnError: pipeline.Skip,
}
// Default: abort the pipeline on failure
pipeline.Stage{
Name: "critical-step",
Agent: criticalAgent,
OnError: pipeline.Abort, // default behavior
}Error Actions
Abort
Stop the pipeline and return the error. This is the default behavior.
Skip
Mark the stage as skipped and continue with the previous stage's output. The pipeline proceeds to the next stage.
When a stage with retry fails all attempts and has OnError: Skip, the stage is skipped after exhausting retries. The StageResult.Attempts field records how many times the stage was executed.
Graph
The Graph pattern extends pipelines with conditional branches, parallel execution (fan-out/fan-in), and loops. Nodes wrap agents and edges connect them with optional condition functions.
import "github.com/lonestarx1/gogrid/pkg/orchestrator/graph"
g, err := graph.NewBuilder("review-pipeline").
AddNode("draft", draftAgent).
AddNode("review", reviewAgent).
AddNode("publish", publishAgent).
AddEdge("draft", "review").
AddEdge("review", "publish").
Options(
graph.WithConfig(graph.Config{
MaxIterations: 10,
Timeout: 5 * time.Minute,
CostBudget: 3.00,
}),
graph.WithTracer(tracer),
).
Build()
result, err := g.Run(ctx, "Write about AI agents")
fmt.Println(result.Output)
fmt.Printf("Nodes: %d, Cost: $%.4f\n", len(result.NodeResults), result.TotalCost)Fan-Out / Fan-In
Independent branches run concurrently. When multiple edges point to the same node, it waits for all incoming sources to complete before running.
// Diamond pattern: a -> b, a -> c, b -> d, c -> d
// b and c run in parallel; d waits for both before running
g, _ := graph.NewBuilder("diamond").
AddNode("split", splitAgent).
AddNode("path-a", agentA).
AddNode("path-b", agentB).
AddNode("merge", mergeAgent).
AddEdge("split", "path-a").
AddEdge("split", "path-b").
AddEdge("path-a", "merge").
AddEdge("path-b", "merge").
Build()Graph Builder
The fluent builder API validates the graph at build time — duplicate nodes, missing edge targets, and other structural errors are caught before execution.
b := graph.NewBuilder("my-graph")
// Add nodes (each wraps an agent)
b.AddNode("research", researchAgent)
b.AddNode("analyze", analyzeAgent)
b.AddNode("report", reportAgent)
// Add edges (optionally with conditions)
b.AddEdge("research", "analyze")
b.AddEdge("analyze", "report")
// Apply options
b.Options(graph.WithTracer(tracer))
// Build validates and returns the graph
g, err := b.Build()
if err != nil {
log.Fatal(err) // e.g., "edge references unknown node"
}DOT Export
Export the graph to Graphviz DOT format for visualization.
dot := g.DOT()
// Output:
// digraph "my-graph" {
// rankdir=LR;
// node [shape=box, style=rounded];
// "research";
// "analyze";
// "report";
// "research" -> "analyze";
// "analyze" -> "report";
// }Loops & Conditions
Conditional edges use graph.When to control routing based on a node's output. Backward edges create loops with a configurable max iteration guard.
g, _ := graph.NewBuilder("review-loop").
AddNode("draft", draftAgent).
AddNode("review", reviewAgent).
AddNode("revise", reviseAgent).
AddNode("publish", publishAgent).
AddEdge("draft", "review").
// Conditional: route based on review output
AddEdge("review", "revise", graph.When(func(out string) bool {
return strings.Contains(out, "needs revision")
})).
AddEdge("review", "publish", graph.When(func(out string) bool {
return strings.Contains(out, "approved")
})).
// Loop back: revise -> review
AddEdge("revise", "review").
Options(graph.WithConfig(graph.Config{
MaxIterations: 5, // prevent infinite loops
})).
Build()Edge Helpers
// When wraps a simple output check
graph.When(func(output string) bool {
return strings.Contains(output, "approved")
})
// Always is an unconditional edge (same as no condition)
graph.Always()Dynamic Orchestration
Dynamic Orchestration is GoGrid's most powerful pattern. A Runtime enables agents to spawn child agents, teams, pipelines, or graphs at runtime — the executing agent decides which orchestration to use based on the problem at hand.
import "github.com/lonestarx1/gogrid/pkg/orchestrator/dynamic"
rt := dynamic.New("coordinator",
dynamic.WithConfig(dynamic.Config{
MaxConcurrent: 5,
MaxDepth: 3,
CostBudget: 2.00,
}),
dynamic.WithTracer(tracer),
)
// Embed the runtime in context for child access
ctx := rt.Context(ctx)
// Spawn any orchestration pattern as a child
agentResult, _ := rt.SpawnAgent(ctx, researchAgent, "Find papers on X")
teamResult, _ := rt.SpawnTeam(ctx, reviewTeam, agentResult.Message.Content)
pipeResult, _ := rt.SpawnPipeline(ctx, summarizePipeline, teamResult.Decision.Content)
graphResult, _ := rt.SpawnGraph(ctx, publishGraph, pipeResult.Output)
// Aggregate metrics across all children
res := rt.Result()
fmt.Printf("Children: %d, Cost: $%.4f\n", len(res.Children), res.TotalCost)Spawning Children
Four spawn methods correspond to GoGrid's four orchestration patterns. Each blocks until the child completes, inherits the parent's tracing context, and records cost/usage metrics.
// Spawn a single agent
result, err := rt.SpawnAgent(ctx, agent, "input")
// Spawn a team
result, err := rt.SpawnTeam(ctx, team, "discuss this")
// Spawn a pipeline
result, err := rt.SpawnPipeline(ctx, pipeline, "process this")
// Spawn a graph
result, err := rt.SpawnGraph(ctx, graph, "route this")Context Propagation
The runtime is stored in context so nested orchestrations can dynamically spawn further children up to the configured depth limit.
// Retrieve runtime from context (e.g., inside a tool)
rt := dynamic.FromContext(ctx)
if rt != nil {
// Spawn a sub-task from within a tool execution
result, _ := rt.SpawnAgent(ctx, helperAgent, "assist with this")
}
// Check current nesting depth
depth := dynamic.DepthFromContext(ctx)Resource Governance
The runtime enforces resource limits to prevent runaway costs, infinite recursion, and resource exhaustion.
MaxConcurrent
Maximum number of children executing simultaneously. Uses a semaphore — excess spawns block until a slot is available.
MaxDepth
Maximum nesting depth for recursive spawning. Prevents infinite recursion when children spawn further children. Defaults to 10.
CostBudget
Maximum total cost in USD across all children. New spawns are rejected once the budget is exhausted.
rt := dynamic.New("governed",
dynamic.WithConfig(dynamic.Config{
MaxConcurrent: 3, // at most 3 children running at once
MaxDepth: 4, // max 4 levels of nesting
CostBudget: 1.00, // $1.00 total across all children
}),
)
// Check remaining budget before expensive operations
remaining := rt.RemainingBudget() // -1 if unlimitedCascading Cancellation
All children use derived contexts. Canceling the parent context automatically cancels all running children — no orphaned goroutines or wasted LLM calls.
ctx, cancel := context.WithTimeout(ctx, 30*time.Second)
defer cancel()
ctx = rt.Context(ctx)
// All spawned children will be canceled if the 30s timeout fires
rt.SpawnAgent(ctx, slowAgent, "this might take a while")Async & Futures
Use Go to launch children in the background. Returns a Future that can be awaited or polled.
// Launch multiple children concurrently
f1 := rt.Go(ctx, "research", func(ctx context.Context) (string, error) {
r, err := rt.SpawnAgent(ctx, researchAgent, "find papers")
if err != nil {
return "", err
}
return r.Message.Content, nil
})
f2 := rt.Go(ctx, "analyze", func(ctx context.Context) (string, error) {
r, err := rt.SpawnAgent(ctx, analysisAgent, "analyze trends")
if err != nil {
return "", err
}
return r.Message.Content, nil
})
// Await results
research, _ := f1.Wait(ctx)
analysis, _ := f2.Wait(ctx)
// Or wait for all background children at once
rt.Wait()Future API
// Wait blocks until the future completes or context is canceled
output, err := future.Wait(ctx)
// Done returns a channel that closes when the future completes
select {
case <-future.Done():
output, err := future.Wait(ctx) // returns immediately
case <-ctx.Done():
// timeout
}Tracing
GoGrid has built-in structured tracing. Every agent run, LLM call, tool execution, and memory operation produces spans with parent-child relationships.
import "github.com/lonestarx1/gogrid/pkg/trace"
// In-memory tracer for testing/debugging
tracer := trace.NewInMemory()
a := agent.New("agent",
agent.WithProvider(provider),
agent.WithModel("gpt-4o"),
agent.WithTracer(tracer),
)
result, _ := a.Run(ctx, "hello")
// Inspect spans
for _, span := range tracer.Spans() {
fmt.Printf("[%s] %s (%v)\n", span.Name, span.ID, span.Attributes)
}
// Output:
// [llm.complete] abc123 {llm.model: gpt-4o, llm.turn: 1}
// [memory.load] def456 {memory.key: agent}
// [memory.save] ghi789 {memory.key: agent, memory.entries: 3}
// [agent.run] jkl012 {agent.name: agent, agent.turns: 1}Stdout Tracer
For structured logging, the stdout tracer writes spans as JSON lines.
tracer := trace.NewStdout(os.Stdout)
// Each completed span is written as a JSON line:
// {"id":"...","name":"agent.run","start_time":"...","attributes":{...}}Team Trace Spans
Team execution produces a trace tree with team, round, and per-agent spans.
// Trace tree for a team run:
// team.run
// └─ team.round (round=1)
// ├─ agent.run (agent-a)
// │ ├─ memory.load
// │ ├─ llm.complete
// │ └─ memory.save
// └─ agent.run (agent-b)
// ├─ memory.load
// ├─ llm.complete
// └─ memory.savePipeline Trace Spans
Pipeline execution produces a trace tree with pipeline and per-stage spans.
// Trace tree for a pipeline run:
// pipeline.run
// ├─ pipeline.stage (collect)
// │ ├─ agent.run
// │ │ ├─ memory.load
// │ │ ├─ llm.complete
// │ │ └─ memory.save
// ├─ pipeline.stage (analyze)
// │ └─ agent.run ...
// └─ pipeline.stage (summarize)
// └─ agent.run ...Graph Trace Spans
// Trace tree for a graph run:
// graph.run
// ├─ graph.node (draft, iteration=1)
// │ └─ agent.run ...
// ├─ graph.node (review, iteration=1)
// │ └─ agent.run ...
// ├─ graph.node (revise, iteration=1)
// │ └─ agent.run ...
// ├─ graph.node (review, iteration=2)
// │ └─ agent.run ...
// └─ graph.node (publish, iteration=1)
// └─ agent.run ...Dynamic Trace Spans
// Trace tree for dynamic orchestration:
// dynamic.spawn_agent (research)
// └─ agent.run ...
// dynamic.spawn_team (debate)
// └─ team.run ...
// dynamic.go (background-task)
// └─ dynamic.spawn_pipeline (summarize)
// └─ pipeline.run ...OpenTelemetry Export
The OTLP exporter sends GoGrid trace spans to any OpenTelemetry-compatible backend — Jaeger, Zipkin, Grafana Tempo, and others. Built entirely with the Go standard library — no external OTel SDK required.
import "github.com/lonestarx1/gogrid/pkg/trace/otel"
exporter := otel.NewExporter(
otel.WithEndpoint("http://localhost:4318/v1/traces"),
otel.WithServiceName("my-agent-service"),
otel.WithServiceVersion("1.0.0"),
otel.WithBatchSize(100),
otel.WithFlushInterval(5 * time.Second),
)
defer exporter.Shutdown()
// Use as any tracer
a := agent.New("assistant",
agent.WithTracer(exporter),
agent.WithProvider(provider),
agent.WithModel("gpt-4o"),
)
result, _ := a.Run(ctx, "hello")Batching & Flushing
Spans are batched in memory and flushed periodically or when the batch size is reached. Shutdown sends all remaining spans.
// Spans are sent as OTLP JSON over HTTP POST
// Content-Type: application/json
// Payload follows the OTLP JSON span format with:
// - resourceSpans[].resource.attributes: service.name, service.version
// - scopeSpans[].spans[]: traceId, spanId, parentSpanId, name, kind, timestamps
// - Attributes prefixed with "gogrid." (e.g., gogrid.agent.name)
// - exception.message for error spansSemantic Conventions
GoGrid spans follow semantic conventions for attribute naming.
gogrid.agent.name
The agent's name. Attached to agent.run spans.
gogrid.llm.model
The LLM model used. Attached to llm.complete spans.
gogrid.tool.name
The tool's name. Attached to tool.execute spans.
gogrid.cost.usd
Cost in USD. Attached to agent.run spans.
Structured Logging
The trace/log package provides structured JSON logging with automatic trace correlation. When a span exists in the context, the logger includes trace_id and span_id in every log line.
import "github.com/lonestarx1/gogrid/pkg/trace/log"
logger := log.New(os.Stdout, log.Info)
// Basic logging
logger.Info("agent started", "agent", "researcher", "model", "gpt-4o")
// Context-aware logging with trace correlation
logger.InfoCtx(ctx, "LLM call complete", "tokens", "150", "model", "gpt-4o")
// Output (JSON):
// {"level":"info","time":"2026-02-16T12:00:00Z","msg":"LLM call complete",
// "trace_id":"abc123","span_id":"def456",
// "fields":{"tokens":"150","model":"gpt-4o"}}Log Levels
Debug
Most verbose. Use for detailed diagnostic information.
Info
Default level. Normal operational messages.
Warn
Potential issues that don't prevent operation.
Error
Failures that need attention.
File Logging with Rotation
FileWriter writes to disk with automatic size-based rotation. Configurable max file size and number of rotated files to keep.
fw, err := log.NewFileWriter("/var/log/gogrid.log", log.FileConfig{
MaxSize: 10 * 1024 * 1024, // 10 MB per file
MaxFiles: 5, // keep 5 rotated files
})
if err != nil {
panic(err)
}
defer fw.Close()
logger := log.New(fw, log.Debug)
logger.Info("agent started")
// Rotated files: gogrid.log, gogrid.log.1, gogrid.log.2, ...Metrics
GoGrid provides Prometheus-compatible metrics with no external dependencies. The metrics.Collector wraps any tracer and automatically populates counters, gauges, and histograms from trace spans.
import "github.com/lonestarx1/gogrid/pkg/trace/metrics"
reg := metrics.NewRegistry()
collector := metrics.NewCollector(innerTracer, reg)
// Use collector as the tracer — metrics are recorded automatically
a := agent.New("assistant",
agent.WithTracer(collector),
agent.WithProvider(provider),
agent.WithModel("gpt-4o"),
)
// Expose metrics for Prometheus scraping
http.HandleFunc("/metrics", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "text/plain; version=0.0.4")
fmt.Fprint(w, reg.Export())
})Auto-Collected Metrics
gogrid_agent_runs_total
Total agent runs, labeled by agent name and status (ok/error).
gogrid_agent_run_duration_seconds
Histogram of agent run durations.
gogrid_llm_calls_total
Total LLM calls, labeled by model and status.
gogrid_llm_call_duration_seconds
Histogram of LLM call latencies.
gogrid_llm_tokens_total
Total tokens consumed, labeled by model and type (prompt/completion).
gogrid_tool_executions_total
Total tool executions, labeled by tool name and status.
gogrid_tool_execution_duration_seconds
Histogram of tool execution durations.
gogrid_memory_operations_total
Total memory operations, labeled by operation type (load/save).
gogrid_cost_usd_total
Total cost in USD, labeled by agent and model.
Custom Metrics
Create your own counters, gauges, and histograms for application-specific metrics.
reg := metrics.NewRegistry()
// Counters
requests := reg.Counter("http_requests_total", "Total HTTP requests")
requests.Inc(map[string]string{"method": "GET", "status": "200"})
// Gauges
connections := reg.Gauge("active_connections", "Active connections")
connections.Set(42, map[string]string{"server": "web-1"})
// Histograms (custom buckets)
latency := reg.Histogram("request_duration_seconds", "Request latency",
0.01, 0.05, 0.1, 0.5, 1, 5)
latency.Observe(0.123, map[string]string{"handler": "api"})Cost Tracking
Every LLM call is metered. GoGrid includes default pricing for popular models and tracks costs per agent and per team.
// Per-agent cost in Result
result, _ := agent.Run(ctx, "hello")
fmt.Printf("Cost: $%.6f\n", result.Cost)
// Per-team aggregate cost
teamResult, _ := t.Run(ctx, "discuss")
fmt.Printf("Team cost: $%.6f\n", teamResult.TotalCost)
// Budget enforcement — agent stops when budget is exceeded
a := agent.New("agent",
agent.WithConfig(agent.Config{CostBudget: 0.50}),
// ...
)
// Team budget — team stops starting new rounds when exceeded
t := team.New("team",
team.WithConfig(team.Config{CostBudget: 5.00}),
// ...
)Memory Stats in Results
When the agent's memory implements StatsMemory, the result includes aggregate memory statistics.
if result.MemoryStats != nil {
fmt.Printf("Memory: %d keys, %d entries, %d bytes\n",
result.MemoryStats.Keys,
result.MemoryStats.TotalEntries,
result.MemoryStats.TotalSize)
}Cost Governance
Advanced cost governance features for production deployments. Set budgets with threshold alerts, allocate costs to specific entities, and generate aggregate reports.
Budget Alerts
Register a callback that fires when cost crosses threshold fractions of a configured budget. Each threshold fires at most once.
import "github.com/lonestarx1/gogrid/pkg/cost"
tracker := cost.NewTracker()
tracker.SetBudget(10.00) // $10 budget
tracker.OnBudgetThreshold(func(threshold, current float64) {
fmt.Printf("ALERT: %.0f%% of budget reached ($%.2f)\n",
threshold*100, current)
}, 0.5, 0.8, 1.0) // alerts at 50%, 80%, 100%
// As costs accumulate, alerts fire automatically
tracker.Add("gpt-4o", usage) // triggers 50% alert at $5Cost Allocation
Attribute costs to specific entities — agents, teams, pipelines, or any named component. Track which parts of your system spend the most.
// Attribute costs to specific entities
tracker.AddForEntity("gpt-4o", "research-agent", usage1)
tracker.AddForEntity("gpt-4o", "summary-agent", usage2)
tracker.AddForEntity("gpt-4o-mini", "research-agent", usage3)
// Query per-entity costs
researchCost := tracker.EntityCost("research-agent")
summaryCost := tracker.EntityCost("summary-agent")
fmt.Printf("Research: $%.4f, Summary: $%.4f\n", researchCost, summaryCost)Cost Reports
Generate aggregate reports breaking down costs by model and entity.
report := tracker.Report()
fmt.Printf("Total: $%.4f across %d calls\n", report.TotalCost, report.RecordCount)
// By model
for model, mr := range report.ByModel {
fmt.Printf(" %s: %d calls, $%.4f, %d tokens\n",
model, mr.Calls, mr.Cost, mr.Usage.TotalTokens)
}
// By entity
for entity, cost := range report.ByEntity {
fmt.Printf(" %s: $%.4f\n", entity, cost)
}Mock Provider
The pkg/llm/mock package provides a configurable mock LLM provider for testing GoGrid agents, teams, pipelines, and graphs without API keys.
Basic Usage
import (
"github.com/lonestarx1/gogrid/pkg/llm"
"github.com/lonestarx1/gogrid/pkg/llm/mock"
)
// Fixed response for all calls.
provider := mock.New(mock.WithFallback(&llm.Response{
Message: llm.NewAssistantMessage("mock response"),
Usage: llm.Usage{PromptTokens: 10, CompletionTokens: 5, TotalTokens: 15},
Model: "mock",
}))
agent.New("my-agent", agent.WithProvider(provider), agent.WithModel("mock"))Sequential Responses
Queue multiple responses for multi-turn conversations. The provider returns them in order, then falls back to the fallback response.
toolCallResp := &llm.Response{
Message: llm.Message{
Role: llm.RoleAssistant,
ToolCalls: []llm.ToolCall{{ID: "tc-1", Function: "search", Arguments: []byte(`{"q":"test"}`)}},
},
Usage: llm.Usage{PromptTokens: 10, CompletionTokens: 5, TotalTokens: 15},
Model: "mock",
}
finalResp := &llm.Response{
Message: llm.NewAssistantMessage("found it"),
Usage: llm.Usage{PromptTokens: 20, CompletionTokens: 10, TotalTokens: 30},
Model: "mock",
}
provider := mock.New(
mock.WithResponses(toolCallResp, finalResp),
mock.WithFallback(finalResp),
)Error Injection & Latency
// Always fail.
provider := mock.New(mock.WithError(errors.New("api unavailable")))
// Fail first 2 calls, then succeed.
provider = mock.New(
mock.WithFailCount(2),
mock.WithFallback(successResponse),
)
// Simulate 100ms latency (respects context cancellation).
provider = mock.New(
mock.WithDelay(100 * time.Millisecond),
mock.WithFallback(response),
)Call Recording
The mock provider records all calls for assertions in tests.
provider := mock.New(mock.WithFallback(response))
// ... run agent ...
fmt.Println(provider.Calls()) // Number of Complete calls
history := provider.History() // All recorded call params
provider.Reset() // Clear history, keep configEvaluation Framework
The pkg/eval package provides composable evaluators for scoring agent outputs. Evaluators measure both output quality and operational metrics like cost and tool usage.
Evaluator Interface
type Evaluator interface {
Name() string
Evaluate(ctx context.Context, result *agent.Result) (Score, error)
}
type Score struct {
Pass bool // Primary signal: did the result meet criteria?
Value float64 // Normalized 0.0-1.0 for trend analysis
Reason string // Human-readable explanation
}Built-in Evaluators
import "github.com/lonestarx1/gogrid/pkg/eval"
// Exact string match.
eval.NewExactMatch("expected output")
// Substring containment (Value = fraction of substrings found).
eval.NewContains("Go", "concurrency", "goroutines")
// Cost budget check.
eval.NewCostWithin(0.05) // $0.05 USD max
// Tool usage expectations.
eval.NewToolUse(eval.ToolExpectation{
Name: "search",
MinCalls: 1,
})
// LLM-as-judge (scores 0-10, passes at >= 7).
eval.NewLLMJudge(provider, "gpt-4o", "Rate clarity and accuracy.")Evaluation Suite
Compose multiple evaluators into a suite. The suite runs all evaluators and aggregates results. SuiteResult.Pass is true only if every evaluator passed.
suite := eval.NewSuite(
eval.NewContains("Go", "compiled"),
eval.NewCostWithin(0.05),
eval.NewFunc("min_length", func(_ context.Context, r *agent.Result) (eval.Score, error) {
if len(r.Message.Content) >= 20 {
return eval.Score{Pass: true, Value: 1.0, Reason: "sufficient"}, nil
}
return eval.Score{Pass: false, Value: 0.0, Reason: "too short"}, nil
}),
)
result, err := suite.Run(ctx, agentResult)
fmt.Println(result.Pass) // true if all passed
for name, score := range result.Scores {
fmt.Printf("%s: pass=%v value=%.2f reason=%s\n",
name, score.Pass, score.Value, score.Reason)
}Benchmarks
The pkg/eval/bench package provides benchmarks for GoGrid's core patterns using the mock provider. All benchmarks measure framework overhead, not LLM latency.
# Run all benchmarks
go test -bench=. ./pkg/eval/bench/
# With memory profiling
go test -bench=. -benchmem ./pkg/eval/bench/
# Available benchmarks:
# BenchmarkAgentRun - Basic agent execution
# BenchmarkAgentRunWithToolUse - Agent with tool calling
# BenchmarkAgentRunParallel - Concurrent agent execution
# BenchmarkPipelineThreeStages - Fixed three-stage pipeline
# BenchmarkPipelineScaling - Pipeline: 1, 3, 5, 10 stages
# BenchmarkTeamTwoMembers - Two-agent team
# BenchmarkTeamScaling - Team: 1, 2, 5, 10, 20 members
# BenchmarkSharedMemorySaveLoad - Memory load/save ops
# BenchmarkSharedMemoryContention - Memory: 1, 2, 5, 10 writers