private beta · v0.4

Agents that learn
from every run.

The contextual-learning layer for AI agents. We ingest your production telemetry, score every run, and write what works into a memory your agents read before the next one.

0
runs analyzed last 30d
0
lessons learned
+0%
mean accuracy lift
−0%
repeated-mistake rate
0 lines
to instrument
spolme28b300f…complete
reasoning756ms
authenticate2ms
reasoning563ms
fetch_email292ms
reasoning776ms
summarize763ms
generateReply1062ms
sendReply524ms
01
trace

See every step the agent took.

A timeline of every reasoning call, tool invocation, and result — with full inputs, outputs, and timing. Replayable, queryable, framework-agnostic.

spolmPersonalLogse28b300f-406c-45d1…
GmailReadAndReply
Task: Read the latest email and reply with 'chicken' in it
Run ID e28b300f-406c…Duration 7.147s
Steps
11
Status
Complete
OverviewLogsIssuesRubric Evals
Steps
1. reasoning
llmgemini
756msfailure
2. authenticate
authgmail_oauth
2mssuccess
3. reasoning
llmgemini
563mssuccess
4. fetch_email
emailgmail
292mssuccess
5. reasoning
llmgemini
776mssuccess
6. summarize
llmgemini
763mssuccess
7. reasoning
llmgemini
776mssuccess
8. generateReply
llmgemini
1062mssuccess
9. reasoning
llmgemini
927mssuccess
10. sendReply
emailgmail
524mssuccess
11. reasoning
llmgemini
664mssuccess
reasoning
Step 3c2a574b-ca03-44a4-ba72…
failure
INPUT
{
  "goal": "Read the latest email and reply with 'chicken' in it",
  "iteration": 1,
  "context": {
    "lastAction": null,
    "authenticated": false
  }
}
OUTPUT
{
  "action": "authenticate",
  "reasoning": "The agent is not authenticated, so it needs to authenticate first."
}
02
score

Every run gets a grade and a diagnosis.

Configurable rubrics evaluate the run end-to-end. Spolm pinpoints what went right, what went wrong, and writes a root-cause summary traced to the exact step that broke.

spolmLogsRun Overview
Task: What is quantum computing?
Run ID d96995ef-8ce7-4472-ab42…·Duration 59.88s·Tokens 0
complete
Score
0/10
0 passed · 0 failed
✓ What went right
✕ What went wrong
◐ Root cause summary

03
learn

Yesterday's failure is today's context.

Successful and failed decision patterns get written to a knowledge base ranked by accuracy. Before each run, the agent retrieves the most relevant lessons — and adapts strategy pre-emptively.

Session 1 · Yesterday
Run #182
8 steps · 4.9s · 2 failures
reasoning756ms
reasoning612ms
fetch_email401ms
authenticate4ms
fetch_email292ms
reasoning776ms
generateReply1062ms
sendReply524ms
Lesson learned
Always authenticate before fetch_email
accuracy +24%tool sequence
Knowledge base
246 patterns
retrieved as context
Session 2 · Today
Run #183
4 steps · 1.8s · 0 failures
authenticate3ms
fetch_email280ms
generateReply998ms
sendReply510ms
04 · quickstart

Four lines to instrument.

Drop the SDK into your agent loop. Spolm tails your runs, scores them, and exposes a single retrieve() call that returns relevant context for the next prompt.

  • Works with LangChain, LlamaIndex, AutoGen, or a custom loop.
  • OpenTelemetry-compatible. Hosted or self-hosted.
  • SOC 2 in progress. PII redaction on by default.
python · agent.pycopy
# 1. wrap your run
from spolm import Spolm
sp = Spolm(api_key="sk-…")

# 2. retrieve learned context before each run
ctx = sp.retrieve(task="reply to email")

# 3. instrument the run
with sp.run(task) as r:
    result = agent.invoke(task, context=ctx)
    r.log_result(result)

# 4. spolm scores it, stores the lessons, ranks them
#    next retrieve() gets smarter automatically.
built for

Teams shipping AI agents in production.

customer support
Triage that improves with every ticket.

Catch reasoning loops, hand-off mistakes, and tone drift before they reach customers. Tighter every week, no retraining.

research & retrieval
Stop relitigating the same query.

When one agent navigates a knowledge base for a query, the next inherits the path. Lower tokens, faster answers.

dev tools
Coding agents that remember your repo.

Naming, internal APIs, test idioms — learned once, applied across every PR. The agent stops re-introducing the same bugs.

we're in private beta

See it on your own agents.