Core Framework¶
The versifai.core package provides the shared agentic infrastructure: the ReAct loop engine, LLM client, memory management, tool system, and configuration.
Agent Base¶
BaseAgent
¶
BaseAgent(display: AgentDisplay, memory: AgentMemory, llm: LLMClient, registry: ToolRegistry)
Base class providing the ReAct loop and shared infrastructure.
Subclasses must
- Call
super().__init__(display, memory, llm, registry) - Implement
_register_tools() - Set
self._system_prompt
Source code in src/versifai/core/agent.py
LLM Client¶
LLMClient
¶
LLMClient(model: str = 'claude-sonnet-4-6', max_tokens: int = 8192, api_key: str | None = None, api_base: str | None = None, retry_attempts: int = 3, retry_base_delay: float = 2.0, extended_context: bool = True)
Multi-provider LLM client for tool-use conversations.
Uses LiteLLM <https://docs.litellm.ai/>_ under the hood to support
Anthropic Claude, OpenAI, Azure, Bedrock, Vertex, and 100+ other providers
through a single unified API.
Provider is inferred from the model string:
"claude-sonnet-4-6"→ Anthropic"gpt-4o"→ OpenAI"bedrock/anthropic.claude-3-5-sonnet"→ AWS Bedrock
API keys are resolved from environment variables automatically by LiteLLM
(ANTHROPIC_API_KEY, OPENAI_API_KEY, etc.) or can be passed explicitly.
Example::
from versifai.core import LLMClient
# Anthropic Claude (default)
llm = LLMClient()
# OpenAI
llm = LLMClient(model="gpt-4o")
# Explicit key
llm = LLMClient(model="claude-sonnet-4-6", api_key="sk-...")
Source code in src/versifai/core/llm.py
send
¶
send(messages: list[dict], system: str, tools: list[dict]) -> LLMResponse
Send a message with tool definitions and return the response.
For Anthropic models, applies prompt caching on the system prompt and tool definitions to reduce token costs.
Parameters¶
messages : list[dict] Conversation history (Anthropic or OpenAI format — auto-converted). system : str System prompt text. tools : list[dict] Tool definitions (Anthropic format — auto-converted for other providers by LiteLLM).
Returns¶
LLMResponse Normalized response with content blocks and usage stats.
Source code in src/versifai/core/llm.py
240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 | |
extract_actions
staticmethod
¶
extract_actions(response: LLMResponse | Any) -> list[dict]
Parse a response into a list of actions.
Each action is one of
{"type": "text", "text": "..."}{"type": "tool_use", "id": "...", "name": "...", "input": {...}}
Accepts both our LLMResponse and raw Anthropic Message objects
for backward compatibility.
Source code in src/versifai/core/llm.py
build_tool_result_message
staticmethod
¶
Build a tool_result message block for the conversation.
Source code in src/versifai/core/llm.py
LLMResponse
dataclass
¶
LLMResponse(content: list[dict] = list(), stop_reason: str = '', usage: dict = dict(), raw: Any = None)
Normalized response from any LLM provider.
Wraps the raw provider response into a consistent shape so the rest of the framework doesn't need to care which provider was used.
Memory¶
AgentMemory
¶
Manages conversation history and decision logging for the agent.
Handles context window management by summarizing old messages when the history gets too long. Supports per-source resets to keep the context clean across data sources.
Source code in src/versifai/core/memory.py
add_user_message
¶
Add a user message (or initial prompt) to the conversation.
add_assistant_message
¶
Add an assistant message to the conversation.
Content is the raw content blocks from Claude's response.
Source code in src/versifai/core/memory.py
add_tool_result
¶
add_tool_result(tool_use_id: str, result: str, is_error: bool = False, image_base64: str = '', image_media_type: str = 'image/png') -> None
Add a tool result message, optionally with an image.
When image_base64 is provided, the tool result content becomes a list of content blocks (text + image) so Claude can see chart output.
Source code in src/versifai/core/memory.py
add_context_note
¶
reset_for_new_source
¶
Reset the conversation for a new data source.
Clears all messages but preserves decisions, source summaries, and context notes. Optionally logs a summary of the previous source.
Source code in src/versifai/core/memory.py
get_carryover_context
¶
Build a context string of everything learned so far. Injected into the prompt for a new source so the agent retains cross-source knowledge (schema decisions, FIPS patterns, etc).
Source code in src/versifai/core/memory.py
strip_images
¶
Remove ALL base64 images from conversation history.
Returns the number of images stripped. Images are replaced with a text placeholder so the agent knows a chart was viewed.
Source code in src/versifai/core/memory.py
compress_all_tool_results
¶
Force-compress ALL tool results regardless of age.
Uses a lower threshold (300 chars) than the normal compression. Called during emergency recalibration after context window overflow.
Source code in src/versifai/core/memory.py
force_summarize
¶
Summarize the conversation regardless of message count.
Used during emergency recalibration to reduce context size.
Source code in src/versifai/core/memory.py
recalibrate
¶
Emergency context reduction pipeline.
Called when the LLM rejects a request due to context window overflow. Applies three progressive reduction steps: 1. Strip all base64 images (biggest per-item savings) 2. Compress all tool results aggressively (300 char threshold) 3. Force-summarize old conversation turns
Source code in src/versifai/core/memory.py
log_decision
¶
Log a decision made by the agent.
Source code in src/versifai/core/memory.py
log_source_summary
¶
Store a summary for a completed source (survives context compression).
Display¶
AgentDisplay
¶
Clean chat-style display for the agent.
In Databricks: renders styled HTML bubbles via displayHTML. Outside: prints clean formatted text with unicode icons.
Source code in src/versifai/core/display.py
dump_progress
¶
Overwrite path with the full plain-text log so far.
Builds the full string first and writes in a single call — Databricks FUSE mounts don't support append and can behave unpredictably with multiple writes within one open().
Source code in src/versifai/core/display.py
phase
¶
Display a major phase header.
Source code in src/versifai/core/display.py
step
¶
Display a minor step (turn counter, etc). Kept minimal.
Source code in src/versifai/core/display.py
thinking
¶
Display the agent's reasoning as a chat bubble.
Source code in src/versifai/core/display.py
tool_call
¶
Display a tool invocation.
Source code in src/versifai/core/display.py
tool_result
¶
Display a tool result.
Source code in src/versifai/core/display.py
success
¶
Display a success message.
Source code in src/versifai/core/display.py
warning
¶
Display a warning.
Source code in src/versifai/core/display.py
error
¶
Display an error.
Source code in src/versifai/core/display.py
summary_table
¶
Display a summary table.
Source code in src/versifai/core/display.py
ask_human
¶
Pause the agent and ask the human operator a question.
Always uses Python input() for reliability — Databricks
widgets don't work well in all notebook execution contexts.
Displays the question with HTML formatting if in Databricks,
then collects the answer via stdin.
Source code in src/versifai/core/display.py
Configuration¶
CatalogConfig
dataclass
¶
Shared Databricks Unity Catalog connection settings.
Used by all agent families that need to read from or write to Unity Catalog tables and volumes.
AgentSettings
dataclass
¶
AgentSettings(max_agent_turns: int = 200, max_turns_per_source: int = 120, max_acceptance_iterations: int = 3, sample_rows: int = 10, profile_sample_size: int = 500)
Tunable parameters for agent behaviour.
Sensible defaults are provided. Override as needed when constructing agent instances.
Tool System¶
BaseTool
¶
Bases: ABC
Abstract base class for agent tools.
Subclasses must implement
- name (property)
- description (property)
- parameters_schema (property) — JSON Schema dict
- _execute(**kwargs) -> ToolResult
parameters_schema
abstractmethod
property
¶
JSON Schema describing the tool's input parameters.
Example
{ "type": "object", "properties": { "path": {"type": "string", "description": "Volume path to explore"} }, "required": ["path"] }
execute
¶
execute(**kwargs) -> ToolResult
Public entry point. Wraps _execute with error handling so that any exception is captured and returned as an error ToolResult instead of crashing the agent loop.
Special handling for TypeError (missing/wrong parameters): instead of a raw traceback, returns the tool's parameter schema so the agent can self-correct in one retry.
Source code in src/versifai/core/tools/base.py
to_claude_tool_definition
¶
Return the tool definition in the format expected by the Anthropic API's tool_use feature.
Source code in src/versifai/core/tools/base.py
ToolResult
dataclass
¶
ToolResult(success: bool, data: Any = None, error: str = '', summary: str = '', image_path: str = '')
Structured result returned by every tool invocation.
to_content_str
¶
Serialize to a string suitable for returning to the LLM.
Source code in src/versifai/core/tools/base.py
ToolRegistry
¶
Central registry of all tools available to the agent.
Source code in src/versifai/core/tools/registry.py
execute
¶
execute(tool_name: str, /, **kwargs) -> ToolResult
Look up a tool by name and execute it.
The / makes tool_name positional-only so it can never
collide with a keyword argument in **kwargs — even if a
tool's own input schema includes a tool_name parameter
(e.g. create_custom_tool).
Source code in src/versifai/core/tools/registry.py
to_claude_tools
¶
Run Management¶
RunState
dataclass
¶
RunState(status: str = 'running', entry_point: str = 'run', current_phase: str = '', current_item: str = '', completed_phases: list[str] = list(), completed_items: dict[str, list[str]] = (lambda: {})(), error: str = '', updated_at: str = '')
Tracks execution state of an agent run for stop/resume support.
Persisted inside run_metadata.json under the "state" key so
there is a single file to read on resume.
Typical lifecycle::
state = RunState(entry_point="run")
state.mark_phase_start("orientation")
state.mark_phase_complete("orientation")
state.mark_phase_start("silver")
state.mark_item_complete("silver", "silver_county_master")
...
state.mark_completed() # or mark_interrupted() / mark_failed()
AgentDependency
dataclass
¶
AgentDependency(agent_type: str, config_name: str, run_id: str = '', base_path: str = '', outputs: list[str] = list())
Declares that a config consumes outputs from a previous agent run.
Used by orchestrators to resolve the concrete path to a prior run's outputs (findings, charts, tables, notes) at startup time.
Example::
AgentDependency(
agent_type="scientist",
config_name="geographic_disparity",
base_path="/Volumes/.../research_results/geographic_disparity",
)
generate_run_id
¶
Generate a unique run ID: YYYYMMDD_HHMMSS_XXXX.
Format: timestamp (to second) + 4 random hex chars for uniqueness. Lexicographic sort = chronological sort.
Source code in src/versifai/core/run_manager.py
init_run_directory
¶
Create a run-isolated output directory.
Creates base_path/runs/{run_id}/ with standard subdirectories:
charts/, tables/, notes/, models/.
Returns the full run directory path.
Source code in src/versifai/core/run_manager.py
resolve_dependency
¶
resolve_dependency(dep: AgentDependency) -> str
Resolve a dependency to its concrete run directory path.
If dep.run_id is set, returns dep.base_path/runs/{dep.run_id}/.
If empty, finds the latest run under dep.base_path/runs/.
If no runs directory exists, returns dep.base_path directly
(backward compatibility with pre-run-isolation outputs).
Source code in src/versifai/core/run_manager.py
resolve_run_path
¶
Resolve a run directory path.
If run_id is given, returns base_path/runs/{run_id}/.
If None, finds the most recent run by lexicographic sort of run IDs.
Raises FileNotFoundError if no runs exist.
Source code in src/versifai/core/run_manager.py
list_runs
¶
List all runs under base_path/runs/.
Returns a list of dicts: [{run_id, path, created, metadata}].