Feature: Intelligent Skill & Integration Discovery
Date: February 15, 2026 Status: Approved — partially implemented as Vector-First Architecture (VectorRouter, Feb 20) Version: V1.0 Epic: Tool Discoverability & Scalability
1. Context & Motivation
Implementation note (Feb 20, 2026): The core of this design — embedding-based tool selection — has been implemented as the Vector-First Architecture.
VectorRouterinbackend/chat/vector_router.pyroutes ~65% of requests without calling the LLM at all. Skills are indexed viastore_if_novel()withmemory_type="skill_index". The full SmartToolSelector/ToolCatalogService described below is the next evolution of this pattern. See ROADMAP.md — Knowledge Pipeline for how this connects to the compilation chain (VectorRouter = the shortcut before any runtime).
Problem Statement
As Morphee's integration ecosystem grows (users can download skills/integrations from marketplace), the current approach of including all available tools in every LLM request becomes inefficient:
- Token overhead: System prompt balloons from 3-5 KB to 10+ KB with 50+ tools
- Latency: LLM must parse and decide among 50+ irrelevant tools
- Cognitive load: AI struggles to pick the right tool from a massive list
- Cost: Token usage increases proportionally with tool count
- Discoverability: Users don't know what tools they have; LLM can't effectively recommend them
Scope
This feature enables intelligent, context-aware tool selection using semantic search (embeddings). Instead of including all tools in every request, the system:
- Embeds all tool descriptions at startup (core integrations, skills, MCP tools)
- At runtime, finds the top 8-12 most relevant tools based on user's message
- Passes only those to the LLM with transparent explanation
- Provides an escape hatch (
tool_discovery__find) for when users need tools outside the filtered set
Extended Scope: MCP (Model Context Protocol) Integration
This phase also introduces MCP support — a new MCPIntegration that manages external MCP servers and exposes their tools to Morphee:
- MCP servers (local or remote) register their tools
- Tool schemas are automatically extracted and embedded
- MCP tools are discoverable, smart-selectable, and executable like any other tools
- Marketplace integration — users can download MCP server packages alongside skills/integrations
- Security — MCP tool calls respect ACL rules and approval workflows
Example: User downloads "Web Research MCP Server" from marketplace → tools appear in tool_discovery__find("web search") → SmartToolSelector ranks them by relevance → LLM can use them transparently.
Scale
Current state (Phase 3e.5):
- Core integrations: ~14 (LLM, Memory, Frontend, Tasks, Spaces, Notifications, Cron, Google Calendar, Gmail, Filesystem, Webhook, Echo, Settings, Skills)
- Dynamic skills: user-generated (0-100+ per group)
Target state (Phase 3e.6+):
- Users can download 50-200+ skills/integrations from marketplace
- System remains responsive and efficient regardless of tool count
- Tool discovery is transparent and always available
2. Options Investigated
Option A: Embedding-Based Smart Tool Selection (CHOSEN)
Description: Embed all tool descriptions upfront. At runtime, use semantic search (LanceDB locally, pgvector on server) to retrieve the top-K most relevant tools based on the user's message. Always include core tools. Pass only the filtered set to the LLM.
Pros:
- ✅ Massive token savings (60-70%)
- ✅ Faster LLM response (smaller context)
- ✅ Embedding lookup is cheap (~50-100ms, local)
- ✅ Transparent: LLM knows tools are filtered, can ask for discovery
- ✅ Scales to 100+ tools without degradation
- ✅ Leverages existing embedding infrastructure (RAG pipeline)
- ✅ Works offline (LanceDB on desktop)
Cons:
- ⚠️ Embedding lookup adds ~50-100ms latency (minor, offset by token savings)
- ⚠️ Irrelevant tools might be selected if user's message is ambiguous
- ⚠️ Requires maintaining tool catalog (invalidation on updates)
Effort: M (4-6 weeks, 13 steps)
Option B: Tool Discovery Interface Only (No Pre-filtering)
Description:
Keep all tools in the system prompt, but add a tool_discovery integration that the LLM can call to explore tools by query. No pre-filtering.
Pros:
- ✅ LLM always has complete picture
- ✅ Simpler implementation (just add discovery tool)
- ✅ No risk of missing relevant tools
Cons:
- ❌ Token overhead remains unsolved (original problem)
- ❌ LLM still must parse all tools even if it doesn't use them
- ❌ Doesn't scale with marketplace integrations
- ❌ Latency improvements minimal
Effort: S (1 week, just add one integration)
Option C: Hybrid (Pre-filter + Discovery)
Description:
Use embedding-based pre-selection (Option A), but also include tool_discovery as escape hatch. LLM can call discovery if filtered tools aren't sufficient.
Pros:
- ✅ All benefits of Option A
- ✅ LLM can explicitly request more tools if needed
- ✅ Handles edge cases where embedding selection is wrong
Cons:
- ⚠️ More complex than Option A alone
- ⚠️ Extra LLM round-trip if discovery is called (rare)
- ⚠️ Minimal risk (discovery is escape hatch, not main path)
Effort: M + small (Option A + discovery integration)
Option D: Per-User Tool Profiles
Description: During onboarding, ask users which features they care about. Filter tools based on user's declared interests + message context.
Pros:
- ✅ Very targeted filtering
- ✅ Respects user preference
Cons:
- ❌ Assumes static interests (users' needs change)
- ❌ Onboarding becomes longer
- ❌ Requires maintenance as interests evolve
- ❌ Doesn't work for marketplace integrations (users don't know them yet)
Effort: M (onboarding UI + preference storage)
Option E: Dynamic System Prompt
Description: Instead of listing all tools, restructure system prompt to say: "You have access to: Communication (email, chat), Planning (tasks, calendar), Memory, Frontend, etc." Trust LLM to ask for specifics.
Pros:
- ✅ Less token overhead than listing all tools
- ✅ Simpler than embedding-based selection
Cons:
- ❌ LLM doesn't know what tools actually exist
- ❌ More ambiguous than explicit tool list
- ❌ Doesn't solve the core problem (still need to bind to actual tools)
- ❌ Requires extra discovery calls
Effort: S (update system prompt builder)
3. Decision
Chosen approach: Option C — Hybrid (Pre-filter + Discovery)
Recommended configuration:
- Primary path: Embedding-based smart tool selection (8-12 tools per request)
- Always include: Core tools (memory, tasks, frontend, notifications, discovery)
- Escape hatch:
tool_discovery__find()for when user needs tools outside filtered set - Transparent: System prompt clearly explains selection and discovery option
Reasoning:
- Option A alone is excellent, but discovery tool adds minimal complexity while providing a safety net
- Hybrid handles 99% of requests with Option A speed, but has an escape hatch for edge cases
- Token savings (60-70%) align with Phase 3e.6 goals (performance & scalability)
- Marketplace scalability: Supports growth to 100+ user-downloaded integrations
- Transparent & explainable: LLM knows what's happening, can make informed decisions
- Leverages existing infra: Uses RAG pipeline's embedding provider + LanceDB/pgvector
Trade-offs accepted:
- ⚠️ Embedding lookup adds ~50-100ms latency per request (offset by 60-70% token savings = net win)
- ⚠️ Potential for irrelevant tool selection in ambiguous cases (discovery tool handles this)
- ⚠️ Tool catalog requires invalidation on updates (manageable, infrequent)
4. Implementation Plan
Phase 1: Foundation (Week 1-2, Steps 1-3)
| Step | Description | Effort | Details |
|---|---|---|---|
| 1 | Create ToolCatalogService | M | Build tool registry, embedding, caching |
| 2 | Implement tool_discovery integration | M | New integration with find/list/describe actions |
| 3 | Build SmartToolSelector | M | Query LanceDB, apply ACL, return filtered list |
Deliverables:
backend/chat/tool_catalog.py— ToolCatalogService, ToolCatalogEntry modelbackend/interfaces/integrations/tool_discovery.py— ToolDiscoveryIntegrationbackend/chat/tool_selector.py— SmartToolSelector with ACL filtering
Phase 2: LLM Integration (Week 2-3, Steps 4-6)
| Step | Description | Effort | Details |
|---|---|---|---|
| 4 | Update orchestrator to use SmartToolSelector | M | Replace actions_to_anthropic_tools() with selection |
| 5 | Revise system prompt builder | S | Add transparency message + selected tools list |
| 6 | Update tool bridge for dynamic tool lists | S | Handle variable-length tool list |
Deliverables:
- Updated
backend/chat/orchestrator.py - Updated
backend/chat/prompts.py - Updated
backend/chat/tools.py
Phase 3: Testing & Validation (Week 3-4, Steps 7-10)
| Step | Description | Effort | Details |
|---|---|---|---|
| 7 | Unit tests for ToolCatalogService | M | Test embedding, caching, invalidation |
| 8 | Unit tests for SmartToolSelector | M | Test relevance scoring, ACL filtering, core tool inclusion |
| 9 | Integration tests: chat flow with discovery | M | Full E2E with tool selection + discovery calls |
| 10 | E2E validation: token savings + accuracy | M | Measure savings, verify tool selection quality |
Acceptance Criteria:
- ✅ System prompt size reduced by 60-70%
- ✅ Top-K selection has >85% accuracy (selected tools are relevant)
- ✅ Tool discovery retrieves correct tools when called
- ✅ ACL filtering blocks unauthorized tools
- ✅ Core tools always present
Phase 4: Marketplace Integration (Week 4, Step 11)
| Step | Description | Effort | Details |
|---|---|---|---|
| 11 | Auto-embed on marketplace install | M | Hook into skill/integration install, async embed |
Deliverables:
- Updated
backend/skills/service.pyorbackend/interfaces/integrations/*.py(install hook) - Async embedding task for new integrations
Phase 5: Polish & Documentation (Week 4+, Steps 12-13)
| Step | Description | Effort | Details |
|---|---|---|---|
| 12 | Update docs | S | interfaces.md, architecture.md, api.md |
| 13 | Feature doc + rationale | S | This document + IMPLEMENTATION_PLAN.md |
Deliverables:
- Updated
docs/interfaces.md(tool discovery section) - Updated
docs/architecture.md(new component: ToolCatalogService) - Updated
docs/api.md(tool_discovery actions) docs/features/2026-02-15-IMPLEMENTATION_PLAN.md(step-by-step guide)docs/features/QUICK_REFERENCE_Tool_Discovery.md(user reference)
5. Technical Specification
Data Models
ToolCatalogEntry
from dataclasses import dataclass
from typing import Optional
import numpy as np
from interfaces.models import AIAccess
@dataclass
class ToolCatalogEntry:
"""Entry in the tool catalog for embedding + discovery"""
interface_name: str # "gmail"
action_name: str # "send_email"
full_name: str # "gmail__send_email"
description: str # From ActionDefinition.description
category: str # "communication", "planning", "core", "memory", etc.
vector: np.ndarray # 384-dim (FastEmbed) or 1536-dim (OpenAI)
ai_access: AIAccess # execute/propose/blocked
available_in_groups: int = 0 # How many groups have this integration
parameters_summary: str = "" # Brief param summary for discovery results
tags: list[str] = None # ["email", "send", "communication"] for search
SmartToolSelector Output
@dataclass
class SelectedTools:
"""Result of smart tool selection"""
selected: list[ToolCatalogEntry] # Top-K filtered tools
core_included: list[str] # Which core tools were included
excluded_count: int # How many tools were filtered out
reason: str # Explanation for transparency
discovery_suggested: bool # Should LLM know it can use discovery?
Service: ToolCatalogService
class ToolCatalogService:
"""
Manages the tool catalog: embedding all registered tools,
caching embeddings, and handling invalidation.
"""
async def initialize(self):
"""Build tool catalog from InterfaceManager at startup"""
# For each interface in InterfaceManager:
# For each action in interface.get_actions():
# Create ToolCatalogEntry
# Embed description
# Store in LanceDB + cache
async def add_tool(self, interface_name: str, action: ActionDefinition):
"""Add a new tool (called when skill/integration installed)"""
# Embed + store
async def remove_tool(self, interface_name: str, action_name: str):
"""Remove a tool (called when uninstalled)"""
async def get_all_tools(self) -> list[ToolCatalogEntry]:
"""Return all tools (for discovery)"""
async def search(self, query: str, limit: int = 20) -> list[ToolCatalogEntry]:
"""Search by embedding + text"""
@property
def cache_size(self) -> int:
"""Current number of tools in catalog"""
Service: MCPIntegration (NEW)
class MCPIntegration(BaseInterface):
"""Manage and expose MCP (Model Context Protocol) servers.
MCP servers provide external capabilities that extend Morphee.
This integration acts as a broker: register servers, fetch their schemas,
convert to tools, embed descriptions, make executable.
"""
name = "mcp"
description = "Register and manage MCP (Model Context Protocol) servers"
config_schema = {} # Configuration per interface instance
def get_actions(self) -> List[ActionDefinition]:
return [
ActionDefinition(
name="register_server",
description="Register a new MCP server (local or remote)",
parameters=[
ActionParameter(
name="name",
type=ParameterType.STRING,
description="Display name for this MCP server (e.g., 'Web Research')",
required=True,
),
ActionParameter(
name="endpoint",
type=ParameterType.STRING,
description="Server endpoint (URL for remote, path for local)",
required=True,
),
ActionParameter(
name="api_key",
type=ParameterType.STRING,
description="Optional API key if server requires authentication",
required=False,
),
],
ai_access=AIAccess.PROPOSE, # Requires approval to add new capabilities
side_effect=SideEffect.WRITE,
),
ActionDefinition(
name="list_servers",
description="List all registered MCP servers",
parameters=[],
ai_access=AIAccess.EXECUTE,
side_effect=SideEffect.READ,
),
ActionDefinition(
name="call_tool",
description="Call a tool provided by an MCP server",
parameters=[
ActionParameter(
name="server_name",
type=ParameterType.STRING,
description="Name of the MCP server",
required=True,
),
ActionParameter(
name="tool_name",
type=ParameterType.STRING,
description="Tool name on that server",
required=True,
),
ActionParameter(
name="params",
type=ParameterType.OBJECT,
description="Tool parameters",
required=False,
),
],
ai_access=AIAccess.EXECUTE, # Or PROPOSE if tool is sensitive
side_effect=SideEffect.READ, # Or WRITE/DELETE based on tool
),
]
async def execute(self, action_name: str, parameters: dict) -> ActionResult:
if action_name == "register_server":
return await self._register_server(
parameters.get("name"),
parameters.get("endpoint"),
parameters.get("api_key"),
)
elif action_name == "list_servers":
return await self._list_servers()
elif action_name == "call_tool":
return await self._call_tool(
parameters.get("server_name"),
parameters.get("tool_name"),
parameters.get("params", {}),
)
async def _register_server(self, name: str, endpoint: str, api_key: Optional[str]) -> ActionResult:
"""Register a new MCP server and fetch its schema"""
# 1. Validate endpoint is reachable
# 2. Fetch MCP server schema (list of available tools)
# 3. Convert MCP tools to ActionDefinition format
# 4. Embed tool descriptions
# 5. Store server config in database
# 6. Register each tool with ToolCatalogService
# NEW tools immediately appear in tool_discovery + SmartToolSelector
async def _list_servers(self) -> ActionResult:
"""List all registered MCP servers and their tool counts"""
async def _call_tool(self, server_name: str, tool_name: str, params: dict) -> ActionResult:
"""Execute a tool on a registered MCP server"""
# 1. Look up server config
# 2. Format params per MCP protocol
# 3. Call MCP server endpoint
# 4. Return result to LLM
MCP Tool Registration Flow
User: "Install the Web Research MCP server"
↓
mcp__register_server(
name="Web Research",
endpoint="https://mcp-web-research.example.com",
api_key="sk_xxx"
)
↓
MCPIntegration:
1. Validates endpoint reachable
2. Fetches schema: [search_web, fetch_url, extract_content]
3. Converts to ActionDefinition:
- web_research__search_web
- web_research__fetch_url
- web_research__extract_content
4. Embeds descriptions → LanceDB
↓
ToolCatalogService:
1. New tools immediately searchable
↓
SmartToolSelector:
1. Next message automatically considers MCP tools
↓
tool_discovery__find("search the web"):
1. Returns: web_research__search_web + other search tools
Service: SmartToolSelector
class SmartToolSelector:
"""
Selects relevant tools for a specific user request.
Algorithm:
1. Embed user message
2. Query LanceDB: top-20 similar tools
3. Filter by ACL: only accessible tools
4. Add core tools (always)
5. Return top-K final selection
"""
def __init__(
self,
catalog: ToolCatalogService,
interface_manager: InterfaceManager,
acl_service: ACLService,
max_tools: int = 10,
):
self.catalog = catalog
self.interface_manager = interface_manager
self.acl_service = acl_service
self.max_tools = max_tools
self.core_tools = {
"memory__search", "memory__store", "memory__recall", "memory__forget",
"tasks__list", "tasks__create", "tasks__update_status",
"frontend__show_card", "frontend__show_form", "frontend__show_choices",
"notifications__send",
"tool_discovery__find",
}
async def select(
self,
user_message: str,
user_id: UUID,
group_id: UUID,
space_id: UUID,
) -> SelectedTools:
"""
Select most relevant tools for this user + message.
Args:
user_message: The user's current message
user_id: User making the request
group_id: User's group
space_id: Current space
Returns:
SelectedTools with selected tools + explanation
"""
# 1. Get embedding of user message
embedding = await embedding_provider.embed(user_message)
# 2. Query LanceDB for top-20
candidates = await self.catalog.search_vector(
vector=embedding.vector,
limit=20,
metric="cosine"
)
# 3. Filter by ACL (only accessible tools)
accessible = []
for tool in candidates:
full_name = f"{tool.interface_name}__{tool.action_name}"
if await self.acl_service.check(
user_id, group_id, space_id, full_name
):
accessible.append(tool)
# 4. Separate core vs non-core
core = [t for t in accessible if f"{t.interface_name}__{t.action_name}" in self.core_tools]
non_core = [t for t in accessible if f"{t.interface_name}__{t.action_name}" not in self.core_tools]
# 5. Build final selection
slots_for_core = max(2, self.max_tools // 3) # Reserve ~30% for core
slots_for_non_core = self.max_tools - slots_for_core
selected = (core[:slots_for_core] + non_core[:slots_for_non_core])[:self.max_tools]
return SelectedTools(
selected=selected,
core_included=[f"{t.interface_name}__{t.action_name}" for t in core if t in selected],
excluded_count=len(candidates) - len(selected),
reason=f"Based on your message, I found these most relevant tools:",
discovery_suggested=len(accessible) > len(selected), # Let LLM know discovery exists
)
Integration: ToolDiscoveryIntegration
class ToolDiscoveryIntegration(BaseInterface):
"""Discover and explore available integrations and skills"""
name = "tool_discovery"
description = "Search and discover available tools and integrations"
def __init__(self, catalog: ToolCatalogService, **kwargs):
super().__init__(**kwargs)
self.catalog = catalog
def get_actions(self) -> List[ActionDefinition]:
return [
ActionDefinition(
name="find",
description="Search for tools by query. Returns matching integrations and skills.",
parameters=[
ActionParameter(
name="query",
type=ParameterType.STRING,
description="Search query (e.g., 'send email', 'schedule meeting')",
required=True,
),
ActionParameter(
name="limit",
type=ParameterType.INTEGER,
description="Max results to return (default 10)",
required=False,
default=10,
),
],
ai_access=AIAccess.EXECUTE,
side_effect=SideEffect.READ,
),
ActionDefinition(
name="list_all",
description="List all available tools and integrations",
parameters=[
ActionParameter(
name="category",
type=ParameterType.STRING,
description="Optional: filter by category (communication, planning, memory, etc.)",
required=False,
),
],
ai_access=AIAccess.EXECUTE,
side_effect=SideEffect.READ,
),
ActionDefinition(
name="describe",
description="Get full details about a specific tool",
parameters=[
ActionParameter(
name="tool_name",
type=ParameterType.STRING,
description="Tool name (e.g., 'gmail__send_email')",
required=True,
),
],
ai_access=AIAccess.EXECUTE,
side_effect=SideEffect.READ,
),
]
async def execute(self, action_name: str, parameters: dict) -> ActionResult:
if action_name == "find":
return await self._find(parameters.get("query"), parameters.get("limit", 10))
elif action_name == "list_all":
return await self._list_all(parameters.get("category"))
elif action_name == "describe":
return await self._describe(parameters.get("tool_name"))
else:
return ActionResult(success=False, error=f"Unknown action: {action_name}")
async def _find(self, query: str, limit: int) -> ActionResult:
"""Search for matching tools"""
results = await self.catalog.search(query, limit=limit)
return ActionResult(
success=True,
output={
"query": query,
"count": len(results),
"tools": [
{
"name": f"{t.interface_name}__{t.action_name}",
"description": t.description,
"ai_access": t.ai_access.value,
}
for t in results
],
}
)
async def _list_all(self, category: Optional[str]) -> ActionResult:
"""List all available tools"""
tools = await self.catalog.get_all_tools()
if category:
tools = [t for t in tools if t.category == category]
return ActionResult(
success=True,
output={
"total": len(tools),
"tools": [
{
"name": f"{t.interface_name}__{t.action_name}",
"description": t.description,
"category": t.category,
}
for t in tools
],
}
)
async def _describe(self, tool_name: str) -> ActionResult:
"""Get full details about a tool"""
interface_name, action_name = tool_name.split("__", 1)
tool = await self.catalog.get_tool(interface_name, action_name)
if not tool:
return ActionResult(success=False, error=f"Tool not found: {tool_name}")
return ActionResult(
success=True,
output={
"name": tool_name,
"description": tool.description,
"category": tool.category,
"ai_access": tool.ai_access.value,
"parameters_summary": tool.parameters_summary,
}
)
Updated Orchestrator Flow
# In chat/orchestrator.py
async def chat_with_tools(
user_id: UUID,
group_id: UUID,
space_id: UUID,
messages: list[dict],
system_prompt_override: Optional[str] = None,
) -> AsyncGenerator[StreamEvent, None]:
"""
Agent loop with intelligent tool selection.
"""
# 1. Determine relevant tools (NEW)
tool_selector = SmartToolSelector(...)
selected = await tool_selector.select(
user_message=messages[-1]["content"], # Current message
user_id=user_id,
group_id=group_id,
space_id=space_id,
)
# 2. Build Anthropic tools list from selected tools (CHANGED)
tools = actions_to_anthropic_tools(
interface_manager,
tools_to_include=[f"{t.interface_name}__{t.action_name}" for t in selected.selected],
)
# 3. Build system prompt with transparency (CHANGED)
system_prompt = build_system_prompt(..., selected_tools=selected)
# 4. Agent loop (unchanged)
for turn in range(max_turns):
# LLM call with selected tools + system prompt
stream = await llm.chat(
messages=messages,
tools=tools,
system_prompt=system_prompt,
)
# ... rest of loop
Updated System Prompt
Tool usage guidelines:
I've selected these tools based on your message:
1. memory__search — Find remembered facts, preferences, events
2. calendar__list_events — Check your calendar
3. calendar__create_event — Schedule an event (requires approval)
4. notifications__send — Send yourself an alert
These are the most relevant for what you asked.
If you need something else, ask me: "What other tools do I have?"
I can search all available integrations and suggest more options.
To use a tool:
- memory__search: Look up known facts before answering questions
- calendar__list_events: Check what's on the calendar
- ... [rest of tool guidance as today]
Database/Storage
LanceDB (Local, Desktop):
- Table:
tool_catalog - Columns: interface_name, action_name, full_name, description, category, vector, ai_access, parameters_summary, tags
- Index: Vector index on
vectorcolumn (cosine distance)
pgvector (Server):
- Table:
tool_catalog - Columns: same as LanceDB
- Index: ivfflat on
vectorcolumn (cosine) - Used as fallback for web client, cache for mobile
Redis Cache:
- Key:
tool_catalog:all→ serialized list of all tools - TTL: 1 hour or on invalidation
- Used for fast discovery listing
6. Questions & Answers
Q: What if user's message is ambiguous and embedding-based selection picks wrong tools?
A: This is handled by the discovery escape hatch.
- If user says "send something", system might pick email + messaging tools
- If user actually wants "send a notification", they can call
tool_discovery__find("send notification")and get the right tool - Discovery tool is always available as escape hatch
Q: How does tool discovery work with ACL restrictions?
A: SmartToolSelector applies ACL filtering before returning results.
- Query LanceDB: get top-20 tools by similarity
- Filter: only tools user has access to (via Space inheritance + ACL rules)
- Return filtered set
- Same ACL logic applies in
ToolDiscoveryIntegration._find()
Q: What about the embedding latency? 50-100ms adds up per request
A: Offset by token savings:
- Embedding lookup: ~50-100ms (local LanceDB)
- System prompt: 60-70% smaller → LLM processes 600-1400 tokens fewer
- LLM response time: typically 30-50ms per 100 tokens
- Net savings: -600ms to +200ms depending on response length
- For most requests (non-verbose responses), we're ahead
Q: Should Skills and Integrations be truly unified in the code?
A: Yes, partially. In terms of tool catalog & discovery, they're identical:
- Both register as "virtual integrations" with actions
- Both get embedded + discoverable
- Same ACL rules apply
- Same system to call them
However, their creation/lifecycle might differ:
- Skills: created at runtime via SkillEngine, self-register as DynamicSkillInterface
- Integrations: registered at startup, may require configuration
- This distinction is fine to keep in their respective services
- But from the orchestrator's perspective, they're interchangeable
Q: What happens if a tool description is updated?
A: Tool catalog invalidation:
- Option 1: TTL-based (1 hour) — embed is re-built periodically
- Option 2: Event-based — when tool updates, trigger re-embedding
- Option 3: Manual — admin command to rebuild catalog
- Recommend: Hybrid of Option 2 + Option 1 (event-triggered with TTL fallback)
Q: How do you measure success?
A: Success metrics:
- Token savings: System prompt reduced 60-70% (measure: count tokens in prompts.py before/after)
- Tool selection accuracy: >85% of selected tools are actually used by LLM (measure: log which tools LLM calls)
- Tool discovery adoption: Users call tool_discovery in <5% of requests (baseline metric, should stay low)
- Latency: Request latency unchanged or improved (measure: embedding lookup + prompt parsing time)
- Coverage: No tool discovery calls fail to find relevant tools (measure: discovery call results)
7. Open Items
-
Tool catalog invalidation strategy — Decide between TTL, event-based, or manual
- Recommendation: Event-based (when SkillService creates/deletes skill) + 1-hour TTL as safety net
- Owner: Backend architect
- Timeline: During implementation
-
Tool tags/categorization — Formalize tool categories
- Recommendation: Add
categoryfield to ActionDefinition, migrate all actions to categorize - Owner: Product/Architect
- Timeline: Can be deferred to step 12 (polish)
- Recommendation: Add
-
Mobile embedding performance — LanceDB on mobile is fast, but confirm latency targets
- Recommendation: Profile embedding lookup on iOS/Android, optimize if needed
- Owner: Mobile lead
- Timeline: Phase 3d M3 (offline mobile)
-
Marketplace integration hook — When users download skills/integrations, exactly when does embedding happen?
- Recommendation: Async background job (don't block the download)
- Owner: Marketplace/Skills lead
- Timeline: During step 11
8. References
- docs/interfaces.md — Integration/Interface system
- docs/architecture.md — System architecture
- backend/memory/rag.py — RAG pipeline (embedding + search)
- backend/memory/embedding_manager.py — Embedding provider
- backend/memory/vector_store.py — Vector storage
- backend/chat/tools.py — Tool bridge (current)
- backend/chat/orchestrator.py — Orchestrator (will be updated)
- backend/chat/prompts.py — System prompt builder (will be updated)
- Phase 3b.1 (Skills): backend/skills/
- Phase 2b (Tauri Rust): frontend/src-tauri/src/ (LanceDB, embeddings)
9. Implementation Dependencies
Must be done first:
- Phase 3e.5 complete (latest system prompt, tools stable)
- Embedding provider operational (RAG pipeline working)
Can be done in parallel:
- Marketplace integration (step 11) can start anytime
- Documentation updates (steps 12-13) can follow core implementation
Blocks:
- None — this is a new feature, no breaking changes
Last Updated: February 20, 2026 Owner: Backend Architect + LLM Team