Feature: Intelligent Skill & Integration Discovery

Date: February 15, 2026 Status: Approved — partially implemented as Vector-First Architecture (VectorRouter, Feb 20) Version: V1.0 Epic: Tool Discoverability & Scalability

1. Context & Motivation

Implementation note (Feb 20, 2026): The core of this design — embedding-based tool selection — has been implemented as the Vector-First Architecture. VectorRouter in backend/chat/vector_router.py routes ~65% of requests without calling the LLM at all. Skills are indexed via store_if_novel() with memory_type="skill_index". The full SmartToolSelector/ToolCatalogService described below is the next evolution of this pattern. See ROADMAP.md — Knowledge Pipeline for how this connects to the compilation chain (VectorRouter = the shortcut before any runtime).

Problem Statement

As Morphee's integration ecosystem grows (users can download skills/integrations from marketplace), the current approach of including all available tools in every LLM request becomes inefficient:

Token overhead: System prompt balloons from 3-5 KB to 10+ KB with 50+ tools
Latency: LLM must parse and decide among 50+ irrelevant tools
Cognitive load: AI struggles to pick the right tool from a massive list
Cost: Token usage increases proportionally with tool count
Discoverability: Users don't know what tools they have; LLM can't effectively recommend them

Scope

This feature enables intelligent, context-aware tool selection using semantic search (embeddings). Instead of including all tools in every request, the system:

Embeds all tool descriptions at startup (core integrations, skills, MCP tools)
At runtime, finds the top 8-12 most relevant tools based on user's message
Passes only those to the LLM with transparent explanation
Provides an escape hatch (tool_discovery__find) for when users need tools outside the filtered set

Extended Scope: MCP (Model Context Protocol) Integration

This phase also introduces MCP support — a new MCPIntegration that manages external MCP servers and exposes their tools to Morphee:

MCP servers (local or remote) register their tools
Tool schemas are automatically extracted and embedded
MCP tools are discoverable, smart-selectable, and executable like any other tools
Marketplace integration — users can download MCP server packages alongside skills/integrations
Security — MCP tool calls respect ACL rules and approval workflows

Example: User downloads "Web Research MCP Server" from marketplace → tools appear in tool_discovery__find("web search") → SmartToolSelector ranks them by relevance → LLM can use them transparently.

Scale

Current state (Phase 3e.5):

Core integrations: ~14 (LLM, Memory, Frontend, Tasks, Spaces, Notifications, Cron, Google Calendar, Gmail, Filesystem, Webhook, Echo, Settings, Skills)
Dynamic skills: user-generated (0-100+ per group)

Target state (Phase 3e.6+):

Users can download 50-200+ skills/integrations from marketplace
System remains responsive and efficient regardless of tool count
Tool discovery is transparent and always available

2. Options Investigated

Option A: Embedding-Based Smart Tool Selection (CHOSEN)

Description: Embed all tool descriptions upfront. At runtime, use semantic search (LanceDB locally, pgvector on server) to retrieve the top-K most relevant tools based on the user's message. Always include core tools. Pass only the filtered set to the LLM.

Pros:

✅ Massive token savings (60-70%)
✅ Faster LLM response (smaller context)
✅ Embedding lookup is cheap (~50-100ms, local)
✅ Transparent: LLM knows tools are filtered, can ask for discovery
✅ Scales to 100+ tools without degradation
✅ Leverages existing embedding infrastructure (RAG pipeline)
✅ Works offline (LanceDB on desktop)

Cons:

⚠️ Embedding lookup adds ~50-100ms latency (minor, offset by token savings)
⚠️ Irrelevant tools might be selected if user's message is ambiguous
⚠️ Requires maintaining tool catalog (invalidation on updates)

Effort: M (4-6 weeks, 13 steps)

Option B: Tool Discovery Interface Only (No Pre-filtering)

Description: Keep all tools in the system prompt, but add a tool_discovery integration that the LLM can call to explore tools by query. No pre-filtering.

Pros:

✅ LLM always has complete picture
✅ Simpler implementation (just add discovery tool)
✅ No risk of missing relevant tools

Cons:

❌ Token overhead remains unsolved (original problem)
❌ LLM still must parse all tools even if it doesn't use them
❌ Doesn't scale with marketplace integrations
❌ Latency improvements minimal

Effort: S (1 week, just add one integration)

Option C: Hybrid (Pre-filter + Discovery)

Description: Use embedding-based pre-selection (Option A), but also include tool_discovery as escape hatch. LLM can call discovery if filtered tools aren't sufficient.

Pros:

✅ All benefits of Option A
✅ LLM can explicitly request more tools if needed
✅ Handles edge cases where embedding selection is wrong

Cons:

⚠️ More complex than Option A alone
⚠️ Extra LLM round-trip if discovery is called (rare)
⚠️ Minimal risk (discovery is escape hatch, not main path)

Effort: M + small (Option A + discovery integration)

Option D: Per-User Tool Profiles

Description: During onboarding, ask users which features they care about. Filter tools based on user's declared interests + message context.

Pros:

✅ Very targeted filtering
✅ Respects user preference

Cons:

❌ Assumes static interests (users' needs change)
❌ Onboarding becomes longer
❌ Requires maintenance as interests evolve
❌ Doesn't work for marketplace integrations (users don't know them yet)

Effort: M (onboarding UI + preference storage)

Option E: Dynamic System Prompt

Description: Instead of listing all tools, restructure system prompt to say: "You have access to: Communication (email, chat), Planning (tasks, calendar), Memory, Frontend, etc." Trust LLM to ask for specifics.

Pros:

✅ Less token overhead than listing all tools
✅ Simpler than embedding-based selection

Cons:

❌ LLM doesn't know what tools actually exist
❌ More ambiguous than explicit tool list
❌ Doesn't solve the core problem (still need to bind to actual tools)
❌ Requires extra discovery calls

Effort: S (update system prompt builder)

3. Decision

Chosen approach: Option C — Hybrid (Pre-filter + Discovery)

Recommended configuration:

Primary path: Embedding-based smart tool selection (8-12 tools per request)
Always include: Core tools (memory, tasks, frontend, notifications, discovery)
Escape hatch: tool_discovery__find() for when user needs tools outside filtered set
Transparent: System prompt clearly explains selection and discovery option

Reasoning:

Option A alone is excellent, but discovery tool adds minimal complexity while providing a safety net
Hybrid handles 99% of requests with Option A speed, but has an escape hatch for edge cases
Token savings (60-70%) align with Phase 3e.6 goals (performance & scalability)
Marketplace scalability: Supports growth to 100+ user-downloaded integrations
Transparent & explainable: LLM knows what's happening, can make informed decisions
Leverages existing infra: Uses RAG pipeline's embedding provider + LanceDB/pgvector

Trade-offs accepted:

⚠️ Embedding lookup adds ~50-100ms latency per request (offset by 60-70% token savings = net win)
⚠️ Potential for irrelevant tool selection in ambiguous cases (discovery tool handles this)
⚠️ Tool catalog requires invalidation on updates (manageable, infrequent)

4. Implementation Plan

Phase 1: Foundation (Week 1-2, Steps 1-3)

Step	Description	Effort	Details
1	Create `ToolCatalogService`	M	Build tool registry, embedding, caching
2	Implement `tool_discovery` integration	M	New integration with find/list/describe actions
3	Build `SmartToolSelector`	M	Query LanceDB, apply ACL, return filtered list

Deliverables:

backend/chat/tool_catalog.py — ToolCatalogService, ToolCatalogEntry model
backend/interfaces/integrations/tool_discovery.py — ToolDiscoveryIntegration
backend/chat/tool_selector.py — SmartToolSelector with ACL filtering

Phase 2: LLM Integration (Week 2-3, Steps 4-6)

Step	Description	Effort	Details
4	Update orchestrator to use SmartToolSelector	M	Replace `actions_to_anthropic_tools()` with selection
5	Revise system prompt builder	S	Add transparency message + selected tools list
6	Update tool bridge for dynamic tool lists	S	Handle variable-length tool list

Deliverables:

Updated backend/chat/orchestrator.py
Updated backend/chat/prompts.py
Updated backend/chat/tools.py

Phase 3: Testing & Validation (Week 3-4, Steps 7-10)

Step	Description	Effort	Details
7	Unit tests for ToolCatalogService	M	Test embedding, caching, invalidation
8	Unit tests for SmartToolSelector	M	Test relevance scoring, ACL filtering, core tool inclusion
9	Integration tests: chat flow with discovery	M	Full E2E with tool selection + discovery calls
10	E2E validation: token savings + accuracy	M	Measure savings, verify tool selection quality

Acceptance Criteria:

✅ System prompt size reduced by 60-70%
✅ Top-K selection has >85% accuracy (selected tools are relevant)
✅ Tool discovery retrieves correct tools when called
✅ ACL filtering blocks unauthorized tools
✅ Core tools always present

Phase 4: Marketplace Integration (Week 4, Step 11)

Step	Description	Effort	Details
11	Auto-embed on marketplace install	M	Hook into skill/integration install, async embed

Deliverables:

Updated backend/skills/service.py or backend/interfaces/integrations/*.py (install hook)
Async embedding task for new integrations

Phase 5: Polish & Documentation (Week 4+, Steps 12-13)

Step	Description	Effort	Details
12	Update docs	S	interfaces.md, architecture.md, api.md
13	Feature doc + rationale	S	This document + IMPLEMENTATION_PLAN.md

Deliverables:

Updated docs/interfaces.md (tool discovery section)
Updated docs/architecture.md (new component: ToolCatalogService)
Updated docs/api.md (tool_discovery actions)
docs/features/2026-02-15-IMPLEMENTATION_PLAN.md (step-by-step guide)
docs/features/QUICK_REFERENCE_Tool_Discovery.md (user reference)

5. Technical Specification

Data Models

ToolCatalogEntry

from dataclasses import dataclass
from typing import Optional
import numpy as np
from interfaces.models import AIAccess

@dataclass
class ToolCatalogEntry:
    """Entry in the tool catalog for embedding + discovery"""

    interface_name: str           # "gmail"
    action_name: str              # "send_email"
    full_name: str                # "gmail__send_email"
    description: str              # From ActionDefinition.description
    category: str                 # "communication", "planning", "core", "memory", etc.
    vector: np.ndarray            # 384-dim (FastEmbed) or 1536-dim (OpenAI)
    ai_access: AIAccess           # execute/propose/blocked
    available_in_groups: int = 0  # How many groups have this integration
    parameters_summary: str = ""  # Brief param summary for discovery results
    tags: list[str] = None        # ["email", "send", "communication"] for search

SmartToolSelector Output

@dataclass
class SelectedTools:
    """Result of smart tool selection"""

    selected: list[ToolCatalogEntry]     # Top-K filtered tools
    core_included: list[str]             # Which core tools were included
    excluded_count: int                  # How many tools were filtered out
    reason: str                          # Explanation for transparency
    discovery_suggested: bool            # Should LLM know it can use discovery?

Service: ToolCatalogService

class ToolCatalogService:
    """
    Manages the tool catalog: embedding all registered tools,
    caching embeddings, and handling invalidation.
    """

    async def initialize(self):
        """Build tool catalog from InterfaceManager at startup"""
        # For each interface in InterfaceManager:
        #   For each action in interface.get_actions():
        #     Create ToolCatalogEntry
        #     Embed description
        #     Store in LanceDB + cache

    async def add_tool(self, interface_name: str, action: ActionDefinition):
        """Add a new tool (called when skill/integration installed)"""
        # Embed + store

    async def remove_tool(self, interface_name: str, action_name: str):
        """Remove a tool (called when uninstalled)"""

    async def get_all_tools(self) -> list[ToolCatalogEntry]:
        """Return all tools (for discovery)"""

    async def search(self, query: str, limit: int = 20) -> list[ToolCatalogEntry]:
        """Search by embedding + text"""

    @property
    def cache_size(self) -> int:
        """Current number of tools in catalog"""

Service: MCPIntegration (NEW)

class MCPIntegration(BaseInterface):
    """Manage and expose MCP (Model Context Protocol) servers.

    MCP servers provide external capabilities that extend Morphee.
    This integration acts as a broker: register servers, fetch their schemas,
    convert to tools, embed descriptions, make executable.
    """

    name = "mcp"
    description = "Register and manage MCP (Model Context Protocol) servers"
    config_schema = {}  # Configuration per interface instance

    def get_actions(self) -> List[ActionDefinition]:
        return [
            ActionDefinition(
                name="register_server",
                description="Register a new MCP server (local or remote)",
                parameters=[
                    ActionParameter(
                        name="name",
                        type=ParameterType.STRING,
                        description="Display name for this MCP server (e.g., 'Web Research')",
                        required=True,
                    ),
                    ActionParameter(
                        name="endpoint",
                        type=ParameterType.STRING,
                        description="Server endpoint (URL for remote, path for local)",
                        required=True,
                    ),
                    ActionParameter(
                        name="api_key",
                        type=ParameterType.STRING,
                        description="Optional API key if server requires authentication",
                        required=False,
                    ),
                ],
                ai_access=AIAccess.PROPOSE,  # Requires approval to add new capabilities
                side_effect=SideEffect.WRITE,
            ),
            ActionDefinition(
                name="list_servers",
                description="List all registered MCP servers",
                parameters=[],
                ai_access=AIAccess.EXECUTE,
                side_effect=SideEffect.READ,
            ),
            ActionDefinition(
                name="call_tool",
                description="Call a tool provided by an MCP server",
                parameters=[
                    ActionParameter(
                        name="server_name",
                        type=ParameterType.STRING,
                        description="Name of the MCP server",
                        required=True,
                    ),
                    ActionParameter(
                        name="tool_name",
                        type=ParameterType.STRING,
                        description="Tool name on that server",
                        required=True,
                    ),
                    ActionParameter(
                        name="params",
                        type=ParameterType.OBJECT,
                        description="Tool parameters",
                        required=False,
                    ),
                ],
                ai_access=AIAccess.EXECUTE,  # Or PROPOSE if tool is sensitive
                side_effect=SideEffect.READ,  # Or WRITE/DELETE based on tool
            ),
        ]

    async def execute(self, action_name: str, parameters: dict) -> ActionResult:
        if action_name == "register_server":
            return await self._register_server(
                parameters.get("name"),
                parameters.get("endpoint"),
                parameters.get("api_key"),
            )
        elif action_name == "list_servers":
            return await self._list_servers()
        elif action_name == "call_tool":
            return await self._call_tool(
                parameters.get("server_name"),
                parameters.get("tool_name"),
                parameters.get("params", {}),
            )

    async def _register_server(self, name: str, endpoint: str, api_key: Optional[str]) -> ActionResult:
        """Register a new MCP server and fetch its schema"""
        # 1. Validate endpoint is reachable
        # 2. Fetch MCP server schema (list of available tools)
        # 3. Convert MCP tools to ActionDefinition format
        # 4. Embed tool descriptions
        # 5. Store server config in database
        # 6. Register each tool with ToolCatalogService

        # NEW tools immediately appear in tool_discovery + SmartToolSelector

    async def _list_servers(self) -> ActionResult:
        """List all registered MCP servers and their tool counts"""

    async def _call_tool(self, server_name: str, tool_name: str, params: dict) -> ActionResult:
        """Execute a tool on a registered MCP server"""
        # 1. Look up server config
        # 2. Format params per MCP protocol
        # 3. Call MCP server endpoint
        # 4. Return result to LLM

MCP Tool Registration Flow

User: "Install the Web Research MCP server"
    ↓
mcp__register_server(
    name="Web Research",
    endpoint="https://mcp-web-research.example.com",
    api_key="sk_xxx"
)
    ↓
MCPIntegration:
    1. Validates endpoint reachable
    2. Fetches schema: [search_web, fetch_url, extract_content]
    3. Converts to ActionDefinition:
       - web_research__search_web
       - web_research__fetch_url
       - web_research__extract_content
    4. Embeds descriptions → LanceDB
    ↓
ToolCatalogService:
    1. New tools immediately searchable
    ↓
SmartToolSelector:
    1. Next message automatically considers MCP tools
    ↓
tool_discovery__find("search the web"):
    1. Returns: web_research__search_web + other search tools

Service: SmartToolSelector

class SmartToolSelector:
    """
    Selects relevant tools for a specific user request.

    Algorithm:
    1. Embed user message
    2. Query LanceDB: top-20 similar tools
    3. Filter by ACL: only accessible tools
    4. Add core tools (always)
    5. Return top-K final selection
    """

    def __init__(
        self,
        catalog: ToolCatalogService,
        interface_manager: InterfaceManager,
        acl_service: ACLService,
        max_tools: int = 10,
    ):
        self.catalog = catalog
        self.interface_manager = interface_manager
        self.acl_service = acl_service
        self.max_tools = max_tools
        self.core_tools = {
            "memory__search", "memory__store", "memory__recall", "memory__forget",
            "tasks__list", "tasks__create", "tasks__update_status",
            "frontend__show_card", "frontend__show_form", "frontend__show_choices",
            "notifications__send",
            "tool_discovery__find",
        }

    async def select(
        self,
        user_message: str,
        user_id: UUID,
        group_id: UUID,
        space_id: UUID,
    ) -> SelectedTools:
        """
        Select most relevant tools for this user + message.

        Args:
            user_message: The user's current message
            user_id: User making the request
            group_id: User's group
            space_id: Current space

        Returns:
            SelectedTools with selected tools + explanation
        """
        # 1. Get embedding of user message
        embedding = await embedding_provider.embed(user_message)

        # 2. Query LanceDB for top-20
        candidates = await self.catalog.search_vector(
            vector=embedding.vector,
            limit=20,
            metric="cosine"
        )

        # 3. Filter by ACL (only accessible tools)
        accessible = []
        for tool in candidates:
            full_name = f"{tool.interface_name}__{tool.action_name}"
            if await self.acl_service.check(
                user_id, group_id, space_id, full_name
            ):
                accessible.append(tool)

        # 4. Separate core vs non-core
        core = [t for t in accessible if f"{t.interface_name}__{t.action_name}" in self.core_tools]
        non_core = [t for t in accessible if f"{t.interface_name}__{t.action_name}" not in self.core_tools]

        # 5. Build final selection
        slots_for_core = max(2, self.max_tools // 3)  # Reserve ~30% for core
        slots_for_non_core = self.max_tools - slots_for_core

        selected = (core[:slots_for_core] + non_core[:slots_for_non_core])[:self.max_tools]

        return SelectedTools(
            selected=selected,
            core_included=[f"{t.interface_name}__{t.action_name}" for t in core if t in selected],
            excluded_count=len(candidates) - len(selected),
            reason=f"Based on your message, I found these most relevant tools:",
            discovery_suggested=len(accessible) > len(selected),  # Let LLM know discovery exists
        )

Integration: ToolDiscoveryIntegration

class ToolDiscoveryIntegration(BaseInterface):
    """Discover and explore available integrations and skills"""

    name = "tool_discovery"
    description = "Search and discover available tools and integrations"

    def __init__(self, catalog: ToolCatalogService, **kwargs):
        super().__init__(**kwargs)
        self.catalog = catalog

    def get_actions(self) -> List[ActionDefinition]:
        return [
            ActionDefinition(
                name="find",
                description="Search for tools by query. Returns matching integrations and skills.",
                parameters=[
                    ActionParameter(
                        name="query",
                        type=ParameterType.STRING,
                        description="Search query (e.g., 'send email', 'schedule meeting')",
                        required=True,
                    ),
                    ActionParameter(
                        name="limit",
                        type=ParameterType.INTEGER,
                        description="Max results to return (default 10)",
                        required=False,
                        default=10,
                    ),
                ],
                ai_access=AIAccess.EXECUTE,
                side_effect=SideEffect.READ,
            ),
            ActionDefinition(
                name="list_all",
                description="List all available tools and integrations",
                parameters=[
                    ActionParameter(
                        name="category",
                        type=ParameterType.STRING,
                        description="Optional: filter by category (communication, planning, memory, etc.)",
                        required=False,
                    ),
                ],
                ai_access=AIAccess.EXECUTE,
                side_effect=SideEffect.READ,
            ),
            ActionDefinition(
                name="describe",
                description="Get full details about a specific tool",
                parameters=[
                    ActionParameter(
                        name="tool_name",
                        type=ParameterType.STRING,
                        description="Tool name (e.g., 'gmail__send_email')",
                        required=True,
                    ),
                ],
                ai_access=AIAccess.EXECUTE,
                side_effect=SideEffect.READ,
            ),
        ]

    async def execute(self, action_name: str, parameters: dict) -> ActionResult:
        if action_name == "find":
            return await self._find(parameters.get("query"), parameters.get("limit", 10))
        elif action_name == "list_all":
            return await self._list_all(parameters.get("category"))
        elif action_name == "describe":
            return await self._describe(parameters.get("tool_name"))
        else:
            return ActionResult(success=False, error=f"Unknown action: {action_name}")

    async def _find(self, query: str, limit: int) -> ActionResult:
        """Search for matching tools"""
        results = await self.catalog.search(query, limit=limit)
        return ActionResult(
            success=True,
            output={
                "query": query,
                "count": len(results),
                "tools": [
                    {
                        "name": f"{t.interface_name}__{t.action_name}",
                        "description": t.description,
                        "ai_access": t.ai_access.value,
                    }
                    for t in results
                ],
            }
        )

    async def _list_all(self, category: Optional[str]) -> ActionResult:
        """List all available tools"""
        tools = await self.catalog.get_all_tools()
        if category:
            tools = [t for t in tools if t.category == category]
        return ActionResult(
            success=True,
            output={
                "total": len(tools),
                "tools": [
                    {
                        "name": f"{t.interface_name}__{t.action_name}",
                        "description": t.description,
                        "category": t.category,
                    }
                    for t in tools
                ],
            }
        )

    async def _describe(self, tool_name: str) -> ActionResult:
        """Get full details about a tool"""
        interface_name, action_name = tool_name.split("__", 1)
        tool = await self.catalog.get_tool(interface_name, action_name)
        if not tool:
            return ActionResult(success=False, error=f"Tool not found: {tool_name}")
        return ActionResult(
            success=True,
            output={
                "name": tool_name,
                "description": tool.description,
                "category": tool.category,
                "ai_access": tool.ai_access.value,
                "parameters_summary": tool.parameters_summary,
            }
        )

Updated Orchestrator Flow

# In chat/orchestrator.py

async def chat_with_tools(
    user_id: UUID,
    group_id: UUID,
    space_id: UUID,
    messages: list[dict],
    system_prompt_override: Optional[str] = None,
) -> AsyncGenerator[StreamEvent, None]:
    """
    Agent loop with intelligent tool selection.
    """
    # 1. Determine relevant tools (NEW)
    tool_selector = SmartToolSelector(...)
    selected = await tool_selector.select(
        user_message=messages[-1]["content"],  # Current message
        user_id=user_id,
        group_id=group_id,
        space_id=space_id,
    )

    # 2. Build Anthropic tools list from selected tools (CHANGED)
    tools = actions_to_anthropic_tools(
        interface_manager,
        tools_to_include=[f"{t.interface_name}__{t.action_name}" for t in selected.selected],
    )

    # 3. Build system prompt with transparency (CHANGED)
    system_prompt = build_system_prompt(..., selected_tools=selected)

    # 4. Agent loop (unchanged)
    for turn in range(max_turns):
        # LLM call with selected tools + system prompt
        stream = await llm.chat(
            messages=messages,
            tools=tools,
            system_prompt=system_prompt,
        )

        # ... rest of loop

Updated System Prompt

Tool usage guidelines:
I've selected these tools based on your message:
  1. memory__search — Find remembered facts, preferences, events
  2. calendar__list_events — Check your calendar
  3. calendar__create_event — Schedule an event (requires approval)
  4. notifications__send — Send yourself an alert

These are the most relevant for what you asked.

If you need something else, ask me: "What other tools do I have?"
I can search all available integrations and suggest more options.

To use a tool:
  - memory__search: Look up known facts before answering questions
  - calendar__list_events: Check what's on the calendar
  - ... [rest of tool guidance as today]

Database/Storage

LanceDB (Local, Desktop):

Table: tool_catalog
Columns: interface_name, action_name, full_name, description, category, vector, ai_access, parameters_summary, tags
Index: Vector index on vector column (cosine distance)

pgvector (Server):

Table: tool_catalog
Columns: same as LanceDB
Index: ivfflat on vector column (cosine)
Used as fallback for web client, cache for mobile

Redis Cache:

Key: tool_catalog:all → serialized list of all tools
TTL: 1 hour or on invalidation
Used for fast discovery listing

6. Questions & Answers

Q: What if user's message is ambiguous and embedding-based selection picks wrong tools?

A: This is handled by the discovery escape hatch.

If user says "send something", system might pick email + messaging tools
If user actually wants "send a notification", they can call tool_discovery__find("send notification") and get the right tool
Discovery tool is always available as escape hatch

Q: How does tool discovery work with ACL restrictions?

A: SmartToolSelector applies ACL filtering before returning results.

Query LanceDB: get top-20 tools by similarity
Filter: only tools user has access to (via Space inheritance + ACL rules)
Return filtered set
Same ACL logic applies in ToolDiscoveryIntegration._find()

Q: What about the embedding latency? 50-100ms adds up per request

A: Offset by token savings:

Embedding lookup: ~50-100ms (local LanceDB)
System prompt: 60-70% smaller → LLM processes 600-1400 tokens fewer
LLM response time: typically 30-50ms per 100 tokens
Net savings: -600ms to +200ms depending on response length
For most requests (non-verbose responses), we're ahead

Q: Should Skills and Integrations be truly unified in the code?

A: Yes, partially. In terms of tool catalog & discovery, they're identical:

Both register as "virtual integrations" with actions
Both get embedded + discoverable
Same ACL rules apply
Same system to call them

However, their creation/lifecycle might differ:

Skills: created at runtime via SkillEngine, self-register as DynamicSkillInterface
Integrations: registered at startup, may require configuration
This distinction is fine to keep in their respective services
But from the orchestrator's perspective, they're interchangeable

Q: What happens if a tool description is updated?

A: Tool catalog invalidation:

Option 1: TTL-based (1 hour) — embed is re-built periodically
Option 2: Event-based — when tool updates, trigger re-embedding
Option 3: Manual — admin command to rebuild catalog
Recommend: Hybrid of Option 2 + Option 1 (event-triggered with TTL fallback)

Q: How do you measure success?

A: Success metrics:

Token savings: System prompt reduced 60-70% (measure: count tokens in prompts.py before/after)
Tool selection accuracy: >85% of selected tools are actually used by LLM (measure: log which tools LLM calls)
Tool discovery adoption: Users call tool_discovery in <5% of requests (baseline metric, should stay low)
Latency: Request latency unchanged or improved (measure: embedding lookup + prompt parsing time)
Coverage: No tool discovery calls fail to find relevant tools (measure: discovery call results)

7. Open Items

Tool catalog invalidation strategy — Decide between TTL, event-based, or manual
- Recommendation: Event-based (when SkillService creates/deletes skill) + 1-hour TTL as safety net
- Owner: Backend architect
- Timeline: During implementation
Tool tags/categorization — Formalize tool categories
- Recommendation: Add category field to ActionDefinition, migrate all actions to categorize
- Owner: Product/Architect
- Timeline: Can be deferred to step 12 (polish)
Mobile embedding performance — LanceDB on mobile is fast, but confirm latency targets
- Recommendation: Profile embedding lookup on iOS/Android, optimize if needed
- Owner: Mobile lead
- Timeline: Phase 3d M3 (offline mobile)
Marketplace integration hook — When users download skills/integrations, exactly when does embedding happen?
- Recommendation: Async background job (don't block the download)
- Owner: Marketplace/Skills lead
- Timeline: During step 11

8. References

docs/interfaces.md — Integration/Interface system
docs/architecture.md — System architecture
backend/memory/rag.py — RAG pipeline (embedding + search)
backend/memory/embedding_manager.py — Embedding provider
backend/memory/vector_store.py — Vector storage
backend/chat/tools.py — Tool bridge (current)
backend/chat/orchestrator.py — Orchestrator (will be updated)
backend/chat/prompts.py — System prompt builder (will be updated)
Phase 3b.1 (Skills): backend/skills/
Phase 2b (Tauri Rust): frontend/src-tauri/src/ (LanceDB, embeddings)

9. Implementation Dependencies

Must be done first:

Phase 3e.5 complete (latest system prompt, tools stable)
Embedding provider operational (RAG pipeline working)

Can be done in parallel:

Marketplace integration (step 11) can start anytime
Documentation updates (steps 12-13) can follow core implementation

Blocks:

None — this is a new feature, no breaking changes

Last Updated: February 20, 2026 Owner: Backend Architect + LLM Team

1. Context & Motivation​

Problem Statement​

Scope​

Extended Scope: MCP (Model Context Protocol) Integration​

Scale​

2. Options Investigated​

Option A: Embedding-Based Smart Tool Selection (CHOSEN)​

Option B: Tool Discovery Interface Only (No Pre-filtering)​

Option C: Hybrid (Pre-filter + Discovery)​

Option D: Per-User Tool Profiles​

Option E: Dynamic System Prompt​

3. Decision​

4. Implementation Plan​

Phase 1: Foundation (Week 1-2, Steps 1-3)​

Phase 2: LLM Integration (Week 2-3, Steps 4-6)​

Phase 3: Testing & Validation (Week 3-4, Steps 7-10)​

Phase 4: Marketplace Integration (Week 4, Step 11)​

Phase 5: Polish & Documentation (Week 4+, Steps 12-13)​

5. Technical Specification​

Data Models​

ToolCatalogEntry​

SmartToolSelector Output​

Service: ToolCatalogService​

Service: MCPIntegration (NEW)​

MCP Tool Registration Flow​

Service: SmartToolSelector​

Integration: ToolDiscoveryIntegration​

Updated Orchestrator Flow​

Updated System Prompt​

Database/Storage​

6. Questions & Answers​

Q: What if user's message is ambiguous and embedding-based selection picks wrong tools?​

Q: How does tool discovery work with ACL restrictions?​

Q: What about the embedding latency? 50-100ms adds up per request​

Q: Should Skills and Integrations be truly unified in the code?​

Q: What happens if a tool description is updated?​

Q: How do you measure success?​

7. Open Items​

8. References​

9. Implementation Dependencies​

1. Context & Motivation

Problem Statement

Scope

Extended Scope: MCP (Model Context Protocol) Integration

Scale

2. Options Investigated

Option A: Embedding-Based Smart Tool Selection (CHOSEN)

Option B: Tool Discovery Interface Only (No Pre-filtering)

Option C: Hybrid (Pre-filter + Discovery)

Option D: Per-User Tool Profiles

Option E: Dynamic System Prompt

3. Decision

4. Implementation Plan

Phase 1: Foundation (Week 1-2, Steps 1-3)

Phase 2: LLM Integration (Week 2-3, Steps 4-6)

Phase 3: Testing & Validation (Week 3-4, Steps 7-10)

Phase 4: Marketplace Integration (Week 4, Step 11)

Phase 5: Polish & Documentation (Week 4+, Steps 12-13)

5. Technical Specification

Data Models

ToolCatalogEntry

SmartToolSelector Output

Service: ToolCatalogService

Service: MCPIntegration (NEW)

MCP Tool Registration Flow

Service: SmartToolSelector

Integration: ToolDiscoveryIntegration

Updated Orchestrator Flow

Updated System Prompt

Database/Storage

6. Questions & Answers

Q: What if user's message is ambiguous and embedding-based selection picks wrong tools?

Q: How does tool discovery work with ACL restrictions?

Q: What about the embedding latency? 50-100ms adds up per request

Q: Should Skills and Integrations be truly unified in the code?

Q: What happens if a tool description is updated?

Q: How do you measure success?

7. Open Items

8. References

9. Implementation Dependencies