Methodology · v1 advisory

How a verdict is produced.

mcpindex evaluates MCP tools and publishes a finding per tool. Today every published screen verdict is semantic-only: an LLM judge reads the tool description for hidden instructions, bound to the exact tool definition that was seen (tool_definition_hash). The deterministic conformance probe is built but has not yet run against the public corpus — so a conforming ALLOW (which the probe would have to earn) is not produced at v1; the screen emits REVIEW or UNVERIFIED. The finding is what an agent reads before it calls. Confidence is reported but not yet calibrated (calibrated=false) — the honest limits below.

The eval
  • Conformance probe
    roadmap

    Built but not yet run on the public corpus. When it runs, it drives the tool against its declared schema and checks whether observed behavior matches what the description claims (a pass/fail dimension verdict with a captured trace), gated to the D3 labeled-corpus milestone. At v1 it is built-not-run on the screen: no public verdict carries a conformance result yet, and a conformance result, when it lands, will be monitored, not enforced — it surfaces in the verdict; it does not block the call upstream.

  • Intent judge
    LLM

    Reads the tool description, schema, and example outputs adversarially. Flags hidden instructions, exfiltration patterns, prompt-injection payloads, and overclaims (e.g. 'validates' a field it never checks). Output is a pass/fail dimension verdict with rationale and severity.

  • History
    OTS

    OTS Bitcoin-anchored history with cadence bound = confirmation latency (~10 min for pending; ~1 hour at N=6 confirmations for Bitcoin-finalized); sub-window precision asserted, not proven. The verdict stream for a tool is hash-chained and timestamped via OpenTimestamps; the chain is auditable end-to-end once a block confirms.

The drift gate method

The screen above is the prior an agent reads before it wires a tool. The drift gate is the live, in-path check during use. It answers a narrower, provable question: did this tool’s contract change since you pinned it? The verdict is a contract-diff, not a safety verdict — but the gate sits in the call path, so it can HOLD the call before your agent acts on the change, not merely report it after. The gate runs deterministically and entirely on your host. Its verdicts are produced by the live gate code, not a hand-written table; the reproducible scenario battery is documented in the whitepaper.

  • TOFU pin
    baseline

    On first sight of a tool (the client's tools/list), the gate pins the tool's contract trust-on-first-use: a hash over name + description + input schema, plus the captured schema. The pin can persist across restarts, so a contract that changes while your agent is offline is still caught on the next call. The first-seen contract is the baseline; the gate cannot catch drift that happened before it was installed.

  • Contract-diff
    deterministic

    On a call, the gate re-derives the live contract and compares it to the pin. A mismatch is classified into a fixed taxonomy (ChangeKind): added-required-param, required-set-expanded, constraint-narrowed, type-changed, enum-values-removed, removed-param, annotation-flip-to-destructive, output-schema-added, output-schema-changed, tool-added/removed. It also scans for injection/exfil markers in the input AND output schema and the description. No LLM, no scoring you cannot trace; a structural surprise it cannot classify fails closed (deep-schema-undiffable), never open.

  • Postures
    policy

    Monitor notifies and proceeds; Guard (default) holds the unambiguously-breaking and dangerous changes while letting a proven-benign drift through; Strict holds on any drift. A benign change (added optional param or new tool, description byte-identical, no risk escalation, no marker) is auto-accepted and re-pinned, so cosmetic churn does not raise a false alarm. Anything else holds before the call.

Honest limits (the gate)
  • · Contract-diff, not a safety verdict. A HOLD means “this tool’s contract changed vs what you pinned” — not that the new contract is unsafe. You review the before/after and re-pin if the change is expected.
  • · Advisory in judgment, in-path in effect. The gate does not assert a tool is safe; it asserts what changed. Because it sits in the call path, that judgment can actually HOLD the call — a passive scanner can only alert after the fact.
  • · Deterministic tier-0 live; tiers 1-3 built but held off by default. The contract-diff is deterministic, runs first, and is the live, deterministic leg. Above it the ladder is built as in-path seams — a cloud tier-1 corpus lookup (a contract judged once clears or condemns it everywhere), a tier-2 LLM consult on the ambiguous, and a tier-3 behavioral verifier that exercises a changed tool to clear or refute the change — but each is held off by default and requires explicit opt-in; the default build egresses nothing and stays fail-closed. When enabled, the behavioral tier clears or refutes a contract change; it is not a proof of safety, and confidence is reported but not yet calibrated (calibrated=false at v1).
  • · Fail-closed. A tool with no pin, an unreadable contract, or a diff the gate cannot complete holds rather than proceeds. The gate never silently allows what it could not verify.
The drift network method

The in-path gate catches a change the first time you see it. The drift network catches it before you do. mcpindex crawls the public MCP registry every day, re-derives each tool’s contract, and records every silent change as a fingerprint-only entry. When you pin a tool, the gate can ask the network one question: has the crawler already caught this contract drifting? If it has, you are warned on the first call — a contract-diff advisory that rides alongside the verdict and never moves PROCEED or HOLD. Every drift the crawler catches is public in the drift ledger.

  • · Crawler-corroborated, not crowd-sourced. The public corroboration count floors at the crawler (one first-party source); forgeable install reports are excluded from the public number. The warning is real today because the crawler sees the drift, not because other installs reported it.
  • · Opt-in, privacy-by-construction. Off by default. When enabled, the only thing that leaves is a salted (HMAC) fingerprint plus closed-vocabulary fields (change type, safety flag, hour-rounded time) — never a schema, argument, description, URL, or server/tool name. Fail-open: it never blocks or changes a call.
  • · Advisory, never the decision. The fleet advisory informs; the gate’s deterministic contract-diff still decides. The network can raise your attention; it cannot move a PROCEED or a HOLD.
Four-state verdict

The directive an agent reads is one of three decisions, on top of a status that says how the eval went. Together they are four states an agent must distinguish.

  • ALLOW
    decision (roadmap)

    The eval ran end-to-end and the tool cleared its checks at the recorded clearance level; the agent may invoke within that clearance until expires_at. Not produced at v1: a clearing ALLOW requires the behavioral conformance probe, gated to the D3 labeled-corpus milestone. Today the screen emits REVIEW or UNVERIFIED only.

  • DENY
    decision (roadmap)

    The eval ran end-to-end and a finding crossed the deny threshold (high-severity intent flag, conformance regression, or a poisoned description); the agent should not invoke. Reserved in the contract; at v1 a high-severity finding surfaces as REVIEW for human adjudication rather than an automatic public DENY.

  • REVIEW
    decision

    The eval ran but produced ambiguous or partial findings (e.g. medium-severity flag, partial conformance, provider disagreement). Surfaces the dimension findings; agent should defer to a human or fall back to its own checks.

  • UNVERIFIED
    status

    No verdict on file for this tool yet (the wire term the trust API returns when a tool has not been screened). The agent should NOT infer trust; treat as not-yet-cleared. Coverage rolls out as the corpus expands (adversarial cases first).

Honest limits (v1)
  • · Definition, not runtime. The eval is bound to the tool definition (description + schema) at evaluation time. Runtime behavior on a specific call is not covered.
  • · Conformance built, not yet run; monitored, not enforced. The deterministic conformance probe has not run on the public corpus, so no published screen verdict carries a conformance result today. When it runs, a conformance result is reported in the verdict and the public surface, not enforced on the wire.
  • · OTS cadence bound = confirmation latency. The OTS anchor proves the verdict existed by some Bitcoin block; it does not prove minute-level ordering inside the confirmation window.
  • · calibrated = false at v1. Confidence scores are reported but not calibrated against a held-out adversarial corpus yet.
  • · Advisory, not blocking. mcpindex publishes the verdict. The agent or IDE decides whether to act.
  • · D3 graduation gate. >=150 conforming labels with FP upper-95 <=2%. Current: 15/150.

The honest-limits list is a contract. If any of these stops being true, the methodology page changes first, the verdict surface changes second, and the network only sees the upgrade after both.

Quality score (directory axis)

The trust verdict answers “does this tool behave as it claims.” The directory still answers a simpler question: which servers look mature from public registry signal. The 0-100 MCP Quality Score is a public-data composite (freshness, completeness, installability, documentation, semver stability) and remains the secondary axis on every page. Source: lib/quality.ts.

Cite this
"mcpindex: the trust-to-act layer for agent tool use." mcpindex.ai/methodology, 2026.
https://mcpindex.ai/methodology

Or just link to a server's detail page. The verdict surface and the score breakdown both render there.