Endara Relay

One endpoint for all your MCP servers. A single Rust binary that aggregates local STDIO servers, remote HTTP/SSE servers, and OAuth servers and serves them at http://localhost:9400/mcp. A separate management API (used by Endara Desktop) is exposed on a local Unix-domain socket / Windows Named Pipe — never on a TCP port.

Overview

Endara Relay is a single Rust binary that sits between your AI client (Claude Desktop, Cursor, ChatGPT, Windsurf, VS Code, Zed, Continue, or any MCP-compatible app) and all the MCP servers you actually use. Point every client at one local endpoint — http://localhost:9400/mcp— and the relay handles the rest: spawning STDIO servers, holding onto SSE / HTTP connections, refreshing OAuth tokens, and merging every server's tool catalog into a single unified list with collision-free names.

It uses one transport-specific adapter per endpoint, namespaces tools with a stable prefix to avoid collisions, and watches config.toml for changes so you can add or remove servers without restarting. STDIO adapters are restarted automatically with exponential backoff if the underlying process crashes.

Optionally, Relay can run in JS execution mode, where instead of advertising hundreds of tool definitions to the model on every turn, it advertises three meta-tools and lets the model run a sandboxed JavaScript program that calls the underlying tools in a single round-trip. See JS execution engine below.

No cloud, no accounts, no telemetry. Everything runs on your machine.

Install

Pick whichever you have set up.

Homebrew (macOS / Linux)

brew install endara-ai/tap/endara-relay

Cargo

cargo install endara-relay

Pre-built binaries

Download the latest release for your platform from github.com/endara-ai/endara-relay/releases.

Or, if you'd rather not run the relay yourself, install Endara Desktop — it bundles the relay, starts and stops it for you, and provides a UI for managing endpoints.

Quick start

Drop the following into ~/.endara/config.toml:

# ~/.endara/config.toml
[relay]
machine_name = "my-mac"

[[endpoints]]
name = "filesystem"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/projects"]

Then start the relay:

endara-relay start

Point any MCP-compatible client at http://localhost:9400/mcp. Claude Desktop only speaks stdio, so use the mcp-remote bridge — drop this into claude_desktop_config.json:

{
  "mcpServers": {
    "endara": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "http://localhost:9400/mcp"]
    }
  }
}

For Cursor, add the same URL under Settings → MCP → Add new MCP server → HTTP. Restart the client and the filesystem tools should appear in its tool list, prefixed with filesystem__.

CLI reference

The relay has a single subcommand, start, which boots the HTTP server, loads the config, and starts watching it for changes.

FlagDefaultDescription
--data-dir~/.endaraBase directory for config, logs, and OAuth tokens. The relay creates it if it doesn't exist and writes a default config.toml on first run.
--config<data-dir>/config.tomlOverride the path to the TOML configuration file.
--port9400Port to listen on. The MCP endpoint (/mcp and /mcp/*), /oauth/callback, and /healthz are served on this TCP port. The management API (/api/*) is not served on TCP — it is bound to a Unix-domain socket / Windows Named Pipe; see Management API.
--log-formatcompactLog output format for stdout. One of compact, text, or json. The rolling file always uses compact.
--colorautoColorize stdout logs. One of auto (color only when stdout is a TTY), always, or never. The rolling log file is always ANSI-free.
--file-log-leveldebug,endara_relay=traceEnvFilter directive applied to the rolling log-file layer, independent of the stdout filter. Lets the file run at a different verbosity than stdout.
--no-toon(flag)Disable TOON conversion of JSON tool-call responses for this run. Overrides [relay] toon_output from config.toml. See TOON output.

NoteThe RUST_LOG environment variable overrides the stdout log filter when set. The default is info,endara_relay=debug. Logs are written to both stdout and ~/.endara/logs/relay.log.<YYYY-MM-DD> (the file is filtered by --file-log-level).

Configuration reference

Endara Relay reads a single TOML file. Default location: ~/.endara/config.toml. The file has one [relay] table and any number of [[endpoints]] entries.

[relay] table

FieldTypeRequiredDefaultDescription
machine_namestringnosystem hostnameIdentifies this relay instance in logs and /api/status. Free-form; pick anything that helps you tell machines apart. When omitted (or when the whole [relay] table is absent) the relay derives it from your hostname.
local_js_executionboolnofalseWhen true, the advertised tool catalog is replaced with three meta-tools (list_tools, search_tools, execute_tools) and direct tool calls are rejected. See JS execution engine.
token_dirstring (path)no<data-dir>/tokensOverride the directory used for OAuth token and DCR-credential storage. Useful when you want a non-default location separate from the data dir.
toon_outputboolnotrueWhen true, the relay re-encodes JSON-shaped tool-call responses to TOON before forwarding them to MCP clients. Set to false (or pass --no-toon on the command line) to forward JSON pass-through. See TOON output.
startup_init_timeout_secsinteger (seconds)no60Caps how long the MCP HTTP listener waits for adapters to finish their parallel async init before binding port 9400 anyway. Past the timeout the port comes up and any still-pending adapters keep initializing in the background; their endpoints surface an Initializing… state until they settle to healthy or failed.

NoteThe entire [relay] table is optional. When it is absent, the relay applies sensible defaults (including a machine_name derived from your hostname) and starts normally — a fresh install works out of the box without a hand-written [relay] table.

[[endpoints]] entries

Each [[endpoints]] table describes one MCP server to connect to. Required fields depend on the chosen transport.

FieldTypeRequiredDescription
namestringyesUnique, non-empty identifier. Used as the default tool prefix (sanitized to lowercase ASCII), and is the path segment in management API URLs (those URLs are served on the management socket — see Management API).
descriptionstringnoFree-form; surfaced in the UI and in logs.
tool_prefixstringnoOverride the auto-derived prefix. If omitted, defaults to sanitize_name(name). See Tool prefixing.
transportstdio | sse | http | oauthyesAdapter type. Determines which other fields are required.
commandstringyes (stdio)Executable to spawn for STDIO transports.
argsarray of stringnoArguments passed to the spawned process.
urlstringyes (sse, http, oauth)Endpoint URL for HTTP-based transports.
envmap of string → stringnoEnvironment variables passed to STDIO subprocesses. Values support $VAR resolution and $$ escaping — see Environment variable resolution.
headersmap of string → stringnoExtra HTTP headers for http / sse / oauth transports. Header values support inline $VAR substitution (e.g. Authorization = "Bearer $TOKEN").
disabledboolnoDefault false. When true, the endpoint is registered but no adapter is started; toggling this does not restart adapters during hot-reload.
disabled_toolsarray of stringnoTool names to hide from the advertised catalog without disabling the underlying server. Calls to disabled tools return an MCP error.
oauth_server_urlstringyes (oauth)Authorization server base URL. The relay performs OIDC / metadata discovery against this URL.
client_idstringnoPre-provisioned OAuth client identifier. If omitted, the relay performs Dynamic Client Registration when needed.
scopesarray of stringnoOAuth scopes requested during authorization.
token_endpointstringnoOverride the discovered token endpoint URL. Rarely needed — the relay self-heals stale token endpoints by re-running OAuth discovery on a refresh-time 404.
server_type_overridestringnoOverride the upstream-derived server type name that the relay advertises to connected MCP clients (the label that shows up in Connected server types: and prefixes tool names). Useful when an upstream MCP server reports a placeholder or unhelpful serverInfo.name(e.g. Google's hosted MCP servers all self-identify as statelessserver). The auto-strip of -mcp-server and friends is not applied to overrides — the value is used as written. Tool-name routing (the tool_prefix) is unaffected.

Containerized stdiostdio endpoints run inside a per-endpoint container by default when a container runtime (docker or podman) is detected, for stronger isolation. The relay uses a small mcp-runner image that provides uvx and npx, so the usual command/args work unchanged; with no runtime present it falls back to spawning the process directly. Per-endpoint isolation and host volume mounts are managed from Endara Desktop.

OAuth credentialsOAuth client credentials are not stored in config.toml. They are written via POST /api/endpoints/{name}/oauth/credentials(or the equivalent flow in Endara Desktop) and persisted by the relay's TokenManager under ~/.endara/tokens/ with mode 0600. Dynamic Client Registration (DCR) populates them automatically when the server supports it; otherwise you provide client_id (and client_secret for confidential clients) via the API or desktop UI. Note: /api/* is exposed only on the management socket described in Management API, not on http://localhost:9400. Use curl --unix-socket (or the Desktop UI) to call it.

Transport snippets

STDIO

[[endpoints]]
name = "github"
description = "GitHub MCP server"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
env = { GITHUB_TOKEN = "$GITHUB_TOKEN" }

HTTP

[[endpoints]]
name = "context7"
transport = "http"
url = "https://mcp.context7.com/mcp"
headers = { Authorization = "Bearer $CONTEXT7_KEY" }

SSE

[[endpoints]]
name = "remote-sse"
transport = "sse"
url = "https://example.com/mcp/sse"

OAuth

[[endpoints]]
name = "linear"
transport = "oauth"
url = "https://mcp.linear.app/mcp"
oauth_server_url = "https://mcp.linear.app"
scopes = ["read", "write"]
# client_id / client_secret are persisted via the management API, not TOML

Environment variable resolution

Endpoint env values and headers values are passed through a small resolver before adapters start:

If a referenced variable is not set, the relay records the failure through ConfigError::EnvVarMissing. With graceful validation (the default at startup and during hot-reload), the affected endpoint is registered as a failed adapter with the underlying error visible on its entry in GET /api/endpoints (filter the array by name) — startup itself does not fail.

Validation rules

Hot reload

The relay watches config.toml via the notify crate and applies changes without a restart. The file is diffed against the running config, and:

Management API

The relay exposes a small JSON API for inspecting state, restarting adapters, completing OAuth flows, and editing the running configuration. Endara Desktop drives this API. Unlike /mcp, the management API does not listen on TCP — it binds to a Unix-domain socket on macOS / Linux ($XDG_RUNTIME_DIR/endara-relay-<suffix>/api.sock, falling back to $TMPDIR/endara-relay-<uid>-<suffix>/api.sock on macOS or <data-dir>/api.sock if no runtime dir is set) and to a per-user Named Pipe on Windows (\\.\pipe\endara-relay-<sessionid>, falling back to \\.\pipe\endara-relay-<data-dir-hash>). The <suffix> is a stable hash of the data-dir, so a dev build (~/.endara-dev) and a production build (~/.endara) get distinct sockets and can run side-by-side on the same machine. To script against the API, use curl --unix-socket (macOS / Linux) or a Named Pipe client (Windows). Keeping the management API off TCP rules out drive-by browser attacks against a local HTTP endpoint; see Security for the broader threat model.

MethodPathDescription
GET/api/statusProcess uptime, total endpoint count, and healthy count.
GET/api/endpointsAll endpoints with health, tool count, last-activity timestamps, and lifecycle state.
GET/api/catalogFull merged tool catalog across all endpoints with applied prefixes, source endpoint name, and current availability (reflects per-endpoint and per-tool disable state plus health).
GET/api/endpoints/:name/toolsTool definitions for a single endpoint, including each tool’s input schema.
GET/api/endpoints/:name/logsRecent log lines for an endpoint, for debugging stuck or failing adapters.
POST/api/endpoints/:name/restartRestart an endpoint’s adapter. Returns immediately; the heavy work runs in the background and lifecycle state is surfaced through GET /api/endpoints.
POST/api/endpoints/:name/refreshRe-list tools from a healthy endpoint without restarting it.
POST/api/endpoints/:name/disableShut down the adapter and mark the endpoint disabled. Persisted to the disabled-state file so the endpoint stays disabled across restarts; its tools disappear from the catalog.
POST/api/endpoints/:name/enableClear the disabled flag and re-initialize the adapter. Persisted to the disabled-state file.
POST/api/endpoints/:name/tools/:tool_name/disableHide a single tool from the merged catalog without disabling the endpoint as a whole. Persisted to the disabled-state file.
POST/api/endpoints/:name/tools/:tool_name/enableRe-enable a previously disabled tool on an endpoint. Persisted to the disabled-state file.
POST/api/endpointsCreate a new endpoint and persist it to config.toml. Body is the same JSON shape as a [[endpoints]] entry. The new adapter is initialized asynchronously and surfaces through GET /api/endpoints.
PUT/api/endpoints/:nameReplace an existing endpoint’s definition in-place and persist to config.toml. Restarts the adapter if any meaningful field (transport, command/args, url, env, headers, OAuth fields) changed; preserves stored OAuth tokens when the change is non-credential.
DELETE/api/endpoints/:nameRemove an endpoint from the running registry and persist the deletion to config.toml.
GET/api/configCurrent parsed configuration with env values redacted.
POST/api/config/reloadForce an immediate reload from disk (the file watcher does this automatically; this endpoint is for triggering it manually).
POST/api/test-connectionTry connecting with the supplied transport / command / url / headers without persisting an endpoint. Useful for UIs validating user input before saving.
POST/api/endpoints/:name/oauth/startStart an OAuth authorization flow for the endpoint and return the authorize URL.
POST/api/endpoints/:name/oauth/credentialsPersist OAuth client credentials (client_id / client_secret) for the endpoint.
GET/api/endpoints/:name/oauth/statusWhether the endpoint has tokens, when they expire, and which scopes were granted.
POST/api/endpoints/:name/oauth/revokeRevoke and delete the stored OAuth tokens for the endpoint.
POST/api/endpoints/:name/oauth/refreshForce-refresh an access token using the stored refresh token.
GET/api/endpoints/:name/oauth/metricsIn-process OAuth metric counters for the endpoint (e.g. token refreshes, refresh failures), as JSON.
POST/api/oauth/setupCreate a transient OAuth setup session: discovers OAuth metadata, attempts Dynamic Client Registration, and returns the authorize URL — without writing to config.toml.
POST/api/oauth/setup/:id/credentialsSubmit manual client_id / client_secret for a setup session when DCR is unavailable, and receive the authorize URL.
GET/api/oauth/setup/:id/statusPoll the status of a setup session (pending / awaiting credentials / authorized / failed).
POST/api/oauth/setup/:id/commitPersist a successfully authorized setup session: write the new endpoint into config.toml and register the running adapter. Only succeeds once the session has reached the Authorized state.
DELETE/api/oauth/setup/:idCancel a setup session and clean up its in-memory state without writing to config.
POST/api/endpoints/:name/credentialsPersist OAuth client credentials (client_id and optional client_secret) for an existing OAuth endpoint via the TokenManager DCR file. Modern replacement for the legacy client_secret TOML field. To seed credentials during initial setup, before the endpoint exists, use POST /api/oauth/setup/:id/credentials instead.
GET/api/endpoints/:name/credentialsInspect which credential fields are currently set for an endpoint (values are not returned).
GET/api/idp-providersThe static IdP provider-template table (Okta, Entra, Google, Ping, Custom) used to build an organization’s issuer URL. See Enterprise-Managed Authorization.
GET/api/organizationsList configured organizations with provider, resolved IdP issuer, and whether the credential pool currently holds a usable ID / refresh token.
POST/api/organizationsCreate an organization from a provider + slug (or a custom issuer URL) and an optional pre-registered client_id, then return the IdP SSO authorize URL. Tokens are never written to config.toml.
POST/api/organizations/:org/reauthenticateRe-run the IdP sign-in for an organization (e.g. after its refresh token expires) and return a fresh authorize URL.
DELETE/api/organizations/:orgRemove an organization and purge its pooled IdP credentials. Endpoints bound to it stop authenticating.
POST/api/organizations/:org/probeEMA capability probe: given candidate MCP server URLs, report which ones the org’s IdP credentials can mint an access token for. Bounded, cached, and persists nothing.

Scripting against the API

Because /api/* lives on a local socket / pipe, the invocation depends on your platform. Methods, paths, JSON bodies, and status codes are standard HTTP — only the transport is local.

# Linux — Unix-domain socket under $XDG_RUNTIME_DIR.
# <suffix> is a stable hash of the data-dir; resolve the actual path
# from /api/status output or by inspecting the relay's startup logs.
curl --unix-socket "$XDG_RUNTIME_DIR/endara-relay-<suffix>/api.sock" \
  http://localhost/api/status
# macOS — Unix-domain socket under $TMPDIR.
# <suffix> is a stable hash of the data-dir.
curl --unix-socket "$TMPDIR/endara-relay-$(id -u)-<suffix>/api.sock" \
  http://localhost/api/status
# Windows (PowerShell) — per-user Named Pipe.
# curl 8.x supports --unix-socket against \\.\pipe\<name>
curl.exe --unix-socket "\\.\pipe\endara-relay-$([System.Security.Principal.WindowsIdentity]::GetCurrent().User.Value)" `
  http://localhost/api/status

On all platforms, the host portion of the URL (http://localhost) is ignored by the relay — only the path and method matter. The socket / pipe is owned by the current user with restrictive permissions, and on Unix the relay verifies the peer's UID before accepting a connection.

Enterprise-Managed Authorization (EMA)

Preview / forward-lookingEMA is a preview capability. It has two independent legs: an identity provider issues a cross-app ID-JAG assertion (RFC 8693 token exchange), and a resource authorization server redeems that ID-JAG for an access token. Okta can issue ID-JAGs today (with admin managed-connection setup), but no production MCP provider redeems ID-JAGs yet. The relay's full chain (SSO → exchange → redeem → call) is implemented and validated by mock-IdP / mock-AS integration tests, but a live end-to-end run currently requires both an IdP and a resource AS that support cross-app access / ID-JAG (for example the oktadev reference server). You cannot EMA-connect ordinary catalog providers (GitHub, Linear, Slack, Atlassian, etc.) in production today.

EMA lets one organization sign in to its identity provider once and have that single session authorize every MCP server the org governs. It contrasts with per-endpoint resource OAuth (above), where each server runs its own authorization flow and holds its own tokens. Under EMA, an [[organizations]] entry holds the shared IdP sign-in, and each EMA endpoint references that org by name and carries the resource (its own MCP server URL) the minted access token is scoped to.

# An organization: one IdP sign-in shared across this org's EMA endpoints.
[[organizations]]
name = "Acme Corp"
provider = "okta"
idp = "https://acme.okta.com"
# Optional pre-registered IdP client_id. Required for IdPs without CIMD/DCR
# (Okta, Entra). Omit it to fall back to CIMD (when advertised) or DCR.
client_id = "0oa1example2client3id"

# An EMA endpoint authenticates through the org above and mints a token
# scoped to its own resource (the MCP server URL).
[[endpoints]]
name = "acme-jira"
transport = "http"
url = "https://acme.example.com/mcp"

[endpoints.auth]
type = "ema"
organization = "Acme Corp"
resource = "https://acme.example.com/mcp"
# Tokens are NEVER written here — the org's ID token lives in the credential store.

[[organizations]] fields

FieldTypeRequiredDescription
namestringyesStable, human-readable key (e.g. Acme Corp). Referenced by endpoint auth.organization and used as the credential-pool key.
providerokta | entra | google | ping | customyesProvider template id. Determines how the IdP issuer URL is built — see the per-provider matrix below.
idpstringyesResolved IdP issuer URL. Built from the provider template + your org slug (or pasted directly for custom), then validated via RFC 8414 / OIDC discovery.
client_idstringnoPre-registered OAuth client_idfor the org's IdP app. When present it is used verbatim across the authorize URL and every EMA token leg; when absent the relay falls back to the resolution chain below. Required for Okta / Entra.

Endpoint [endpoints.auth] EMA block

FieldTypeRequiredDescription
typeemayesSelects Enterprise-Managed Authorization. ema is currently the only supported value.
organizationstringyesName of the [[organizations]] entry this endpoint authenticates through. The IdP issuer is resolved from the named org.
resourcestringyesThe target MCP server URL the EMA access token is minted for (typically the same value as the endpoint's url).

No tokens in configThe org's pooled ID token and refresh token are never written to config.toml. Only name, provider, idp, and the optional client_idare serialized. Credentials live in the relay's credential store with mode 0600 — see Security.

Client registration model

Before it can start the IdP sign-in, the relay needs an OAuth client_idfor your org's IdP. It resolves one through a fixed fallback chain:

  1. Explicit client_id on the organization — used verbatim when supplied.
  2. CIMD (Client ID Metadata Document) — only when the IdP advertises client_id_metadata_document_supported in its discovery metadata.
  3. DCR (Dynamic Client Registration, RFC 7591) — only when the IdP exposes a registration_endpoint.
  4. Otherwise the relay returns 422 client_id_required and you must supply a pre-registered client_id.

Okta and Entra support neither CIMD nor DCR for this flow, so they always need a pre-registered client_id. Google, Ping, and most custom OIDC providers also typically require a pre-registered client.

Loopback redirect URI

The IdP app you register must whitelist the relay's loopback redirect URI:

http://127.0.0.1:{relay_port}/oauth/callback

The default {relay_port} is 9400 (production) or 9500 (dev). The value must byte-match what the relay sends: use the literal host 127.0.0.1 (not localhost) and the exact port your relay listens on. A mismatched port or host is the most common cause of an invalid redirect_uri error on the IdP consent screen.

Grant types & scopes

Register the IdP app for the Authorization Code grant and Refresh Token grant. The relay requests the scope openid offline_access. openid yields the ID token EMA exchanges for an ID-JAG; offline_access asks the IdP for a refresh token so the relay can silently re-mint the ID token without sending you back through the browser each time it expires.

Per-provider setup matrix

For each provider, register a public/native OAuth client (PKCE, no client secret), whitelist the loopback redirect URI above, and enable the Authorization Code + Refresh Token grants with scopes openid offline_access. The provider-specific details follow. Items marked (unverified — confirm in your IdP console) are based on the issuer templates the relay ships; confirm the exact console steps against your tenant.

Okta

Microsoft Entra ID

Google

PingOne / PingFederate

Custom OIDC

Endpoint profiles

Profiles are named subsets of your registered endpoints served under their own MCP URL. Pointing a client at http://localhost:9400/mcp/{profile} exposes only the tools from the servers you added to that profile, so different agents or clients can share one relay without sharing one catalog.

Profile URL shape

Each profile is reachable at http://localhost:9400/mcp/{profile} over Streamable HTTP, and at http://localhost:9400/mcp/sse/{profile} for legacy SSE clients. The unprefixed /mcp and /mcp/sse endpoints continue to serve the union of every enabled endpoint, exactly as before.

Per-profile JS execution and TOON output overrides

Each profile owns its own local_js_execution and toon_output values, independent of the global [relay] defaults. One profile can serve raw JSON to a downstream tool that requires it while another profile keeps the token-efficient TOON encoding on, all from the same relay process.

Server allow-list

A profile maintains an explicit allow-list of endpoint names. Adding an endpoint to a profile makes its tools visible on that profile's URL; removing it (or never adding it) hides them. Endpoints not in any profile are still reachable on the unprefixed /mcpURL. The allow-list is managed through Endara Desktop's Profiles tab or through the management API.

claude_desktop_config.json snippet

Use mcp-remote to bridge a profile URL into stdio clients like Claude Desktop. The Profiles tab in Endara Desktop renders a copyable snippet equivalent to:

{
  "mcpServers": {
    "endara-work": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "http://localhost:9400/mcp/work"]
    }
  }
}

Tool-catalog change notifications

Endara Relay advertises tools.listChanged: true in its initialize response, so spec-compliant MCP clients subscribe to live tool-catalog updates instead of relying on cached descriptions. The relay drives notifications/tools/list_changed from two sources:

Clients that ignore the tools.listChanged capability keep their old behaviour of caching whatever they pulled at initialize time. Clients that honour it refetch tools/list on every event and stay in sync with both upstream and relay-driven mutations automatically.

JS execution engine

When [relay] local_js_execution = true, the relay replaces its full advertised tool catalog with three meta-toolslist_tools, search_tools, and execute_tools — and rejects any direct tool call with the message "Direct tool calls are not allowed in JS execution mode. Use execute_tools instead." The model is expected to look up the tools it actually needs through search_tools, then call them inside a single sandboxed JavaScript program.

How it works

execute_tools runs the supplied script in an embedded boa_engine JavaScript sandbox — entirely in-process, no Node.js, no require / import / fetch, no filesystem or network access of its own. The script body is wrapped in (async function() { ... })() so top-level await works. Whatever value you pass to returnbecomes the meta-tool's result. Each call gets a fresh context — no state persists between execute_tools invocations.

Sandbox limits

Functions and globals exposed to the script

Tool naming inside the script

Tool keys on the tools object follow the same prefixing scheme as the underlying catalog. Multi-server mode produces prefix__name with a double underscore between prefix and tool name (e.g. github__list_repos). Single-server mode omits the prefix.

Result shape and the safe-handling pattern

Every tools[...] call returns the standard MCP tool result: { content?: [{ type, text }], structuredContent? }. structuredContentis the server's structured output and is preferred. content[0].text is provider-defined prose and is not guaranteed to be JSON — it may be empty, truncated, or natural language. Use this pattern:

const r = await tools["todoist__get-tasks"]({ limit: 5 });
if (r.structuredContent) return r.structuredContent;
const t = r.content && r.content[0] && r.content[0].text;
return typeof t === "string" && /^\s*[\[{]/.test(t) ? JSON.parse(t) : t;

The three meta-tools

list_tools({ limit?, offset? })

Paginated catalog. limit defaults to 50 and is capped at 200. Returns { tools, total, limit, offset }; each tool entry is { name, description, input_schema, annotations? }. Use this when you want to enumerate every tool the relay knows about.

search_tools({ query, limit? })

Fuzzy ranked search across tool name, description, endpoint name, and input-schema property names. limit defaults to 20 and is capped at 200. Search is case-insensitive and typo-tolerant (Levenshtein), and respects camelCase / snake_case / kebab-case word boundaries. Ranking goes exact > prefix > substring > fuzzy; field weights are name > description > endpoint; tools matching more query tokens rank higher. Returns an array of { name, description, input_schema, annotations? }.

execute_tools({ script })

Runs script under the rules above and returns whatever the script returns. Throws if the script throws or exceeds a sandbox limit; the error message is propagated back to the meta-tool caller.

Why this exists — the token-burn problem

A typical desktop client connects to many MCP servers (filesystem, github, slack, jira, todoist, postgres, …). The combined catalog can easily be hundreds of tools with multi-thousand- character JSON schemas attached to each.

In standard MCP mode, every one of those tool definitions is sent to the model on every request — the catalog alone can cost tens of thousands of input tokens per turnjust to advertise capabilities the model probably won't use this turn.

JS mode collapses that advertised surface to three tools. The model uses search_tools to look up the handful of tools it needs for the current task, calls them inside a single execute_tools round-trip, and returns only the distilled answer. Two compounding wins:

Worked examples

Example 1 — discover then call:

// The model doesn't know the exact tool name, so it searches first.
const matches = await tools["search_tools"]({ query: "list github issues", limit: 5 });
const m = matches[0];                        // pick the top hit
const r = await tools[m.name]({ repo: "endara-ai/endara-relay", state: "open" });
return r.structuredContent ?? r.content?.[0]?.text;

Example 2 — chain calls in one round-trip:

const projects = await tools["todoist__get-projects"]({});
const proj = (projects.structuredContent ?? []).find(p => p.name === "Inbox");
const tasks = await tools["todoist__get-tasks"]({ project_id: proj.id });
return { projectId: proj.id, tasks: tasks.structuredContent };

Example 3 — reduce-and-return (the token-burn-reduction pattern):

// Fetch potentially huge data, but only return what the model needs.
const all = await tools["github__list_issues"]({ repo: "endara-ai/endara-relay", state: "open" });
const issues = all.structuredContent ?? [];
// 200 issues -> 5 stale ones with just the fields we care about.
const stale = issues
  .filter(i => Date.now() - new Date(i.updated_at).getTime() > 30 * 86400_000)
  .sort((a, b) => new Date(a.updated_at) - new Date(b.updated_at))
  .slice(0, 5)
  .map(i => ({ number: i.number, title: i.title, updated_at: i.updated_at }));
return { staleCount: stale.length, stale };

Limits to remember

Any single execute_tools call is bounded by the 30-second wall-clock timeout and the 1M-iteration loop cap. Scripts cannot persist state between calls — each invocation starts from scratch. If a tool call inside the script throws, the sandbox surfaces the error message back to the meta-tool caller.

TOON output

Endara Relay re-encodes JSON-shaped tool-call responses to TOON (Token-Oriented Object Notation) before forwarding them to MCP clients. TOON is a text-based, indentation-driven serialization format with tabular array headers that produces roughly 40-60% fewer tokens than JSON on the structured / tabular shapes most MCP tools return, while remaining losslessly round-trippable. Connected models consume fewer context tokens per tool call without losing any data.

Conversion applies to both the native tools/callpath and the relay's list_tools, search_tools, and execute_tools meta-tool responses. Non-JSON text, scalar values, error responses, image / embedded resources, and structured content pass through unchanged. When TOON is on, the search_tools description picks up a one-line TOON hint so models know to parse responses as TOON.

Enabled by default. Set [relay] toon_output = false in config.toml or pass --no-toon to endara-relay start to restore JSON pass-through. Endara Desktop exposes the same toggle as Settings → TOON Output Format; flipping it reloads the relay sidecar in place.

Tool prefixing

Two MCP servers can ship tools with the same name (for example, both a filesystem and a sandbox server might call something read_file). To keep names unique, the relay prefixes every tool it advertises with the endpoint's prefix and a double underscore, e.g. github__list_repos.

The prefix is taken from the endpoint's tool_prefix if set; otherwise it's derived from name by sanitizing to lowercase ASCII (non-ASCII characters are stripped). If sanitization yields an empty string, set tool_prefix explicitly. When the relay is connected to only one underlying server, prefixes are omitted and tools keep their original names.

When the relay derives the advertised server typefrom an upstream's serverInfo.name, common boilerplate suffixes — -mcp-server, _mcp_server, -mcp, and _mcp — are stripped automatically, so todoist-mcp-server is advertised as todoist and linear-mcp-server as linear. The stripping applies only to upstream-derived names; values supplied via server_type_override are used verbatim.

Crash recovery

STDIO adapters are restarted automatically with exponential backoff when the underlying process exits unexpectedly. Each restart resets the adapter to the initializing lifecycle state and then either back to ready on success or to failed with the most recent error exposed on the endpoint entry returned by GET /api/endpoints. SSE / HTTP / OAuth adapters reconnect on transport errors; OAuth tokens are refreshed automatically when an access token nears expiry.

OAuth refresh additionally self-heals when the stored token_endpoint in config.toml is missing or stale. On a refresh-time 404, the relay re-runs OAuth discovery against the resource URL (RFC 9728 → RFC 8414), caches the discovered endpoint in memory, and retries the refresh once. Subsequent refreshes use the cached endpoint directly. The Re-authorize flow similarly runs RFC 8414 discovery against an endpoint's oauth_server_url before falling back to the conventional {base}/authorize / {base}/token URLs.

Upstream MCP servers are always advertised to connected MCP clients (Claude Desktop, Cursor, Windsurf, etc.) once they are registered — not only while their adapter is Healthy. Many clients cache tool descriptions and rarely refresh them, so a brief blip that flips a server to Failed or Startingno longer drops it from the model's view. Configured server_type_override values are visible even before the first successful handshake.

File locations

All paths are under the data directory (default ~/.endara):

Known limitations

The relay is under active development. The items below are intentional gaps in current behaviour — not bugs — and are tracked for a future release.

Resources and prompts are not proxied

The relay aggregates MCP tools today. The other two MCP primitives — resources (resources/list, resources/read) and prompts (prompts/list, prompts/get) — are not forwarded from upstream servers through the relay yet. A client that issues those requests against the relay will receive an empty list (or, for resources/read and prompts/get, a method-not-found / unknown-URI style error) even when an upstream server would normally answer them.

Cross-endpoint namespacing for resource URIs and prompt names needs design work that we are not undertaking right now, which is why this is deferred. Proxying both domains is on the roadmap; we don't have a committed timeline yet.

Troubleshooting

Port 9400 already in use (EADDRINUSE)

Another endara-relayinstance is already listening, or you have both Endara Desktop's bundled relay and a separately installed CLI relay running. Stop one of them, or pass --port to use a different port. See Desktop troubleshooting for how Desktop handles the same conflict.

Endpoint stuck in failed state

Inspect GET /api/endpoints/{name}/logs for recent adapter output and ~/.endara/logs/relay.log.<YYYY-MM-DD>for the relay's own log. Common causes: a missing command, a server that needs an env var that wasn't set, or an OAuth flow that hasn't been completed.

Environment variable resolution failure

If a $VAR reference in env or headersisn't set, the affected endpoint is registered as a failed adapter. Set the variable in the relay's process environment (or in your shell profile if you launch the relay from the shell) and the next reload picks it up. Use $$ to emit a literal dollar sign.

Tool name collisions

If two endpoints derive the same prefix from their name — say two endpoints called github — set an explicit tool_prefix on one of them. Validation will not let two endpoints share the same name.