MCP Servers vs REST APIs: Why AI Agents Need a Different Kind of Interface
REST APIs are the backbone of the modern web β so why did the AI world invent something new? A deep dive into why MCP exists, and what problems it solves that REST simply wasn\'t built for.
If you've spent any time in the AI developer ecosystem recently, you've probably heard the term MCP β Model Context Protocol. It's an open standard introduced by Anthropic that lets AI models like Claude connect to external tools and data sources in a structured, consistent way. In a short period, an ecosystem of MCP servers has emerged covering everything from Google Drive to GitHub to Slack to local filesystems.
But here's the question developers always ask: we already have REST APIs β why do we need MCP at all?
It's a completely fair question. REST is everywhere. It's well understood, battle-tested, and supported by virtually every language, framework, and platform on the planet. Billions of API calls happen over REST every day. So what does MCP actually add, and why should you care?
The answer lies in a fundamental mismatch between how REST was designed and how AI agents actually work. To understand it properly, we need to look at the assumptions each model makes about the caller β and why those assumptions matter enormously once you put an AI in the loop.
What REST Was Built For
REST (Representational State Transfer) was conceived by Roy Fielding in his 2000 doctoral dissertation as a set of architectural principles for building web services. It arrived at exactly the right moment: the web was exploding, services needed to communicate over HTTP, and REST gave developers a clean mental model for doing so.
The key assumptions baked into REST are rarely stated explicitly, but they're there:
- The caller knows what it wants. A REST client issues a specific, deliberate request:
GET /users/42,POST /orders,DELETE /files/abc123. The endpoint, method, and parameters are all chosen by a developer who already understands the domain and has a specific outcome in mind. - Schemas are documented for humans, not machines. REST APIs are described in human-readable documentation β OpenAPI specs, Swagger UIs, README files β that developers read, understand, and then write code against. The documentation is a one-time input to the development process, not something consumed at runtime.
- Interaction is stateless and transactional. Each HTTP request is self-contained. The server doesn't know or care what requests came before. State lives on the client. This is by design: statelessness makes REST services easier to scale and reason about.
- Errors are handled by the developer. When an API returns a 422 or a bespoke error object, a developer reads the docs to understand what went wrong and writes code to handle it. The handling logic is predetermined and explicit.
These assumptions are perfectly suited to deterministic systems. A payment processor, a user profile service, a weather API β these fit cleanly into REST because a human engineer has already decided exactly what data flows where, when, and in response to what. The intelligence sitting behind the client is human and operates before runtime, not during it.
What AI Agents Actually Need
An AI agent operates in a fundamentally different way. It doesn't arrive with a predetermined plan. It reasons about a goal, decides dynamically which tools might help, calls them, interprets the results in context, and adapts its next step accordingly. The intelligence is in the loop at every stage β not just at design time.
This creates a set of requirements that REST was never designed to meet.
1. Discovery at Runtime, Not Design Time
When a developer builds a REST client, they've read the API documentation. They know the endpoints, the request shapes, the response formats. That knowledge is baked into the code they write. The API itself doesn't need to explain what it can do at runtime β the developer already knows, because they read the docs last Tuesday.
An AI agent has no developer in the loop at runtime. It can't read documentation in advance for every possible tool it might encounter. It needs to discover what a tool can do at the moment it encounters it β dynamically, from the tool itself, while a task is already underway.
MCP solves this with a mandatory tools/list method. When a host (like Claude.ai) connects to an MCP server, the first thing it does is call tools/list to receive a structured manifest of everything the server can do. Each tool in the response includes a name, a human-readable description, and a JSON Schema describing its input parameters and their types. The model reads this manifest and reasons about which tools are relevant to the current task β dynamically, at inference time.
This is discovery as a first-class protocol primitive. With REST, you'd need to either hardcode every tool schema into the model's context window (bloating every prompt and requiring constant maintenance), or build a bespoke discovery layer yourself. MCP makes it automatic and consistent across every server in the ecosystem.
2. A Shared Vocabulary for Tool Calls
Every REST API is different. Pagination is implemented a dozen ways β cursor-based, offset-based, page-number-based, Link headers. Authentication might be Bearer tokens, API keys in headers, API keys in query strings, or OAuth flows. Error formats are wildly inconsistent: some APIs return {"error": "not found"}, others return {"code": 404, "message": "..."}, others use RFC 7807 Problem Details. Date formats, null handling, nested vs flat structures β every API makes its own choices.
For a human developer, this inconsistency is annoying but manageable. You read the docs for each API, write handling code for each case, and do it once. It's upfront work, not runtime work.
For an AI model generating tool calls dynamically, inconsistency is a much more serious problem. Every inconsistency is a potential point of failure β a response format the model misreads, an error code it doesn't recognise, a parameter name it gets subtly wrong. The model isn't reading docs; it's reasoning in real time, and variability across interfaces degrades reliability in ways that are hard to debug and easy to underestimate.
MCP defines a standard JSON-RPC 2.0-based protocol. Every MCP server, regardless of what underlying system it connects to, looks the same to the model. Tool calls always have the same structure. Results are always wrapped in the same envelope. Errors always use the same format with standardised error codes. The model learns one protocol and can reliably work with any MCP server, whether it wraps a REST API, a PostgreSQL database, a file system, or a local subprocess.
3. Session State Across a Multi-Step Workflow
This is one of the most important β and least discussed β differences between REST and MCP, and it's worth unpacking in detail.
REST is stateless by design. That's a deliberate architectural choice that makes REST services easy to scale horizontally: any server in a cluster can handle any request, because no request depends on what came before. Every request is a fresh transaction. The trade-off is that the client must send all the context needed to process each request, every single time.
For an AI agent working through a multi-step task, this stateless model creates real friction. Consider an agent asked to summarise a 50-page report in Google Drive, extract the action items, and create tasks in a project management tool. This involves a chain of tool calls where results from earlier steps inform later ones. The agent fetches the document, processes it, decides which sections are relevant, extracts action items, then creates tasks one by one, perhaps going back to the document to clarify ambiguity.
MCP is built around a persistent session model. When a host connects to an MCP server, it establishes a session β a long-lived, stateful connection that persists across many individual tool calls within a workflow. This has concrete technical consequences:
- The server can maintain context between calls. If you fetch a large document in step one and need to query specific sections in steps three and seven, the server doesn't have to re-fetch and re-parse the document each time. It can hold it in session memory and respond to partial queries against the cached content. This is both faster and cheaper β particularly relevant for large documents or expensive API calls upstream.
- Authentication happens once per session. Rather than re-authenticating on every request or passing credentials redundantly with every call, the session handshake handles auth once during connection establishment. All subsequent tool calls within the session are already authenticated. For OAuth-based services where token refresh is involved, this is a meaningful simplification.
- Streaming and progressive results become natural. Because the connection persists, MCP servers can push results incrementally rather than forcing the client to wait for a complete response or poll a separate status endpoint. If a tool call triggers a long-running operation β scanning a large codebase, processing a batch of files, running a search over a large dataset β the server streams updates as they become available. The host can show the model (and the user) progress in real time.
- Server-side resources can be scoped to the session. An MCP server can open a database connection, load a large embedding model, or establish a WebSocket to a third-party service once at session start, then reuse it across all tool calls in that session. This is dramatically more efficient than a REST model where each request is effectively a cold start with connection overhead.
Concretely, MCP sessions are implemented over one of two transports. For web-based deployments, the standard is Server-Sent Events (SSE): the client makes a GET request to establish a persistent event stream, then sends tool call requests via POST to a session-specific endpoint. The server pushes responses, progress updates, and notifications back through the SSE stream. For local deployments β an MCP server running as a sidecar process alongside the host application β communication happens over stdio: the host writes JSON-RPC messages to the server's stdin and reads responses from stdout. In both cases, the session lives until either side closes the connection intentionally, or the connection drops. This is a meaningfully different transport model from REST's request-response pattern β designed for the ongoing, stateful interaction that agentic workflows require.
4. Resources: Exposing Data Separately From Actions
REST conflates data retrieval and actions into a single endpoint model. A GET /documents/42 retrieves data; a POST /documents creates it. The distinction between "give me data" and "do something with side effects" is expressed implicitly through HTTP methods β conventions that a REST API might not even follow consistently.
MCP separates these concerns explicitly at the protocol level. In addition to tools β actions the model can invoke, which may have side effects β MCP servers can expose resources. Resources are structured data items the model can read directly: files, database records, configuration values, API responses that have been pre-fetched. They have URIs, MIME types, and content; they're designed to be injected into the model's context rather than processed through a tool invocation.
Why does this separation matter? Because reading a file and deleting a record are fundamentally different kinds of operations in terms of risk. Reading a file has no side effects β it's safe to do speculatively, to cache, to include in context without asking the user. Deleting a record is irreversible. By giving hosts a protocol-level way to distinguish resources (safe to read) from tools (may have side effects), MCP enables intelligent decisions about what requires user approval and what doesn't β something that requires convention and guesswork with REST.
5. Safety, Permissioning, and Human-in-the-Loop Flows
When an AI agent is about to take a consequential action β sending an email, deleting a file, making a purchase, posting publicly β there needs to be a clear, structured way to communicate what the action does and what its consequences are. This is critical for user trust, and for preventing the class of "confident but wrong" mistakes that make people nervous about autonomous agents.
REST APIs don't have a standard mechanism for this. Some document their side effects well; most don't. There's no protocol-level concept of "this operation is irreversible" or "this action affects people outside this session" or "this will cost money". Developers building agents on REST have to handle this ad hoc, and they inevitably handle it inconsistently.
MCP tool definitions include structured, machine-readable descriptions β names, parameter descriptions, and implicit signals about consequence β that both the model and the host UI can use. Because every tool call goes through the standardised MCP protocol, the host can intercept and gate them consistently. A host can present a confirmation UI for operations flagged as destructive, log all tool calls for audit purposes, enforce rate limits on expensive operations, or route specific actions through an approval workflow requiring explicit user sign-off. None of this requires bespoke integration logic for each tool β the protocol gives hosts the structural hooks they need.
6. Prompts: Encoding How to Use the Server, Not Just What It Can Do
MCP introduces a third primitive alongside tools and resources: prompts. A prompt is a reusable, parameterised interaction template β a structured way for an MCP server to expose recommended patterns for using its capabilities.
For example, a code review MCP server might expose a prompt called review_pull_request that takes a PR URL and returns a structured prompt asking the model to evaluate security vulnerabilities, performance implications, test coverage, and code style. The host can surface these prompts in its UI β Claude.ai shows MCP prompts as slash commands in the input box β making server-defined workflows immediately discoverable to end users without any additional developer work on the host side.
This addresses a real gap. REST tells you what endpoints exist. OpenAPI tells you how to call them. But neither tells you the right sequence of calls for a given task, the right context to provide, or the right way to interpret results for a particular use case. Prompts are MCP's answer: a way for server authors to encode their domain expertise about how their capabilities should be used, and expose that knowledge to both AI models and end users directly through the host interface.
7. Sampling: Bidirectional AI Calls
Perhaps the most architecturally novel feature of MCP β and the one with the most significant long-term implications β is sampling: the ability for an MCP server to request a completion from the host model as part of its own processing.
In a standard REST model, data flow is unidirectional: the client calls the server, the server returns data. The server is passive. In MCP with sampling, the flow can be genuinely bidirectional. An MCP server handling a complex tool call can pause its processing, send a sampling/createMessage request back to the host asking the model to reason about something β "given what I've retrieved so far, which of these three approaches should I pursue?" β receive the model's response, incorporate it into its own logic, and then return a final result to the original tool call.
This enables a class of server-side intelligence that has no equivalent in REST. An MCP server wrapping a document search system could use sampling to ask the model to reformulate a query when initial results are poor, then run the reformulated query automatically before returning results. An MCP server connected to a monitoring system could use sampling to classify an anomaly before deciding which alert level to apply. An MCP server acting as a code execution environment could use sampling to ask the model to review the output of a script before deciding whether to proceed with the next step. The server isn't just a passive data provider β it's an active participant in the reasoning process.
Critically, sampling requests from servers flow back through the host, which means the host β and the user β retains full control. The host can inspect, approve, modify, or reject sampling requests, maintaining the human-in-the-loop principle even for server-side AI calls. This is a deliberate design choice: MCP doesn't allow servers to silently spin up their own model calls outside the host's awareness. Every model call, wherever it originates, goes through the host.
So Why Not Just Build a Better REST Standard?
You could argue that these problems could be solved by layering conventions on top of REST. Standardise discovery via OpenAPI extensions. Mandate consistent error formats via RFC 7807. Use WebSockets or SSE for persistent connections. Add a metadata endpoint that documents side effects. Many teams have built exactly this kind of bespoke framework, and some of them work quite well.
The problem is coordination and fragmentation. Every team that builds its own "REST-for-agents" framework makes different choices. The model faces a world of subtly different discovery mechanisms, connection models, and error formats. You've improved on vanilla REST, but you haven't solved the core problem: the model still needs to reason about interface variability at runtime, and variability degrades reliability.
MCP takes a different approach: rather than trying to fix all REST APIs, it defines a thin, consistent adapter layer. MCP servers can wrap any underlying system β a REST API, a database, a message queue, a local binary β and expose it to AI models in a predictable, uniform way. The underlying REST API doesn't change; the MCP server translates it into a form the model can work with reliably.
This is why MCP adoption has accelerated quickly. Developers aren't rewriting their backends. They're writing MCP servers β often a few hundred lines of TypeScript or Python using the official SDK β that wrap their existing services and expose a curated subset of capabilities to AI agents. The cost is low; the benefit is that any MCP-compatible host can immediately use those capabilities, with no bespoke integration work on the host side.
What MCP Doesn't Replace
It's worth being precise about scope. MCP is not a replacement for REST in general. It's a specific protocol for AI agent-to-tool communication β a narrow but increasingly important slice of the overall integration landscape. Your mobile app should still talk to your backend over REST. Your microservices should communicate via REST or gRPC or whatever makes sense for the latency and throughput requirements. Your data pipelines should use the right tool for the volume and batch characteristics involved.
MCP sits at a specific integration point: between an AI model and the external systems it needs to interact with on a user's behalf, in real time, as part of an ongoing reasoning process. It's optimised for that use case and that use case only. Using MCP for general service-to-service communication would be like using GraphQL subscriptions for a simple one-off data export β technically possible, completely the wrong tool.
The Analogy That Helps
REST APIs are like instruction manuals β detailed, comprehensive, and perfectly useful if you already know what you're building and have time to read them carefully before you start. MCP is like a knowledgeable colleague sitting next to the AI during the task: they can answer "what can you do?", hand over the right context mid-workflow, handle back-and-forth across a project that spans multiple steps, flag when something consequential is about to happen, and occasionally ask the AI's opinion before proceeding.
You wouldn't hand a new employee a 400-page API reference and expect them to navigate it in real time while a customer is waiting for an answer. You'd give them a colleague who speaks their language, knows the system, and can guide the interaction moment to moment. That's the role MCP plays for AI agents.
The Bottom Line
REST APIs aren't going anywhere β they're the right tool for deterministic, developer-written integrations, and the vast majority of software will continue to be built on them. But AI agents aren't deterministic, and there's no developer making decisions call by call at inference time.
MCP fills a genuine architectural gap: a protocol designed from first principles for how language models actually work β dynamic, contextual, tool-using, session-aware, and increasingly autonomous. It gives agents a consistent interface to the world, handles the session and state management that multi-step workflows require, separates safe read operations from consequential actions, and builds safety and transparency into the integration layer rather than leaving them as afterthoughts.
If you're building anything that involves AI agents interacting with external systems, understanding MCP isn't optional any more. It's the architectural decision that determines whether your agent fumbles through bespoke, brittle integrations or operates with the reliability and predictability of a skilled professional using tools built for the job.