Back Back to Articles

History and Evolution of Model Context Protocol (MCP) Servers

History and Evolution of Model Context Protocol (MCP) Servers

History and Evolution of Model Context Protocol (MCP) Servers

What is the Model Context Protocol (MCP)?

The Model Context Protocol (MCP) is an open standard designed to connect AI language models (LLMs) with external tools, data sources, and systems in a consistent way. In essence, MCP serves as a bridge between AI agents and their environment, providing a standardized interface for AI models to access functions (tools), data (resources), or even preset prompts outside their built-in knowledge. One official description likens MCP to a “USB-C port for AI applications” – a universal connector that lets AI models plug into various peripherals (APIs, databases, file systems, etc.) through a common protocol (Anthropic MCP Overview).

How MCP Works (High-Level):
MCP defines a simple client-server architecture within an AI application. The AI application (often called the Host) runs an MCP client component that interfaces with an external MCP server. The MCP server is essentially an external program or service that offers a set of tools, data, or prompts according to the MCP specification. When an AI agent is running (for example, a chatbot or an IDE assistant), it can query the MCP server for available functions and data. These functions (called Tools in MCP terminology) are actions the LLM can request – e.g., “call an API to get weather data” or “read from a database” – and the MCP server will execute that action and return results. MCP servers can also provide read-only Resources (data the model can fetch, like documents or query results) and predefined Prompts (templates or instructions to help the model perform certain tasks). All of this is done through a standardized messaging format, so the AI agent doesn’t need to know the low-level details of each tool or data source.

In summary, an MCP server acts as a middleware layer that exposes external functionalities to the AI in a model-friendly way. The AI model can discover what “skills” or endpoints the server offers, then call those skills through the MCP protocol. The server executes the request (e.g., actually fetching the data or performing the action) and returns the result, which the AI model can incorporate into its context or response. This allows the AI agent to go beyond its training data and interact with the real world in a controlled, extensible manner.


Why Was MCP Created? (Motivations and Origins)

MCP emerged to solve a growing pain point in the AI community: how to give AI agents standardized access to the vast array of tools and information sources they might need. Before MCP, integrating an LLM with external tools or data was often an ad-hoc, one-off process. Developers had to write custom code or plugins for each specific integration – for example, custom code for a Google Drive API, another for a database, another for a calendar – each with its own authentication and data formatting quirks. This led to a “messy, M×N integration problem” where M different AI agents or applications and N different tools would require M×N bespoke integrations (Philschmid MCP Introduction). Not only was this inefficient (duplicating effort across teams and projects), it was also brittle and hard to scale.

Anthropic introduced MCP in late 2024 specifically to tackle this fragmentation (Anthropic Launch Announcement). The idea was to define one open protocol so that tool creators could implement an MCP-compatible wrapper for their service (building an MCP server), and AI platform developers could implement a generic MCP client in their agents. With this arrangement, any AI agent that speaks MCP could connect to any tool or database that provides an MCP interface – turning the integration problem into an M+N scale instead of M×N. In other words, MCP aimed to do for AI-tool integrations what USB did for device connectivity: eliminate the need for custom adapters by agreeing on a universal interface. This allows AI assistants to dynamically tap into new data sources or actions by simply knowing the MCP protocol, without requiring bespoke code for each new integration.

Another motivation was to support more interactive and dynamic tool use by AI agents. Earlier approaches, like the plugin mechanism in ChatGPT, allowed models to call external APIs defined by OpenAPI schemas. However, those were often one-shot calls (the model asks a plugin for some info and gets an answer) and were tightly controlled by specific platforms (e.g., OpenAI’s plugin ecosystem). MCP, being open-source and platform-agnostic, was designed to enable richer two-way interactions: an AI agent can maintain a dialogue or ongoing session with a tool service via streaming, get incremental updates, handle function outputs in multiple steps, etc. This two-way, stateful capability is important for “agentic” behavior – where the AI might call a tool, get partial data, and then decide on the next action iteratively. MCP’s creators envisioned a universal, open integration layer that any AI developer or organization could adopt, rather than each company inventing its own proprietary plugins or tool APIs.

In summary, MCP was created to standardize tool access for AI, reducing integration effort and enabling AI agents to seamlessly augment their knowledge or actions with external capabilities. It came from the realization that scaling AI usefulness required going beyond siloed models – and doing so in a collaborative way across the industry. The goal was to let everyone “speak the same language” when hooking up AI to tools, much like common web protocols did for connecting different systems on the internet.


MCP Architecture and Components

MCP follows a client–server model within an AI agent’s ecosystem. The key components and their roles are:

  • Host Application (Agent Interface):

    • The main AI application or agent that the end-user interacts with (e.g., a chat interface like Claude or ChatGPT, an IDE assistant, or a custom AI agent in an app).
    • The host is where the AI model resides and reasons (the LLM itself) and where results are ultimately displayed or used.
  • MCP Client:

    • The host includes an MCP client library or module which manages the connection to one or more MCP servers.
    • The client is responsible for communication: establishing handshakes, sending requests on behalf of the AI, and receiving streamed responses.
    • Typically there is a 1:1 relationship between an MCP client instance and a specific MCP server.
  • MCP Server:

    • An external program or service that exposes tools, data, or prompts to the AI agent via the MCP protocol.
    • Think of the server as a connector to some external system – it could be a wrapper around a REST API, a database interface, a filesystem handler, etc.
    • The server advertises what capabilities it offers and follows the MCP spec for exchanging messages (requests, results, errors) with the client.
  • Tools:

    • In MCP, a Tool is a function or operation exposed by the server that the LLM can invoke. Tools usually perform an action or computation (e.g., search(query), send_email(to, body)).
    • These are model-controlled functions, meaning the AI agent decides when to call them as part of its reasoning.
    • Tools are described to the AI (with a name, description, input/output schema), often being translated into function-call format for the LLM.
  • Resources:

    • A Resource is a data source that the server can provide to the model, typically on request (e.g., database://customers).
    • These are application-controlled (read-only endpoints that supply context or knowledge to the model).
    • An AI agent might retrieve a resource at the start of a session (e.g., “load the user’s profile data”) to inform the conversation.
  • Prompts:

    • A Prompt in MCP is a predefined prompt template or guideline that can be provided to the model to help it use a tool or resource effectively.
    • These are user or developer-controlled hints (e.g., a template on how to perform a code review).
    • The AI agent can query the server for available prompt templates and insert them into its context when appropriate.

Example Workflow:

  1. A developer writes a GitHub MCP server that exposes tools like list_issues(repo) or commit_code(diff).
  2. An AI application (host) with an MCP client connects to that server and discovers its tools.
  3. The AI model, when asked “Please commit these changes to my repo,” invokes the commit_code tool via the MCP client.
  4. The MCP server executes the GitHub API calls under the hood and returns success or failure.
  5. The AI model continues the conversation, now informed of the commit status.

This separation ensures the model doesn’t need to know how GitHub works; it only knows it can call commit_code. Conversely, the tool provider doesn’t need to know the LLM’s internal logic; it simply executes the request and returns results in MCP’s format.


How an MCP Exchange Works (Handshake to Tool Invocation)

When an AI agent uses MCP, the interaction between the MCP client and server follows a well-defined sequence:

  1. Initialization & Handshake:

    • When the host application (agent) starts up, it creates one or more MCP client connections to the available servers.
    • The client and server perform a handshake to exchange information about protocol versions, authentication (OAuth 2.1), and capabilities.
    • Both sides confirm they support a compatible MCP version and negotiate any optional features.
  2. Capability Discovery:

    • The MCP client asks the server to list all available Tools, Resources, and Prompts.
    • The server responds with a catalog, including names, descriptions, and input/output schemas for each tool or resource.
    • The AI agent now knows what external actions and data are available.
  3. Context Provisioning:

    • The host may fetch certain Resources immediately (e.g., user profile, documents) to give the model contextual information.
    • The host might also load relevant Prompts templates for tool usage.
    • At this point, the agent is prepared for interaction, having incorporated initial external context.
  4. Tool Invocation by the AI:

    • During user interaction, the AI model decides to invoke a tool (e.g., get_weather(location)).
    • The MCP client sends a tool invocation request to the MCP server with the tool name and arguments.
    • The request is usually in a structured schema (e.g., JSON-RPC).
  5. Execution on Server:

    • The MCP server receives the request, executes the underlying logic (calling external APIs, running computations, etc.), and possibly streams partial results.
    • The server can send intermediate events if the operation is long-running (e.g., streaming database query results).
  6. Result Streaming & Response:

    • The server streams back events or a final result to the MCP client.
    • The client passes these messages to the host application.
  7. Incorporation & Continuation:

    • The host delivers the result(s) to the AI model. If in a chat interface, the result might be inserted into the model’s context.
    • The LLM continues generating a response to the user, now enriched by the tool output (or handles errors if the tool failed).

From the AI’s perspective, it’s like calling a function and receiving its return value. MCP abstracts away the complexities of authentication, API calls, and data formatting, allowing the model to focus on reasoning.


Technical Evolution of MCP: From SSE to Streamable HTTP

The MCP specification has evolved since late 2024, especially in how the client and server communicate (the transport layer). Early MCP versions defined two primary transport mechanisms:

  1. Standard I/O (stdio) – Local:

    • The MCP server can run as a local subprocess of the host application, communicating via standard input/output streams.
    • This is simple and efficient when the agent and tool are on the same machine (e.g., a local filesystem tool).
    • The client launches the server program and exchanges JSON messages over the stdin/stdout pipe.
  2. HTTP + SSE (Remote):

    • MCP originally used HTTP with Server-Sent Events (SSE) for remote or long-running tools.
    • The MCP client initiates an HTTP connection, and after an initial handshake, the server streams events back to the client via SSE.
    • SSE is a web standard for one-way server-to-client streaming, letting the server push partial results or progress updates (e.g., streaming database query results).

However, SSE had limitations (e.g., only one-way streaming, connection management complexities). In March 2025, the MCP spec introduced a new Streamable HTTP transport:

  • Streamable HTTP:
    • Uses HTTP chunked responses (and can use SSE under the hood) to allow two-way streaming.
    • Each tool invocation is a standard HTTP request where the response is a stream of chunks or events.
    • Simplifies firewall/NAT traversal and aligns with how modern web APIs handle streaming (e.g., OpenAI’s streaming completions).

The updated spec also added JSON-RPC batching, allowing clients to send multiple requests in one payload and receive combined results—improving efficiency when fetching multiple data points in parallel. Security was enhanced by mandating OAuth 2.1 for authenticating remote servers.

As of mid-2025, MCP servers generally fall into three transport categories:

Transport Type Description
Standard I/O (stdio) Local subprocess communication via stdin/stdout. Used for local plugins/tools—low latency and simple.
HTTP + SSE (legacy) HTTP with Server-Sent Events for streaming responses. One-way server→client streaming; being supplanted by Streamable HTTP.
Streamable HTTP (modern) HTTP requests with chunked/two-way streaming plus JSON-RPC support. More robust and compatible with web standards.

Developers can choose the mode that fits their use case (some servers support multiple modes). Backward compatibility is maintained via proxies that wrap SSE servers and present them as Streamable HTTP servers.


Adoption by AI Providers and the MCP Ecosystem

After its launch by Anthropic, MCP gained rapid adoption across the AI industry. Below is a timeline of major adopters and how they implemented MCP:

  • Anthropic (Claude):

    • MCP was born in the Anthropic ecosystem, with the initial spec and reference code released in late 2024 (Anthropic MCP Announcement).
    • Claude Desktop (Anthropic’s desktop app) uses a local MCP setup to let Claude read/write files on your machine securely.
    • Anthropic built an MCP Connector into their cloud API, so developers can connect Claude to remote MCP servers directly (Anthropic API Docs).
    • Developers can instruct Claude to use tools defined by an MCP server without custom tool-invocation logic—the Anthropic API handles the MCP protocol.
  • OpenAI:

    • Initially, OpenAI’s tool approach was through ChatGPT Plugins and function calling (OpenAPI schemas). In March 2025, OpenAI announced MCP support in its Agents SDK.
    • The OpenAI Agents SDK includes an MCP client interface, allowing GPT-4–based agents to connect to any MCP server.
    • The SDK recognizes all three transport types (stdio, SSE, Streamable HTTP).
    • MCP tools can appear to GPT models as “functions” to call; under the hood, the SDK routes these through MCP.
    • This bridges MCP into OpenAI’s existing function-calling framework and enables agents to leverage the community’s library of MCP servers.
  • Microsoft:

    • In May 2025, Microsoft introduced MCP support (Preview) in its Copilot Studio platform (Microsoft Blog announcement).
    • Copilot Studio is a toolkit for building enterprise “copilots”; MCP support allows these copilots to connect to external tools through a standard interface.
    • Copilot Studio initially supports the SSE transport for remote tool servers; Streamable HTTP support is expected soon.
    • This integration enables Microsoft AI agents to use third-party MCP servers (e.g., Salesforce connector, custom internal database tools) without writing new glue code.
  • Google:

    • Google’s approach led to the creation of the Agent2Agent (A2A) protocol in April 2025, aimed at inter-agent communication. A2A is designed to complement, not replace, MCP.
    • Google explicitly positions A2A for agent-to-agent messaging while endorsing MCP as the standard for agent-to-tool interactions (Google A2A Announcement).
    • Google Cloud’s Generative AI App Builder and future Bard integrations will support MCP servers, enabling agents on Google’s platform to use existing MCP tools.
    • By backing MCP, Google ensures platform interoperability and avoids fracturing the ecosystem.
  • AWS and Others:

    • In May 2025, AWS joined the MCP steering committee and integrated MCP support into its AI offerings.
    • AWS released an open-source agent framework called Strands that works with MCP servers (and upcoming A2A support).
    • AWS engineers noted MCP’s design can support agent-to-agent communication and are collaborating on extending the spec with projects like LangGraph, CrewAI, and LlamaIndex.
    • Many smaller open-source projects and companies (e.g., Cursor AI, LibreChat) quickly added MCP support, creating a large library of MCP servers for databases, Google Sheets, Git, Slack bots, and more.
    • A community-curated list now catalogs hundreds of MCP servers ready for use, enabling AI agents to gain new capabilities “for free.”

The broad adoption of MCP within a year of its debut is striking. Microsoft, OpenAI, Anthropic, Google, and AWS all converged on supporting the same protocol, illustrating the need for a neutral, open standard in the rapidly evolving agent ecosystem. MCP’s open nature and industry backing have created network effects: each new MCP server benefits agents on every platform.


Related Standards and Prior Art in Tool Integration

MCP did not emerge in isolation—it was influenced by and addresses limitations of earlier approaches to connecting AI models with external tools and data. Key prior art includes:

  • ChatGPT Plugins / OpenAI’s Plugin Spec (2023):

    • Developers provided an OpenAPI specification for each plugin, which ChatGPT could parse and call.
    • This standardized how a model could call an external API, but:
      • Each plugin required its own web service and manifest.
      • Functionality was limited to one-off calls (no ongoing, stateful interactions).
      • The ecosystem was controlled by OpenAI (only ChatGPT or platforms implementing the same could use those plugins).
    • MCP extended this idea by offering a unified, open framework—anyone can implement an MCP server, and any agent can connect, regardless of vendor.
  • LangChain and Agent Frameworks (2022–2023):

    • Libraries like LangChain let developers define “Tools” in code (Python functions), which an LLM agent could invoke using a ReAct-style loop.
    • LangChain amassed hundreds of tools in its own format, but each agent was tied to that framework.
    • MCP, in contrast, is a protocol standard. MCP servers are live network services anyone can call. LangChain now provides adapters so MCP servers can act as LangChain tools, bridging the protocols.
  • Retrieval-Augmented Generation (RAG):

    • RAG pipelines store documents in a vector database; at query time they retrieve top matches and inject them into the LLM prompt.
    • This approach extends a model’s knowledge but is implicit (context is preloaded).
    • MCP provides an explicit mechanism: the model can decide “when” to call a search_documents tool and retrieve exactly what it needs on demand.
    • RAG and MCP are complementary; MCP servers can wrap RAG pipelines, and RAG can feed MCP tools.
  • Other Agent Communication Protocols (2025):

    • Google’s Agent2Agent (A2A) protocol is designed for agent↔agent messaging, whereas MCP is for agent↔tool.
    • A2A and MCP are complementary: agents can use MCP to fetch data or perform actions, and A2A to coordinate with other agents.
    • Other emerging protocols (e.g., “Agent Communication Protocol” variants) also aim to standardize agent collaboration.
    • MCP focuses on the foundational layer of tool access; future protocols will build on top for multi-agent workflows.

These standards and frameworks reflect the community’s evolving understanding of how to safely, efficiently, and scalably connect AI models with external capabilities. MCP represents the crystallization of these learnings into a common protocol that is open, extensible, and platform-agnostic.


Long-Term Vision vs. Current Usage of MCP

Long-Term Vision:
- Enable agent-to-agent communication: MCP could allow one AI agent to expose its own capabilities as an MCP server, so another agent can call it as a tool.
- Foster distributed AI ecosystems: Imagine specialized agents (finance, coding, translation) collaborating via MCP and higher-level protocols like A2A.
- Support autonomous workflows: Chains of tools and agent calls orchestrated through open protocols, enabling AI to perform complex tasks with minimal human intervention.

MCP’s architecture is already suited for this future. By standardizing request/response formats, authentication (OAuth 2.1), and streaming, MCP provides the building blocks for agents to share capabilities and even form networks of microservices.

Current Usage (Mid-2025):
- Tool Interoperability: MCP is primarily used to let AI agents plug into external tools “for free” (without custom integrations).
- Common use cases include internet/web access, database queries, productivity apps (email, calendars, Slack), coding/devops tools (GitHub, CI/CD), OS/filesystem access, and specialized APIs (crypto data, ML inference).
- The expanding library of MCP servers means an AI agent can almost instantly gain new features simply by connecting to the right server.

  • Developer Enthusiasm:

    • Rapid adoption by major cloud providers and open-source projects.
    • Community contributions of MCP servers for a wide variety of services (CRM, Google Sheets, Jira, etc.).
    • Developer tooling (SDKs, proxies, adapters) to make building and consuming MCP servers easier.
  • Platform Integration:

    • Anthropic’s Claude, OpenAI’s Agents SDK, Microsoft Copilot Studio, Google Cloud Generative AI, and AWS all support MCP.
    • Each platform integrates MCP into its existing function-calling or plugin frameworks, providing a smooth developer experience.

While the grand vision of a fully decentralized network of collaborating agents is still emerging, MCP’s role as the universal tool interface is well underway. Today’s AI assistants can pull in real-time data, execute tasks, and interact with users more safely and reliably, all thanks to MCP’s open protocol. As the specification evolves and the ecosystem grows, MCP is positioned to remain the backbone of agentic AI workflows.


Sources

Illiana Reed