We've all been there. You're deep in a coding session with your favorite AI assistant, trying to explain a complex bug in your legacy codebase. You paste in file after file, stack trace after stack trace. Suddenly, the AI starts hallucinating, forgetting what you told it three messages ago, or worse—it hits you with the dreaded "context limit reached."
It’s frustrating, isn't it? It feels like trying to explain a movie plot to a friend who walks out of the room every five minutes.
For the longest time, the solution was just "buy a bigger context window." But treating Large Language Models (LLMs) like bottomless pits where we just dump data isn't sustainable. It’s slow, it’s expensive, and frankly, it’s messy.
If you’ve been hearing whispers about MCP in developer circles and wondering what the fuss is about, you’re in the right place. In this tutorial, we aren't just going to define it; we are going to build with it. We’ll explore how this standard is changing the game for LLM context management, turning our AI models from isolated chatbots into deeply integrated system agents.
Grab a coffee, open your terminal, and let's dive in.
1. Introduction to Model Context Protocol (MCP)
So, what exactly is the Model Context Protocol?
Imagine you’re trying to charge your phone, your laptop, and your headphones. Ten years ago, you needed a different cable for every single device. It was a nightmare. Then came USB-C. Suddenly, one standard port connected everything.
MCP is the USB-C for Artificial Intelligence.
Technically speaking, the Model Context Protocol is an open standard that enables AI models to connect securely to local and remote data sources—like your file system, a PostgreSQL database, or your Slack workspace.
Before MCP, if you wanted ChatGPT or Claude to interact with your internal database, you had to build a custom, brittle integration. You were essentially hardwiring the connection. If the API changed? Broken. If you switched LLM providers? You had to rewrite the whole thing.
MCP standardizes this. It decouples the AI (the "Client") from the data (the "Server").
- The Client: The AI interface (like Claude Desktop or an IDE).
- The Host: The application running the client.
- The Server: A lightweight program that exposes your data or tools via the MCP standard.
The beauty here is that once you build an MCP server for your data, any MCP-compliant AI client can use it. It’s write once, use everywhere for the AI era.
2. Why MCP Matters for LLMs
You might be thinking, "I can already paste data into the chat. Why do I need a protocol?"
That’s a fair question. But let’s look at the limitations of the copy-paste method, or even standard RAG (Retrieval-Augmented Generation) pipelines.
The Context Window Bottleneck
Every LLM has a "context window"—think of this as the AI's short-term memory. Whether it’s 8k tokens or 200k tokens, it is finite. When you manage state or huge datasets, you can't just shove terabytes of documentation into that window. It’s like trying to memorize an entire encyclopedia just to answer one trivia question.
MCP changes the approach from "pushing" data to "pulling" data.
Instead of dumping everything into the context window at the start, the LLM uses MCP to reach out and grab exactly what it needs, right when it needs it. This keeps the context window clean and focused on the task at hand.
Real-Time Accuracy
When you paste a snippet of code into a chat, that data is static. If you update the code in your editor five minutes later, the AI doesn't know. It’s looking at a snapshot of the past.
With MCP, the AI is connected to the live source. If you ask it to review a file, it reads the current version of that file through the protocol. It reduces hallucinations because the ground truth is always fetched fresh.
Security and Boundaries
Giving an AI agent access to your computer sounds terrifying, right? MCP is designed with a "human in the loop" philosophy. The protocol requires the user to approve sensitive actions. The AI can say, "I want to read this file," but the protocol ensures you say "Yes" before it happens.
3. Core Concepts of MCP: Context Window, Encoding, Decoding
Before we write code, let's briefly unpack the architecture. Don't worry, I’ll keep the jargon to a minimum.
The Three Pillars of MCP
MCP focuses on three main capabilities that a server can provide to an AI:
- Resources: These are like files. They are data that can be read. Think of logs, database records, or code files. The AI can read them, but usually can't change them directly through this interface.
- Prompts: These are pre-written templates. If you have a specific way you want the AI to analyze a bug report, you can save that as a "Prompt" in your MCP server. The AI can load this template to get the best context immediately.
- Tools: This is where the magic happens. Tools are executable functions. The AI can "call" a tool to perform an action—like querying an API, performing a calculation, or even creating a GitHub issue.
Encoding and Transport
How do these messages travel? Under the hood, MCP usually runs over JSON-RPC.
If you've ever dealt with asynchronous JavaScript, you know that communication between systems takes time. MCP handles this by sending JSON messages back and forth over a transport layer.
- Stdio (Standard Input/Output): This is the most common for local tools. The MCP server runs as a subprocess, and the AI client talks to it via the command line streams.
- SSE (Server-Sent Events): Used for remote servers over HTTP.
The "Encoding" part refers to how the request (e.g., "Read file main.ts") is packaged into a JSON object, sent to your server, decoded by your code, processed, and then the result is encoded back to the AI.
4. Setting Up Your Environment for MCP
Alright, enough theory. Let’s build something.
We are going to create a simple MCP server using TypeScript. Why TypeScript? Because it provides the type safety we need when dealing with protocol definitions, and it aligns perfectly with the modern web stack.
Prerequisites
- Node.js: Ensure you have Node.js (v18 or higher) installed.
- npm: Comes with Node.
- An MCP Client: The easiest way to test this is using the Claude Desktop App, but you can also use MCP inspectors for debugging.
Initializing the Project
Open your terminal. We’ll create a directory for our project, which we’ll call mcp-notes-server.
mkdir mcp-notes-server
cd mcp-notes-server
npm init -y
Now, let's install the necessary packages. We need the official MCP SDK and Zod (a schema validation library that is absolutely essential for defining data structures safely).
npm install @modelcontextprotocol/sdk zod
npm install -D typescript @types/node tsx
Note: We're using tsx to run TypeScript directly without manually compiling it every time. It’s a lifesaver for development.
Initialize your TypeScript configuration:
npx tsc --init
Update your package.json to include a script to run the server. Add this to the "scripts" section:
"scripts": {
"start": "tsx index.ts"
}
5. Implementing MCP: A Step-by-Step Example
We are going to build a "Notes" server. This server will allow the AI to:
- List all available notes (Resources).
- Read a specific note (Resource).
- Create a new note (Tool).
Create a file named index.ts in your project root.
Step 1: Import and Setup
First, we import the classes we need and set up a simple in-memory database (just a JavaScript object for now).
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import {
CallToolRequestSchema,
ListResourcesRequestSchema,
ListToolsRequestSchema,
ReadResourceRequestSchema,
} from "@modelcontextprotocol/sdk/types.js";
import { z } from "zod";
// A simple in-memory store for our notes
// In a real app, this would be a database connection
const notes: Record<string, string> = {
"welcome.txt": "Welcome to your MCP Notes system! This data is live.",
"todo.txt": "1. Buy milk\n2. Learn MCP\n3. Build cool AI agents",
};
// Initialize the MCP Server
const server = new Server(
{
name: "mcp-notes-server",
version: "1.0.0",
},
{
capabilities: {
resources: {},
tools: {},
},
}
);
Step 2: Define Resources (Reading Data)
Now, we need to tell the AI what data is available. We do this by implementing a "List Resources" handler.
// Handler: List all available resources (files)
server.setRequestHandler(ListResourcesRequestSchema, async () => {
return {
resources: Object.keys(notes).map((filename) => ({
uri: `note:///${filename}`, // Custom URI scheme
name: filename,
mimeType: "text/plain",
})),
};
});
Next, we need a handler for when the AI actually wants to read one of these files.
// Handler: Read a specific resource
server.setRequestHandler(ReadResourceRequestSchema, async (request) => {
const url = new URL(request.params.uri);
const filename = url.pathname.replace(/^\//, ""); // Remove leading slash
const content = notes[filename];
if (!content) {
throw new Error(`Note ${filename} not found`);
}
return {
contents: [
{
uri: request.params.uri,
mimeType: "text/plain",
text: content,
},
],
};
});
Step 3: Define Tools (Taking Action)
Resources are passive; Tools are active. Let's give the AI the ability to create a new note. This is powerful because it allows the AI to persist information outside of its context window.
// Handler: List available tools
server.setRequestHandler(ListToolsRequestSchema, async () => {
return {
tools: [
{
name: "create_note",
description: "Create a new note with a filename and content",
inputSchema: {
type: "object",
properties: {
filename: { type: "string", description: "Name of the file (e.g., meeting-notes.txt)" },
content: { type: "string", description: "The text content of the note" },
},
required: ["filename", "content"],
},
},
],
};
});
// Handler: Execute the tool
server.setRequestHandler(CallToolRequestSchema, async (request) => {
if (request.params.name === "create_note") {
// Validate input using Zod (Good practice!)
const args = request.params.arguments as { filename: string; content: string };
if (!args.filename || !args.content) {
throw new Error("Missing filename or content");
}
// Save the note to our memory store
notes[args.filename] = args.content;
return {
content: [
{
type: "text",
text: `Successfully created note: ${args.filename}`,
},
],
};
}
throw new Error("Tool not found");
});
Step 4: Connect the Transport
Finally, we need to actually run the server. We'll use the StdioServerTransport, which allows the AI client (like Claude) to run this script and talk to it via standard input/output.
async function main() {
const transport = new StdioServerTransport();
await server.connect(transport);
console.error("MCP Notes Server running on stdio"); // Use stderr for logs, stdout is for protocol!
}
main().catch((error) => {
console.error("Server error:", error);
process.exit(1);
});
Critical Tip: Notice I used console.error for logging? In MCP stdio mode, console.log writes to standard output, which is used for the protocol messages. If you print random text there, you will break the connection. Always log to stderr!
6. Advanced MCP Techniques: Dynamic Context and Summarization
We have a basic server, but let's talk about the "Deep Dive" part. How do we make this smart?
Dynamic Context Updates
One of the coolest features of MCP is the ability to subscribe to changes. If your notes object was actually a live database, you wouldn't want the AI to work with stale data.
The MCP protocol supports notifications. If a resource changes, the server can send a resource/updated notification to the client. This prompts the AI to re-fetch the context if it's currently using that resource. It’s reactive programming for LLMs.
Context Window Optimization
Remember the "context window" keyword from the title? Here is where it pays off.
Instead of creating a tool that returns all notes (which might blow up the context limit), you can create a tool called search_notes.
- The AI sends a query: "Project Beta deadlines".
- Your MCP server runs a fuzzy search or a vector database lookup.
- The server returns only the relevant snippets.
This is how you build scalable systems. You aren't expanding the context window; you are optimizing what goes into it. It’s similar to how we optimize React applications—rendering only what changed. (Speaking of, if you're into frontend optimization, check out our guide on The Complete Guide to React Hooks).
7. Best Practices for MCP Integration
After building a few of these, I’ve learned some hard lessons. Here are the rules I live by:
1. Treat Tools Like APIs
The input schema for your tools is effectively an API contract. Be descriptive. The "description" field in your tool definition is not just for humans; it’s the prompt the LLM reads to understand how to use the tool. If your description is vague, the LLM will fail to use it correctly.
2. Error Handling is User Experience
If your tool crashes, the LLM gets a generic error. It doesn't know why it failed. Instead of throwing a raw exception, catch errors and return a meaningful text message in the tool response.
- Bad:
Error: ENOENT - Good: "Could not find file 'data.txt'. Please check the filename and try again." The LLM can read that error and actually self-correct!
3. Keep Payload Sizes Small
Just because you can send 5MB of text in a resource doesn't mean you should. Latency matters. Large payloads slow down the decoding process and the LLM's "time to first token."
8. Common Challenges and Troubleshooting
Even with a perfect setup, things go wrong.
"The Client Won't Connect" This is almost always a path issue. When configuring Claude Desktop or another client, you must provide the absolute path to your executable.
- Wrong:
node index.js - Right:
/usr/local/bin/node /Users/dev/mcp-notes-server/dist/index.js
"JSON-RPC Error: Parse Error"
You probably used console.log somewhere in your code. As mentioned earlier, stdout is reserved for the protocol. Strip out all logs or move them to console.error.
"The AI is looping" Sometimes the AI calls a tool, gets a result, and calls it again immediately. This usually happens when the tool output is ambiguous. Ensure your tool returns a clear "Success" message or definitive data so the LLM knows the task is complete.
9. Conclusion: The Future of Context Management
We are witnessing a shift in how we build AI applications. We are moving away from monolithic "chatbots" toward modular ecosystems of tools and data. This aligns perfectly with the rise of autonomous agents.
The Model Context Protocol is the glue that makes this possible. By implementing MCP, you aren't just managing a context window; you are giving your LLM agency. You are giving it eyes to read your codebase and hands to perform tasks, all within a standardized, secure framework.
It reminds me a bit of the early days of REST APIs. At first, it seemed like extra work. But soon, it became the language of the web. MCP is shaping up to be the language of AI interoperability.
So, take the code we wrote today. Expand it. Connect it to your local Git repository, or maybe a weather API. The context window is no longer a cage—it’s a viewport into a much larger world.
Happy coding!
Frequently Asked Questions
What is the difference between MCP and RAG? RAG (Retrieval-Augmented Generation) is a technique for fetching data to add to a prompt. MCP is a standard protocol for connecting to that data. You can think of MCP as the standardized pipe that facilitates RAG, but it also handles tools and prompts, which standard RAG pipelines often don't.
Does MCP work with OpenAI's GPT models? Yes, technically. While Anthropic championed the standard, MCP is open source. Any client application that interfaces with OpenAI models (like a custom chatbot UI or an IDE extension) can implement the MCP client standard to talk to MCP servers.
Is MCP secure for production databases? MCP is designed with security in mind, favoring a "human-in-the-loop" approach for sensitive actions (tools). However, like any API, you must validate inputs and implement proper access controls within your MCP server code. Never expose a tool that allows raw SQL execution without strict limitations!
Can I use Python for MCP? Absolutely. While we used TypeScript in this tutorial, there is a robust Python SDK for MCP. The concepts—Resources, Tools, and Prompts—remain exactly the same.