From Chat to Action: Practical Local Agents with MCP

PyMUG Meetup (CoderFaculty) — January 2026 

The problem

Basic data analysis is often boring and repetitive: you load a CSV, check the columns, scan for missing values, run describe(), and repeat the same checks again and again. 

In this session, I showed a practical alternative: use a local LLM as the interface and let it call MCP tools to compute results and explain them in plain English. 


What we built

A simple, offline workflow where:

  • I ask questions in natural language
  • The model calls MCP tools to compute answers
  • The analysis runs locally (pandas,numpy,sns,plt)
  • The model summarises findings and next steps in plain English

✅ Local execution: everything runs on my machine (no cloud dependency).

image.png


The core problems we want to solve

When you open a dataset, the first questions are nearly always:

  1. What is the dataset about?
  2. Do we have missing data?
  3. Are there suspicious categories or correlations?

The demo agent is designed to answer these quickly and reliably using tools rather than “best guesses”.

image.png


MCP in plain terms

Model Context Protocol (MCP) is a standard way for models to call tools.

  • Tools expose structured inputs/outputs
  • The host controls what the model is allowed to do
  • The model chooses when to call tools
  • Tool results come back as structured data mcp

A useful mental model:

MCP is the connector between your local model and your local capabilities (files, analysis functions, etc.)

image.png


Architecture

The setup is intentionally simple and stage-safe:

  • LM Studio: local model + MCP host
  • MCP Server (Python): dataset tools implemented with pandas/numpy
  • Local Data Folder: CSV files in a restricted directory (e.g., DATA_DIR / data/*.csv)

image.png


What tools did I expose?

To keep the demo reliable, I kept the tool surface area small and practical:

  • List available datasets
  • Read dataset details (rows/columns)
  • Get columns
  • Preview dataset (rows + inferred types)
  • Quick dataset analysis (missingness, stats, categories, correlation)

This is enough to cover 80% of “first contact” dataset work.

image.png


Sample prompts (the live flow)

This is the exact flow I used in the demo:

Step 1 — Find available datasets

Prompt: “What datasets are available?” 

Step 2 — Confirm schema

Prompt: “Show me the columns and a preview with types.” 

Step 3 — Run profiling

Prompt: “Analyse missing values, summary stats, and top categories.” 

Step 4 — Interpretation

Prompt: “Summarise the main data quality issues and next cleaning steps.” 

The important detail is Step 4: the model’s job is to interpret results produced by tools, not invent them.

image.png


Guardrails (why it’s stage-safe)

To avoid surprises and keep runtime predictable, I added three simple constraints:

  • Preview capped
  • Directory restricted
  • Analysis restricted

✅ This keeps the agent useful without giving it uncontrolled access.


Tools & technologies used

For the demo environment:

  • Language: Python
  • Model: GPT-OSS 20b
  • MCP library: FastMCP
  • Reason for model choice: it runs locally with around 12GB VRAM requirements (practical for consumer hardware).

image.png


What this demonstrates (the real point)

This demo is not about “AI doing data science”.

It is about something more practical:

  • LLM = interface (natural language, summarisation, next-step guidance)
  • Tools = execution (pandas/numpy compute results deterministically)
  • MCP = safe connection between the two mcp

When combined, you get a local agent that can turn chat into action—reliably.

image.png


image.png

image.png

image.png

image.png

image.png

image.png

What I would build next

If I expand this into a more complete dataset assistant, the next additions would be:

  • Duplicate detection + row-level quality checks
  • Validity rules (ranges, regex patterns, type enforcement)
  • Report export (Markdown/PDF)
  • Lightweight charts (histograms/bar plots)
  • Dataset comparison (schema drift, distribution drift)

Learn the coding skills to
advance your career !