FastMCP

MD-Models provides MCP (Model Context Protocol) integration that automatically generates MCP tools from your markdown-defined data models. Using FastMCP, MD-Models creates a fully functional MCP server with tools for querying, upserting, and aggregating data from your database.

The MCP integration transforms your markdown data models into LLM-accessible tools, allowing language models to interact with your database through a standardized protocol. Each model in your markdown becomes a set of MCP tools for database operations. This enables language models to understand your data schema, upsert records, query existing data, and perform aggregations—all through a consistent, well-defined interface.

MCP (Model Context Protocol) is a protocol that allows AI assistants and language models to interact with external systems through tools. By exposing your database as MCP tools, you enable language models to perform database operations autonomously, making your data accessible to AI-powered applications. The tools are self-describing, meaning the language model can discover available operations and their parameters through introspection, reducing the need for manual configuration.

Setting up the MCP server

Load Your Data Model

from mdmodels import DataModel, mcp, sql
from fastmcp import FastMCP

model = DataModel.from_markdown("model.md")

This gives you a Library containing Pydantic models (for data validation).

Set Up Database Connection
```
db = sql.DatabaseConnector(library=model, **database_config)
# Creates tables and returns SQLModel classes
db_models = db.create_tables()
```
The DatabaseConnector automatically generates SQLModel classes from your library. The create_tables() method returns the generated SQLModel classes (for database operations).

Create MCP Server

app = FastMCP("mdmodels")
mcp.create_mcp_tools(app=app, db=db)

Run the Server
```
if __name__ == "__main__":
    app.run("stdio")
```
The server runs in stdio mode, which is the standard for MCP servers.

CLI Usage

You can run the MCP integration directly from the CLI using the same TOML config file:

mdmodels mcp --config config.toml --transport stdio

stdio is the default transport and is typically used for desktop MCP clients. For networked deployments, use --transport sse or --transport streamable-http and optionally set --host and --port. For complete config details, see CLI Configuration.

Database Session Middleware

The DBSessionMiddleware is automatically added when calling create_mcp_tools() and manages database sessions for MCP requests. This middleware is essential for ensuring proper database connection handling and transaction management. Each MCP tool invocation gets its own database session, which is crucial for maintaining data integrity. All operations within a single tool call are part of the same transaction, so if any operation fails, the entire transaction is rolled back. This prevents partial updates and ensures your database remains in a consistent state.

Generated Tools

create_mcp_tools() automatically generates a comprehensive set of tools for interacting with your database. These tools are designed to be self-describing, meaning language models can discover their capabilities through introspection and use them appropriately.

The generated tools fall into four categories:

Schema tools: These tools help language models understand your data structure. get_schema returns JSON schemas for all models, allowing the model to understand field types, required fields, and validation rules. get_relationships provides information about how models relate to each other, enabling the model to understand foreign key relationships and navigate your data model structure.
Upsert tools: For each model in your markdown, an Upsert_<ModelName> tool is generated (for example, Upsert_Molecule, Upsert_Reaction). Each payload item accepts an optional top-level row_pk. If row_pk matches an existing row, the row is updated; otherwise, a new row is created.
Relationship reference rules: Upsert payloads are intentionally flat. For related objects, use ChildRef payloads ({"row_pk": <id>}) instead of nested inline objects. For array-valued relationships, upsert behavior is append+dedupe when the target row already exists.
Query tools: The select_from_table tool allows querying data with flexible filtering capabilities, while aggregate_from_table supports aggregations like counts, sums, averages, and more. These tools use the FilterTask system for complex query construction, allowing language models to build sophisticated database queries.
Vector search tools: The vector_search tool enables semantic search across your data using vector embeddings. This tool allows you to find records based on semantic similarity rather than exact matches, making it powerful for discovering related content, finding similar entities, or performing content-based searches. The tool supports cross-table similarity search, where you can search one table using embeddings from a related table (e.g., finding experiments by protein sequence similarity).

All tools maintain referential integrity. Upsert tools resolve ChildRef entries against existing rows and reuse those rows by primary key.

Filtering and Aggregation

The MCP tools support sophisticated filtering and aggregation capabilities, allowing language models to query your database with precision. These features enable complex data retrieval operations that would typically require writing SQL queries manually.

Filters use FilterTask objects with table, filter conditions, and combine logic (AND/OR). This structure allows you to build complex queries by combining multiple filter conditions. Each filter specifies a table, a list of filter conditions (each with a column, operation, and value), and how those conditions should be combined (using AND or OR logic).

Here’s how to construct filters:

from mdmodels.sql.filter import FilterTask

filters = [
    FilterTask(
        table="Molecule",
        filters=[
            {"column": "name", "operation": "like", "value": "%Water%"}
        ],
        combine="and"
    )
]

Aggregations use Aggregation objects specifying function (count, sum, avg, min, max, stddev, variance) and column. Aggregations are useful for summarizing data, calculating statistics, and performing analytical queries. You can combine aggregations with filters to get aggregated results for specific subsets of your data.

The available aggregation functions cover common statistical operations: count for counting records, sum and avg for numerical aggregations, min and max for finding extremes, and stddev and variance for statistical analysis. Each aggregation operates on a specific column, allowing you to compute statistics for any field in your models.

Here’s how to define aggregations:

from mdmodels.sql.aggregation import Aggregation

aggregations = [
    Aggregation(function="count", column="id")
]

MCP Configuration

You can customize the behavior of MCP tools for individual models by providing a config parameter to create_mcp_tools. This allows you to control which tools are generated, customize tool descriptions, and configure model-specific behavior.

from mdmodels.mcp import create_mcp_tools, MCPConfig

app = FastMCP("mdmodels")

create_mcp_tools(
    app=app,
    db=db,
    config={
        "Molecule": MCPConfig(
            description="Create a new molecule with chemical properties.",
            allow_create=True,
        ),
        "Reaction": MCPConfig(
            description="Create a chemical reaction with educts and products.",
            allow_create=True,
        ),
        "Experiment": MCPConfig(
            allow_create=False,  # Disable upsert tool for this model
        ),
    },
)

The config parameter accepts a dictionary mapping model names (as strings) to MCPConfig instances. Each MCPConfig supports the following properties:

description (Optional[str]): Custom tool description for the model’s upsert tool. Tool descriptions are crucial for language models—they help the model understand what each tool does and when to use it. If not provided, MD-Models generates a generic description. Custom descriptions should be clear and concise, explaining what the tool does and what kind of data it expects. They should mention the model’s purpose, key fields, and any important relationships or constraints.
allow_create (bool): Controls whether an Upsert_<ModelName> tool is generated for this model. Defaults to False. Set to True to enable upsert/write capability for specific models.

Upsert Input Semantics

When calling Upsert_<ModelName> tools, payload items follow these rules:

Use top-level row_pk to target an existing row for update.
If row_pk is omitted, the tool follows create behavior.
If both model primary key and row_pk are provided and disagree, the operation fails.
For related fields, provide ChildRef values: {"row_pk": <id>}.
For array relationships on existing rows, input refs are merged with existing refs (append+dedupe).

This keeps write behavior explicit and reduces accidental link corruption from guessed or stale IDs.

Complete Example

Here’s a complete example:

from fastmcp import FastMCP
from mdmodels import DataModel, mcp, sql

# Load data model and create database connection
model = DataModel.from_markdown("model.md")

db = sql.DatabaseConnector(library=model, **database_config)
# Creates tables and returns SQLModel classes
db_models = db.create_tables()

# Initialize FastMCP app
app = FastMCP("mdmodels")

# Register MCP tools
mcp.create_mcp_tools(
    app=app,
    db=db,
    config={
        "Molecule": mcp.MCPConfig(allow_create=True),
        "Reaction": mcp.MCPConfig(allow_create=True),
    },
)

# Run the server
if __name__ == "__main__":
    app.run("stdio")

This creates a fully functional MCP server with tools for every model. Language models can connect and use the tools for schema introspection, upsert operations, querying, and aggregation. The server runs in stdio mode, which is the standard for MCP servers, allowing it to communicate with MCP clients through standard input/output streams.

Once running, the MCP server exposes all generated tools to connected language models. The models can discover available tools through introspection, understand their parameters through the JSON schemas, and use them to interact with your database. This enables AI-powered applications to autonomously work with your data, performing complex operations like creating related records, querying with filters, and computing aggregations—all without requiring manual SQL queries or custom API endpoints.

The MCP protocol ensures that tool invocations are properly structured and validated, and the FastMCP framework handles the communication protocol details, allowing you to focus on defining your data models and letting MD-Models generate the tools automatically.