Powering Your Database Queries with a LLMs and MCPs

Last Updated on December 9, 2025

Today, we live in an era where we increasingly encounter AIs of various types and uses in our lives, whether due to hype or not, such as the well-known ChatGPT, Grok, DeepSeek, etc., for simple or complex text or audio chats. What they all have in common is the use of Large Language Models (LLMs), an acronym you’ve surely encountered during your studies of AI, but which, due to the hype and euphoria of developers and people outside our tech bubble, we often forget the real concept, application, and limitations of LLMs. In general terms: LLMs are designed to understand and generate text in natural language, as well as other forms of content, based on vast amounts of data from diverse sources (files in .txt, .pdf, etc. formats)

Therefore, the goal of this blog post is to show you how to run LLMs completely locally on your own computer – no pricey API bills, no sending your data to the cloud, and no usage limits (beyond your machine’s capacity).

We’ll keep everything simple, beginner-friendly, and 100 % practical. By the end, you’ll be able to ask questions in plain English (“Show me all customers who bought more than $500 last month”) and have the model instantly write perfect (or almost) SQL for you with results.

Use Case: How to use LLM to make SQL queries?

The use case we will solve is creating a simple chat where we can use natural language to query information in an SQLite database. In this chat, when the user asks for data from the database, the response will always include the generated SQL query and a table with the results.

To do this, we will implement and use:

The Ollama tool, which makes it easier to start using LLMs like Gemma 3, DeepSeek-R1, Qwen3, and others locally. In our case, we will use the LLM qwen2.5:latest, which, as far as I could test, has better results for the programming and SQL domains;
Model Context Protocol (MCP) Server API, which implements an open standard that allows AI applications to connect to data sources, tools, and workflows, similar to USB-C, which connects electronic devices in a standardized way.
And finally, a simple SPA app using React.js that will consume our MCP server.

The link to the complete source code for both the MCP server and the frontend is at the end of this post.

The goal throughout this post is to focus on the relevant parts of the server implementation and how to consume it on the front-end side.

Let’s code!

First, we need to install the Ollama tool on our machine. Just access this link and download it for your current OS. After the installation is complete, we need to download our LLM qwen2.5:latest, which is quite simple:

ollama run qwen2.5:latest

After completion, you should be able to send messages to the downloaded model, like this:

>>> Send a message (/? for help)

Note: Feel free to download and play with other models.

Once we have downloaded our model, let’s move on to implementing our MCP server. Below is an image that clearly demonstrates how this open protocol works.

The image above clearly illustrates how the MCP server works, showing that the MCP Server consumes external data sources from private or public APIs, databases, documents, and so on, and provides this data to MCP clients, such as Claude Desktop, Cursor IDE, and in our case, the SPA app with React.js.

We will use Node.js + Express to implement our server, which will consist of just two main files: server.ts and question-to-sql.ts. The latter is the most important because it will handle the use of our LLM and return the SQL query to be executed in our SQLite database as output.

`server.ts`

The goal of our MCP server is simple: to provide a /mcp endpoint that will be consumed by our MCP client, passing a simple question as text.

Let’s walk through the code.

const server = new McpServer({
  name: "demo-server",
  version: "1.0.0",
});

1. Create the MCP Server Instance

McpServer is the core class from the Model Context Protocol SDK.
Provides it a name and version, these show up in the LLM’s tool selection UI (e.g. in Claude, Cursor, etc.), so the model knows what server it’s communicating to.
At this point, the server exists but has no tools yet. So, let’s register a new one!

server.registerTool(
  "sql-query-tool",
  {
    title: "SQL Query Tool",
    description:
      "Ask a question in natural language receives the SQL query and the result",
    inputSchema: { question: z.string() },
    outputSchema: { sql: z.string(), result: z.array(z.any()) },
  },
  async ({ question }) => {
    // ... tool implementation
  },
);

2. Register the SQL Query Tool
This is where the magic happens. We register a tool named "sql-query-tool". With the following parameters

title and description;
inputSchema, uses Zod to declare that the tool expects a single string field called question.
outputSchema, defines the structure of the response with what will be returned: an object with sql (string) and result (array of rows).

The Actual Tool Handler (the third argument)

async ({ question }) => {
  const db = await openDB();
  let output: { sql: string; result: any[] };

  try {
    if (memoCache.has(question)) {
      // Cache hit, reuse the previously generated SQL
      const sql = memoCache.get(question)!;
      const rows = await db.all(sql);
      output = { sql, result: rows };
    } else {
      // Cache miss, ask the LLM (or another service) to turn the question into SQL
      const sql = await questionToSQL(question, db);
      const rows = await db.all(sql);
      // Store it for next time
      memoCache.set(question, sql);
      output = { sql, result: rows };
    }

    return {
      content: [{ type: "text", text: JSON.stringify(output) }],
      structuredContent: output,
    };
  } finally {
    await db.close(); // Always close the DB, even if something throws
  }
}

Let’s check step-by-step what we have inside the tool:

const db = await openDB();
Opens a connection to the database. Assumes openDB() is a simple function that returns a SQLite database handle.
Check the cache (memoCache, a simple Map declared outside):
- If we’ve seen this exact question before, we skip the expensive questionToSQL call.
- Otherwise, generate the SQL once, cache it forever (or until server restart).
const sql = await questionToSQL(question, db);
This is the key part. questionToSQL is an async function that calls our local LLM with a prompt, it returns just the SQL string. It executes the SQL safely with db.all(sql), getting all rows as an array of objects.

Full code:

const server = new McpServer({
    name: "demo-server",
    version: "1.0.0",
});

server.registerTool(
    "sql-query-tool",
    {
        title: "SQL Query Tool",
        description:
        "Ask a question in natural language receives the SQL query and the result",
        inputSchema: { question: z.string() },
        outputSchema: { sql: z.string(), result: z.array(z.any()) },
    },
    async ({ question }) => {
        const db = await openDB();
        let output: { sql: string; result: any[] };

        try {
            if (memoCache.has(question)) {
                const sql = memoCache.get(question)!;

                const rows = await db.all(sql);

                output = { sql, result: rows };
            } else {
                const sql = await questionToSQL(question, db);

                const rows = await db.all(sql);

                memoCache.set(question, sql);

                output = { sql, result: rows };
            }

            return {
                content: [{ type: "text", text: JSON.stringify(output) }],
                structuredContent: output,
            };
        } finally {
            await db.close();
        }
    },
);

app.post("/mcp", async (req, res) => {
    const transport = new StreamableHTTPServerTransport({
        sessionIdGenerator: undefined,
        enableJsonResponse: true,
    });

    res.on("close", () => {
        transport.close();
    });

    await server.connect(transport);
    await transport.handleRequest(req, res, req.body);
});

In summary, our server receives a question, checks if we already have that question cached, executes it in the database if necessary, and sends the results back.

`question-to-sql.ts`

In this file, we have the questionToSQL function, which is the core of our app, as it handles the generation of SQL queries from questions in natural language. And for this, this function:

First, getting the full database schema to use it in the prompt to be sent to LLM via schemaCache.getSchema(db). This will me make our prompt more rich in details and context about our database tables and it structures;
Defines a prompt with detailed instructions on how to interpret and transform the question into a valid SQL query;
Uses the generateChatAnswer function to obtain the chat response using our downloaded model, "qwen2.5:latest";
Processes the response to extract only the content of the SQL query.

import { Ollama } from "ollama";

const ollama = new Ollama({ host: " http://localhost:11434" });

const generateChatAnswer = async (prompt: string) => {
    const response = await ollama.chat({
        model: "qwen2.5:latest",
        messages: [{ role: "user", content: prompt }],
    });

    return response.message.content.trim();
};

export async function questionToSQL(
    question: string,
    db: Database,
): Promise<string> {
    const schemaInfo = await schemaCache.getSchema(db);

    const prompt = `
        You are an expert SQL query generator for SQLite databases. Your task is to convert a natural language question into a valid SQL query based on the provided database schema.

        ${schemaInfo}

        ### Instructions:
        - Use ONLY the tables and columns defined in the schema above
        - Generate syntactically correct SQLite queries
        - Use appropriate JOIN types (INNER, LEFT, RIGHT) based on the question context
        - Include WHERE clauses for filtering when needed
        - Use aggregation functions (COUNT, SUM, AVG, MIN, MAX) when appropriate
        - Add ORDER BY clauses for sorting when requested
        - Use GROUP BY for aggregations
        - Ensure proper handling of NULL values
        - Generate ONLY the SQL query without explanations, comments, or markdown formatting
        - If the question cannot be answered with the available schema, output: "Cannot generate query: required tables or columns not found in schema"

        ### Examples of good queries:
        - Use table and column names exactly as shown in the schema
        - Use proper SQLite syntax and functions
        - Join tables using their foreign key relationships when needed

        User question: ${question}

        Generated SQL query (output ONLY the raw SQL query):
    `;

    const sqlContent = await generateChatAnswer(prompt);

    const codeBlockMatch = sqlContent
        .trim()
        .split("```sql\n")
        .join("")
        .split("\n```")
        .join("");

    return codeBlockMatch;
}

Note: The more detailed our prompt is, the more accurate our final response will be. Try new prompts or improvements during your development.

Practical Example

If the input is a question such as: "Select all users with email with domain @yahoo.com", it will generate something similar to:

SELECT * FROM users WHERE email LIKE '%@yahoo.com';

Frontend app

So, once our MCP server is implemented, we can understand how to integrate it into our chat app, nicknamed "GPTo Database Chat".

Our app is very straightforward in its features, basically allowing the user to:

Send a question to our database and obtain the SQL query and the returned data;
Automatically save the questions asked in the browser’s local storage history.

To implement the above features, React.js (with TypeScript) + framework’s React Router was used.

The core of the app relevant to this post is the integration with the API of our MCP server through the implementation of the custom React hook useMcpSql, a simple wrapper for the usage of hook useMcp, that will be used by our UI to interact with our MCP server.

import { useMcp } from "use-mcp/react";

export interface ChatResponse {
    sql: string;
    result: unknown[];
}

export function useMcpSql() {
    const { state, retry, error, callTool } = useMcp({
        url: "http://localhost:3001/mcp",
        clientName: "MCP-SQL-Chat",
        autoReconnect: true,
    });

    const isError = state === "failed";

    const call = async (question: string) => {
        try {
            const result = await callTool("sql-query-tool", { question });

            return result.structuredContent as ChatResponse;
        } catch (error) {
            console.error(error);
            throw error;
        }
    };

    return {
        retry,
        isError,
        error,
        call,
    };
}

The useMcp is very useful because which provides a very good abstraction to instantiate a MCP client server to consume any MCP server (locally or remotely) with states and helpful methods. And we only need these: state, retry, error, callTool

state: used to check if the request made with callTool was successful or not;
retry: used by our UI in case of failure to repeat the same prompt/question request;
error: it’s just the exception content;
callTool: the principal method that will call our tool defined on the server (sql-query-tool) with the question param and it will be called by the "Ask Question" UI button.

Finally, the end result is the rendering of the SQL query and its results in table format for better visualization.

Wrapping up

It’s simple to use and integrate a local LLM via an MCP server, right? I hope this post has made it clearer what LLMs actually are and how we can go beyond common cloud usage (like ChatGPT, Grok, Copilot, etc.), adopting open-source models provided by the community that can be installed locally to solve everyday tasks—like the natural language SQL query demonstrated in this post, and without depending on super models that require your credit card!

That’s all for today, folks! Until next time!

Check out the complete code by clicking on this link

We want to work with you. Check out our Services page!

Powering Your Database Queries with a LLMs and MCPs

Use Case: How to use LLM to make SQL queries?

Let’s code!

`server.ts`

`question-to-sql.ts`

Frontend app

Wrapping up

Related

Gabriel Quaresma

Use Case: How to use LLM to make SQL queries?

Let’s code!

server.ts

question-to-sql.ts

Frontend app

Wrapping up

Related

Gabriel Quaresma

You might also like

IoT: What Can You Do with Your Stack?Use your own stack to build an IoT application

Use your own stack to build an IoT application

Scalable Frontend #3 — The State LayerPatterns for the single source of truth

Patterns for the single source of truth

ActiveRecord Callback Chain: A Cool (and Weird) Edge Case

`server.ts`

`question-to-sql.ts`

IoT: What Can You Do with Your Stack?
Use your own stack to build an IoT application

Scalable Frontend #3 — The State Layer
Patterns for the single source of truth