AI for Data

How to Use MCP Toolbox for Databases with Claude Code: BigQuery Setup, Tools, and a Read-Only Production Pattern

· 10 min read

Most data teams in 2026 still talk to BigQuery through brittle glue: a notebook with hardcoded SQL, a half-broken dbt project, or a Slack bot that knows three queries. Google quietly shipped MCP Toolbox for Databases last year (formerly Gen AI Toolbox for Databases), and Claude Code reads MCP natively. Wire them together and you get a SQL pair-programmer that actually knows your schema. Here is the working setup, the nine BigQuery tools you get out of the box, and the read-only pattern I run in production.

In This Article

  1. What MCP Toolbox for Databases actually is
  2. Why a data team should care
  3. Setup in five commands
  4. The .mcp.json that Claude Code reads
  5. The nine prebuilt BigQuery tools
  6. Custom tools: when prebuilt is not enough
  7. Local toolbox vs managed MCP server
  8. Read-only mode for production
  9. What MCP Toolbox is not

What MCP Toolbox for Databases actually is

MCP Toolbox for Databases is an open-source MCP server from Google. The repo lives at github.com/googleapis/mcp-toolbox, the docs at mcp-toolbox.dev, latest release v1.2.0. It connects AI agents, IDEs and applications directly to enterprise databases, BigQuery included, and serves a dual purpose:

  • Prebuilt tools (build-time): launch with --prebuilt=<database> and you instantly get a set of generic tools your agent can call. No YAML, no boilerplate.
  • Custom tools framework (run-time): define your own SQL tools in a tools.yaml file with sources, tools, toolsets and prompts. This is what production agents use.

The wire picture for the BigQuery prebuilt path:

Client
Claude Code
.mcp.json
stdio MCP
JSON
Bridge
MCP Toolbox
–prebuilt=bigquery
BigQuery API
rows + jobs
Warehouse
BigQuery
datasets · tables
auth via gcloud ADC ~/.config/gcloud/application_default_credentials.json

Claude Code calls a tool. The toolbox translates it into a BigQuery API call against the project you already authenticated to with gcloud. Results flow back as structured JSON that the agent can reason over.

Why a data team should care

Three slots where this beats writing your own glue. The first is schema discovery. Instead of teaching the agent INFORMATION_SCHEMA gymnastics, it calls list_dataset_ids, then list_table_ids, then get_table_info, and writes the query against real schema instead of guessing column names. The second is grounded SQL: ask_data_insights takes a question in plain English, looks at the catalog and returns SQL that compiles. The third is the analysis primitives Google bolted on, forecast for time series and analyze_contribution for key driver analysis between two periods, which used to be a notebook each. You get them as tool calls.

Setup in five commands

From a clean macOS laptop to a working agent in under five minutes. Replace YOUR_PROJECT with your BigQuery project ID. The two env variables keep the download URL short:

$ gcloud auth application-default login
$ gcloud config set project YOUR_PROJECT
$ export VERSION=1.2.0 PLATFORM=darwin/arm64
$ BASE=https://storage.googleapis.com/mcp-toolbox-for-databases
$ curl -L -o toolbox "$BASE/v$VERSION/$PLATFORM/toolbox"
$ chmod +x ./toolbox && ./toolbox --version

The first command writes ADC credentials to ~/.config/gcloud/application_default_credentials.json. The toolbox reads from that file at runtime, so you do not need to paste a service account key into your repo. For Linux or Windows, swap darwin/arm64 for the right platform from the releases page. If you would rather skip the binary install, npx -y @toolbox-sdk/server runs the same server on demand.

The .mcp.json that Claude Code reads

Drop this into your project root or your global Claude Code config. The npx form is the simplest path:

{
  "mcpServers": {
    "toolbox-bigquery": {
      "command": "npx",
      "args": [
        "-y",
        "@toolbox-sdk/server",
        "--prebuilt=bigquery"
      ],
      "env": {
        "BIGQUERY_PROJECT": "YOUR_PROJECT",
        "BIGQUERY_MAX_QUERY_RESULT_ROWS": "200"
      }
    }
  }
}

If you prefer the binary you downloaded above, swap "command": "npx" for "command": "./toolbox" and drop the -y and @toolbox-sdk/server args. The default row cap is 50, the override above lifts it to 200, useful for real exploration. Restart Claude Code. Type /mcp. You should see toolbox-bigquery connected and nine tools registered. If it shows zero tools, the ADC token expired, run gcloud auth application-default login again. The same JSON shape works for Gemini CLI, Cursor, Windsurf, VS Code Copilot, Cline, Codex and Google Antigravity.

The nine prebuilt BigQuery tools

Worth knowing the surface area before you start prompting. These are the exact tool names registered by --prebuilt=bigquery, grouped by what you would use them for. Google's own prebuilt config also bundles them into two toolsets, data and analytics, which you can target from a custom agent.

Discovery
4 tools
Map the warehouse, surface schemas. Read-safe, zero query cost.
list_dataset_ids list_table_ids get_dataset_info get_table_info
>_
SQL & search
2 tools
Run ad-hoc SQL, search the catalog. This is where slot cost shows up.
execute_sql search_catalog
Analytics
3 tools
Grounded NL-to-SQL, forecast, key-driver attribution. One call replaces a notebook.
ask_data_insights forecast analyze_contribution

A real prompt I run in Claude Code, end to end:

You are a BigQuery analyst. Tools available via the toolbox-bigquery MCP.

Task: figure out which traffic source drove the biggest week-over-week
change in signups in dataset `analytics_prod`.

1. Use search_catalog to find the signup events table.
2. Use get_table_info to confirm the relevant columns.
3. Use analyze_contribution with last week vs the prior week.
4. Return: top 3 drivers with magnitude and a one-line reading.

Use execute_sql with a LIMIT for any ad-hoc lookup.
Do not write to the warehouse.

Custom tools: when prebuilt is not enough

The custom path lives in a tools.yaml file. Three sections do all the work, sources, tools, toolsets:

kind: source
name: my-bq-source
type: bigquery
project: YOUR_PROJECT

---
kind: tool
name: search_release_notes_bq
type: bigquery-sql
source: my-bq-source
description: Get Google Cloud release notes for the past week.
statement: |
  SELECT product_name, description, published_at
  FROM `bigquery-public-data`.`google_cloud_release_notes`.`release_notes`
  WHERE DATE(published_at) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
  ORDER BY published_at DESC

---
kind: toolset
name: my_bq_toolset
tools:
  - search_release_notes_bq

Run it locally with ./toolbox --tools-file tools.yaml --ui and the toolbox spins up on port 5000 with a tiny web UI at http://127.0.0.1:5000/ui for poking at tools before you wire them into an agent. The full taxonomy, parameterized queries, OAuth-gated tools, semantic search, lives in the official docs.

Local toolbox vs managed MCP server

Same tools, different operational model. The split that matters for a data team:

Local toolboxGoogle Cloud MCP ServersCustom helper
Where it runsyour laptopGoogle Cloud, managedyour repo
Authgcloud ADCOAuth 2.0 plus IAMservice account key
Access policyyour rolescentralized IAMwhatever you wrote
Setup time5 minutes30 minutesa sprint
Best fitsolo data workteam-wide policyweird custom needs

For a single data lead exploring a warehouse, local Toolbox is right. For a team where you need an audit trail and per-role tool restrictions, the Google Cloud managed MCP servers hand you the same prebuilt experience without anyone running the binary. Custom helpers I would only build if you have a quirk neither covers, for example a dialect MCP Toolbox does not yet speak.

Read-only mode for production

The default execute_sql tool runs whatever SQL the agent passes it, including INSERT, UPDATE, DELETE and DDL. That is fine for a sandbox, dangerous on a shared warehouse. Two ways to lock it down. The cheap one: skip the prebuilt config and write a tools.yaml that registers only the read-side tools (list_dataset_ids, list_table_ids, get_dataset_info, get_table_info, search_catalog, ask_data_insights) and never execute_sql. The proper one: attach an IAM deny policy that strips write permissions from the principal Toolbox runs as, so even a confused agent cannot mutate state. Belt-and-braces if your warehouse holds anything you cannot rebuild from sources.

What MCP Toolbox is not

It is not a dashboard. Metabase and Looker still answer “show me the chart”. It is not a modeling layer. dbt and Dataform still own the transformation tier. It is not a way to bypass BigQuery slot costs. Every execute_sql call is a BigQuery job with real billing. Treat it like a junior analyst with admin keys, useful but supervised.

FAQ

Does this work with Claude Desktop too? Yes. Same .mcp.json shape, drop it into the Claude Desktop config directory, restart, the tools appear. Cursor, Windsurf, VS Code Copilot, Cline, Codex, Gemini CLI and Google Antigravity all read the same MCP format.

Can MCP Toolbox connect to other warehouses? Yes. The same binary supports AlloyDB, Cloud SQL (PostgreSQL, MySQL, SQL Server), Spanner, Firestore, Knowledge Catalog (formerly Dataplex), plus PostgreSQL, MySQL, SQL Server, Oracle, MongoDB, Redis, Elasticsearch, CockroachDB, ClickHouse, Couchbase, Neo4j, Snowflake, Trino and others. Swap --prebuilt=bigquery for the right flag and adjust env vars.

What does this cost? The toolbox itself is free, Apache 2.0. You pay BigQuery slot or on-demand pricing for every query the agent issues, plus your Claude API or subscription tokens. Run get_table_info liberally, run execute_sql with a row limit until you trust the agent.

How do I scope it to one dataset? Use IAM. Grant the principal running the toolbox roles/bigquery.dataViewer on the specific dataset and roles/bigquery.jobUser at the project, nothing higher. The toolbox inherits the auth identity’s permissions, no extra scoping needed at the toolbox level.

Fractional CDO engagement

A working MCP Toolbox setup in your repo, in 30 days.

If your team has BigQuery and a Claude license but nobody to wire them together cleanly, Valiotti Data drops in as a fractional CDO and ships the working pieces, not another deck.

MCP Toolbox + .mcp.json wired to your project
Starter prompt library for your warehouse
IAM deny-policy template for read-only mode
Running BigQuery cost dashboard
Book a discovery call → See what fractional CDO covers

See also

Fivetran MCP with Claude Code — Connector Ops Without the UI

Upstream of every BigQuery query is a pipeline that loaded the data. Fivetran’s official MCP server lets Claude Code own that side too.

Keep reading

Enjoyed this article?

Get weekly data strategy insights delivered to your inbox.

Get in Touch

Let's Discuss Your Project

Book a 30-minute discovery call. We'll assess your data maturity and recommend the right approach — no strings attached.

Book a Discovery Call →
Need help with your data strategy? Book a Discovery Call →