AI for Data

How to Use OpenClaw for Your Data Team: Self-Hosted Setup, Skills, and Hosted Alternative

· 9 min read

OpenClaw is usually pitched as a personal AI assistant, the kind of thing you point at your inbox or Spotify. That framing is why most data leads ignore it. The interesting part for a data team is the same part everyone is talking past: it is a self-hosted gateway between messaging apps and an LLM agent, with persistent memory and a skills system. Which means you can park one in Slack, give it your tools, and let it run the boring half of the week.

In This Article

  1. What OpenClaw actually is
  2. Why a data team should care
  3. My setup: OpenClaw on a Hostinger VPS
  4. Cost split: heartbeat vs primary
  5. Hosted alternative: KiloClaw
  6. Skills I actually run for data work
  7. OpenClaw vs Claude Code vs a plain Slack bot
  8. What OpenClaw is not
  9. Where to go from here

What OpenClaw actually is

OpenClaw is an open-source agent runtime by Peter Steinberger (MIT licensed, mascot is a space lobster called Molty). The runtime sits between chat channels (Slack, Telegram, WhatsApp, Discord, iMessage, Matrix, Teams, Signal) and any LLM you wire up. Configuration lives in ~/.openclaw/openclaw.json. There is a CLI for setup and operations and a browser Control UI on port 18789. There is no vendor SaaS in the loop. The agent runs where you put it.

The shape of a deployment, in one diagram:

     CHANNELS              RUNTIME              TOOLS
   ┌──────────┐          ┌──────────┐         ┌──────────┐
   │  slack   │ ───┐     │          │     ┌── │  notion  │
   ├──────────┤    │     │ openclaw │     │   ├──────────┤
   │ telegram │ ───┼───▶ │ + skills │ ───▶├── │   jira   │
   ├──────────┤    │     │ + memory │     │   ├──────────┤
   │ whatsapp │ ───┤     │          │     ├── │  fathom  │
   ├──────────┤    │     │  llm     │     │   ├──────────┤
   │ discord  │ ───┘     │  router  │     └── │warehouse │
   └──────────┘          └──────────┘         └──────────┘

Why a data team should care

Three slots in a data lead’s week that an always-on agent fills cleanly. The first is recurring stakeholder updates: a Monday digest pulled from Jira, recent meetings, and a Notion roadmap, written in your voice and posted to Slack. The second is on-call notifications with hands: instead of a pager that just yells, the agent pulls the failing connector’s last sync log, pings the right channel, and drafts the ticket. The third is the ad-hoc pull: someone asks “what was Q1 churn by plan?” in Slack, the agent runs the query against your warehouse helper, and answers in-thread. None of this needs a new dashboard. It needs an agent with the right tools and a stable address to message.

My setup: OpenClaw on a Hostinger VPS

I run OpenClaw on a Hostinger VPS using their official gateway image. The container sits behind a Tailscale address, so the agent is reachable from my laptop and phone but not from the open internet. Docker Compose mounts a workspace directory, and the runtime exposes the gateway on a single port. I keep my helper scripts in a private repo and sync them onto the VPS with a five-minute cron pull, so anything I add on the laptop is on the box by lunch. Secrets ride along encrypted with age, decrypted on the VPS with a key that never leaves the host.

The compose file is twelve lines:

# /docker/openclaw/docker-compose.yml
services:
  openclaw:
    image: ghcr.io/hostinger/hvps-openclaw:latest
    restart: unless-stopped
    env_file: .env
    ports:
      - "${PORT}:${PORT}"
    volumes:
      - ./data:/data
      - /workspace:/data/.openclaw/workspace

And the cron job that keeps helpers fresh:

# on the VPS host, crontab -e
*/5 * * * * docker exec openclaw bash -c \
  'cd /data/.claude/utils-repo && git pull -q && bash scripts/post-pull.sh'

Cost split: heartbeat vs primary

OpenClaw runs two model classes at once. The “primary” handles real reasoning. The “heartbeat” runs the cheap chatter that keeps an agent alive across sessions and routes inbound messages. If you put a frontier model on the heartbeat slot you will burn through credits while idle. I run primary as openai-codex/gpt-5.4 through a ChatGPT OAuth login, which means inference comes out of an existing Plus subscription instead of a metered API key. Heartbeat runs on Moonshot Kimi K2.6, which is an order of magnitude cheaper per million tokens. Embeddings for memory search run on text-embedding-3-small. The split is one config change and it cuts a quiet $50 a month off the bill.

The exact two commands I ran:

$ openclaw models auth login --provider openai-codex --set-default
# opens a ChatGPT OAuth flow, paste back the redirect URL

$ openclaw models set moonshot/kimi-k2.6 --slot heartbeat
$ openclaw models status --plain
primary:    openai-codex/gpt-5.4
heartbeat:  moonshot/kimi-k2.6
embeddings: openai/text-embedding-3-small

Hosted alternative: KiloClaw

If a VPS is more YAK than your current week can absorb, the team at Kilo runs a hosted version called KiloClaw, built on the same OpenClaw runtime. It is cloud-hosted and always on, you assign tasks through Slack, Discord, Telegram or the web, and you pay with Kilo credits that cover both hosting and inference. There is a 7-day free trial with no card required. If you do want a discount on the first paid month, KILOPARTNERSMAY at Stripe checkout takes 50% off (expires May 31, 2026). The honest tradeoff: you give up the model-routing tricks above, but you skip the VPS, the Tailscale config, and the cron job that worries about helper drift.

Skills I actually run for data work

Skills in OpenClaw are small markdown prompts wired to tools, stored under ~/.openclaw/skills/. Mine fall into three buckets. A “humanize” skill that takes a draft executive update and strips the AI tells (em-dashes, “let’s dive in”, suspiciously parallel three-bullet lists) before it reaches a stakeholder. A LinkedIn analytics skill that pulls my Shield CSV, runs seven slices on it (post type, day-of-week, length, hashtags, themes, trends, top-ranked posts) and writes a markdown report into the vault. And a generic “weekly digest” skill that joins meeting recaps from Fathom with active Jira tickets and Notion roadmap pages, then drafts the update in my voice. None of these are clever. They are five-paragraph markdown files. What pays off is the runtime keeping them warm and reachable from any chat app.

The weekly-digest skill, trimmed for the post:

# ~/.openclaw/skills/weekly-digest.md
---
name: weekly-digest
description: Compile last week's data team progress into a stakeholder digest.
tools: [bash, fathom, jira, notion]
---

You are drafting a weekly digest for a non-technical exec.

Steps:
1. Pull last 7 days of recordings via the fathom helper, summarize calls
   that mention data, dashboards, metrics, or pipelines.
2. List Jira tickets in project DATA closed since Monday.
3. Read the "Roadmap" Notion page, note any deadline shifts.
4. Draft 3 short paragraphs: shipped, in-flight, blockers.

Output rules:
- No em-dashes. No "Let's dive in". No "leverage".
- Numbers always with units, never raw counts.
- End with one open question for the exec.

OpenClaw vs Claude Code vs a plain Slack bot

Quick read on where each one actually lands when a data team picks one:

OpenClaw Claude Code Slack bot
Interface chat apps terminal CLI slack only
State persistent memory session only stateless
Tools any helper or MCP any repo command fixed actions
Model choice any provider any provider vendor pinned
Best fit team ops, on-call code review, refactors simple alerts

What OpenClaw is not

It is not a code-review agent. If you want PR comments on your dbt models, use Claude Code or a similar IDE-attached CLI. It is not a batch ETL runner. Airflow, Dagster and Prefect still own that. It is not a BI replacement. Metabase and Superset still answer “show me the dashboard”. OpenClaw is the chat-shaped layer on top of those. Pretending it can replace the underlying tools is the fastest way to a six-month migration that delivers nothing.

FAQ

Does OpenClaw work with dbt or Metabase out of the box? Not directly. There is no first-party dbt or Metabase integration. You wire it up by giving the agent a shell tool and a small skill that knows how to call dbt run, query the Metabase REST API, or hit your warehouse. The runtime supplies the loop, you supply the glue.

Can I run OpenClaw without writing my own skills? Yes. The default install has enough channels, memory, and tool-calling that a plain agent in Slack already does useful work. Skills are how you sharpen it for your team.

OpenClaw vs Claude Code, what is the difference? Claude Code is a CLI that lives in your terminal next to your repo. It is great for code work and bad at being on-call. OpenClaw lives in chat apps and is great at being on-call but does not pretend to be your IDE. Different shapes, different jobs.

Does OpenClaw support MCP? Yes, including a managed browser through the Chrome DevTools MCP. You can plug in any MCP server you already run.

Where to go from here

If your team has the data stack but no one to put an agent on top of it, this is the kind of work Valiotti Data runs as a fractional CDO engagement. We bring an OpenClaw setup, a starter skills pack, and the cost-routing config into your repo in the first 30 days, plus an on-call runbook. Get in touch if you want a working setup instead of another platform conversation.

Keep reading

Enjoyed this article?

Get weekly data strategy insights delivered to your inbox.

Get in Touch

Let's Discuss Your Project

Book a 30-minute discovery call. We'll assess your data maturity and recommend the right approach — no strings attached.

Book a Discovery Call →
Need help with your data strategy? Book a Discovery Call →