Marketing Analytics

What is Web Analytics?

· 29 min read

TL;DR

Web analytics is the practice of capturing, storing, transforming, and acting on data about how people use your website or app. In 2026, that means more than a GA4 tag dropped in the footer. A working analytics stack pulls events from the browser and the server, lands them in a warehouse like BigQuery or Snowflake, models them with dbt, and pushes the results back into the tools your sales, growth, and product teams already use. The companies that get this right know which channel paid for each customer, which feature kept them, and which dashboard nobody reads. The ones that don’t have GA4 open in a tab and a sales team operating on vibes.

In This Article

  1. TL;DR
  2. What web analytics actually is
  3. Why it still matters in 2026
  4. The 4 layers of a modern web analytics stack
  5. Choosing your tool: GA4, Plausible, Mixpanel, Amplitude, PostHog
  6. Cookieless tracking: what actually works in 2026
  7. Metrics that matter
  8. 30-day implementation playbook
  9. Where most companies go wrong
  10. Next steps

What web analytics actually is

After deploying analytics for 50+ companies across SaaS, marketplaces, and consumer subscription, I’ve watched the definition drift. Marketers say web analytics is “the GA4 dashboard”. Engineers say it’s “the events pipeline”. Both are partial.

Web analytics is the discipline that connects four things:

  1. Capture. Events fired from the browser, the server, the mobile app, and third-party tools (Stripe, HubSpot, Intercom). Each event has a timestamp, a user identifier, and a payload.
  2. Storage. A place where every event lands and stays. In 2018 that was a flat CSV. In 2026 it’s a cloud warehouse with petabyte capacity and SQL on top.
  3. Transform. The logic that turns raw events into business concepts: sessions, conversion paths, cohorts, retention curves, LTV. This is where dbt earned its place in the stack.
  4. Activate. Pushing the modeled data back into the systems where people make decisions: BI dashboards, the CRM, the ad platforms, the support tool.

Skip any one of these and you don’t have web analytics. You have telemetry, or a dashboard, or a spreadsheet pretending to be a strategy.

Why it still matters in 2026

Three shifts have reshaped what “web analytics” means since 2022, and most teams haven’t updated their stack accordingly.

The first is the cookieless era. Safari, Firefox, and Brave block third-party cookies by default. Chrome rolled out Privacy Sandbox APIs through 2025 while repeatedly delaying full third-party cookie deprecation, leaving the industry in a transitional state nobody planned for. Client-side GA4 is still blocked or stripped by an estimated 40 to 55 percent of EU traffic and 20 to 30 percent of US traffic, depending on the audience. If your analytics still depends on a JavaScript tag in the browser, you’re flying with one eye closed.

The second is AI Overviews changing the SERP. Google’s AI Overviews now appear above the organic results for the majority of informational queries. Click-through rates on pages ranked 1 through 10 have dropped 25 to 40 percent on those queries since early 2024. The old SEO playbook (rank for keywords, count sessions) no longer reflects what’s actually driving pipeline. You need to track conversation-to-conversion paths, not just keyword positions.

The third is attribution decay. Multi-touch attribution models built in 2019 assumed deterministic cross-device tracking. That assumption is dead. The companies that still report channel-level ROAS down to two decimal places are reporting fiction. The realistic answer in 2026 is a probabilistic model, a holdout test, and humility.

The teams winning in 2026 treat web analytics as an operational pipeline, not a reporting deliverable. The output of analytics is decisions, not slides.

The 4 layers of a modern web analytics stack

THE STACK, FOUR LAYERS CAPTURE events from the world GA4 / Plausible server-side GTM backend / webhooks Stripe / HubSpot STORAGE one warehouse, one truth BigQuery Snowflake ClickHouse TRANSFORM raw events into concepts dbt Dataform marts, tests, schedules ACTIVATE data where decisions happen BI tools dashboards reverse ETL Meta CAPI Google EC LinkedIn API
the four functional layers of a modern web analytics stack

Here’s the stack we deploy when we walk into a fractional engagement and find GA4 alone in the cupboard.

Capture: GA4, Plausible, and server-side GTM

The browser is no longer a reliable place to fire events from. Modern stacks use three capture channels in parallel:

  • Client-side: GA4 for funnels and behavior, or Plausible for privacy-first session counts. Treat this as best-effort.
  • Server-side GTM: a tagging server you run yourself (Cloud Run, App Engine, or a dedicated VM). The browser sends one event to your domain, the server fans it out to GA4, Meta CAPI, TikTok, LinkedIn. This recovers 30 to 50 percent of the conversions that ad blockers eat.
  • Backend events: the source of truth. Stripe webhooks, your app’s signup endpoint, your CRM’s lead-created event. These never get blocked because they never touch a browser.

The combination is what gives you a complete picture. GA4 alone gives you GA4’s picture, which is not the same thing.

Storage: BigQuery, Snowflake, ClickHouse

Pick one warehouse and consolidate. The mistake I see most often is “we have GA4 BigQuery export, Stripe in Redshift, and a Looker Studio dashboard pulling from spreadsheets.” Three sources of truth means zero sources of truth.

  • BigQuery is the default for any team using GA4. The free GA4 export is the single biggest gift Google has given the analytics industry, and ignoring it is malpractice.
  • Snowflake wins when finance is already there and the org needs role-based access at scale.
  • ClickHouse is the right call for very-high-volume event analytics (1B+ events per month) where the query patterns are predictable. It’s 5 to 10× cheaper than BigQuery at that scale, with caveats.

For most companies between $5M and $50M revenue, BigQuery is the correct answer and the conversation is over.

Transform: dbt and Dataform

Raw event data is not analytics. It’s raw event data. The transform layer turns it into the business concepts your stakeholders actually ask about.

dbt is now the standard. Every serious data team I’ve worked with in the past three years runs dbt, either dbt Core (open-source) or dbt Cloud (managed). The model is simple: SQL files in a git repo, dependency graph, tests, documentation, scheduled runs. If you’re building a transform layer from scratch in 2026, dbt is the answer unless you have a specific reason it isn’t.

Dataform (now part of Google Cloud) is the lighter alternative for BigQuery-only shops. Same idea, tighter integration, less ecosystem.

Activate: BI tools, dashboards, reverse ETL

The last mile. The transformed data has to land in the tools where decisions actually happen.

  • BI: Metabase (open-source, low cost), Looker (Google’s enterprise tool), Mode (analytics-first), Hex (notebooks-meet-BI). Pick one. Two BI tools means political war.
  • Reverse ETL: Hightouch or Census push warehouse data back into Salesforce, HubSpot, Intercom, Braze. This is where the warehouse stops being a black hole and starts paying for itself.
  • Embedded analytics: if you’re building dashboards for customers (not employees), Cube or Preset are the right answer.

Choosing your tool: GA4, Plausible, Mixpanel, Amplitude, PostHog

Five tools dominate the web analytics conversation in 2026. Here’s how to pick.

FIVE TOOLS, TWO AXES ANALYTICAL DEPTH deep shallow PRIVACY POSTURE weak strong GA4 free, ubiquitous, weak privacy Plausible privacy-first, depth ceiling Mixpanel funnels & product analytics Amplitude best-in-class cohorts PostHog self-host, all-in-one
five web analytics tools mapped against analytical depth and privacy posture
ToolPricingEase of setupAnalytical depthPrivacy postureSweet spotWhere it breaks
GA4Free; 360 starts at ~$50K/yrHigh (snippet + GTM)Medium; clunky for funnelsWeak; needs consent modeMarketing teams that need free, ubiquitous trackingProduct analytics, retention cohorts
Plausible$9 to $200/moVery highLowStrong; no cookiesPrivacy-first sites, EU traffic, founder-led startupsAnything beyond pageview counts
MixpanelFree up to 1M events; $20/mo+MediumHigh; built for funnelsMediumProduct analytics, feature adoption, B2B SaaSMarketing attribution, paid channels
AmplitudeFree up to 50K MTU; $61K/yr enterpriseMediumVery high; best-in-class cohortsMediumScaling B2B and B2C, retention focusCost at scale, simple sites
PostHogFree self-host; $0 to $450/mo cloudMediumHigh; ships with feature flags + replayStrong (self-host)Engineering-led teams that want one tool for everythingMarketing teams used to GA-style reports

The honest framework: if your team is mostly marketing, start with GA4 plus server-side GTM. If your team is mostly product, start with Mixpanel or PostHog. If you’re a privacy-forward EU brand, start with Plausible and accept the depth ceiling. Don’t try to run two of these in parallel. You’ll spend more time reconciling numbers than acting on them.

Cookieless tracking: what actually works in 2026

The phrase “cookieless tracking” gets thrown around like there’s a clean technical solution. There isn’t. There’s a stack of mitigations, and you implement as many as you need.

Server-side GTM is the single biggest lever. Move your tagging from the browser to a server you control. Your domain serves the tracking endpoint, the browser thinks it’s first-party, ad blockers ignore it. From 15+ production GA4 migrations, server-side GTM typically recovers 30 to 50 percent of the conversions that client-side loses, with the spread depending on audience demographics.

First-party cookies with extended TTL still work where they’re allowed. Configure your server to set a _ga equivalent with SameSite=Lax, Secure, HttpOnly, and a 13-month TTL (the legal max in most EU jurisdictions). This buys you durable user identification within a single browser.

Consent Mode v2 is required for any Google product (GA4, Google Ads) if you serve EU traffic. The basic implementation behaves correctly when the user denies consent (pings without identifiers, modeled conversions fill in the gaps). The advanced implementation does this plus offers more accurate conversion modeling. Pick advanced unless you have a privacy reason not to.

Fingerprinting limits. Stay away. Apple, Mozilla, and the EU have all signaled they treat fingerprinting as a privacy violation. Short-term gains, long-term regulatory risk. Don’t.

Server-to-server conversion APIs. Meta CAPI, Google Enhanced Conversions, TikTok Events API, LinkedIn Conversions API. Each ad platform now ingests conversions directly from your backend, bypassing the browser entirely. Wire these up. They’re the single best ROI work an analytics team can ship in 2026.

The realistic expectation: a well-configured stack recovers 60 to 70 percent of what you used to see in the cookie era. Not 100. Anyone selling you 100 is selling fingerprinting.

Metrics that matter

Most analytics dashboards are graveyards of vanity metrics nobody is acting on. Here’s the short list of metrics that actually move the business, by stage.

For a marketing site (pre-product):

  • Sessions by channel (organic, paid, referral, direct, email). Not pageviews. Sessions.
  • Conversion rate by channel. Visitors who became leads, divided by visitors. Tracked per channel because the blended number is meaningless.
  • Cost per qualified lead (CPQL). The only paid-traffic metric that matters once the qualifier is in place.

For a SaaS product:

  • Activation rate (signup → first value moment, within 24 hours). The single best predictor of week-12 retention.
  • Weekly active users / Monthly active users ratio. Stickiness.
  • Retention cohorts by signup month. The graph that tells you whether the business is actually working.
  • Conversion path from first touch to paid. Track exactly what people did, not what the last-click model claims.

For a marketplace or e-commerce site:

  • Repeat purchase rate at 30 / 60 / 90 days.
  • First-purchase contribution margin by acquisition channel.
  • Time-to-first-purchase distribution.

Notice what’s not on this list: bounce rate (mostly noise in 2026), average session duration (useless without context), pageviews (vanity), social shares (vanity). If a metric isn’t tied to a decision someone makes within the next two weeks, it doesn’t belong on a dashboard.

30-day implementation playbook

Most “implement web analytics” engagements bloat into six-month projects that produce a Notion page. Here’s the version that actually ships in 30 days.

Week 1: Audit and consolidate.

  • Inventory every tool currently firing events. List them. Who owns each? When was it last updated?
  • Identify the source of truth for revenue (almost always Stripe or the backend, never GA4).
  • Document the top 10 questions stakeholders ask analytics this quarter. These define what you build.

The common finding in week 1 is that 60 to 70 percent of stakeholder questions can’t be answered because the underlying data was never captured, not because the dashboards are wrong. Don’t add tracking yet, name the gap. Keep the question list in a shared doc, not a slide deck, and assign each question an owner who will see the answer first. The people who get their questions answered fastest become the analytics team’s best advocates inside the org.

Week 2: Stand up the warehouse and pipeline.

  • Connect GA4 to BigQuery (the export is free and instant).
  • Pipe Stripe, the product database, and any other source into the same warehouse via Fivetran, Airbyte, or a custom Cloud Function.
  • Deploy server-side GTM on a subdomain of your main site (tag.yourdomain.com).

Server-side GTM is the lever most teams skip in week 2 because it looks scary and isn’t. The container itself runs for $5 to $50 a month on Cloud Run or a small VM. Mirror the existing client-side tags into the server container first, validate they fire correctly against a known event, then start migrating critical conversions one by one. Keep the client-side fallback live until the server numbers reconcile with the source-of-truth backend, then sunset the duplicates.

Week 3: Model the data.

  • Stand up a dbt project with three layers: staging, intermediate, marts.
  • Build the marts that answer the top 10 questions from Week 1. No more.
  • Add tests to every primary key and every revenue model. This is non-negotiable.

The most common mistake in week 3 is building 50 marts because they sound useful. Don’t. Build the 5 to 10 marts that map directly to the questions from week 1. Marts that aren’t queried within 30 days of creation are dead weight and they make the project look bigger than it is when you go to defend it. Lean is faster to ship, easier to test, and easier to explain to a CFO who’s asking what the analytics line item is buying.

Week 4: Activate.

  • Wire one BI tool to the marts. Build the five dashboards stakeholders actually want.
  • Push the customer LTV model back into HubSpot or Salesforce via Hightouch or Census.
  • Wire Meta CAPI, Google Enhanced Conversions, and LinkedIn Conversions API to the warehouse-fed conversion events.

Activation is where most engagements stall. The warehouse is built, the marts are tested, the dashboards look correct, then nobody opens them. Fix this by writing a one-pager per dashboard before you ship it: who owns it, what decision it informs, what action follows a red number. Without that one-pager, dashboards drift into trivia inside a quarter. With it, they survive leadership changes and become the operating layer the company actually plans against.

After 30 days you have a working stack. The next 60 days are about depth (cohort analysis, attribution modeling, retention curves) and the 60 days after that are about scale (governance, role-based access, SLA on key metrics).

Where most companies go wrong

Five patterns we see in the first audit of every new fractional engagement.

1. Over-instrumentation. The team fires 400 events. Nobody can name what 380 of them are for. Pick the 20 events that map to business outcomes and delete the rest. Yes, delete.

2. No data layer. The site sends events directly from inline JavaScript scattered across templates. Every refactor breaks tracking. Build a proper data layer (a JavaScript object that pages populate, GTM reads from). Engineering will thank you.

3. Ignoring server-side events. The team uses GA4 and only GA4. They have no idea how much they’re losing to ad blockers because GA4 doesn’t tell them. Server-side capture is the floor of professional analytics in 2026.

4. Never auditing. Nobody has checked whether the conversion tag fires correctly in six months. Half the time it doesn’t. Quarterly tag audits should be on someone’s calendar.

5. Dashboards nobody opens. I check this on every audit: pull the BI tool’s access logs and count the dashboards opened in the last 30 days. The typical mid-stage company has 60 dashboards and 4 that get opened weekly. Kill the 56 unused ones. Yes, kill them. The dead ones cost trust.

Next steps

If you got here, you already know whether your analytics stack is working. The signal isn’t subtle. Either your CEO trusts the numbers and acts on them, or there’s a slow loss of confidence that compounds across quarters.

Three places to start, in order of effort:

  1. Run the Week 1 audit above on your own stack. The output is a list. The list is usually uncomfortable.
  2. Pick the single biggest gap (almost always server-side tracking or a missing transform layer) and ship it.
  3. If the gap is leadership, not tooling, that’s the conversation Valiotti Data has. Fractional CDO engagements start with the same teardown described in this article.

Web analytics in 2026 isn’t a tool decision. It’s a discipline that connects capture to activation, and most of the value lives in the boring middle: the warehouse, the dbt models, the tags nobody wants to maintain. Get that part right and the dashboards become trivial. Skip it and no BI tool will save you.

Keep reading

Enjoyed this article?

Get weekly data strategy insights delivered to your inbox.

Get in Touch

Let's Discuss Your Project

Book a 30-minute discovery call. We'll assess your data maturity and recommend the right approach — no strings attached.

Book a Discovery Call →
Need help with your data strategy? Book a Discovery Call →