All

Refined an ETL Project for Flawless Performance and Seamless Data Workflows: Scalista Case

Refined an ETL Project for Flawless Performance and Seamless Data Workflows

Impact
<2 weeks
Resolution Time
Fully restored
Client Reports
99.5%+
System Reliability

Restored critical client reporting pipeline in under 2 weeks. Implemented monitoring and testing to prevent recurrence.

The Challenge

Scalista, an Austrian digital marketing agency, had developed a proprietary product called Cloudista — a data harvesting platform that collected marketing data from their clients’ accounts and exported it to Google BigQuery for reporting in Tableau. However, the system had stopped working correctly: the connection between BigQuery and Tableau was broken, Python library updates had introduced compatibility issues, and clients were no longer receiving their reports.

This wasn’t just a technical problem — it was a business continuity issue. Scalista’s clients depended on these analytics reports for their marketing decisions. Every day the system was down, client satisfaction eroded and Scalista’s reputation was at risk. They needed a partner who could diagnose the root causes quickly, fix the immediate issues, and ensure long-term system stability.

The urgency was compounded by the fact that Scalista’s business model depended on providing superior analytics to their marketing clients. If a competitor could demonstrate better reporting capabilities, Scalista risked losing accounts. The broken pipeline wasn’t just a technical problem — it was threatening the core value proposition of the business. Furthermore, the codebase had accumulated technical debt over years of incremental changes without proper testing or documentation, making it difficult for anyone to understand the full system behavior or predict the impact of fixes.

Our Approach

We structured the engagement as a three-phase technical rescue operation:

  • Phase 1 — Diagnosis: We performed a comprehensive audit of the Cloudista codebase, focusing on the ETL pipeline from data collection through BigQuery loading to Tableau visualization. We identified three distinct failure points: a breaking change in the Google BigQuery Python client library, a Tableau connector configuration issue, and a data schema mismatch that had accumulated over time.
  • Phase 2 — Remediation: We fixed each issue systematically. The BigQuery client was updated with proper version pinning to prevent future breaking changes. The Tableau connection was reconfigured with a more robust approach that didn’t depend on deprecated features. The data schema was cleaned and standardized across all client accounts.
  • Phase 3 — Hardening: Beyond fixing the immediate issues, we implemented safeguards to prevent recurrence: automated testing for the ETL pipeline, monitoring alerts for data freshness, version pinning for all dependencies, and documentation of the system architecture for Scalista’s internal team.

We also redesigned the approach to building Tableau reports from BigQuery data, implementing a more reliable pattern that handled edge cases (empty datasets, schema changes, new client onboarding) gracefully.

Beyond the immediate fix, we restructured the codebase for long-term maintainability. We introduced proper error handling patterns, centralized configuration management, and created a deployment pipeline that ran automated tests before any change reached production. We also implemented a client onboarding checklist that automated the setup of new client data flows, reducing what had been a manual, error-prone process to a repeatable 15-minute procedure. The monitoring system we put in place checks data freshness for every client every hour, alerting the Scalista team if any client’s reports are stale — preventing the silent failures that had caused the original crisis.

Results

  • Root cause identified and resolved — BigQuery-to-Tableau connection fully restored.
  • New, more robust approach to building Tableau reports from BigQuery data.
  • Project maintenance and hardening ensuring flawless ongoing performance.
  • All clients receiving correct, timely analytics reports again.
  • Automated testing and monitoring preventing future silent failures.
  • Comprehensive documentation enabling Scalista’s team to maintain the system independently.

Technologies Used

Python, Google BigQuery, Tableau, GCP, automated testing, CI/CD for data pipelines.

Project Screenshots

Facing similar data challenges?

Book a Discovery Call →

Key Takeaways

01

Maintain a dialog between a data analyst and the Client. It’s best if the Client provides as many details as possible and formalizes the tasks.

02

When a vendor is dealing with an already-developed project, provide project documentation. It will help them understand the context much better.

Have a similar challenge?
Let's talk about your data

A 30-minute conversation about your data stack, pain points, and opportunities.

30-min video call No commitment Actionable next steps

Explore related projects

View All Case Studies →
Need help with your data strategy? Book a Discovery Call →