Start Now Login

Blinkist uses intermix.io to find rogue queries and optimize schema configurations

“With intermix.io, Blinkist has been able to plan each table's sort and distribution key, and keep their table statistics current.”

0%

UNPLANNED DOWNTIME

2x

FASTER QUERIES

37%

REDUCTION IN STORAGE UTILIZATION

The Company

Blinkist lets you read the key insights from 2,000+ nonfiction books. The Blinkist mobile app provides powerful summary packs that you can read or listen to in 15 minutes. The summaries condense key insights into “blinks”, because you can read or listen to it in the blink of an eye. Every month, Blinkist delivers book summaries to millions of users via a subscription-based service.

The Blinkist growth engine is paid acquisition, via an automated its ad-bidding process. Machine learning models analyze behavioral data from the app. By looking at data from past campaigns, the models predict the ROI of new campaigns across different ad types, networks and audiences. The system submits then new bids on a programmatic basis.

The People

Sebastian Schleicher is the Head of IT at Blinkist. Sebastian’s team handles both Blinkist’s production and data infrastructure. They are tightly coupled as insights from behavioral data is integrated back into the product experience. Front-end developers constantly instrument new events in the Blinkist app. 

The Blinkist data infrastructure is the foundation for a constant flow of high quality data for the machine learning models. By optimizing ad bids, the models help drive down Blinkist’s customer acquisition cost, increasing the overall value of Blinkist’s business.

Sebastian Schleicher
HEAD OF DATA INFRASTRUCTURE, BLINKIST

The Challenge

Blinkist’s Redshift cluster had experienced periods of unexpected downtime because the cluster had filled up. In some cases, a developer instrumented a new event type in the Blinkist app, which then would cause exploding data growth in the data infrastructure. Amazon CloudWatch would trigger an alert, but too late to take action. It was impossible for the data team to find the underlying transformations and queries in time, and identify the tables driving the growth that would fill up the cluster. In other cases, aggregations wouldn’t run, leading to stale data.

For Sebastian, it was important to get full visibility into data growth and its sources. Speed and throughput were important as well, but the most crucial points were cluster stability and data validation. Cluster downtime and stale data are serious problems for Blinkist. Downtime means less growth, and stale data leads to misaligned bids, which drives up customer acquisition cost.

”If our customer acquisition campaigns stop, our growth stops.”

Tobias Balling

CTO, BLINKIST

The Process

The first step was one of our half-day workshops “World-class Data Engineering on Amazon AWS”. We went through the best practices for Amazon Redshift, and identified which of them had been implemented so far at Blinkist. Together, we also looked at the entire data infrastructure tool chain, which included e.g. Amazon Kinesis Data Streams and Mode Analytics for dashboards and data analysis. The output was a success plan with a set of projects that the Blinkist data team could include into their sprints.

The Solution

Blinkist used our “Discover” feature to realize that most workloads were running through a single user account, in a single default queue. By setting up individual logins, Sebastian’s team was now able to track the specific queries of each user.  The visibility into each user allowed the team to use our “Throughput” and “Memory Analysis” features to set up workload management, and fine-tune concurrency and memory.  Coupled with finding the sortkeys and distkeys via our “Skew Analysis” , that led to a 2x increase in average query speeds.

Blinkist used our Storage page to identify stale tables and introduced regular vacuuming with our custom vacuum scripts. Stats-off dropped from over 60% to below 5%, freeing up space in the cluster. The team started to track the growth of individual tables and schemas via our “Table Analytics”, and set alerts when data growth spiked. By knowing which table was growing, and what user and queries were responsible for the growth, the Blinkist team was now able to always act before the cluster filled up.

With full visibility into their data infrastructure, the Blinkist data team can now move at much faster pace. Unplanned cluster downtime is a thing of the past, and the focus is now on building out more data services that can fuel Blinkist’s growth.

”Today, our problems are solved. We can see the historic growth of our data down to the schema, with full visibility into the quality of the tables. intermix.io reads like a story of our data infrastructure – I see immediately what’s up.”

Sebastian Schleicher

HEAD OF IT, BLINKIST

Next case study

“Our cluster was simply unresponsive. Dashboards wouldn’t return results for 20 minutes. Today, we’re at a completely different level of performance. Now, 3 seconds wait time is the worst case scenario for us.“
Aaron Lozier
CTO, POTENZA AUTO
99%REDUCTION IN QUEUE WAIT TIME