Founded in 2006, Fuze looked to solve a problem for business conversations. They were lacking two key ingredients: communication & collaboration. Today, the Fuze platform unifies voice, video, messaging, and conferencing services on a single, award-winning cloud platform,and delivers intelligent, mobile-ready apps to customers.
A central Amazon Redshift data warehouse combines data from a growing number of sources and events. The tool chain around Amazon Redshift includes ETLeap and Dell Boomi for data ingestion and ETL pipelines. Fuze runs large-scale data transformations within Redshift in SQL. For business intelligence and dashboards Fuze uses Looker.
As Fuze has grown its customer base and employee count, data has become a mission-critical component. More than 300 people query the data warehouse on a constant basis across Fuze’s 19 global office locations.
Each day critical business functions query the data warehouse:
- Finance investigates bookings that have not been activated yet.
- Product uses data to understand platform adoption and arm customer success teams with information for client conversations.
- Sales closes its monthly books by distinguishing commissionable deals from self-service sign-ups.
- Customer Support investigates support tickets in near real-time, troubleshooting issues by calling up data on an account and product usage they can’t get anywhere else.
Shifting to the cloud and Amazon Redshift has been transformational for Fuze to support these new type of use cases for data.
Newfound Visibility increases Data Product Velocity
Stephen Bronstein leads the Data Team at Fuze. It’s a “skinny team” of 3 people that supports all of Fuze’s data needs.
Over the course of the past three years, Stephen has led the team through warehouse transitions and performance tuning, adoption of new data sources, regular surges of new data and use cases, and the on-boarding of hundreds of new data-users. “People care a lot about having the right data at the right time. It’s crucial to drive their work forward,” says Bronstein.
Amazon Redshift kept up with the grow in data volume, in-database transformations and users querying the warehouse. But with that much activity from growth in users and queries, it can be difficult to spot queries that degrade the overall performance and user experience. A single complex analyst SQL statement or new ETL workflow is enough to cause problems.
In the past, we would spend a lot of time understanding what particular query caused an issue. A CPU spike or the warehouse filling up. Often, the data to solve the issue wasn’t even available from the tool that wrote the query. We now get visibility in one view across our ETL pipelines, the warehouse, and our BI dashboards. And it’s not just that visibility. The mere fact that we can go back through more history and slice and dice that history in so many ways is a huge benefit to us.”Steve Bronstein
SVP OF ANALYTICS & DATA SCIENCE, FUZE
Like many customers coming to intermix.io, Steve faced a problem common to all data warehouses: A sudden increase in disk utilization and lack of visibility into the root cause. One bad query by a single user is sufficient.
Elastic resizing fixes the problem. But long term, that’s not a viable strategy as it causes cost problems and doesn’t solve the underlying issue of “expensive” queries and missing SLAs.
Especially our Tier 4 Support Team relies on near-real-time data from our Amazon Redshift warehouse,” says Bronstein. “When ETL pipelines start lagging, or queries are slow, all because of one bad query, it’s a serious problem. It blocks them from trouble-shooting customer issues. We want to make sure they have data in time so they can do their job.”
Another issue was long-running queries. As the questions about the business became more complex, so did the queries. Longer SQL statements, scanning more tables and data. Query execution times would spike to anywhere between 30 minutes and 3 hours, even longer for peaks. That causes frustration for dashboard users, and blocks resources for others.
BI and ETL tools that connect to Amazon Redshift mask their individual users, worfklows or originating S3 buckets. For the data team, that means switching between different vendor consoles, trying to triangulate issues. It’s a time-consuming guessing game to figure out which workflow in an ETL tool triggered a single COPY or UNLOAD statement, or which view in a dashboard triggered a SELECT statement.
intermix.io integrates with ETLeap and Looker out-of-the-box. With our App Tracing, data teams get user- and query-level visibility into data sources, workflows and Looker dashboards – and what resources they consume in the data warehouse.
App Tracing provides instant escalation of problems. “Now we can see what jobs are filling up our tables, what pipelines are lagging, or what individual Looker users are suffering from long-running queries. We need to be certain that we hit our SLAs, intermix.io enables that.”
A New World with Query Insights
Today, Steve’s team uses intermix.io for two broad use cases:
- Optimize the Fuze ETL pipelines. That includes monitoring ETL pipelines running in ETleap, transformations in Dell Boomi, and general health of the data warehouse. There are no more surprises when it comes to CPU spikes or tables growing and filling up the cluster.
- Find and fix slow queries. With the ETL pipelines under control, the team can dive into queries and find “the biggest offenders”, e.g. queries that use a lot of resources, have high latency, or prevent reports from running.
With ETL pipelines and data warehouse growth and cost under control, Steve’s team can now shift their attention to optimizing queries. Most of the queries run through Looker, including the use of Persistent Derived Tables (PDT). But even with that abstraction layer, as new users write new types of queries and trigger new workflows, problematic SQL statements are inevitable. The affect spans SLAs and query performance.
“The query recommendations in the intermix dashboard were a big ‘a-ha’ moment for us,” says Bronstein. With Query Insights, the Fuze data team tracks down the biggest “offenders” and addresses the part of their SQL statements that drive up query latency and resource usage. Query recommendations enable the data team to work with individual users to re-factor their queries and improve runtimes, sometimes by a factor of 10x.
With intermix.io we addressed our most urgent concern. Before didn’t have the insights needed to diagnose and troubleshoot issues with tools and queries. For example, we can now see how what seems like a pretty simple straightforward query is using 1.5TB of temp disk space.Steve Bronstein
SVP OF ANALYTICS & DATA SCIENCE, FUZE
For some of the most urgent issues, the team can fall back on the Expert Services that intermix offers. “Next to the product, we really value the expert support we get from intermix.io. We have a direct Slack channel to our assigned customer support engineers. We’re able to message within that channel where we’re hung up in our own troubleshooting and get near instantaneous support from engineers that would otherwise take us 12 to 24 hours.”
For the data team, using intermix.io has freed up valuable time while increasing insight into cluster and query behavior, focusing the team’s attention where it’s most productive. “The value for our users is that it accelerates our speed to generate insights. We can cherry-pick the highest value use cases, build them in Amazon Redshift and solve the most urgent ‘data pain points’ for our users,” says Bronstein.
For Fuze, the combination of Amazon Redshift and intermix.io increases the value Fuze generates from its data. A scalable cloud warehouse with Amazon Redshift, combined with a single view across their tool chain, fast queries and cost control makes for a better end user experience and drives everyone’s work with data forward.