Announcing Query Groups – Intelligent Query Classification

Query Groups is a powerful feature which intelligently classifies and ranks query workloads on your cluster. Query Groups can answer questions like:

How it Works

Queries are grouped together using a proprietary algorithm, and ranked by volume, execution time, and queue time. More metrics will be added in the future.

All queries in a “query group” share SQL structure and operate on the same tables.

Example – Find the queries causing a query spike

At 8:17am on Aug. 7 the below cluster experienced an 8x spike in queries. Typically, this type of event is caused by a handful of new queries which suddenly increased their volume. How do you find which queries? Who ran them?

Query groups can quickly determine which queries are responsible.

Click on the new Query Groups page in the left nav. Groups are sorted by Rank by default. In this case, we want to re-sort by “Rank Change”. Sorting by rank change will order the list of query groups by the ‘fastest movers’. So you’ll quickly see the groups which moved up the ranks in the past week.

Sure enough, we see a handful of query groups which suddenly started running. Clicking into the first one, we can isolate the exact queries.

The same procedure could be used to determine the queries that spike in latency or queue time.

What’s Next

We will expand the ‘grouping’ concept in the future to add:

