We Tested Amazon Redshift Concurrency Scaling: Here are our Results
The Amazon Redshift architecture allows to scale up by adding more nodes to a cluster. That can cause over-provisioning of nodes to address peak query volume. Unlike adding nodes, Concurrency Scaling adds more query processing power on an as-needed basis.
The typical Concurrency Scaling for Amazon Redshift gives Redshift clusters additional capacity to handle bursts in query load. It works by off-loading queries to new, “parallel” clusters in the background. Queries are routed based on WLM configuration and rules.
Pricing for Concurrency Scaling is based on a credit model, with a free usage tier and beyond that a charge based on the time that a Concurrency Scaling cluster runs queries.
We tested Concurrency Scaling for one of our internal clusters. In this post, we present the results of our usage of the feature and information on how to get started.
Table of Contents
[cta heading=”Download the Top 14 Performance Tuning Techniques for Amazon Redshift” image=”https://intermix-media.intermix.io/wp-content/uploads/20190117201655/carl-j-734528-unsplash.jpg” form=”3″ whitepaper=”1210″]
To be eligible for concurrency scaling, an Amazon Redshift cluster must meet the following three requirements:
- EC2-VPC platform
- Node type must be dc2.8xlarge, ds2.8xlarge, dc2.large, or ds2.xlarge
- # of nodes is between 2 and 32 (no single-node clusters)
Eligible Query Types
Concurrency scaling does not work on all query types. For the first release, it handles read-only queries that meet three conditions:
- Read-only SELECT queries (although more types are planned)
- The query does not reference a table with sorting style of INTERLEAVED.
- The query does not use Amazon Redshift Spectrum to reference external tables.
For routing to a Concurrency Scaling cluster, a query needs to encounter queueing. Also, queries eligible for SQA (Short Query Acceleration) queue will not run on the Concurrency Scaling clusters.
Queuing and SQA are a function of a proper set-up of Redshift’s workload management (WLM). We recommend first optimizing your WLM because it will reduce the need for Concurrency Scaling. And that matters because Concurrency Scaling comes free up to a certain amount of hours. In fact, AWS claims that Concurrency Scaling will be free for 97% of customers, which takes us to pricing.
We’ve also tested enabling Redshift’s automatic WLM and captured our experience with it in this blog post “Should I Enable Amazon Redshift’s Automatic WLM?“
Concurrency Scaling Pricing
For Concurrency Scaling, AWS has introduced a credit model. Each active Amazon Redshift cluster earns credits on an hourly basis, up to one hour of free Concurrency Scaling credits per day.
You only pay when your use of the concurrency scaling clusters exceeds the amount of credits that you’ve incurred.
The fee equals the per-second on-demand rate for a transient cluster used in excess of the free credits – only when it’s serving your queries – with a one-minute minimum charge each time a Concurrency Scaling cluster is activated. Calculating the per-second on-demand rate is based on general Amazon Redshift pricing, i.e. by node type and number of nodes in your cluster.
Enabling Concurrency Scaling
Concurrency scaling is enabled on a per-WLM queue basis. Go to the AWS Redshift Console and click on “Workload Management” from the left-side navigation menu. Select your cluster’s WLM parameter group from the subsequent pull-down menu.
You should see a new column called “Concurrency Scaling Mode” next to each queue. The default is ‘off’. Click ‘Edit’ and you’ll be able to modify the settings for each queue.
How we Configured Redshift Concurrency Scaling
Concurrency scaling works by routing eligible queries to new, dedicated clusters. The new clusters have the same size (node type and number) as the main cluster.
The number of clusters used for concurrency scaling defaults to one (1), with the option to configure up to ten (10) total clusters.
The total number of clusters that should be used for concurrency scaling can be set by the parameter max_concurrency_scaling_clusters. Increasing the value of this parameter provisions additional standby clusters.
Monitoring our Concurrency Scaling test
There are a few additional charts in the AWS Redshift console. There is a chart called “Max Configured Concurrency Scaling Clusters” which plots the value of max_concurrency_scaling_clusters over time.
The number of Active Scaling clusters is also shown in the UI under Concurrency Scaling Activity:
The Queries tab in the UI also has a column to show if the query ran on the Main cluster or on the Concurrency Scaling cluster:
Whether a particular query ran on the main cluster or via a Concurrency Scaling cluster is stored in stl_query.concurrency_scaling_status.
A value of 1 means the query ran on a Concurrency Scaling cluster, and other values mean it ran on the main cluster.
redshiftcluster_2=# select distinct
concurrency_scaling_status,count(*) from stl_query where endtime <
'2019-03-29 15:00:00' group by concurrency_scaling_status;
concurrency_scaling_status | count
2 | 21
0 | 310790
4 | 19818
6 | 69082
11 | 7
3 | 853546
8 | 228977
Concurrency Scaling info is also stored in some other tables/views, e.g. SVCS_CONCURRENCY_SCALING_USAGE. There’s a list of catalog tables that store concurrency scaling information.
[cta heading=”Download the Top 14 Performance Tuning Techniques for Amazon Redshift” image=”https://intermix-media.intermix.io/wp-content/uploads/20190117201655/carl-j-734528-unsplash.jpg” form=”4″ whitepaper=”1210″]
We enabled Concurrency Scaling for a single queue on an internal cluster at approximately 2019-03-29 18:30:00 GMT. We changed the max_concurrency_scaling_clusters parameter to 3 at approximately 2019-03-29 20:30:00.
To simulate query queuing, we lowered the # of slots for the queue from 15 slots to 5 slots.
Below is a chart from the intermix.io dashboard, showing the running versus queuing queries for this queue, after cranking down the number of slots.
We observe that the queueing time for queries went up, maxing out at about > 5 minutes.
Here’s the corresponding summary in the AWS console of what happened during that time:
Redshift spun up three (3) concurrency scaling clusters as requested. It appears that these clusters were not fully utilized, even though our cluster had many queries that were queuing.
The usage chart correlates closely with the scaling activity chart:
After a few hours, we checked and it looked like 6 queries ran with concurrency scaling. We also spot-checked two queries against the UI. We haven’t checked how this value may be used if multiple concurrency clusters are active.
redshiftcluster_2=# select distinct
concurrency_scaling_status,count(*) from stl_query where endtime >
'2019-03-29 18:30:00' group by concurrency_scaling_status;
concurrency_scaling_status | count
4 | 108
6 | 333
1 | 6
0 | 913
3 | 4495
8 | 304
Conclusion: Is Redshift Concurrency Scaling worth it?
Concurrency Scaling may mitigate queue times during bursts in queries.
From this basic test, it appears that a portion of our query load improved as a result. However, simply enabling Concurrency Scaling didn’t fix all of our concurrency problems. The limited impact is likely due to the limitations on the types of queries that can use Concurrency Scaling. For example, we have a lot of tables with interleaved sort keys, and much of our workload is writes.
While concurrency scaling doesn’t appear to be a silver bullet solution for WLM tuning in all cases, using the feature is easy and transparent.
We do recommend enabling the feature on your WLM queues. Start with a single concurrency cluster, and monitor the peak load via the console to determine whether the new clusters are being fully utilized.
As AWS adds support for additional query/table types, Concurrency Scaling should become more and more effective.
Join 11,000 of your peers.
Subscribe to our newsletter SF Data.
People at Facebook, Amazon and Uber read it every week.
Every Monday morning we'll send you a roundup of the best content from intermix.io and around the web. Make sure you're ready for the week! See all issues.