Start Now Login
What Causes “Serializable Isolation Violation Errors” in Amazon Redshift?

What Causes “Serializable Isolation Violation Errors” in Amazon Redshift?

As an Amazon Redshift administrator, you may have come across an error in your logs that looks like this:

ERROR: 1023 DETAIL: Serializable isolation violation on table – 15464033, transactions forming the cycle are: 44753006, 44753010 (pid:22973)

This post will work through how to understand the root cause of these errors and how to fix them.

Background

Amazon Redshift implements a scheme called serial isolation. This is necessary to achieve high concurrency for write operations. It allows for many transactions to write to a single table at the same time.

Here’s how it works:

Take two transactions: T1 and T2 which write to TABLE1. Thus when either T1 or T2 run, they will change TABLE1. Let’s run these transactions in two scenarios:

Scenario 1: T1 runs first, followed by T2
Scenario 2: T2 runs first, followed by T1

If TABLE1 always looks the same after either scenario 1 and 2 – then Redshift is able to run T1 and T2 in parallel. No problem!

But in rare cases, it is possible for parallel execution of T1 and T2 to yield a different result. This would violate data integrity for the database. Redshift will act to prevent this, by aborting the transaction (and rolling it back).

Here’s an example. Consider an empty table. Transaction T1 SELECTs all the rows in the table and then INSERTs a row. Transaction T2 INSERTs a row into the table and then SELECTs all the rows in the table.

Scenario 1

T1 executes to completion, followed by T2. T1 will SELECT no rows from the empty table and INSERT a row, and then T2 will INSERT a row and SELECT 2 rows. So the SELECTed output would be 0 rows for T1 and 2 rows for T2.

Scenario 2

Reverse the order, so T2 goes first. T2 will INSERT a row and then SELECT all the rows (i.e. 1) in the table, and then T1 will SELECT all rows and then INSERT a row. So the SELECTed output of the transactions will be 1 row for T2 and 1 row for T1.

The Violation

Let’s look at what might happen during execution is parallel:

  1. T2 could INSERT first,
  2. T1 could then SELECT and INSERT, and
  3. then T2 could SELECT.

This would result in T1 SELECTing one row but T2 SELECTing two rows. We cannot get this result through any serial execution of T1 and T2. T1 aborts since T1 did the WRITE that triggered the violation.

Database “Snapshots”

How does Redshift do this? Redshift takes snapshots of the database following any of the following statements:

 
SELECT
COPY
DELETE
INSERT
UPDATE
TRUNCATE
ALTER TABLE
CREATE TABLE
DROP TABLE
TRUNCATE TABLE
 

Monitoring for violations begins immediately after the snapshot. You can assume it’s happening all the time.

Debugging a Real-World Scenario

Let’s look at a real-world scenario, and how we could debug it. Here is an example that we recently came across in one of our clusters.

Serializable isolation violation on table – 13187250, transactions forming the cycle are: 40805622, 40805600, 40805609 (pid:4029)

First, let’s look for which of these three transactions caused the offending WRITE. We can do this by selecting from the STL_TR_CONFLICT table. That table logs a row for each occurrence of this error.

 

dev=# select * from stl_tr_conflict where xact_id in (40805622, 40805600, 40805609);

xact_id | process_id | xact_start_ts | abort_time | table_id

----------+------------+----------------------------+----------------------------+----------

40805600 | 4029 | 2018-01-26 03:40:14.026031 | 2018-01-26 03:40:24.657363 | 13187250

(1 row)

Redshift aborted Transaction 40805600 because it would have resulted in a serialization violation. We can now execute the following SQL to find all the READS and WRITES that lead up to this violation:

WITH aborted_transactions as (
SELECT tc.xact_id as aborted_xid,
tc.table_id AS tbl,
tc.xact_start_ts transaction_start_time,
tc.abort_time transaction_abort_time
FROM
stl_tr_conflict tc
WHERE
tc.xact_id=40805600
),
concurrent_transactions as (
SELECT at.aborted_xid as aborted_xid,
at.tbl as aborted_table,
s.xid as concurrent_xid,
min(s.starttime) as transaction_starttime,
max(s.endtime) as transaction_endtime
FROM aborted_transactions at,
svl_statementtext s
WHERE s.xid in (40805622, 40805600, 40805609)
GROUP BY aborted_xid, aborted_table, concurrent_xid
),
-- find all tables touched by these transactions
concurrent_operations as (
SELECT ct.aborted_xid, ct.concurrent_xid, d.query, q.starttime, q.endtime, d.tbl, 'D' as operation, q.aborted, substring(querytxt, 1, 50) querytxt FROM stl_delete d,stl_query q,concurrent_transactions ct WHERE d.query=q.query AND ct.concurrent_xid = q.xid
UNION
-- insert, update, copy statements
SELECT ct.aborted_xid, ct.concurrent_xid, i.query, q.starttime, q.endtime, i.tbl, 'W' as operation, q.aborted, substring(querytxt, 1, 50) querytxt FROM stl_insert i,stl_query q,concurrent_transactions ct WHERE i.query=q.query AND ct.concurrent_xid = q.xid
UNION
-- select statements
SELECT ct.aborted_xid, ct.concurrent_xid, s.query, q.starttime, q.endtime, s.tbl, 'R' as operation, q.aborted, substring(querytxt, 1, 50) querytxt FROM stl_scan s,stl_query q,concurrent_transactions ct WHERE s.type=2 AND s.query=q.query AND ct.concurrent_xid = q.xid
UNION
-- maybe add aborted?
SELECT ct.aborted_xid, ct.concurrent_xid, q.query, q.starttime, q.endtime, stc.table_id as tbl, 'A' as operation, q.aborted, substring(querytxt, 1, 50) querytxt FROM stl_tr_conflict stc,stl_query q,concurrent_transactions ct WHERE stc.xact_id=ct.concurrent_xid AND q.xid=stc.xact_id and q.aborted=1
)
SELECT co.aborted_xid, co.concurrent_xid, co.query, co.starttime, co.endtime, co.tbl, tn.“table”, co.operation, co.aborted, co.querytxt FROM concurrent_operations co LEFT JOIN svv_table_info tn ON co.tbl=tn.table_id ORDER BY co.starttime;

To run this in your cluster,

  1. replace the xact_id in the aborted_transactions WHERE clause, with the transaction id SELECTed from the stl_tr_conflict table (in our case 40805600).
  2. replace the transaction list in the aborted_transactions WHERE clause with the full transaction list reported in the error message (in our case 40805622, 40805600, 40805609).

This produces the following output. We omit the actual query text for brevity.

 aborted_xid | concurrent_xid | query | starttime | endtime | tbl | is_temp_table | operation | aborted
-------------+----------------+----------+----------------------------+----------------------------+----------+---------------+-----------+---------
40805600 | 40805600 | 12145452 | 2018-01-26 03:40:14.173419 | 2018-01-26 03:40:24.62772 | 13187208 | f | R | 0
40805600 | 40805600 | 12145452 | 2018-01-26 03:40:14.173419 | 2018-01-26 03:40:24.62772 | 15037734 | t | W | 0
40805600 | 40805600 | 12145452 | 2018-01-26 03:40:14.173419 | 2018-01-26 03:40:24.62772 | 13187242 | f | R | 0
40805600 | 40805600 | 12145452 | 2018-01-26 03:40:14.173419 | 2018-01-26 03:40:24.62772 | 13187210 | f | R | 0
40805600 | 40805609 | 12145474 | 2018-01-26 03:40:16.096666 | 2018-01-26 03:40:16.688598 | 15037750 | t | W | 0
40805600 | 40805609 | 12145480 | 2018-01-26 03:40:16.689452 | 2018-01-26 03:40:16.935325 | 15037750 | t | R | 0
40805600 | 40805609 | 12145485 | 2018-01-26 03:40:17.024012 | 2018-01-26 03:40:17.682866 | 15037750 | t | W | 0
40805600 | 40805609 | 12145491 | 2018-01-26 03:40:17.69563 | 2018-01-26 03:40:18.473838 | 13187208 | f | D | 0
40805600 | 40805609 | 12145491 | 2018-01-26 03:40:17.69563 | 2018-01-26 03:40:18.473838 | 15037750 | t | R | 0
40805600 | 40805609 | 12145491 | 2018-01-26 03:40:17.69563 | 2018-01-26 03:40:18.473838 | 13187208 | f | R | 0
40805600 | 40805609 | 12145500 | 2018-01-26 03:40:18.483172 | 2018-01-26 03:40:18.965172 | 15037750 | t | R | 0
40805600 | 40805609 | 12145500 | 2018-01-26 03:40:18.483172 | 2018-01-26 03:40:18.965172 | 13187208 | f | W | 0
40805600 | 40805622 | 12145512 | 2018-01-26 03:40:20.353269 | 2018-01-26 03:40:24.941206 | 13187210 | f | R | 0
40805600 | 40805622 | 12145512 | 2018-01-26 03:40:20.353269 | 2018-01-26 03:40:24.941206 | 13187208 | f | R | 0
40805600 | 40805622 | 12145512 | 2018-01-26 03:40:20.353269 | 2018-01-26 03:40:24.941206 | 13187250 | f | R | 0
40805600 | 40805600 | 12145557 | 2018-01-26 03:40:24.645869 | 2018-01-26 03:40:24.661953 | 15037734 | t | R | 1
40805600 | 40805600 | 12145557 | 2018-01-26 03:40:24.645869 | 2018-01-26 03:40:24.661953 | 13187250 | f | A | 1
(17 rows)

Operations on temp tables are gray. Temp tables cannot trigger a violation error since they are per-session. Still, they can be useful for debugging because they start a serial isolation (due to the snapshot).

Here’s what we learned:

  1. Transaction 40805600 reads from table 13187208
  2. Transaction 40805609 reads and writes from 13187208
  3. Transaction 40805622 reads and joins tables 13187208 to 13187250. (this is only evident in the omitted query text)
  4. Transaction 40805600 deletes (evident in the omitted query text) from 13187250. This delete triggers the serialization violation.

We can see how this cycle of transactions triggered the error. 40805600 read at the beginning of the cycle. Then 40805609 read and wrote. Finally, 40805600 read and wrote again (via the delete). And due to its join, 40805622 introduced a dependency between the two table writes.

Fixing Serialization Violation Errors

Up till now, we’ve diagnosed and understood the cause of our serialization violation. How can we avoid this problem? There are three approaches:

Move SELECT statements out of the offending transactions.

When possible, move out all the SELECTs. This will resolve the conflict. But this may not be possible when your WRITE operation depends on the result of SELECT.

Pessimistic locking.

You can prevent the conflict by locking the table using the LOCK statement. In our case, we could lock 13187208 at the start of 40805600 or 40805609. This comes at the cost of concurrency and performance. Locking a table can affect transactions outside of this cycle if they need to read that table.

Reschedule transactions.

Change the query scheduler to run transactions at different times. Transaction 40805609 is a COPY job and 40805600 and 40805622 are transformations. We could avoid the conflict completely by ensuring they don’t run at the same time.

Going beyond troubleshooting

Hope this post helped you find and fix Serializable isolation violation in Redshift. Going further I would suggest looking at our “Top 14 Performance Tuning Techniques for Amazon Redshift” where we talk in-depth on how to improve query performance and all the challenges in scaling your workloads.

Related content
3 Things to Avoid When Setting Up an Amazon Redshift Cluster Apache Spark vs. Amazon Redshift: Which is better for big data? Amazon Redshift Spectrum: Diving into the Data Lake! A Quick Guide to Using Short Query Acceleration and WLM for Amazon Redshift for Faster Queries What is TensorFlow? An Intro to The Most Popular Machine Learning Framework Titans of Data with Mirko Novakovic - How Containers are Giving Rise to New Data Services Why We Built intermix.io - “APM for Data” 4 Simple Steps To Set-up Your WLM in Amazon Redshift For Better Workload Scalability World-class Data Engineering with Amazon Redshift - Training Announcing App Tracing - Monitoring Your Data Apps With intermix.io Have Your Postgres Cake with Amazon Redshift and eat it, too. 4 Real World Use Cases for Amazon Redshift 3 Steps for Fixing Slow Looker Dashboards with Amazon Redshift Zero Downtime Elasticsearch Migrations Titans of Data with Florian Leibert – CEO Mesosphere Improve Amazon Redshift COPY performance:  Don’t ANALYZE on every COPY Building a Better Data Pipeline - The Importance of Being Idempotent The Future of Machine Learning in the Browser with TensorFlow.js Gradient Boosting Libraries — A Comparison Crowdsourcing Weather Data With Amazon Redshift The Future of Apache Airflow Announcing Query Groups – Intelligent Query Classification Top 14 Performance Tuning Techniques for Amazon Redshift Product Update: An Easy Way To Find The Cause of Disk Usage Spikes in Amazon Redshift How We Reduced Our Amazon Redshift Cost by 28%
Ready to start seeing into your data infrastructure?
Get started with a 14-day free trial, with access to the full platform

No Credit Card Required