Redshift is Amazon’s data warehouse product available on AWS. To help users understand the contents and performance of their Redshift cluster, users are provided with some tables and views, including:
- a series of Postrges-inspired
pg_catalogtables and views
- the ANSI-standard
- a series of Redshift-specific system tables and views
This blog focuses on the Redshift-specific objects, and the queries that I often come back to when I need to look “under the hood” to try and understand and improve performance issues.
Understanding the Redshift-specific system objects
All Redshift system objects have prefixes on their name which follow a naming convention:
- First, they have
Sfor “system object”
- Then they have either
Tfor “table”, or
Vfor “view”, depending on the object type
- Finally, they have either
L_for “logs”, or
V_for, erm, “snapshots” (I haven’t been able to find anything which explains why this is
That gives you object names like
STV_INFLIGHT which is a system table with a (v)snapshot of the current queries running on the cluster. Simple! Now we understand that, let’s look at the SQL itself.
This SQL snippet tells you what’s actively running on the cluster (you need to be a superuser to see all queries running):
SELECT userid, query pid starttime text FROM stv_inflight;
text field will only store the first 200 characters of the queries running, but the real benefit here is that you can identify long-running queries. If you have a query you need to kill for any reason, you note the
pid from the results, and run
pg_terminate_backend(pid) to stop it running.
If you need the full text of the query, you can make a note of the
query from here, and plug it into this SQL:
SELECT q.query, q.starttime, t.text, t.sequence FROM stl_query q INNER JOIN stl_querytext t ON q.query = t.query WHERE q.query = 1234 ORDER BY sequence;
If you are seeing an unusually long-running operation, it may be in a deadlocked transaction, which is visible in the
STV_LOCKS table. Again, you can use the
lock_owner_pid of the relevant sessions along with
pg_terminate_backend(pid) to kill the offending session (start with the oldest lock first):
SELECT table_id, last_update, last_commit, lock_owner_pid, lock_status FROM stv_locks ORDER BY last_update;
If you are digging around the system tables, you’re probably unhappy with the performance of something. There’s a series of objects which are aimed at providing useful information.
SVV_TABLE_INFO provides information about state of your tables, which can used to troubleshoot poor-performance. I use this view as a guide for when to
VACUUM a table. I tend to just select all columns from the view:
SELECT * FROM svv_table_info ORDER BY database, schema, table;
Here are some of the columns in the results I look at to see if there are any “low hanging fruit” with respect to improving performance:
emptyshows how many blocks are waiting to be freed by a vacuum.
unsortedshows the percent of the table that is unsorted. The cluster will need to scan this entire section for every query. If this creeps up too high you need to vacuum to re-sort the table.
skey_sortkey1shows the ratio of the size of the first column of the sortkey to the size of the largest non-sortkey column. You can use this value to evaluate the effectiveness of the sortkey.
skew_rowsshows the ratio of rows from most on a slice to least on a slice. Use it to evaluate distkey.
pct_usedshows the amount of disk that the one table occupies. Use it to review if a table, or portion of it, could be stored in S3 or elsewhere. This might be the case for less frequently accessed data, or if you have “backup” tables.
STL_ALERT_EVENT_LOG table is important for optimising queries. When the cluster executes a query, it will record issues found by the query planner into
STL_ALERT_EVENT_LOG, along with suggested fixes.
Some problems can be fixed by running
VACUUM, while others might require rewriting the query or how you store your data. Amazon themselves can advise you there, with their table usage notes.
Again, I tend to just select all the columns from the table — the
solution column will tell you in plain language what will improve performance:
SELECT * FROM STL_ALERT_EVENT_LOG;
I hope you find these queries useful. Feel free to connect with me on LinkedIn, and leave your comments and suggestions below.