Redshift is Amazon’s data warehouse product available on AWS. To help users understand the contents and performance of their Redshift cluster, users are provided with some tables and views, including:
- a series of Postrges-inspired
pg_catalog
tables and views - the ANSI-standard
INFORMATION_SCHEMA
views - a series of Redshift-specific system tables and views
This blog focuses on the Redshift-specific objects, and the queries that I often come back to when I need to look “under the hood” to try and understand and improve performance issues.
Understanding the Redshift-specific system objects
All Redshift system objects have prefixes on their name which follow a naming convention:
- First, they have
S
for “system object” - Then they have either
T
for “table”, orV
for “view”, depending on the object type - Finally, they have either
L_
for “logs”, orV_
for, erm, “snapshots” (I haven’t been able to find anything which explains why this isV_
and notS_
!)
That gives you object names like STV_INFLIGHT
which is a system table with a (v)snapshot of the current queries running on the cluster. Simple! Now we understand that, let’s look at the SQL itself.
Queries
This SQL snippet tells you what’s actively running on the cluster (you need to be a superuser to see all queries running):
SELECT
userid,
query
pid
starttime
text
FROM stv_inflight;
The text
field will only store the first 200 characters of the queries running, but the real benefit here is that you can identify long-running queries. If you have a query you need to kill for any reason, you note the pid
from the results, and run pg_terminate_backend(pid)
to stop it running.
If you need the full text of the query, you can make a note of the query
from here, and plug it into this SQL:
SELECT
q.query,
q.starttime,
t.text,
t.sequence
FROM stl_query q
INNER JOIN stl_querytext t ON q.query = t.query
WHERE q.query = 1234
ORDER BY sequence;
Locks
If you are seeing an unusually long-running operation, it may be in a deadlocked transaction, which is visible in the STV_LOCKS
table. Again, you can use the lock_owner_pid
of the relevant sessions along with pg_terminate_backend(pid)
to kill the offending session (start with the oldest lock first):
SELECT
table_id,
last_update,
last_commit,
lock_owner_pid,
lock_status
FROM stv_locks
ORDER BY last_update;
Query/Table Performance
If you are digging around the system tables, you’re probably unhappy with the performance of something. There’s a series of objects which are aimed at providing useful information.
SVV_TABLE_INFO
SVV_TABLE_INFO
provides information about state of your tables, which can used to troubleshoot poor-performance. I use this view as a guide for when to VACUUM
a table. I tend to just select all columns from the view:
SELECT *
FROM svv_table_info
ORDER BY database, schema, table;
Here are some of the columns in the results I look at to see if there are any “low hanging fruit” with respect to improving performance:
empty
shows how many blocks are waiting to be freed by a vacuum.unsorted
shows the percent of the table that is unsorted. The cluster will need to scan this entire section for every query. If this creeps up too high you need to vacuum to re-sort the table.skey_sortkey1
shows the ratio of the size of the first column of the sortkey to the size of the largest non-sortkey column. You can use this value to evaluate the effectiveness of the sortkey.skew_rows
shows the ratio of rows from most on a slice to least on a slice. Use it to evaluate distkey.pct_used
shows the amount of disk that the one table occupies. Use it to review if a table, or portion of it, could be stored in S3 or elsewhere. This might be the case for less frequently accessed data, or if you have “backup” tables.
STL_ALERT_EVENT_LOG
The STL_ALERT_EVENT_LOG
table is important for optimising queries. When the cluster executes a query, it will record issues found by the query planner into STL_ALERT_EVENT_LOG
, along with suggested fixes.
Some problems can be fixed by running ANALYZE
or VACUUM
, while others might require rewriting the query or how you store your data. Amazon themselves can advise you there, with their table usage notes.
Again, I tend to just select all the columns from the table — the solution
column will tell you in plain language what will improve performance:
SELECT *
FROM STL_ALERT_EVENT_LOG;
I hope you find these queries useful. Feel free to connect with me on LinkedIn, and leave your comments and suggestions below.