Previously, we discussed the role of Amazon Redshift's sort keys and compared how both compound and interleaved keys work in theory. Throughout that post we used some dummy data and a set of Postgres queries in order to explore the Z-order curve and interleaved sorting without getting bogged down in implementation details. In this post, we will explore some of these implementation details, discuss a common tactic that can benefit from using compound and interleaved sort keys together, and run some benchmark queries against a data set with billions of rows. Throughout this post we will link to code that can be used to recreate our results.

`stv_blocklist`

In Part I, we discussed how Redshift eschews the B-tree in favor of another secondary data structure called a zone map. The zone maps for each table, as well as additional metadata, is stored in the `stv_blocklist`

system table. If we take a look at the definition of the`stv_blocklist`

table, we can see it has a width of about 100B. This corresponds with our estimates from Part I regarding the small total size of this secondary data structure relative to the tables it describes.

Taking a look at `stv_blocklist`

for a table with a compound key confirms what we've learned about zone maps for compound keys. Below we can see that the zone map for`c_custkey`

, the leading column of the sort key for the `orders_compound`

table, would prove helpful in pruning irrelevant blocks. However, the zone map for any of the other columns in the key would be useless.

*The zone map for a compound sort key (code).*

If we take a look at the zone map for `orders_interleaved_4`

, which has an interleaved sort key, we can see that the zone maps for all of the sort key columns look pretty useful.

*The zone map for an interleaved sort key (code).*

However, it's not the minimum and maximum column values that we are seeing here for each block but instead the minimum and maximum sort key values. Continue reading here.

© 2019 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

**Technical**

- Free Books and Resources for DSC Members
- Learn Machine Learning Coding Basics in a weekend
- New Machine Learning Cheat Sheet | Old one
- Advanced Machine Learning with Basic Excel
- 12 Algorithms Every Data Scientist Should Know
- Hitchhiker's Guide to Data Science, Machine Learning, R, Python
- Visualizations: Comparing Tableau, SPSS, R, Excel, Matlab, JS, Pyth...
- How to Automatically Determine the Number of Clusters in your Data
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- Fast Combinatorial Feature Selection with New Definition of Predict...
- 10 types of regressions. Which one to use?
- 40 Techniques Used by Data Scientists
- 15 Deep Learning Tutorials
- R: a survival guide to data science with R

**Non Technical**

- Advanced Analytic Platforms - Incumbents Fall - Challengers Rise
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- How to Become a Data Scientist - On your own
- 16 analytic disciplines compared to data science
- Six categories of Data Scientists
- 21 data science systems used by Amazon to operate its business
- 24 Uses of Statistical Modeling
- 33 unusual problems that can be solved with data science
- 22 Differences Between Junior and Senior Data Scientists
- Why You Should be a Data Science Generalist - and How to Become One
- Becoming a Billionaire Data Scientist vs Struggling to Get a $100k Job
- Why do people with no experience want to become data scientists?

**Articles from top bloggers**

- Kirk Borne | Stephanie Glen | Vincent Granville
- Ajit Jaokar | Ronald van Loon | Bernard Marr
- Steve Miller | Bill Schmarzo | Bill Vorhies

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives**: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central