Database Doctor
Writing on databases, performance, and engineering.

Posts with tag: statistics

Cover

TPC-H Query 11 – Diving into Statistics

Times have been busy after joining: Floe. I highly recommend you check out our blog there to see what we are up to.

I am not giving up on my TPC-H series! Today we are half-way through the workload and have arrived at Query 11. If I am succeeding in what I set out to do, regular readers now have a good grasp on:

  • The large, often orders of magnitude, impact of good Query Optimisation.
  • How to come up with good query plans manually and validate those made by machines.
  • A basic grasp of statistics and what they mean for query planners.

Today's query is pretty simple. Your new skills will let you find the optimal query plan easily.

I am going to take this chance to talk about statistics and how they relate to Query 11. We will also be talking more about bloom filters and what they can do for your analytical workload.

Read More...

Cover

TPC-H Query 10 - Histograms and Functional Dependency

Welcome back to the TPC-H series, dear reader. And happy holidays to those of you who've already shut down.

In today's educational blog, I'm going to teach you about:

  • The importance of histograms
  • When not to do bushy joins
  • Functional dependencies and how they speed up queries
  • Bloom filters

This is a lot of ground to cover in the around 5-15 minutes I have your attention. Every deep dive starts at the surface — let us jump into the deep sea.

Read More...