Quick Links

Re: Thinking About Correlated Columns (again)

From:	Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To:	sthomas(at)optionshouse(dot)com
Cc:	"pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: Thinking About Correlated Columns (again)
Date:	2013-05-15 15:52:22
Message-ID:	5193AF36.3040409@vmware.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On 15.05.2013 18:31, Shaun Thomas wrote:
> I've seen conversations on this since at least 2005. There were even
> proposed patches every once in a while, but never any consensus. Anyone
> care to comment?

Well, as you said, there has never been any consensus.

There are basically two pieces to the puzzle:

1. What metric do you use to represent correlation between columns?

2. How do use collect that statistic?

Based on the prior discussions, collecting the stats seems to be tricky.
It's not clear for which combinations of columns it should be collected
(all possible combinations? That explodes quickly...), or how it can be
collected without scanning the whole table.

I think it would be pretty straightforward to use such a statistic, once
we have it. So perhaps we should get started by allowing the DBA to set
a correlation metric manually, and use that in the planner.

- Heikki

In response to

Thinking About Correlated Columns (again) at 2013-05-15 15:31:46 from Shaun Thomas

Responses

Re: Thinking About Correlated Columns (again) at 2013-05-15 16:27:29 from Shaun Thomas
Re: Thinking About Correlated Columns (again) at 2013-05-15 17:30:57 from Nikolas Everett
Re: Thinking About Correlated Columns (again) at 2013-05-15 20:22:33 from Gavin Flower

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Craig James	2013-05-15 16:23:07	Re: Thinking About Correlated Columns (again)
Previous Message	Shaun Thomas	2013-05-15 15:31:46	Thinking About Correlated Columns (again)