Re: cross column correlation revisted

From: PostgreSQL - Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Subject: Re: cross column correlation revisted
Date: 2010-07-14 14:13:47
Message-ID: 79DD5105-8545-4705-9780-591AF8E9A1F7@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

hello tom,

i think that having stats on an index is a problem by itself for 2 reasons - for cross column correlation at least:

a.) joins cannot be covered by an index on two tables - we would fix "inside a table correlation problems" but not joins.
b.) who says that there is actually an index in place? assume you are doing some big seq scan to do analytics. you don't want it to be indexed for 10 different types of queries.

i think i is pretty hard to determine automatically what to collect because we cannot know which permutations of cross-column magic people will use.
i was thinking along the line of having it automatic as well but i could not figure out how to do it.
i think we can suggest addition stats to the user and we can write tools to figure our somehow what would be useful but personally i cannot see anything which is better than a command here.

many thanks,

hans

On Jul 14, 2010, at 4:01 PM, Tom Lane wrote:

> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
>> On 14/07/10 13:12, PostgreSQL - Hans-Jürgen Schönig wrote:
>>> maybe somehow like this ...
>>> ALTER TABLE x SET CORRELATION STATISTICS FOR (id = id2 AND id3=id4)
>
>> +1 is my general feeling, it's good if you can tell the system to
>> collect additional statistics where needed.
>
> The previous discussions about this went in the direction of
> "automatically collect stats if there is an index on that combination of
> columns". Do we really need a command?
>
>> However, the problem is how to represent and store the
>> cross-correlation.
>
> Yes, whatever the triggering mechanism is for collecting cross-column
> stats, actually doing something useful is the hard part.
>
> regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-07-14 14:22:01 Re: cross column correlation revisted
Previous Message Tom Lane 2010-07-14 14:01:26 Re: cross column correlation revisted