Re: WIP: multivariate statistics / proof of concept

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: multivariate statistics / proof of concept
Date: 2014-10-30 09:17:07
Message-ID: CAApHDvp_ONYK=u0c_tmzfmmq-SoHbEhCeO_z755oTZFn+Bo-Wg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 30, 2014 at 12:48 AM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:

> Dne 29 Říjen 2014, 12:31, Petr Jelinek napsal(a):
> >> I've not really gotten around to looking at the patch yet, but I'm also
> >> wondering if it would be simple include allowing functional statistics
> >> too. The pg_mv_statistic name seems to indicate multi columns, but how
> >> about stats on date(datetime_column), or perhaps any non-volatile
> >> function. This would help to solve the problem highlighted here
> >>
> http://www.postgresql.org/message-id/CAApHDvp2vH=7O-gp-zAf7aWy+A-WHWVg7h3Vc6=5pf9Uf34DhQ@mail.gmail.com
> >> . Without giving it too much thought, perhaps any expression that can be
> >> indexed should be allowed to have stats? Would that be really difficult
> >> to implement in comparison to what you've already done with the patch so
> >> far?
> >>
> >
> > I would not over-complicate requirements for the first version of this,
> > I think it's already complicated enough.
>
> My thoughts, exactly. I'm not willing to put more features into the
> initial version of the patch. Actually, I'm thinking about ripping out
> some experimental features (particularly "hashed MCV" and "associative
> rules").
>
>
That's fair, but I didn't really mean to imply that you should go work on
that too and that it should be part of this patch..
I was thinking more along the lines of that I don't really agree with the
table name for the new stats and that at some later date someone will want
to add expression stats and we'd probably better come up design that would
be friendly towards that. At this time I can only think that the name of
the table might not suit well to expression stats, I'd hate to see someone
have to invent a 3rd table to support these when we could likely come up
with something that could be extended later and still make sense both today
and in the future.

I was just looking at how expression indexes are stored in pg_index and I
see that if it's an expression index that the expression is stored in
the indexprs column which is of type pg_node_tree, so quite possibly at
some point in the future the new stats table could just have an extra
column added, and for today, we'd just need to come up with a future proof
name... Perhaps pg_statistic_ext or pg_statisticx, and name functions and
source files something along those lines instead?

Regards

David Rowley

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2014-10-30 09:23:38 Re: WIP: multivariate statistics / proof of concept
Previous Message Abhijit Menon-Sen 2014-10-30 09:00:28 Re: initdb -S and tablespaces