Re: WIP: multivariate statistics / proof of concept

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: multivariate statistics / proof of concept
Date: 2014-10-29 09:41:08
Message-ID: CAApHDvpHNGUkp=n=drHssZymoN2yuUwJGjGZZYYvYSwuifDd5A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 13, 2014 at 11:00 AM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:

> Hi,
>
> attached is a WIP patch implementing multivariate statistics. The code
> certainly is not "ready" - parts of it look as if written by a rogue
> chimp who got bored of attempts to type the complete works of William
> Shakespeare, and decided to try something different.
>
>
I'm really glad you're working on this. I had been thinking of looking into
doing this myself.

> The last point is really just "unfinished implementation" - the syntax I
> propose is this:
>
> ALTER TABLE ... ADD STATISTICS (options) ON (columns)
>
> where the options influence the MCV list and histogram size, etc. The
> options are recognized and may give you an idea of what it might do, but
> it's not really used at the moment (except for storing in the
> pg_mv_statistic catalog).
>
>
>
I've not really gotten around to looking at the patch yet, but I'm also
wondering if it would be simple include allowing functional statistics too.
The pg_mv_statistic name seems to indicate multi columns, but how about
stats on date(datetime_column), or perhaps any non-volatile function. This
would help to solve the problem highlighted here
http://www.postgresql.org/message-id/CAApHDvp2vH=7O-gp-zAf7aWy+A-WHWVg7h3Vc6=5pf9Uf34DhQ@mail.gmail.com
. Without giving it too much thought, perhaps any expression that can be
indexed should be allowed to have stats? Would that be really difficult to
implement in comparison to what you've already done with the patch so far?

I'm quite interested in reviewing your work on this, but it appears that
some of your changes are not C89:

src\backend\commands\analyze.c(3774): error C2057: expected constant
expression [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(3774): error C2466: cannot allocate an
array of constant size 0 [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(3774): error C2133: 'indexes' : unknown
size [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4302): error C2057: expected constant
expression [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4302): error C2466: cannot allocate an
array of constant size 0 [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4302): error C2133: 'ndistincts' : unknown
size [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4775): error C2057: expected constant
expression [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4775): error C2466: cannot allocate an
array of constant size 0 [D:\Postgres\a\postgres.vcxproj]
src\backend\commands\analyze.c(4775): error C2133: 'keys' : unknown size
[D:\Postgres\a\postgres.vcxproj]

The compiler I'm using is a bit too stupid to understand the C99 syntax.

I guess you'd need to palloc() these arrays instead in order to comply with
the project standards.

http://www.postgresql.org/docs/devel/static/install-requirements.html

I'm going to sign myself up to review this, so probably my first feedback
would be the compiling problem.

Regards

David Rowley

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-10-29 09:43:39 Re: Allow peer/ident to fall back to md5?
Previous Message Simon Riggs 2014-10-29 09:27:50 Re: WIP: Access method extendability