|From:||Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>|
|To:||Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>|
|Cc:||pgsql-hackers(at)postgresql(dot)org, tomas(dot)vondra(at)2ndquadrant(dot)com, dean(dot)a(dot)rasheed(at)gmail(dot)com|
|Subject:||Re: extended statistics: n-distinct|
|Views:||Raw Message | Whole Thread | Download mbox|
Kyotaro HORIGUCHI wrote:
> At Mon, 20 Mar 2017 16:02:20 -0300, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote in <20170320190220(dot)ixlaueanxegqd5gr(at)alvherre(dot)pgsql>
> > This is a new thread to present a version of the n-distinct patch that
> > IMO is close enough to commit. There are some work items still.
> > There's some discussion on the topic of cross-column statistics:
> > https://wiki.postgresql.org/wiki/Cross_Columns_Stats
> > This problem is important enough that Kyotaro Horiguchi submitted
> > another patch that does the same thing:
> > https://www.postgresql.org/message-id/flat/20150828.173334.114731693.horiguchi.kyotaro%40lab.ntt.co.jp
> > This patch aims to provide the same functionality, keeping the design
> > general enough that other kinds of statistics can be added later (such
> > as functional dependencies, histograms and MCVs, all of which have been
> > previously submitted as patches by Tomas).
> I may be stupid but I don't get the picture here, specifically
> about the relation to Tomas's patch. Does this work as
> infrastructure for Tomas's mv patch? Or in some other
Well, this patch is Tomas' first patch, which I've reviewed and reworked
-- I changed some things that weren't properly finished, cleaned up the
code, made it all more robust, and made sure the sane cases work sanely
while the others rejected promptly (rather than throwing bogus error
messages at a later time, or crashing).
I didn't review your own n-distinct patch. I don't think there's any
common code, but it would be very useful if you could try your test
scenarios and make sure they are handled sanely by this patch.
Regarding your question:
> Do you planning to realize correcting esitimation of joins
> perplexed by strong correlations?
There is a later patch in Tomas' series which I would like to get to
before PG10 closes, but it's not this patch. It needs to be rebased on
top of this one.
Attached is v30, which includes some more cleanup. Detailed commits can
be seen here:
In particular, this includes code from Tomas to consider mixing
ndistinct estimates from multiple multivariate statistic objects, which
is better than the old approach of only using the estimate when a
perfect match was found. However, I lobotomized Tomas' selfuncs.c code
however and I need to revert that part before pushing -- essentially I
removed examine_variable() processing, which seemed a bit on the
expensive side for what we were doing, but that was a silly mistake.
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
|Next Message||David Steele||2017-03-22 21:33:37||Re: increasing the default WAL segment size|
|Previous Message||Elvis Pranskevichus||2017-03-22 21:02:57||Re: [PATCH v1] Add and report the new "in_hot_standby" GUC pseudo-variable.|