Re: statistics for array types

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: statistics for array types
Date: 2015-09-10 21:28:32
Message-ID: 20150910212832.GS2912@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeff Janes wrote:

> The attached patch forces there to be at least one element in MCE, keeping
> the one element with the highest predicted frequency if the MCE would
> otherwise be empty. Then any other element queried for is assumed to be no
> more common than this most common element.

Hmm, what happens if a common-but-not-an-MCE element is pruned out of
the array when a bucket is filled? I imagine it's going to mis-estimate
the selectivity (though I imagine the effect is going to be pretty
benign anyway, I mean it's still going to be better than stock 0.5%.)

> I'd also briefly considered just having the part of the code that pulls the
> stats out of pg_stats interpret a MCE array as meaning that nothing is more
> frequent than the threshold, but that would mean that that part of the code
> needs to know about how the threshold is chosen, which just seems wrong.

I wonder if we shouldn't add a separate stats STATISTIC_KIND for this,
instead ot trying to transfer knowledge.

Given how simple this patch is, I am tempted to apply it anyway. It
needs a few additional comment to explain what is going on, though.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-09-10 21:30:53 Re: Hooking at standard_join_search (Was: Re: Foreign join pushdown vs EvalPlanQual)
Previous Message Bernd Helmle 2015-09-10 21:26:47 9.3.9 and pg_multixact corruption