Re: Using multiple extended statistics for estimates

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Mark Dilger <hornschnorter(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using multiple extended statistics for estimates
Date: 2019-11-10 17:33:30
Message-ID: 20191110173330.aki2dcpbwnraeo4x@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Nov 09, 2019 at 12:33:05PM -0800, Mark Dilger wrote:
>
>
>On 11/6/19 11:58 AM, Tomas Vondra wrote:
>>On Wed, Nov 06, 2019 at 08:54:40PM +0100, Tomas Vondra wrote:
>>>On Mon, Oct 28, 2019 at 04:20:48PM +0100, Tomas Vondra wrote:
>>>>Hi,
>>>>
>>>>PostgreSQL 10 introduced extended statistics, allowing us to consider
>>>>correlation between columns to improve estimates, and PostgreSQL 12
>>>>added support for MCV statistics. But we still had the limitation that
>>>>we only allowed using a single extended statistics per relation, i.e.
>>>>given a table with two extended stats
>>>>
>>>> CREATE TABLE t (a int, b int, c int, d int);
>>>> CREATE STATISTICS s1 (mcv) ON a, b FROM t;
>>>> CREATE STATISTICS s2 (mcv) ON c, d FROM t;
>>>>
>>>>and a query
>>>>
>>>> SELECT * FROM t WHERE a = 1 AND b = 1 AND c = 1 AND d = 1;
>>>>
>>>>we only ever used one of the statistics (and we considered them in a not
>>>>particularly well determined order).
>>>>
>>>>This patch addresses this by using as many extended stats as possible,
>>>>by adding a loop to statext_mcv_clauselist_selectivity(). In each step
>>>>we pick the "best" applicable statistics (in the sense of covering the
>>>>most attributes) and factor it into the oveall estimate.
>
>Tomas,
>
>Your patch compiles and passes the regression tests for me on debian
>linux under master.
>

Thanks.

>Since your patch does not include modified regression tests, I wrote a
>test that I expected to improve under this new code, but running it
>both before and after applying your patch, there is no change. Please
>find the modified test attached. Am I wrong to expect some change in
>this test's output? If so, can you provide a test example that works
>differently under your patch?
>

Those queries are not improved by the patch, because we only support
clauses "Var op Const" for now - your tests are using "Var op Var" so
that doesn't work.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-11-10 17:34:28 Re: Using multiple extended statistics for estimates
Previous Message Tom Lane 2019-11-10 16:24:34 Re: [bug fix] Produce a crash dump before main() on Windows