From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: using extended statistics to improve join estimates |
Date: | 2021-12-13 13:20:40 |
Message-ID: | 0213d237-efd5-8e7e-fa29-3f36e6e26023@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 11/22/21 02:23, Justin Pryzby wrote:
> Your regression tests include two errors, which appear to be accidental, and
> fixing the error shows that this case is being estimated poorly.
>
> +-- try combining with single-column (and single-expression) statistics
> +DROP STATISTICS join_test_2;
> +ERROR: statistics object "join_test_2" does not exist
> ...
> +ERROR: statistics object "join_stats_2" already exists
>
D'oh, what a silly mistake ...
You're right fixing the DROP STATISTICS results in worse estimate, but
that's actually expected for a fairly simple reason. The join condition
has expressions on both sides, and dropping the statistics means we
don't have any MCV for the join_test_2 side. So the optimizer ends up
not using the regular estimates, as if there were no extended stats.
A couple lines later the script creates an extended statistics on that
expression alone, which fixes this. An expression index would do the
trick too.
Attached is a patch fixing the test and also the issue reported by
Zhihong Yu some time ago.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment | Content-Type | Size |
---|---|---|
0001-Estimate-joins-using-extended-statistics-20211213.patch | text/x-patch | 67.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Gunnar "Nick" Bluth | 2021-12-13 13:21:11 | Re: [PATCH] pg_stat_toast |
Previous Message | talk to ben | 2021-12-13 13:07:36 | Re: Probable memory leak with ECPG and AIX |