Re: Planner makes sub-optimal execution plan

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Алексей Борщёв <aborschev(at)gmail(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Planner makes sub-optimal execution plan
Date: 2025-09-02 04:41:06
Message-ID: CAApHDvr78vT9=v=0EK0aAvhTqwjWYwwG_KouEzaJSYpim9MtZg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, 2 Sept 2025 at 00:41, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> After disabling MIN/MAX optimization in the grouping_planner function:
> /*
> * Preprocess MIN/MAX aggregates, if any. Note: be careful about
> * adding logic between here and the query_planner() call.
> Anything
> * that is needed in MIN/MAX-optimizable cases will have to be
> * duplicated in planagg.c.
> */
> //if (parse->hasAggs)
> // preprocess_minmax_aggregates(root);
>
> I got a better query plan, but I’m still investigating what went wrong.

This is basically just the standard issue people have with how we cost LIMIT.

If you look at this part:

-> Limit (cost=0.43..7.91 rows=1 width=8)
Output: docum.dt
-> Index Scan using docum_dt_7ee1d676 on public.docum
(cost=0.43..420487.43 rows=56222 width=8)

you can see that the Limit total cost is startup (0.43) plus 420487.43
/ 56222 * 1.

The problem is that the planner assumes there's no correlation between
dt and dt_real. It thinks because a decent number of rows have dt_real
>= '2025-08-14 09:44:09.033592' that it'll find the LIMIT 1 row fairly
quickly by scanning the docum_dt_7ee1d676 index. Unfortunately, it's
not quick because the rows matching dt_real >= '2025-08-14
09:44:09.033592' are nearly at the end of the index.

In v16+, using something like MIN(docum.dt ORDER BY dt) AS "dt__min"
should force the planner into giving a better plan. ORDER BY /
DISTINCT aggregates are a bit less efficient in versions earlier than
that, so doing that would require a bit of additional work if using
v14. It might not be too bad, though.

David

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Matt Long 2025-09-08 19:13:41 Re: Poor row estimates from planner, stat `most_common_elems` sometimes missing for a text[] column
Previous Message Alena Rybakina 2025-09-01 12:41:29 Re: Planner makes sub-optimal execution plan