Quick Links

Re: postgres chooses objectively wrong index

From:	Merlin Moncure <mmoncure(at)gmail(dot)com>
To:	Alexey Ermakov <alexius(dot)work(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject:	Re: postgres chooses objectively wrong index
Date:	2026-03-18 18:38:07
Message-ID:	CAHyXU0zhdKYDvt_iON5kVSa4AXa=p_vLP1JMs3u0FFA29rP-aQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Tue, Mar 17, 2026 at 11:27 PM Alexey Ermakov <alexius(dot)work(at)gmail(dot)com>
wrote:

> On 2026-03-18 04:52, Merlin Moncure wrote:
>
> Hello. I think planner doesn't have information about distribution of
>> *async.task_execution_state(task)* unless it's part of any full index. I
>> would try to give that with extended statistics (postgresql 14+):
>>
>> create statistics (mcv) task_task_execution_state_stat on ((async.task_execution_state(task))) from async.task;
>> analyze async.task;
>>
>> If that won't help - please show distribution from pg_stats_ext view for
>> extended statistic above.
>>
>
> This unfortunately fails, probably because the table type includes system
> columns (despite not using them).
>
> orchestrator_service_user(at)orchestrator=> create statistics task_stats
> (mcv) on (async.task_execution_state(task)) from async.task;
> ERROR: statistics creation on system columns is not supported
>
> This would require some refactoring to fix.
>
> Interesting... In that case functional index should help (as it also makes
> statistic for the planner):
>
> create index concurrently on task_task_execution_state_idx async.task using btree ((async.task_execution_state(task)));
>
> analyze async.task;
>
>
This can't help performance, as the index...
CREATE INDEX ON async.task(concurrency_pool, priority, entered)
WHERE async.task_execution_state(task) = 'READY';

...is very precisely configured to provide exactly what's needed; I need
tasks for that exact pool in that exact order if and only if ready. The
partial predicate is designed to keep the index nice and small since only a
small portion of tasks would be eligible at any specific time.

@Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> I'm pretty sure you were following me, but my
abstraction earlier was a bit off;

Simplified, the query is:
> SELECT * FROM foo WHERE a=? AND b=K ORDER BY c, d LIMIT N;
> CREATE INDEX ON foo(a,b,c) WHERE b=K;

Should have been:
SELECT * FROM foo WHERE a=? AND d=K ORDER BY b, c LIMIT N;
CREATE INDEX ON foo(a,b,c) WHERE d=K;

Point being, the index match in on (=, order, order). If a contains any
less than 100% of the total records, and N is small relative to table size,
this ought to be the ideal index for just about any case, the exact match
on partial qual is just gravy.

I think the planner is not giving enough bonus for an exact match versus an
inexact match on partial index mathcing, (A=A should be better than A
IN(A,B,C)), and it's unclear why the planner things bitmap heap + sort is
outperforming a raw read off the index base on marginal estimated row
counts. Lowering random_page_cost definitely biases the plan I like,
but it skews both estimates.

@Alexey Ermakov <alexius(dot)work(at)gmail(dot)com>
If you're interested in more context, see:

pgasync <https://github.com/merlinm/pgasync>
pgflow <https://github.com/merlinm/pgflow>
graph example <https://imgur.com/a/LZNpTC1>

merlin

In response to

Re: postgres chooses objectively wrong index at 2026-03-18 05:27:26 from Alexey Ermakov

Responses

Re: postgres chooses objectively wrong index at 2026-03-18 20:12:13 from Alexey Ermakov
Re: postgres chooses objectively wrong index at 2026-03-19 07:09:33 from Andrei Lepikhov

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Alexey Ermakov	2026-03-18 20:12:13	Re: postgres chooses objectively wrong index
Previous Message	Pavel Stehule	2026-03-18 06:35:27	Re: proposal: schema variables