Quick Links

Re: postgres chooses objectively wrong index

From:	Alexey Ermakov <alexius(dot)work(at)gmail(dot)com>
To:	Merlin Moncure <mmoncure(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-performance(at)lists(dot)postgresql(dot)org
Subject:	Re: postgres chooses objectively wrong index
Date:	2026-03-18 05:27:26
Message-ID:	381e7d47-8021-458d-9a64-0cdb80b004f0@gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On 2026-03-18 04:52, Merlin Moncure wrote:
> On Tue, Mar 17, 2026 at 4:16 PM Alexey Ermakov
> <alexius(dot)work(at)gmail(dot)com> wrote:
>
> On 2026-03-18 03:01, Merlin Moncure wrote:
>> I've been maintaining an airflow style orchestrator in pl/pgsql,
>> and it's revealed a performance issue I just can't solve. There
>> is a table, task, which may normally contain billions of rows,
>> but only a tiny portion is interesting for specific reasons—a
>> common pattern in task-type systems.
>>
>> ...
>>
>> I'm wondering if there are other tricks that might apply here,
>> for example, multi column index statistics...curious if anyone
>> has thoughts on that.
>>
>> Any suggestions?
>>
>> merlin
>>
> Hello. I think planner doesn't have information about distribution
> of *async.task_execution_state(task)* unless it's part of any full
> index. I would try to give that with extended statistics
> (postgresql 14+):
>
> create statistics (mcv) task_task_execution_state_stat on ((async.task_execution_state(task))) from async.task;
> analyze async.task;
>
> If that won't help - please show distribution from pg_stats_ext
> view for extended statistic above.
>
>
> This unfortunately fails, probably because the table type includes
> system columns (despite not using them).
> orchestrator_service_user(at)orchestrator=> create statistics task_stats
> (mcv) on (async.task_execution_state(task)) from async.task;
> ERROR: statistics creation on system columns is not supported
>
> This would require some refactoring to fix.

Interesting... In that case functional index should help (as it also
makes statistic for the planner):

create index concurrently on task_task_execution_state_idx async.task using btree ((async.task_execution_state(task)));

analyze async.task;

Perhaps multicolumn index will also help for queries but hard to say
without knowing distributions. We could check state distribution info
after index creation and analyze with query like this:

select * from pg_stats where tablename = 'task_task_execution_state_idx' \gx

Alexey Ermakov

In response to

Re: postgres chooses objectively wrong index at 2026-03-17 22:52:42 from Merlin Moncure

Responses

Re: postgres chooses objectively wrong index at 2026-03-18 18:38:07 from Merlin Moncure

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Pavel Stehule	2026-03-18 06:35:27	Re: proposal: schema variables
Previous Message	Merlin Moncure	2026-03-17 22:52:42	Re: postgres chooses objectively wrong index