Re: Potential bug introduced in PG17 with query parallelization - plan flip

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Jon Jenkins <jjenkins(at)gitlab(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Potential bug introduced in PG17 with query parallelization - plan flip
Date: 2025-10-30 22:41:03
Message-ID: CAApHDvqgT-xap-+GZAxBUmvJOoqGE4jqf1TibpSrew0zMBqXAQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 31 Oct 2025 at 10:40, Jon Jenkins <jjenkins(at)gitlab(dot)com> wrote:
> -> Index Scan using index_on_merge_requests_for_latest_diffs
> on public.merge_requests (cost=0.57..1011.36 rows=773 width=4)
> (actual time=0.015..10.901 rows=9271 loops=1)
> Index Cond: (merge_requests.target_project_id = 16567575)

> However, on 17, a "more expensive" plan is proposed that does a gather
> between workers, but largely performs the exact same:

> -> Index Scan using
> index_merge_requests_on_target_project_id_and_created_at_and_id
> (cost=0.57..19664.54 rows=8262 width=4) (actual time=0.021..7.890
> rows=4498 loops=2)

In the two fragments above, you'll notice the row estimates are quite
different. I suspect this is the reason v17 went with the parallel
plan. What could cause this would be just a difference in sampled rows
from ANALYZE. If you're still in a testing environment, do you see the
plan change back to the non-parallel version if you run ANALYZE on the
merge_requests table and then run the query again? Try it a few times.
You might find that the rows that are sampled sometimes leads to
better or worse estimates. It's a bit of a fact of life if you have
large skewed data as it's not practical to remember how many of each
and every value exist in the statistics. You could increase the number
by increasing the statistics targets for that column, per [1]. That
might still not be enough, however. If you can't go high enough, you
might be able to do something by setting it to not gather stats for
this column and then tuning n_distinct for the column to some value
that suits. The drawback there is that doing that is fraught with the
possible difficulties of having the planner always thinking there's
the same number of rows for every target_project_id. You might end up
doing an Index Scan when a Seq Scan is better (which it could be if
you had a particular target_project_id that dominated the table).

> I don't know enough about planner internals to specifically track this
> code, but I did see that this patch has been added to 17 that deals
> with parallelizing:
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=e08d74ca1

This isn't related to what you're seeing.

David

[1] https://www.postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-DESC-SET-STATISTICS

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2025-10-30 23:08:04 Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
Previous Message Arne Roland 2025-10-30 22:07:50 Re: apply_scanjoin_target_to_paths and partitionwise join