Re: Inefficiency in parallel pg_restore with many tables

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Inefficiency in parallel pg_restore with many tables
Date: 2023-07-16 13:45:54
Message-ID: 3787685.1689515154@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On 2023-07-15 Sa 13:47, Tom Lane wrote:
>> I wonder if we could replace the sorted ready-list with a priority heap,
>> although that might be complicated by the fact that pop_next_work_item
>> has to be capable of popping something that's not necessarily the
>> largest remaining job. Another idea could be to be a little less eager
>> to sort the list every time; I think in practice scheduling wouldn't
>> get much worse if we only re-sorted every so often.

> Yeah, I think that last idea is reasonable. Something like if the number
> added since the last sort is more than min(50, list_length/4) then sort.
> That shouldn't be too invasive.

Actually, as long as we're talking about approximately-correct behavior:
let's make the ready_list be a priority heap, and then just make
pop_next_work_item scan forward from the array start until it finds an
item that's runnable per the lock heuristic. If the heap root is
blocked, the next things we'll examine will be its two children.
We might pick the lower-priority of those two, but it's still known to
be higher priority than at least 50% of the remaining heap entries, so
it shouldn't be too awful as a choice. The argument gets weaker the
further you go into the heap, but we're not expecting that having most
of the top entries blocked will be a common case. (Besides which, the
priorities are pretty crude to begin with.) Once selected, pulling out
an entry that is not the heap root is no problem: you just start the
sift-down process from there.

The main advantage of this over the only-sort-sometimes idea is that
we can guarantee that the largest ready item will always be dispatched
as soon as it can be (because it will be the heap root). So cases
involving one big table (with big indexes) and a lot of little ones
should get scheduled sanely, which is the main thing we want this
algorithm to ensure. With the other approach we can't really promise
much at all.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alena Rybakina 2023-07-16 18:21:25 Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs
Previous Message Andrew Dunstan 2023-07-16 12:17:49 Re: Inefficiency in parallel pg_restore with many tables