Re: pg_dump test instability

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump test instability
Date: 2018-09-12 16:06:35
Message-ID: 6771.1536768395@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> writes:
> Some small comments on the code:

> Maybe add a ready_list_free() to go with ready_list_init(), instead of
> calling pg_free(ready_list.tes) directly.
> get_next_work_item() has been changed to remove the work item from the
> ready_list. Maybe rename to something like pop_next_work_item()?

Both seem reasonable, will do.

> I'm confused by what ready_list_remove() is doing when it's not removing
> the first item. It looks like it's removing all leading items up to the
> i'th one. Is that what we want? In some cases, we are skipping over
> things that we are not interested at all, so this would work, but if
> we're just skipping over an item because of a lock conflict, then it's
> not right.

No. In both code paths, the array slot at index first_te is being
physically dropped from the set of valid entries (by incrementing
first_te). In the first path, that slot holds the item we want to
remove logically from the set, so that incrementing first_te is
all we have to do: the remaining entries are still in the range
first_te..last_te, and they're still sorted. In the second code
path, the item that was in that slot is still wanted as part of
the set, so we copy it into the valid range (overwriting the item
in slot i, which is no longer wanted). Now the valid range is
probably not sorted, so we have to flag that a re-sort is needed.

I expect that most of the time the first code path will be taken,
because usually we'll be able to dispatch the highest-priority
ready entry. We'll only take the second path when we have to postpone
the highest-priority entry because of a potential lock conflict
against some already-running task. Any items between first_te and i
are other tasks that also have lock conflicts and can't be dispatched
yet; we certainly don't want to lose them, and this code doesn't.

If you can suggest comments that would clarify this more,
I'm all ears.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Gierth 2018-09-12 16:08:08 Re: Consistent segfault in complex query
Previous Message Tom Lane 2018-09-12 15:24:56 Re: Consistent segfault in complex query