Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Subject: Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Date: 2018-01-12 19:55:07
Message-ID: CAH2-Wz=oT=PePbGjS2k5p2SivA80J3S865tvCKHPEGOUJPp-+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 12, 2018 at 6:14 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Jan 12, 2018 at 8:19 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> 1.
>> + if (!IsBootstrapProcessingMode() && !indexInfo->ii_Concurrent)
>> {
>> - snapshot = RegisterSnapshot(GetTransactionSnapshot());
>> - OldestXmin = InvalidTransactionId; /* not used */
>> + OldestXmin = GetOldestXmin(heapRelation, true);
>>
>> I think leader and workers should have the same idea of oldestXmin for
>> the purpose of deciding the visibility of tuples. I think this is
>> ensured in all form of parallel query as we do share the snapshot,
>> however, same doesn't seem to be true for Parallel Index builds.
>
> Hmm. Does it break anything if they use different snapshots? In the
> case of a query that would be disastrous because then you might get
> inconsistent results, but if the snapshot is only being used to
> determine what is and is not dead then I'm not sure it makes much
> difference ... unless the different snapshots will create confusion of
> some other sort.

I think that this is fine. GetOldestXmin() is only used when we have a
ShareLock on the heap relation, and the snapshot is SnapshotAny. We're
only talking about the difference between HEAPTUPLE_DEAD and
HEAPTUPLE_RECENTLY_DEAD here. Indexing a heap tuple when that wasn't
strictly necessary by the time you got to it is normal.

However, it's not okay that GetOldestXmin()'s second argument is true
in the patch, rather than PROCARRAY_FLAGS_VACUUM. That's due to bitrot
that was not caught during some previous rebase (commit af4b1a08
changed the signature). Will fix.

You've given me a lot more to work through in your most recent mail,
Robert. I will probably get the next revision to you on Monday.
Doesn't seem like there is much point in posting what I've done so
far.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message legrand legrand 2018-01-12 20:26:32 Re: PoC: custom signal handler for extensions
Previous Message Mark Rofail 2018-01-12 19:47:19 Re: [HACKERS] GSoC 2017: Foreign Key Arrays