|From:||Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>|
|To:||Peter Geoghegan <pg(at)bowt(dot)ie>|
|Cc:||Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>|
|Subject:||Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)|
|Views:||Raw Message | Whole Thread | Download mbox|
Thanks Peter and Thomas for the review comments.
On Wed, Nov 1, 2017 at 3:59 AM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Thu, Oct 26, 2017 at 4:22 AM, Rushabh Lathia
> <rushabh(dot)lathia(at)gmail(dot)com> wrote:
> > Attaching the re based patch according to the v22 parallel-hash patch
> I took a quick look at this today, and noticed a few issues:
> * make_name() is used to name files in sharedtuplestore.c, which is
> what is passed to BufFileOpenShared() for parallel hash join. Your
> using your own logic for that within the equivalent logtape.c call to
> BufFileOpenShared(), presumably because make_name() wants to identify
> participants by PID rather than by an ordinal identifier number.
> I think that we need some kind of central registry for things that use
> shared buffiles. It could be that sharedtuplestore.c is further
> generalized to support this, or it could be that they both call
> something else that takes care of naming. It's not okay to have this
> left to random chance.
> You're going to have to ask Thomas about this. You should also use
> MAXPGPATH for the char buffer on the stack.
Used MAXPGPATH for the char buffer.
> * This logtape.c comment needs to be updated, as it's no longer true:
* successfully. In general, workers can take it that the leader will
> * reclaim space in files under their ownership, and so should not
> * reread from tape.
> * Robert hated the comment changes in the header of nbtsort.c. You
> might want to change it back, because he is likely to be the one that
> commits this.
* You should look for similar comments in tuplesort.c (IIRC a couple
> of places will need to be revised).
> * tuplesort_begin_common() should actively reject a randomAccess
> parallel case using elog(ERROR).
> * tuplesort.h should note that randomAccess isn't supported, too.
> * What's this all about?:
> + /* Accessor for the SharedBufFileSet that is at the end of Sharedsort. */
> + #define GetSharedBufFileSet(shared) \
> + ((BufFileSet *) (&(shared)->tapes[(shared)->nTapes]))
> You can't just cast from one type to the other without regard for the
> underling size of the shared memory buffer, which is what this looks
> like to me. This only fails to crash because you're only abusing the
> last member in the tapes array for this purpose, and there happens to
> be enough shared memory slop that you get away with it. I'm pretty
> sure that ltsUnify() ends up clobbering the last/leader tape, which is
> a place where BufFileSet is also used, so this is just wrong. You
> should rethink the shmem structure a little bit.
Fixed this by adding a SharedFileSet directly into the Sharedsort struct.
Thanks Thomas Munro for the offline help here.
* There is still that FIXME comment within leader_takeover_tapes(). I
> believe that you should still have a leader tape (at least in local
> memory in the leader), even though you'll never be able to do anything
> with it, since randomAccess is no longer supported. You can remove the
> FIXME, and just note that you have a leader tape to be consistent with
> the serial case, though recognize that it's not useful. Note that even
> with randomAccess, we always had the leader tape, so it's not that
> different, really.
> I suppose it might make sense to make shared->tapes not have a leader
> tape. It hardly matters -- perhaps you should leave it there in order
> to keep the code simple, as you'll be keeping the leader tape in local
> memory, too. (But it still won't fly to continue to clobber it, of
> course -- you still need to find a dedicated place for BufFileSet in
> shared memory.)
Attaching the latest patch (v13) here and I will continue working on the
improvement part for nbtsort.c and tuplesort.c. Also will perform more
with the attached patch.
Patch is re-base of v25 patch set of Parallel hash.
|Next Message||David Rowley||2017-11-14 09:51:47||Re: [HACKERS] Proposal: Local indexes for partitioned table|
|Previous Message||Moon Insung||2017-11-14 09:36:30||RE: [HACKERS][PATCH]pg_buffercache add a buffer state column, Add fuction to decode buffer state|