Re: Parallel bitmap heap scan

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel bitmap heap scan
Date: 2017-02-15 12:17:58
Message-ID: CAFiTN-tVUec79T0qyi4Z4wrtUY9bA0Hu6ywv2PuPVdxZnrY5ig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 14, 2017 at 10:18 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Thanks, the external interface to this looks much cleaner now.
> Scrutinizing the internals:

Thanks for the comments, I am working on these. I have few doubts for
some of the comments.

> I suggest removing both "entry1" and "status" from
> TBMSharedInfo and making tbm_prepare_shared_iterate smarter to
> compensate.

This is easily doable and it will look much cleaner. While doing this
I am facing one problem related to
relptr_store and relptr_access. The problem is that when I call
"relptr_store (base, rp, val)" with both base and val as a same
address then it will store offset as 0 which is fine, but when we use
relptr_access with 0 offset it is returning NULL instead of giving the
actual address. I expect it should return directly base when offset is
0. (I am facing this problem because in this case (TBM_ONE_PAGE),
there will be only one page and address of that will be same as
base.).

The question can be why we did not face this issue while converting
the pagetable to page array, because at least one entry will be there
which should have the same address as the base. But coincidently we
did not get that problem because in that case as of now we were
storing dsa_pointer in the head of the element array, therefore first
pagetable element was not same as the base.

There can be multiple solutions to this but none of them looks cleaner to me.
sol1: If relptr_access return NULL then directly use the base address
as our PagetableEntry.
sol2: Instead of using base as start of the element array, just try to
use some modified base as e.g base=base - some number.
sol3: change relptr_access to not return NULL in this case. But, this
will change the current behaviour of this interface and need to
analyze the side effects.

>
> I don't see why you think you need to add sizeof(dsa_pointer) to the
> allocation and store an extra copy of the dsa_pointer in the
> additional space. You are already storing it in tbm->dsa_data and
> that seems good enough. pagetable_free() needs the value, but it has
> a pointer to the TIDBitmap and could just pass tbm->dsa_data directly
> instead of fishing the pointer out of the extra space.

If you see the code of SH_GROW, first it needs to allocate the bigger
chunk copy data from smaller chunk to the bigger chunk and then free
the smaller chunk.

So by the time it call the pagetable_free, it would have already
called the pagetable_allocate for the newer bigger chunk of memory so
now, tbm->dsa_data points to the latest memory, but pagetable_free
wants to free older one.

One solution to this could be that I keep two dsa_pointer in TBM, one
point to old memory and another to new. (After this here we will get
the same problem of relptr_access because now we will have first entry
pointer is same as base pointer)

Please provide your thought.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amul Sul 2017-02-15 12:26:20 Re: Bug in to_timestamp().
Previous Message Amit Kapila 2017-02-15 12:11:18 Re: Parallel Index Scans