Refactoring the API for amgetmulti

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Cc: "Jie Zhang" <jzhang(at)greenplum(dot)com>, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
Subject: Refactoring the API for amgetmulti
Date: 2006-07-25 22:49:02
Message-ID: 25079.1153867742@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

One of the complaints I had about the bitmap index patch was the extent
to which it wants to modify (and largely create duplicate code paths in)
the existing executor support for bitmap scans. Now maybe I'm missing
something but I don't think that's where the value-add of this patch is.

It strikes me that some of the problem comes from the current definition
of the API for index AM getmulti functions: they're supposed to return
heap TIDs into a caller-supplied array. Now the only caller of
index_getmulti is nodeBitmapIndexscan.c, which simply turns around and
stuffs the TIDs into a tidbitmap --- and tbm_add_tuples() has no
calculation it can amortize across multiple tuples per call, so there's
no efficiencies of scale there. AFAICS the only thing this protocol is
doing for us is incurring an extra index AM call per thousand tuples
(or whatever the intermediate array size is).

What if we dropped the array convention, and simply passed the tidbitmap
object to the index AM's getmulti function, with the instructions "stuff
all the TIDs into this bitmap, and don't come back till you're done"?
For the existing index AMs this'd be only trivially different, but it
should result in some fractional savings of call overhead when scanning
a large number of index entries.

But for a bitmap index this is considerably more interesting, because
it could stuff its data into the tidbitmap without the overhead of
converting to an explicit array-of-TID format. In particular we could
imagine adding some entry points to tidbitmap.c that accept data in a
more friendly format, and that would all be between tidbitmap.c and the
bitmap index AM, without the need to invade large swaths of the executor
to make it happen.

Comments? For the existing AMs this is a pretty trivial change, and
I'd be willing to commit to making it happen before feature freeze if
it seems useful.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2006-07-25 22:52:17 Re: GUC with units, details
Previous Message Bort, Paul 2006-07-25 22:22:04 Re: GUC with units, details