Re: Page at a time index scan

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Page at a time index scan
Date: 2006-05-02 15:40:45
Message-ID: Pine.OSF.4.61.0605021750180.463955@kosh.hut.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

On Tue, 2 May 2006, Simon Riggs wrote:

> On Mon, 2006-05-01 at 22:00 +0300, Heikki Linnakangas wrote:
>>
>> 1. An index scan now needs to allocate enough memory to hold potentially a
>> whole page worth of items. And if you use markpos/restrpos, twice that
>> much. I don't know if that's an issue, but I thought I'd bring that up.
>
> AFAICS the code:
>
> - allocates memory for the markpos whether or not its ever needed? Most
> index scans never call markpos and not all merge joins either, so that
> seems wasteful. We could allocate when btmarkpos() is called for the
> first time, if ever.

Right. I'll do that.

> - allocates 1024 offsets in every case. If this were just a unique index
> retrieval, that would be too much. When scan->is_multiscan == true, go
> straight for 1024, otherwise start with just 32 offsets and double that
> when/if required.

I wonder if that gets a bit too complicated for saving a small amount of
memory? I'll do it if people think it's worth it.

Also, could we calculate a better estimate of the maximum number of
offsets an index page can hold? There's no way a page can really hold
1024 items. Page headers and special space, line pointers, and the actual
keys need some space.

>> However, btbulkdelete doesn't know if the vacuum is a full or
>> lazy one. The patch just assumes it's a lazy vacuum, but the API really
>> needs to be changed to pass that information earlier than at vacuum_cleanup.
>
> Looks like it needs work. Do you have suggestions while you're there?

Now that I look at it: Why do we have a separate vacuum_cleanup function
at all? Calls to index_vacuum_cleanup go hand in hand with calls to
index_bulk_delete. I thought that index_vacuum_cleanup would only be
called after the last cycle of a multi-cycle vacuum, but that doesn't seem
to be the case.

>> 3. Before the patch, a scan would keep the current page pinned to keep
>> vacuum from deleting the current item. The patch doesn't change that
>> behaviour, but it now seems to me that even a pin is no longer needed.
>
> Agreed. The pin has two functions:
> - keep the page from being moved out of the bufmgr - no need anymore
> - stop a vacuum from removing the page - no need anymore. We'll not stop
> on a removable row anymore, so no need.

At the moment, backward scan returns to the page to walk left from there.
It could be changed to take a copy of the left sibling pointer
like forward scans do, eliminating the need to return to the original
page in most cases.

Also, if there's dead tuples on the page, the scan will need to return
to the page to kill them.

But you can always re-pin the page when needed.

> Some of the code doesn't use standard spacing e.g. "if(" should be "if
> (", but other than that it looks very neat and well implemented.

Thanks, I'll fix the spacings.

- Heikki

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2006-05-02 16:04:21 Re: Page at a time index scan
Previous Message Simon Riggs 2006-05-02 13:44:04 Re: Page at a time index scan