Re: Sequential scans

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Simon Riggs <simon(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sequential scans
Date: 2007-05-03 08:25:29
Message-ID: 46399C79.4050400@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeff Davis wrote:
> On Wed, 2007-05-02 at 23:59 +0100, Heikki Linnakangas wrote:
>> Jeff Davis wrote:
>>> On Wed, 2007-05-02 at 20:58 +0100, Heikki Linnakangas wrote:
>>>> Jeff Davis wrote:
>>>>> What should be the maximum size of this hash table?
>>>> Good question. And also, how do you remove entries from it?
>>>>
>>>> I guess the size should somehow be related to number of backends. Each
>>>> backend will realistically be doing just 1 or max 2 seq scan at a time.
>>>> It also depends on the number of large tables in the databases, but we
>>>> don't have that information easily available. How about using just
>>>> NBackends? That should be plenty, but wasting a few hundred bytes of
>>>> memory won't hurt anyone.
>>> One entry per relation, not per backend, is my current design.
>> Umm, you naturally have just entry per relation, but we were talking
>> about how many entries the table needs to hold.. You're patch had a
>> hard-coded value of 1000 which is quite arbitrary.
>
> I think I'm missing something from your statement "I guess the size
> should somehow be related to number of backends." If there is only one
> entry per relation, why do more backends require a larger hash table?

The hash table keeps track of ongoing seq scans. That's presumably
related to number of backends; I can't imagine a plan where a backend is
executing more than one seq scan at a time, so that gives us an upper
bound of NBackends simultaneous seq scans in the system.

Sure, if you only have say 5 tables that are large enough that we try to
keep track of them, there can't be more than 5 entries in the table. But
we don't know how many such tables there is at startup time.

A hard coded constant in the range of 10 - 100 might be just fine in
practice.

> If I just step back for a second to consider a simpler design:
>
> We will most likely have no more than a few relations larger than the
> minimum threshold for Sync Scanning in any database being scanned
> concurrently.

Note that the list would be shared with all databases in the cluster.
Your point is still valid, though.

> What if I just make an LRU that occupies a fixed size? Any reads or
> updates can start at the MRU item and work backward, and any evictions
> can happen at the LRU item.
>
> Does a hash table really save us cycles if we keep the list small, say,
> 100 items? Looking at all 100 items would only be required perhaps on a
> scan startup. I don't have a good sense of scale here, is following 50
> pointers while holding a lock a significant source of contention?

Hmm. If you only need to scan through it when you start the scan, I
suppose it would be ok.

> 100 is of course arbitrary, and that could grow or shrink automatically
> at runtime.

Except that it can't shrink, because we don't have any convenient moment
to remove an entry from the list, other than dropping the least recently
used entry out of the list when inserting a new one. And since it's in
shared mem, the allocated size is fixed at boot up, so we can't grow it
either.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2007-05-03 08:44:47 Re: Patch queue triage
Previous Message Dave Page 2007-05-03 07:36:06 Re: PGBuildfarm member narwhal Branch HEAD Status changed from OK to Check failure