Re: Sequential scans

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Simon Riggs <simon(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sequential scans
Date: 2007-05-02 22:59:51
Message-ID: 463917E7.4000803@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeff Davis wrote:
> On Wed, 2007-05-02 at 20:58 +0100, Heikki Linnakangas wrote:
>> Jeff Davis wrote:
>>> What should be the maximum size of this hash table?
>> Good question. And also, how do you remove entries from it?
>>
>> I guess the size should somehow be related to number of backends. Each
>> backend will realistically be doing just 1 or max 2 seq scan at a time.
>> It also depends on the number of large tables in the databases, but we
>> don't have that information easily available. How about using just
>> NBackends? That should be plenty, but wasting a few hundred bytes of
>> memory won't hurt anyone.
>
> One entry per relation, not per backend, is my current design.

Umm, you naturally have just entry per relation, but we were talking
about how many entries the table needs to hold.. You're patch had a
hard-coded value of 1000 which is quite arbitrary.

>> I think you're going to need an LRU list and counter of used entries in
>> addition to the hash table, and when all entries are in use, remove the
>> least recently used one.
>>
>> The thing to keep an eye on is that it doesn't add too much overhead or
>> lock contention in the typical case when there's no concurrent scans.
>>
>> For the locking, use a LWLock.
>
> Ok. What would be the potential lock contention in the case of no
> concurrent scans?

If you only have one seq scan in the system, there's no contention. If
you have more, they will all need to acquire the lock to update the page
number in the hash table, regardless of the table their scanning, so
there's potential for contention.

To put things into perspective, though, any time you need to evict a
buffer from the buffer cache to read in a new buffer, you need to
acquire the BufFreelistLock. If we only update the page number say every
10 pages, the overhead and lock contention of that lock would be roughly
1/10th of that arising from BufFreelistLock. And we can make it a
conditional acquire, in which case the backends won't need to slow down
their scans, but the updates of the entries in the hash table will get
less frequent instead. We could also take just a shared lock to update
the counter, and rely on the atomicity of the write like your patch did.
You'd only need to take an exclusive lock to move entries in the LRU
list or to add or remove an entry.

But let's keep it simple at first, and do some testing..

> Also, is it easy to determine the space used by a dynahash with N
> entries? I haven't looked at the dynahash code yet, so perhaps this will
> be obvious.

Yes, see hash_estimate_size.

BTW: Thanks for the patch!

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-05-02 23:26:27 Re: Avoiding unnecessary reads in recovery
Previous Message Jeff Davis 2007-05-02 22:37:01 Re: Sequential scans