Re: Sequential scans

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, Simon Riggs <simon(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sequential scans
Date: 2007-05-02 21:28:26
Message-ID: 4639027A.7080809@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeff Davis wrote:
> On Wed, 2007-05-02 at 20:02 +0100, Heikki Linnakangas wrote:
>> Gregory Stark wrote:
>>> "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
>>>
>>>> Let's use a normal hash table instead, and use a lock to protect it. If we only
>>>> update it every 10 pages or so, the overhead should be negligible. To further
>>>> reduce contention, we could modify ReadBuffer to let the caller know if the
>>>> read resulted in a physical read or not, and only update the entry when a page
>>>> is physically read in. That way all the synchronized scanners wouldn't be
>>>> updating the same value, just the one performing the I/O. And while we're at
>>>> it, let's use the full relfilenode instead of just the table oid in the hash.
>>> It's probably fine to just do that. But if we find it's a performance
>>> bottleneck we could probably still manage to avoid the lock except when
>>> actually inserting a new hash element. If you just store in the hash an index
>>> into an array stored in global memory then you could get away without a lock
>>> on the element in the array.
>>>
>>> It starts to get to be a fair amount of code when you think about how you
>>> would reuse elements of the array. That's why I suggest only looking at this
>>> if down the road we find that it's a bottleneck.
>> Another trick you could do is to use acquire the lock conditionally when
>> updating it. But I doubt it's a problem anyhow, if we put some sane
>> lower limit in there so that it's not used at all for small tables.
>>
>
> The more sophisticated the data structure the less able we are to avoid
> locking, correct? For instance, if we have an LRU list it might be
> tricky or impossible to avoid locking even on just reads.

Agreed. I'm not concerned about reads, though. You only need to read
from the structure once when you start a scan. It's the updates that
cause most of the traffic.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-05-02 21:28:39 Re: Optimization in convert_string_datum?
Previous Message Tom Lane 2007-05-02 21:13:08 Re: pg_index updates and SI invalidation