Re: old synchronized scan patch

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Gregory Stark" <stark(at)enterprisedb(dot)com>, "Jeff Davis" <pgsql(at)j-davis(dot)com>, "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>, "Hannu Krosing" <hannu(at)skype(dot)net>, "Luke Lonergan" <llonergan(at)greenplum(dot)com>, <pgsql-hackers(at)postgresql(dot)org>, "Eng" <eng(at)intranet(dot)greenplum(dot)com>
Subject: Re: old synchronized scan patch
Date: 2006-12-06 19:04:32
Message-ID: 1165431873.3839.405.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2006-12-05 at 13:25 -0500, Tom Lane wrote:
> Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> > "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> >> Sure, it should hang around for awhile, and will. The problem is that
> >> its lifetime will be artificially inflated, so that the seqscan ends up
> >> kicking out other blocks that are really of greater importance, rather
> >> than recycling its own old blocks as it should.
>
> > I thought you had switched this all to a clock sweep algorithm.
>
> Yeah ... it's a clock sweep with counter. A buffer's counter is
> incremented by each access and decremented when the sweep passes over
> it. So multiple accesses allow the buffer to survive longer. For a
> large seqscan you really would rather the counter stayed at zero,
> because you want the buffers to be recycled when the sweep comes back
> the first time.

If you focus the backends together by synchronizing them you definitely
also then need to solve the problem of false cache reinforcement.

I envisaged that we would handle the problem by having a large SeqScan
reuse its previous buffers so it would avoid polluting the cache. If we
kept track of how many backends were in link-step together (a "Conga")
we would be able to check that a block had not been used by anyone but
the Conga members.

So we'd need rules about
- when to allow a Conga to form (if file is very big we check, if not we
don't, no real need for exact synchronisation in all cases)
- how to join a Conga
- how to leave a Conga if you fall behind

The cost of synchronisation (i.e. LWlocks) is much less than the cost of
non-synchronisation (i.e. lots more I/O).

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2006-12-06 19:15:32 Re: psql return codes
Previous Message Jim C. Nasby 2006-12-06 18:51:03 Re: psql return codes