Re: Contrib -- PostgreSQL shared variables

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Contrib -- PostgreSQL shared variables
Date: 2004-08-28 17:26:20
Message-ID: Pine.OSF.4.60.0408282002160.230146@kosh.hut.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Sat, 28 Aug 2004 pgsql(at)mohawksoft(dot)com wrote:

>
>> I don't see how this is different from "CREATE TABLE shared_variables
>> (name
>> VARCHAR PRIMARY KEY, value VARCHAR)" and
>> inserting/updating/deleting/selecting from that. Perhaps these are
>> per-session shared variables? IN which case, what is the utility if
>> sharing
>> them across shared memory?
>>
>> - --
>> Jonathan Gardner
>
> Well, the issues you don't see is this:
>
> What if you have to update the variables [n] times a second?
>
> You have to vacuum very frequently. If you update a variable a hundred
> times a second, and vacuum only once every minute, the time it takes to
> update ranges from reading one row from the database to reading 5999 dead
> rows to get to the live one. Then you vacuum, then you are back to one row
> again.

I think the right approach is to tackle that problem instead of working
around it with a completely new variable mechanism.

I've been playing with the idea of a quick vacuum that runs through the
shmem buffers. The idea is that since the pages are already in memory,
the vacuum runs very quickly. Vacuuming the hot pages frequently
enough should avoid the problem you describe. It also saves I/O in the
long run since dirty pages are vacuumed before they are written to
disk, eliminating the need to read in, vacuum and write the same pages
again later.

The problem is of course that to vacuum the heap pages, you have to make
sure that there is no references to the dead tuples from any indexes.

The trivial case is that the table has no indexes. But I believe that
even if the table has ONE index, it's very probable that the corresponding
index pages of the dead tuples are also in memory, since the tuple was
probably accessed through the index.

As the number of indexes gets bigger, the chances of all corresponding
index pages being in memory gets smaller.

If the "quick" vacuum or opportunistic vacuum as I call it is clever
enough to recognize that there is a dead tuple in memory, and all the
index pages that references are in memory too, it could reliably vacuum
just those tuples without scanning through the whole relation and without
doing any extra I/O.

I've written some code that implements the trivial case of no indexes. I'm
hoping to extend it to handle the indexes too if I have time. Then we'll
see if it's any good. I've attached a patch with my current ugly
implementation if you want to give it a try.

> On top of that, all the WAL logging that has to take place for each
> "transaction."

How is that a bad thing? You don't want to give up ACID do you?

- Heikki

Attachment Content-Type Size
oppvacuum.diff text/plain 18.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2004-08-28 17:35:40 Re: Compile failure in CVS HEAD
Previous Message Tom Lane 2004-08-28 17:22:10 Re: Regression test failures

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2004-08-28 17:58:19 Re: log_filename_prefix --> log_filename + strftime()
Previous Message eetemadi 2004-08-28 12:50:45 New Language Translation