Re: Visibility map thoughts

From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Visibility map thoughts
Date: 2007-11-05 16:13:07
Message-ID: 9362e74e0711050813o7eb2a353q7a7ea6cc9b86b608@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/5/07, Heikki Linnakangas <heikki(at)enterprisedb(dot)com> wrote:
>
> Gokulakannan Somasundaram wrote:
> > a) The inserts won't increase the size of the table. If it increases, it
> has
> > to lock one full page of Visibility map and this is not suitable for
> tables,
> > which are short-lived like partitioned tables
>
> The overhead of locking a page is very small.
>
> Actually, extending a heap only needs to touch the visibility map when
> we need a new visibility map page, if we initialize all bits to zero.
> Like we do already anyway.

As you have pointed out 1 page in the visibility map points to 65535 pages
in the heap. So even if we are locking the visibility map for a small time,
it will affect all those scans, which will need to access these pages.

> b) Even if the inserts don't increase the size of the table, it might make
> > DSM useless, if lot of inserts keep converting the all-visible ones to
> > uncertain ones. For that matter, even the Deletes and Updates are also
> going
> > to make lot of pages into uncertain ones.
>
> Sure. If you have a lot of (random) inserts/updates/deletes, it becomes
> much less useful.
>
> A small mitigating factor is that an insert/update/delete will fetch the
> heap page to memory anyway. Therefore having to access it just after the
> update is cheap. This helps inserts in particular, because after the
> inserting transaction is < OldestXmin, we can set the bit again.

But we can set this bit only after a Vacuum process. The tuples might not be
there, till the Vacuum process pitches in and marks this.

> c) Visibility map gets useless, when there is a long running batch query /
> > periodic background queries which run for longer times
>
> Yeah, long running transactions are a bitch in many ways.

But these are determined by business conditions and we need to provide
efficient solutions to deal with it.

> d) More updates- more blocks of uncertainity - space usage by DSM and the
> > reference made to DSM is just an overhead
> > e) Lot of times, people may not need index-only scans. Again this gets
> to be
> > a overhead
>
> The beauty of this approach is that the overhead is very small.
>
> > f) If there are scheduled reboots, the DSM crashes and periodic
> slow-downs
> > in the queries during the time, the DSM gets re-constructed.
>
> That's rubbish.

I think DSM is not WAL-Logged. So when it gets reconstructed every time
for a big table, isn't it a overhead?

> I am not opposing this, as it is a redundant feature for Thick indexes.
> > After all every one of us, want Postgres to be the fastest one in the
> world.
>
> And also the easiest to maintain, most space-efficient, most reliable
> and so forth...

Provided it supports Vacuuming & Freezing..

> But because DSM has a inherent assumption that lot of tables will become
> > static and all the tuples would be visible to everyone. If there are
> such
> > tables, then definitely Thick index becomes a overhead in terms of
> space.
> > But DSM should not become overhead at any cost, as it is a memory
> resident
> > one at all times and also always gets into the lifecycle of a query.
> Only
> > way to achieve it is to make it a dual purpose one. It should help
> Vacuum,
> > freezing and visibility checks.
>
> I don't understand this paragraph.

Because updates, inserts and deletes reduce the utility of Visibility map,
it seems to be designed for more static tables, which don't experience much
of these operations.

--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
(www.alliedgroups.com)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2007-11-05 16:22:53 Re: pgsql: Add a note about another issue that needs to be considered before
Previous Message Andrew Dunstan 2007-11-05 16:12:34 Re: Slow regression tests on windows