Re: Analyze on large changes...

From: Lincoln Yeoh <lyeoh(at)pop(dot)jaring(dot)my>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Rod Taylor" <rbt(at)zort(dot)ca>
Cc: "Hackers List" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Analyze on large changes...
Date: 2002-05-01 18:13:59
Message-ID: 5.1.0.14.1.20020502011225.02ef8d80@192.228.128.13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tom,

(Please correct me where I'm wrong)

Is it possible to reduce the performance impact of dead tuples esp when the
index is used? Right now performance goes down gradually till we vacuum
(something like a 1/x curve).

My limited understanding of current behaviour is the search for a valid
row's tuple goes from older tuples to newer ones via forward links (based
on some old docs[1]).

How about searching from newer tuples to older tuples instead, using
backward links?

My assumption is newer tuples are more likely to be wanted than older ones
- and so the number of tuples to search through will be less this way.

**If index update is ok.
If a tuple is inserted, the index record is updated to point to inserted
tuple, and the inserted tuple is made to point to a previous tuple.
e.g.

Index-> old tuple->older tuple->oldest tuple
Index-> New tuple->old tuple->older tuple->oldest tuple

**if index update not desirable
Index points to first tuple (valid or not).

If a tuple is inserted, the first tuple is updated to point to inserted
tuple, and the inserted tuple is made to point to a previous tuple.
e.g.

Index-> first tuple->old tuple->older tuple->oldest tuple
Index-> first tuple-> New tuple->old tuple->older tuple->oldest tuple

If this is done performance might not deterioriate as much when using index
scans right? I'm not sure if a backward links would help for sequential
scans, which are usually best done forward.

Regards,
Link.

[1] http://developer.postgresql.org/pdf/transactions.pdf
Tuple headers contain:
• xmin: transaction ID of inserting transaction
• xmax: transaction ID of replacing/ deleting transaction (initially NULL)
• forward link: link to newer version of same logical row, if any
Basic idea: tuple is visible if xmin is valid and xmax is not. "Valid"
means
"either committed or the current transaction".
If we plan to update rather than delete, we first add new version of row
to table, then set xmax and forward link in old tuple. Forward link will
be needed by concurrent updaters (but not by readers).

At 10:53 AM 5/1/02 -0400, Tom Lane wrote:
>estimates. [ thinks... ] Actually I think we might just be
>double-counting if we did. The dead tuples surely should not count as
>part of the number of returned rows. We already do account for the
>I/O effort to read them (because I/O is estimated based on the total

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message mlw 2002-05-01 18:24:30 PostgreSQL mission statement?
Previous Message Tom Lane 2002-05-01 18:10:39 Re: Analyze on large changes...