Re: Core reported from vaccum function.

From: "Mavinakuli, Prasanna (STSD)" <prasanna(dot)b-m(at)hp(dot)com>
To: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Rao, Srikanth R (STSD)" <srikanth-r(dot)rao-k(at)hp(dot)com>, "Racharla, Chakravarthy (STSD)" <chakravarthy(dot)racharla(at)hp(dot)com>, "Manchenahalli, Raghunandan (STSD)" <raghunandan(dot)manchenahalli(at)hp(dot)com>, "Hebbar, Raghavendra (STSD)" <raaghav(at)hp(dot)com>, "Mavinakuli, Prasanna (STSD)" <prasanna(dot)b-m(at)hp(dot)com>
Subject: Re: Core reported from vaccum function.
Date: 2007-07-31 03:45:18
Message-ID: FFE5D42C74BFBE48A79FA506C15E1E7C06AE521D@bgeexc04.asiapacific.cpqcorp.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


Thanks Alvaro,for your deliberate explanation.But few more
clarifications are requested as we are too novice to postgreSQL.

1)When it is said "upgrade" it is NOT the upgrade of table rather it is
the upgrade that does happen because of vacuum query execution?..is that
understanding right?.(Because we got the problem during normal query
execution and not in postgreSQL upgrade)

2)Again what we could gather is,there is *a* chance of data corruption
during vaccum query which might lead to core problem as well.

The back trace what we have is ,
gdb) bt

#0 0x449c210:0 in HeapTupleSatisfiesNow+0xb0 ()

#1 0x40ec3f0:0 in heap_fetch+0x6f0 ()

#2 0x41c1940:0 in analyze_rel+0x1540 ()

#3 0x42351d0:0 in vacuum+0x370 ()

#4 0x436adb0:0 in ProcessUtility+0xb00 ()

#5 0x4367b50:0 in PortalRunUtility+0x1c0 ()

#6 0x4368600:0 in PortalRun+0x950 ()

#7 0x435eab0:0 in exec_simple_query+0x530 ()

#8 0x4364550:0 in PostgresMain+0x45a0 ()

#9 0x4301c50:0 in ServerLoop+0x15e0 ()

#10 0x4306050:0 in PostmasterMain+0x2050 ()

#11 0x42858c0:0 in main+0x470 ()

Is there any point to think that it is the result of that corruption?.As
we can observe,the core happened during the execution of
HeapTupleSatisfiesNow which had a fix in later version for the said
problem.does it vindicate core happened only because of the corruption
which was there in earlier version of postgres.(Unfortunately we are
still using older version.7.4.2 which didn't have the fix for the same.)

Again Thanks a lot,

Thanks ,
Prasanna.

-----Original Message-----
From: Alvaro Herrera [mailto:alvherre(at)commandprompt(dot)com]
Sent: Monday, July 30, 2007 10:00 PM
To: Mavinakuli, Prasanna (STSD)
Cc: pgsql-general(at)postgresql(dot)org; Tom Lane; Rao, Srikanth R (STSD);
Racharla, Chakravarthy (STSD); Manchenahalli, Raghunandan (STSD);
Hebbar, Raghavendra (STSD)
Subject: Re: [GENERAL] Core reported from vaccum function.

Mavinakuli, Prasanna (STSD) wrote:
>
> Hello Alvaro,
>
> Thanks for your reply.
>
> We could see
> "Fix potential-data-corruption bug in how VACUUM FULL handles UPDATE
> chains (Tom, Pavan Deolasee) " in 7.4.17 release notes.
>
> Could you please elaborate more on the above problem.Meaning what was
> the actual problem and what fix has been done etc?

Here is the CVS log entry:

2007-03-14 14:48 tgl

* src/backend/commands/vacuum.c (1.263.2.3):

Fix a longstanding bug in VACUUM FULL's handling of update chains. The
code did not expect that a DEAD tuple could follow a RECENTLY_DEAD tuple
in an update chain, but because the OldestXmin rule for determining
deadness is a simplification of reality, it is possible for this
situation to occur (implying that the RECENTLY_DEAD tuple is in fact
dead to all observers, but this patch does not attempt to exploit that).
The code would follow a chain forward all the way, but then stop before
a DEAD tuple when backing up, meaning that not all of the chain got
moved. This could lead to copying the chain multiple times (resulting
in duplicate copies of the live tuple at its end), or leaving dangling
index entries behind (which, aside from generating warnings from later
vacuums, creates a risk of wrong query results or bogus duplicate-key
errors once the heap slot the index entry points to is repopulated).

The fix is to recheck HeapTupleSatisfiesVacuum while following a chain
forward, and to stop if a DEAD tuple is reached. Each contiguous group
of RECENTLY_DEAD tuples will therefore be copied as a separate chain.
The patch also adds a couple of extra sanity checks to verify correct
behavior.

Per report and test case from Pavan Deolasee.

--
Alvaro Herrera
http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tony Caduto 2007-07-31 04:01:55 Re: The leanest, meanest Windows installer possible
Previous Message Alvaro Herrera 2007-07-31 03:39:44 Re: [GENERAL] PostgreSQL, PGDAY, PGParty and OSCON 2007 Rocked!