Re: [HACKERS] Re: pg_dump possible fix, need testers.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: prlw1(at)cam(dot)ac(dot)uk
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] Re: pg_dump possible fix, need testers.
Date: 2000-01-25 02:39:39
Message-ID: 8253.948767979@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Patrick Welche <prlw1(at)newn(dot)cam(dot)ac(dot)uk> writes:
> Rerunning the ordinary regression "runtest", the sanity_check passes. The
> difference being that this time I wasn't running a select at the same time.
> The parallel test "runcheck" fails on different parts at different times eg:

> test select_into ... FAILED
> because
> ! psql: connection to database "regression" failed - Backend startup failed

Do you see these failures just from running the parallel regress tests,
without anything else going on? (Theoretically, since the parallel test
script is running its own private postmaster, whatever you might be
doing under other postmasters shouldn't affect it. But you know the
difference between theory and practice...)

> (BTW in resultmap, I need the .*-.*-netbsd rather than just netbsd, I think
> it's because config.guess returns i386-unknown-netbsd1.4P, and just netbsd
> would imply the string starts with netbsd)

Right. My oversight --- fix committed.

> 3 times in a row now, gmake runtest on its own is fine, gmake runtest with a
> concurrent join select makes sanity_check fail with

> + NOTICE: RegisterSharedInvalid: SI buffer overflow
> + NOTICE: InvalidateSharedInvalid: cache state reset
> + NOTICE: Index onek_stringu1: NUMBER OF INDEX' TUPLES (1000) IS NOT THE SAME AS HEAP' (2000).
> + Recreate the index.
> + NOTICE: Index onek_hundred: NUMBER OF INDEX' TUPLES (1000) IS NOT THE SAME AS
> HEAP' (2000).
> + Recreate the index.

> Ah - some joins work. ... It seems to just be a matter of quantity of data
> going down the connection.

Hmm. I betcha that the critical factor here is how much *time* the
outside join takes, not exactly how much data it emits.

If that backend is tied up for long enough then it will cause the SI
buffer to overflow, just as you show above. In theory, that shouldn't
cause any problems beyond the appearance of the overflow/cache reset
NOTICEs (and in fact it did not the last time I tried running things
with a very small SI buffer). It looks like we have recently broken
something in SI overrun handling.

(In other words, I don't think this has anything to do with Alfred's
changes ... he'll be glad to hear that ;-). Hiroshi may be on the
hook though. I'm going to rebuild with a small SI buffer and see
if it breaks.)

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Bitmead 2000-01-25 02:41:25 Re: [HACKERS] DISTINCT ON: speak now or forever hold your peace
Previous Message Alfred Perlstein 2000-01-25 02:32:14 Re: [HACKERS] Re: pg_dump possible fix, need testers.