Re: Cygwin PostgreSQL Regression Test Problems (Revisited)

From: Jason Tishler <Jason(dot)Tishler(at)dothill(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-ports(at)postgresql(dot)org
Subject: Re: Cygwin PostgreSQL Regression Test Problems (Revisited)
Date: 2001-03-28 22:34:49
Message-ID: 20010328173449.E510@dothill.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-ports

Tom,

On Wed, Mar 28, 2001 at 04:40:30PM -0500, Tom Lane wrote:
> Jason Tishler <Jason(dot)Tishler(at)dothill(dot)com> writes:
> > I've done the above and it seems to indicate that all backends exited
> > with a status of 0. So, I still don't know why some backends "aborted."
>
> Hm. So what exactly is the failure mode? Do the psql processes report
> any errors? Have they produced (any/all of) the expected output?

The failure mode is always something like the following:

The regression test proceeds normally until one of the larger parallel
groups is running. Then it will hang after output such as:

parallel group (18 tests): point lseg box path circle date polygon time abstime inet interval reltime type_sanity oidjoins opr_sanity timestamp...

If I do a ps, I will see the postmaster process and one or more psql
processes. The corresponding postgres processes are no longer running.
(Were they ever running?) The NT Task Manager shows essentially 100% idle.

I usually kill the psql processes, with the following command:

kill $(ps | fgrep psql | awk '{print $1}')

Then the regression test will continue with output like the following:

...Signal 15
Signal 15
comments tinterval
point ... ok
lseg ... ok
box ... ok
path ... ok
polygon ... ok
circle ... ok
date ... ok
time ... ok
timestamp ... ok
interval ... ok
abstime ... ok
reltime ... ok
tinterval ... FAILED
inet ... ok
comments ... FAILED
oidjoins ... ok
type_sanity ... ok
opr_sanity ... ok
test geometry ... ok
..

I believe that the "failures" above correspond to the psql processes
that I killed.

Sometimes the regression test will run to completion without any more
hangs. Sometimes it will hang at one or more large parallel groups. If
I continue to kill the psql processes as above, the regression test will
eventually complete (with more "failures").

I've trying another experiment of killing a postgres backend to see if
the psql process notices the backend dying. It does but I was only able
to kill -9 the postgres backend. Otherwise, postgres ignored the
signal. So, I don't know if my experiment was valid. If a backend
exits normally while a psql is connected, will the psql process notice
this event?

Any other suggestions? Or, should I just run the serial_schedule and
stop my head banging?

Thanks,
Jason
--
Jason Tishler
Director, Software Engineering Phone: +1 (732) 264-8770 x235
Dot Hill Systems Corp. Fax: +1 (732) 264-8798
82 Bethany Road, Suite 7 Email: Jason(dot)Tishler(at)dothill(dot)com
Hazlet, NJ 07730 USA WWW: http://www.dothill.com

In response to

Responses

Browse pgsql-ports by date

  From Date Subject
Next Message Mathijs Brands 2001-03-28 22:34:53 Re: [HACKERS] Re: [PORTS] pgmonitor and Solaris
Previous Message Bruce Momjian 2001-03-28 22:33:05 Re: [HACKERS] Re: [PORTS] pgmonitor and Solaris