Re: Cygwin PostgreSQL Regression Test Problems (Revisited)

From: Jason Tishler <Jason(dot)Tishler(at)dothill(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hiroshi Inoue <Inoue(at)tpf(dot)co(dot)jp>, pgsql-ports(at)postgresql(dot)org
Subject: Re: Cygwin PostgreSQL Regression Test Problems (Revisited)
Date: 2001-04-01 03:07:22
Message-ID: 20010331220722.A2591@dothill.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-ports

Tom,

On Sat, Mar 31, 2001 at 05:45:45PM -0500, Tom Lane wrote:
> "Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp> writes:
> > Oh I found the same description yesterday though I've had no time
> > to test it. If your patch resolves *hang*, it may be the right solution
> > at least for cygwin port.
>
> It seems clear that it's a good idea for fe-misc.c to check the
> exceptfds bit as well as read/write ready --- I'm surprised we have not
> seen problems associated with that on other platforms. I think it
> should check exceptfds all the time, regardless of whether we are
> waiting to read or to write.

I'm glad that you agree. Please post to the list when the change is in
CVS and I will test that this solves the Cygwin regression test (i.e.,
psql) hangs.

BTW, this will also solve the problem of Cygwin psql hanging when no
postmaster is running which I stumbled across when enabling Unix domain
socket support. Previously, I thought that it was a Cygwin problem but
now I know that it is caused by the same pqWait() problem.

> I'm inclined to also accept Jason's change to do the connect() in
> blocking mode on Cygwin.

Actually, the blocking connect() change for Cygwin is obviated by the
pqWait() fix. So, I am now no longer recommending making the blocking
connect() change for Cygwin. Unless, you do so for other Unixes too.

> If we do both of those things, have we
> resolved the issue on Cygwin, or is there still a problem?

If you do both of these changes, then the pqWait() fix will never be
triggered under Cygwin. When I tested my hacky patch to pqWait(), I had
to back out my blocking connect() patch in order for the pqWait() changes
to take affect. The regression test still did not hang -- although, I
continued to have spurious failures due to connection refused conditions.

On Sat, Mar 31, 2001 at 10:15:08AM +0900, Hiroshi Inoue wrote:
> BTW I've never passed the pararell regression test without hang or
> refusal(with your previous patch appiled) under my cygwin environ-
> ment. I added one more connect() call after the refusal and passed
> all regression test successfully. Hmm it may be a more preferable
> solution.

I'm wondering whether it makes sense to add a simple connection retry
policy as suggested above by Hiroshi? Otherwise, make check will
generate false negatives due to connection refused conditions.

If it is considered too late in the release cycle for such a change,
then I offer the following suggestions:

1. Change make check to use the serial_schedule or at least allow it to
be easily selected via a make variable (e.g., make schedule=serial_schedule
check).

2. Change the backlog parameter to listen() in src/backend/libpq/pqcomm.c
to a number that will "ensure" that the parallel_schedule version of the
regression test does not generate connection refused conditions. Note
that I'm not even sure this will really work on all (or any) platforms.

Thanks,
Jason

--
Jason Tishler
Director, Software Engineering Phone: +1 (732) 264-8770 x235
Dot Hill Systems Corp. Fax: +1 (732) 264-8798
82 Bethany Road, Suite 7 Email: Jason(dot)Tishler(at)dothill(dot)com
Hazlet, NJ 07730 USA WWW: http://www.dothill.com

In response to

Responses

Browse pgsql-ports by date

  From Date Subject
Next Message Jason Tishler 2001-04-01 03:20:56 Re: Cygwin PostgreSQL Regression Test Problems (Revisited)
Previous Message Tom Lane 2001-03-31 22:45:45 Re: Cygwin PostgreSQL Regression Test Problems (Revisited)