Re: "incomplete startup packet" on SGI

From: David Rysdam <drysdam(at)ll(dot)mit(dot)edu>
To:
Cc: "pg >> Postgres General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: "incomplete startup packet" on SGI
Date: 2005-12-15 13:31:43
Message-ID: 43A1703F.2010505@ll.mit.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

David Rysdam wrote:

> David Rysdam wrote:
>
>> Tom Lane wrote:
>>
>>> David Rysdam <drysdam(at)ll(dot)mit(dot)edu> writes:
>>>
>>>
>>>> Just finished building and installing on *Sun* (also
>>>> "--without-readline", not that I think that could be the issue):
>>>> Works fine. So it's something to do with the SGI build in particular.
>>>>
>>>
>>>
>>>
>>> More likely it's something to do with weird behavior of the SGI
>>> kernel's
>>> TCP stack. I did a little googling for "transport endpoint is not
>>> connected" without turning up anything obviously related, but that or
>>> ENOTCONN is probably what you need to search on.
>>>
>>> regards, tom lane
>>>
>>> ---------------------------(end of
>>> broadcast)---------------------------
>>> TIP 2: Don't 'kill -9' the postmaster
>>>
>>>
>>>
>>>
>> It's acting like a race condition or pointer problem. When I add
>> random debug printfs/PQflushs to libpq it sometimes works.
>> ---------------------------(end of broadcast)---------------------------
>> TIP 9: In versions below 8.0, the planner will ignore your desire to
>> choose an index scan if your joining column's datatypes do not
>> match
>>
> Not a race condition: No threads
> Not a memory leak: Electric fence says nothing. And it works when
> electric fence is running, whereas a binary that uses the same libpq
> without linking efence does not work.
>
I know nobody is interested in this, but I think I should document the
"solution" for anyone who finds this thread in the archives: My theory
is that Irix is unable to keep up with how fast the postgresql client is
going and that the debug statements/efence stuff are slowing it down
enough that Irix can catch up and make sure the socket really is there,
connected and working. To that end, I inserted a sleep(1) in
fe-connect.c just before the pqPacketSend(...startpacket...) stuff.
It's stupid and hacky, but gets me where I need to be and maybe this
hint will inspire somebody who knows (and cares) about Irix to find a
real fix.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Csaba Nagy 2005-12-15 13:32:47 Re: is this a bug or I am blind?
Previous Message Jaime Casanova 2005-12-15 13:28:20 Re: is this a bug or I am blind?