Re: Unhappy with error handling in psql's handleCopyOut()

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Unhappy with error handling in psql's handleCopyOut()
Date: 2014-02-11 22:03:11
Message-ID: 20140211220311.GV2921@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > I've not gotten back to it yet, but I ran into a related-seeming issue
> > where psql would happily chew up 2G of memory trying to send "COPY
> > failed" notices when it gets disconnected from a server that it's trying
> > to send data to mid-COPY. conn->sock was -1, connection was
> > 'CONNECTION_BAD', but the loop towards the end of handleCopyIn doesn't
> > care and nothing in libpq is changing PQresultStatus():
>
> [ scratches head... ] Offhand I'd have expected PQgetResult to start
> returning error indications. It definitely will do that if it gets
> error indications from the I/O layer. Perhaps it didn't see any
> because asyncStatus had already been reset from PGASYNC_BUSY?

The I/O layer was definitely returning errors, I traced through that
using gdb (tho I don't have it open any more, had to move on to other
things; been planning to reproduce it). Here's the full p *conn:

$9 = {pghost = 0x7fa742753300 "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX", pghostaddr = 0x0, pgport = 0x7fa742753340 "XXXXX", pgunixsocket = 0x0, pgtty = 0x7fa742753360 "", connect_timeout = 0x0, client_encoding_initial = 0x0,
pgoptions = 0x7fa742753380 "", appname = 0x0, fbappname = 0x7fa7427533a0 "psql", dbName = 0x7fa7427532e0 "XXXX", replication = 0x0, pguser = 0x7fa7427532c0 "XXXXXXXXXX", pgpass = 0x7fa742755070 "XXXXXXXXXXX", keepalives = 0x0,
keepalives_idle = 0x0, keepalives_interval = 0x0, keepalives_count = 0x0, sslmode = 0x7fa7427533c0 "prefer", sslcompression = 0x7fa7427533e0 "1", sslkey = 0x0, sslcert = 0x0, sslrootcert = 0x0, sslcrl = 0x0, requirepeer = 0x0,
krbsrvname = 0x7fa742753400 "postgres", Pfdebug = 0x0, noticeHooks = {noticeRec = 0x7fa73fed3ec0 <defaultNoticeReceiver>, noticeRecArg = 0x0, noticeProc = 0x7fa740572f20 <NoticeProcessor>, noticeProcArg = 0x0}, events = 0x0,
nEvents = 0, eventArraySize = 0, status = CONNECTION_BAD, asyncStatus = PGASYNC_COPY_IN, xactStatus = PQTRANS_INTRANS, queryclass = PGQUERY_SIMPLE,
last_query = 0x7fa7427809f0 "COPY XXXXXXXXXXXXXXXXXXXXXXXXXX (XXXXXX, XXXXXX, XXXXXX, XXXXXX, XXXXXX) FROM stdin;", last_sqlstate = "\000\000\000\000\000", options_valid = 1 '\001',
nonblocking = 0 '\000', singleRowMode = 0 '\000', copy_is_binary = 0 '\000', copy_already_done = 0, notifyHead = 0x0, notifyTail = 0x0, sock = -1, laddr = {addr = {ss_family = 2, __ss_align = 0,
__ss_padding = '\000' <repeats 111 times>}, salen = 16}, raddr = {addr = {ss_family = 2, __ss_align = 0, __ss_padding = '\000' <repeats 111 times>}, salen = 16}, pversion = 196608, sversion = 90302, auth_req_received = 1 '\001',
password_needed = 1 '\001', dot_pgpass_used = 1 '\001', sigpipe_so = 0 '\000', sigpipe_flag = 0 '\000', addrlist = 0x0, addr_cur = 0x0, addrlist_family = 0, setenv_state = SETENV_STATE_IDLE, next_eo = 0x0, send_appname = 1 '\001',
be_pid = 25857, be_key = 393001970, md5Salt = "XXXXXX", pstatus = 0x7fa742788630, client_encoding = 6, std_strings = 1 '\001', verbosity = PQERRORS_DEFAULT, lobjfuncs = 0x0, inBuffer = 0x7fa74274a740 "G", inBufSize = 16384,
inStart = 0, inCursor = 0, inEnd = 0, outBuffer = 0x7fa63a495010 "00229878919\t1", '0' <repeats 12 times>, "13462\tsupplemental\tt\nd", outBufSize = 2147475456, outCount = 2147475445, outMsgStart = 2147475446, outMsgEnd = 2147475450,
rowBuf = 0x7fa742752760, rowBufLen = 32, result = 0x0, next_result = 0x0, allow_ssl_try = 1 '\001', wait_ssl_try = 0 '\000', ssl = 0x0, peer = 0x0, engine = 0x0, gctx = 0x0, gtarg_nam = 0x0, ginbuf = {length = 0, value = 0x0},
goutbuf = {length = 0, value = 0x0}, errorMessage = {data = 0x7fa742752970 "cannot allocate memory for output buffer\n", len = 41, maxlen = 256}, workBuffer = {data = 0x7fa742752a80 "SAVEPOINT", len = 9, maxlen = 256}}

You can see asyncStatus is still PGASYNC_COPY_IN and also how
outBufSize grew up to the 2G mark, along with the 'cannot allocate'
error.

The p *res looked like:

$2 = {ntups = 0, numAttributes = 0, attDescs = 0x0, tuples = 0x0, tupArrSize = 0, numParameters = 0, paramDescs = 0x0, resultStatus = PGRES_COPY_IN,
cmdStatus = "\000\177\000\000(at)\000\000\000\000\000\000\000!\000\000\000\000\000\000\000\210ף?\247\177\000\000\000BxB\247\177\000\000`\000\000\000\000\000\000\000\060\000\000\000\000\000\000\000\300dxB\247\177\000\000\000\000\000",
binary = 0, noticeHooks = {noticeRec = 0x7fa73fed3ec0 <defaultNoticeReceiver>, noticeRecArg = 0x0, noticeProc = 0x7fa740572f20 <NoticeProcessor>, noticeProcArg = 0x0}, events = 0x0, nEvents = 0, client_encoding = 6, errMsg = 0x0,
errFields = 0x0, null_field = "", curBlock = 0x0, curOffset = 0, spaceLeft = 0}

> If so, maybe we need an explicit test on the connection being good before
> we return valid PGRES_COPY_IN etc results.

Right, PQresultStatus() just checks if the PGresult is non-NULL, it's
not doing any checking of the connection state itself. I had been
trying to work out if something else should have been responsible for
changing asyncStatus away from PGASYNC_COPY_IN when the connection
failure is detected, but having PQresultStatus() check the connection
status seems like it would work.

Thanks,

Stephen

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-02-11 22:16:32 Re: truncating pg_multixact/members
Previous Message Tom Lane 2014-02-11 21:29:02 Re: Unhappy with error handling in psql's handleCopyOut()