Missing CHECK_FOR_INTERRUPTS in hash joins

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Missing CHECK_FOR_INTERRUPTS in hash joins
Date: 2017-02-15 01:22:00
Message-ID: 6044.1487121720@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I ran into a case where a hash join took a really long time to respond
to a cancel request --- long enough that I gave up and kill -9'd it,
because its memory usage was also growing to the point where the kernel
would likely soon choose to do that for me. The culprit seems to be
that there's no CHECK_FOR_INTERRUPTS anywhere in this loop in
ExecHashJoinNewBatch():

while ((slot = ExecHashJoinGetSavedTuple(hjstate,
innerFile,
&hashvalue,
hjstate->hj_HashTupleSlot)))
{
/*
* NOTE: some tuples may be sent to future batches. Also, it is
* possible for hashtable->nbatch to be increased here!
*/
ExecHashTableInsert(hashtable, slot, hashvalue);
}

so that if you try to cancel while it's sucking a really large batch file
into memory, you lose. (In the pathological case I was checking, the
batch file was many gigabytes in size, and had certainly never all been
resident earlier.)

Adding a C.F.I. inside this loop is the most straightforward fix, but
I am leaning towards adding one in ExecHashJoinGetSavedTuple instead,
because that would also ensure that all successful paths through
ExecHashJoinOuterGetTuple will do a C.F.I. somewhere, and it seems good
for that to be consistent. The other possibility is to put one inside
ExecHashTableInsert, but the only other caller of that doesn't really need
it, since it's relying on ExecProcNode to do one.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-02-15 02:08:35 Re: [Bug fix] PQsendQuery occurs error when target_session_attrs is set to read-write
Previous Message Tom Lane 2017-02-15 00:12:53 Re: WAL consistency check facility