Re: Crash report for some ICU-52 (debian8) COLLATE and work_mem values

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
Cc: "Peter Geoghegan" <pg(at)bowt(dot)ie>, "PostgreSQL mailing lists" <pgsql-bugs(at)postgresql(dot)org>, "Peter Eisentraut" <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: Crash report for some ICU-52 (debian8) COLLATE and work_mem values
Date: 2017-08-06 20:00:51
Message-ID: 9397.1502049651@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

I wrote:
> "Daniel Verite" <daniel(at)manitou-mail(dot)org> writes:
>> If someone wants to reproduce the problem, I've made
>> a custom dump (40MB) available at
>> http://www.manitou-mail.org/vrac/words_test.dump

> Thanks for providing this test data. I've attempted, so far
> unsuccessfully, to reproduce the crash.

OK, so after installing ICU 52.1 from upstream sources, I can reproduce
these problems, and I think there's nothing to see here except ICU bugs.

In the case where Daniel reports huge core dumps, the problem seems to be
that ucol_strcollRegular() gets into an infinite loop here:

// We fetch CEs until we hit a non ignorable primary or end.
uint32_t sPrimary;
do {
// We get the next CE
sOrder = ucol_IGetNextCE(coll, sColl, status);
// Stuff it in the buffer
UCOL_CEBUF_PUT(&sCEs, sOrder, sColl, status);
// And keep just the primary part.
sPrimary = sOrder & UCOL_PRIMARYMASK;
} while(sPrimary == 0);

It keeps on stuffing tokens into "sCEs", which is an expansible buffer in
the same spirit as our StringInfos, and once that gets past 2^31 tokens
all hell breaks loose, because the expansion code is entirely unconcerned
about the possibility of integer overflow:

void ucol_CEBuf_Expand(ucol_CEBuf *b, collIterate *ci, UErrorCode *status) {
uint32_t oldSize;
uint32_t newSize;
uint32_t *newBuf;

ci->flags |= UCOL_ITER_ALLOCATED;
oldSize = (uint32_t)(b->pos - b->buf);
newSize = oldSize * 2;
newBuf = (uint32_t *)uprv_malloc(newSize * sizeof(uint32_t));
if(newBuf == NULL) {
*status = U_MEMORY_ALLOCATION_ERROR;
}
else {
uprv_memcpy(newBuf, b->buf, oldSize * sizeof(uint32_t));
...

The buffer size seems to always be a power of 2. After enough iterations
oldSize is 2^31, so newSize overflows to zero, and it allocates a
zero-length newBuf and proceeds to copy ~8GB of data there. No surprise
that we get a SIGSEGV, but before that happens it'll have trashed a ton of
memory. I suspect whatever problems valgrind is reporting are just false
positives induced by that massive memory stomp.

So there are two separate bugs there --- one is that that loop fails to
terminate, for reasons that are doubtless specific to particular collation
data, and the other is that the expansible-buffer code is not robust.

The other class of failures amounts to this loop iterating till it falls
off the end of memory:

while(schar > (tchar = *(UCharOffset+offset))) { /* since the contraction codepoints should be ordered, we skip all that are smaller */
offset++;
}

which is unsurprising, because (in my core dump) schar is 1113834 which is
larger than any possible UChar value, so the loop cannot terminate except
by crashing. The crash occurred while trying to process this string:

buf2 = 0x1614d20 "requ\364\217\273\252te",

and I do not think it's coincidence that that multibyte character
there corresponds to U+10FEEA or decimal 1113834. Apparently
they've got some bugs with dealing with characters beyond U+FFFF,
at least in certain locales.

In short then, ICU 52.1 is just too buggy to contemplate using.
I haven't compared versions to see when these issues were introduced
or fixed. But I'm now thinking that trying to support old ICU
versions is a mistake, and we'd better serve our users by telling
them not to use ICU before some-version-after-52.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Geoghegan 2017-08-06 20:06:21 Re: Crash report for some ICU-52 (debian8) COLLATE and work_mem values
Previous Message Andres Freund 2017-08-06 17:48:22 Re: Replication to Postgres 10 on Windows is broken

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2017-08-06 20:06:21 Re: Crash report for some ICU-52 (debian8) COLLATE and work_mem values
Previous Message Alik Khilazhev 2017-08-06 18:22:26 Re: [WIP] Zipfian distribution in pgbench