Re: chr() function leads to OOM / killed connection with 8.1, 8.2

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: wiktor(dot)wodecki(at)Net-m(dot)de, pgsql-bugs(at)postgresql(dot)org
Subject: Re: chr() function leads to OOM / killed connection with 8.1, 8.2
Date: 2007-07-19 20:12:40
Message-ID: 469FC5B8.3050609@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Tom Lane wrote:
> Heikki Linnakangas <heikki(at)enterprisedb(dot)com> writes:
>> Tom Lane wrote:
>>> I can reproduce an out-of-memory condition (basically, replace() is
>>> going into an infinite loop because of the invalid input) but I'm
>>> not seeing any crash.
>
>> replace_text reads past the end of source string, byte by byte (or
>> character by character, not sure), and eventually tries to read from an
>> invalid address which causes a segfault. It happens here when start_posn
>> == 367368.
>
> Hm, must be memory-layout-dependent. On mine, the output string buffer
> is growing fast enough to ensure there's still RAM to read, up till the
> kernel says no more.
>
> Anyway the problem is that pg_utf2wchar_with_len silently drops the
> trailing incomplete character in its input, causing text_position_next
> to think the pattern is empty, causing an infinite loop because
> curr_posn never advances. replace_text already tried to guard against
> empty pattern, but it doesn't know about this case.
>
> What I intend to do to fix this is to modify the users of
> text_position_next to believe the string lengths saved by
> text_position_setup, rather than using TEXTLEN() to compute
> the lengths. This will effectively make replace_text and
> friends consistently act as though the partial character isn't there.
>
> In the long run it might be better to make pg_utf2wchar_with_len throw
> an error for bad input, but I'm quite unsure of the consequences of
> that, in view of the existing comment "not ours to throw error".
> Anyway such a potentially-significant behavioral change doesn't seem
> like a good idea to back-patch. (We seem to have this bug in one form
> or another clear back to 7.3...)

I agree we should do the above changes for the sake of robustness, but
isn't the real problem here that chr function can return invalid byte
sequences? That was actually discussed a while back (starting at
http://archives.postgresql.org/pgsql-hackers/2007-04/msg00010.php), but
that was inconclusive.

IMHO chr should at the very least not return invalid byte sequences.
Limiting it to ascii range is not a bad idea either, though that might
break some applications.

Is there any other known loopholes to get invalid data in the database?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2007-07-19 20:26:30 Re: chr() function leads to OOM / killed connection with 8.1, 8.2
Previous Message Tom Lane 2007-07-19 19:48:45 Re: chr() function leads to OOM / killed connection with 8.1, 8.2