| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru> |
| Cc: | PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Potential buffer overrun in spell.c's CheckAffix() |
| Date: | 2026-04-22 13:44:40 |
| Message-ID: | 959933.1776865480@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
Andrey Borodin <x4mmm(at)yandex-team(dot)ru> writes:
>> On 21 Apr 2026, at 22:32, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> + /* protect against buffer overrun */
>> + if (len < Affix->replen || len >= 2 * MAXNORMLEN ||
>> + len - Affix->replen + findlen >= 2 * MAXNORMLEN)
>> + return NULL;
>> +
>> strcpy(newword, word);
>> strcpy(newword + len - Affix->replen, Affix->find);
> Is there a reason for an asymmetric check "len >= 2 * MAXNORMLEN ||”?
Yes. Because of that initial "strcpy(newword, word);", we do actually
need "word" to fit in the output buffer, even if the final output
string is shorter. The other path does not require that.
I suppose we could replace the strcpy with
memcpy(newword, word, len - Affix->replen);
and then we would not need the "len >= 2 * MAXNORMLEN" test
and both paths could share the same check. There's something
to be said for that, though it would be changing the logic to
a greater extent than just "add some safety checks".
>> I chose to do this by silently truncating the input before it can
>> overrun the buffer, using logic comparable to the existing logic in
>> get_nextfield(). Certainly there's at least as good an argument for
>> raising an error, but for now let's follow the existing precedent.
> Is there a reason not to emit WARNING? The data is obviously suspicious…
> Perhaps, there’s a reason, so maybe just document it then.
I could agree with changing all of these cases (including the existing
get_nextfield checks) to throw ERROR; but I don't think that's
something to do in a back-patched bug fix.
Another thing I don't love, but wouldn't change in back branches,
is the use of BUFSIZ for the string lengths. That's far more than
necessary (why not just MAXNORMLEN?), causing these functions to eat
much more stack space than they need. Also it seems like an
unnecessary platform dependency. Maybe BUFSIZ is 8K everywhere,
but I'm not sure of that.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ayush Tiwari | 2026-04-22 14:18:00 | to_date()/to_timestamp() silently accept month=0 and day=0 |
| Previous Message | Ayush Tiwari | 2026-04-22 11:58:41 | Re: Bug in CREATE TABLE .. LIKE .. INCLUDING STATISTICS? |