From: | Sergey Mirvoda <sergey(at)mirvoda(dot)com> |
---|---|
To: | andrew(at)tao11(dot)riddles(dot)org(dot)uk |
Cc: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Андрей Бородин <borodin(at)octonica(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #15420: Server crash. Segmentation fault when parsing xml file |
Date: | 2018-10-05 12:08:48 |
Message-ID: | CALkWAriUN-6GsYyURvAB5f5+HsDbb_bx1YgsXMjs0xsMvCd-xQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Fri, Oct 5, 2018 at 10:08 AM Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
wrote:
> >>>>> "Andrey" == Andrey Borodin <x4mmm(at)yandex-team(dot)ru> writes:
>
> >> You're sure about that libxml2 version? I can reproduce a crash on
> >> 2.9.4, but have as yet failed to do so on 2.9.7 (fails with an error
> >> message instead)
>
> Andrey> You are right, there was default 2.9.4 from OS, and 2.9.4 from
> Andrey> brew was not used.
>
> Andrey> x4mmm-osx:pgsql x4mmm$ xmllint --version
> Andrey> xmllint: using libxml version 20904
>
> I have a complete diagnosis of why it crashes on 2.9.4, and I can see
> why it does not crash the same way on 2.9.7, but I would not bet
> anything on 2.9.7 not having some comparable issue.
>
> What happens on 2.9.4 is this (this is all inside libxml2):
>
> - at some point when parsing an element tag, the code decides to raise
> a fatal error and call xmlHaltParser
>
> - xmlHaltParser works by resetting the input buffer's "base" and "cur"
> pointers to point to a literal "" in the code (thus, a null byte)
>
> - xmlParseStartTag2 detects that input->base has changed, and assumes
> that this is because the buffer got reallocated; in the process of
> dealing with this, it resets input->cur to input->base + cur where
> "cur" is a local variable holding the previous offset in the buffer
> (which is now of course nonsense, so input->cur points into the
> weeds)
>
> - something later tries to access the byte at *input->cur and likely
> crashes (depending on many random factors, including load addresses
> of shared libraries and where in the buffer the original error was
> detected)
>
> Between 2.9.4 and 2.9.7 xmlParseStartTag2 was changed to handle buffer
> reallocations differently so it doesn't fail the same way (it no longer
> tries to modify input->cur). But there are so many ways that this error
> path can screw itself up that I honestly would not trust it for one
> second.
>
> --
> Andrew (irc:RhodiumToad)
>
Sorry for top posting and spelling, T9 and mobile gmail not very usable.
Some notes.
if i set xmloption to document
this code works as expected
postgres=# select d::xml from
convert_from(pg_read_binary_file('EGRUL_FULL_2018-01-01_X.XML'),'windows-1251')
g(d);
....
postgres=# select xml_is_well_formed(d) from
convert_from(pg_read_binary_file('EGRUL_FULL_2018-01-01_X.XML'),'windows-1251')
g(d);
xml_is_well_formed
--------------------
t
(1 строка)
but all other XML functions still crashing server
for example:
postgres=# select xpath_exists('//СвЮЛ'::text,d::xml) from
convert_from(pg_read_binary_file('egrul/EGRUL_FULL_2018-01-01_X.XML'),'windows-1251')
g(d);
--
--Regards, Sergey Mirvoda
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Gierth | 2018-10-05 12:28:22 | Re: BUG #15420: Server crash. Segmentation fault when parsing xml file |
Previous Message | Sergey Mirvoda | 2018-10-05 12:03:17 | Re: BUG #15420: Server crash. Segmentation fault when parsing xml file |