From: | Sergey Mirvoda <sergey(at)mirvoda(dot)com> |
---|---|
To: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
Cc: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Андрей Бородин <borodin(at)octonica(dot)com>, michael(at)paquier(dot)xyz, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #15420: Server crash. Segmentation fault when parsing xml file |
Date: | 2018-10-04 14:11:47 |
Message-ID: | CALkWArjA5ApwXTnWWGMSmw6CFUaaTWHiL5gmJuMZXsMsb0tqeQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
чт, 4 окт. 2018, 19:03 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
>
>
> čt 4. 10. 2018 v 13:47 odesílatel Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> napsal:
>
>>
>>
>> čt 4. 10. 2018 v 13:43 odesílatel Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
>> napsal:
>>
>>>
>>>
>>> 4 окт. 2018 г., в 16:38, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
>>> написал(а):
>>>
>>>
>>>
>>>
>>> Actually we found this error in very fresh intatallation of Ubuntu 16.04
>>>> and postgres 10.5
>>>> After that we checked every configuration we have.
>>>> And only postgres 9.4 works as expected.
>>>>
>>>
>>> This issue is related to libxml2 limits - and it cannot to work with
>>> modern libxml2 libraries.
>>>
>>> Yes, root cause is inside libxml2 code.
>>>
>>> Can we protect postmaster from crashing from libxml2 error? There is a
>>> bunch of PG_TRY there, but it does not help.
>>>
>>
>> Unfortunately, no. You cannot to handle crash. PostgreSQL doesn't start
>> separate process for libxml2 calls, and fault there is fatal.
>>
>
> I played with it, and it looks on some problems with libxml2 and your
> specific document (maybe too much multibyte chars, .. I don't know)
>
> I imported 200MB long xml document with 1M items. So it has not sense to
> limit xml size of PostgreSQL side.
>
> It looks so your xml document hits some corner case of libxml2 where it is
> extremely memory expensive. What I can see, there is lot of long content
> inside attributes.
>
> Regards
>
Pavel, thank you for your interest.
It is definitely something inside this document.
Actually we loaded about 10k different documents like this one. About 10Gb
of content and crash is only on this one.
But every other parser we tried (.net, Java, python) handled this just
fine.
For now we ended with custom plpython function for parsing xml and this is
slow as hell.
This is looks like regression, pg 9.4 load this document without any
problem.
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2018-10-04 14:31:00 | Re: BUG #15420: Server crash. Segmentation fault when parsing xml file |
Previous Message | Pavel Stehule | 2018-10-04 14:02:56 | Re: BUG #15420: Server crash. Segmentation fault when parsing xml file |