Re: BUG #15420: Server crash. Segmentation fault when parsing xml file

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: sergey(at)mirvoda(dot)com
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15420: Server crash. Segmentation fault when parsing xml file
Date: 2018-10-04 11:36:39
Message-ID: CAFj8pRCoGkBvaGb4zemawvwQViRUDVjW8xPPHc3ZMBgbxaJprw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

čt 4. 10. 2018 v 13:20 odesílatel Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
napsal:

>
>
> čt 4. 10. 2018 v 12:18 odesílatel Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> napsal:
>
>> Hi
>>
>> čt 4. 10. 2018 v 12:12 odesílatel Sergey Mirvoda <sergey(at)mirvoda(dot)com>
>> napsal:
>>
>>>
>>>
>>> On Thu, Oct 4, 2018 at 2:11 PM Michael Paquier <michael(at)paquier(dot)xyz>
>>> wrote:
>>>
>>>> If you can, could you please attach this file to this thread? This is
>>>> important for the archives.
>>>> --
>>>> Michael
>>>>
>>>
>>> Looks like it is too big to send uncompressed, here it is in zip archive
>>>
>>
>> I am try to import this xml to Postgres with pgimportdoc
>>
>> https://github.com/okbob/pgimportdoc
>>
>> and looks like some libxml2 issue.
>>
>> pgimportdoc: Unexpected result status: PGRES_FATAL_ERROR
>> pgimportdoc: Error: ERROR: invalid XML content
>> DETAIL: line 178950: internal error: Huge input lookup
>> � органе Пенсионного фонда Российской Федер
>>
>> ^
>> line 178950: attributes construct error
>>
>
> I checked Sergey's example, and it doesn't crash on Linux - The error is
> displayed correctly. Looks like MS Windows issue of libxml2
>
> postgres=# select xml_is_well_formed(d) from
> convert_from(pg_read_binary_file('error.xml'),'windows-1251') g(d);
> ┌────────────────────┐
> │ xml_is_well_formed │
> ╞════════════════════╡
> │ f │
> └────────────────────┘
> (1 row)
>
> This issue can be enforced by relatively new libxml2 limits
>
> https://mail.gnome.org/archives/commits-list/2012-August/msg00645.html
>
> Unfortunately, default configuration uses xmlParseBalancedChunkMemory for
> parsing content, and this function cannot to get option like
>
> XML_PARSE_HUGE
>
> So it is hard to fix it.
>

It probably requires refactoring of parsing xml like
http://xmlsoft.org/examples/parse4.c

Regards

Pavel

> Regards
>
> Pavel
>
>>
>>
>>
>>
>>> --
>>> --Regards, Sergey Mirvoda
>>>
>>

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Andrey Borodin 2018-10-04 11:38:01 Re: BUG #15420: Server crash. Segmentation fault when parsing xml file
Previous Message Sergey Mirvoda 2018-10-04 11:35:33 Re: BUG #15420: Server crash. Segmentation fault when parsing xml file