generating a large XML document

From: Julius Tuskenis <julius(at)nsoft(dot)lt>
To: pgsql-performance(at)postgresql(dot)org
Subject: generating a large XML document
Date: 2011-06-16 19:33:56
Message-ID: 4DFA5AA4.8090001@nsoft.lt
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello, postgresql guru's

When developing ticket export mechanism for our ticketing system I
thought the wise thing would be to generate the result (XML file) in
stored function. This worked fine for small datasets, but for the larger
ones I see it takes much longer to generate the document than to select
the data needed for it.

Here are several "explain alalyze"s :

explain analyze
SELECT * from tex.fnk_access_control_tickets(8560, 0);

"Function Scan on fnk_access_control_tickets (cost=0.25..10.25
rows=1000 width=246) (actual time=479.276..484.429 rows=16292 loops=1)"
"Total runtime: 486.774 ms"

explain analyze
SELECT
--xmlagg(
XMLELEMENT ( NAME "bar",
XMLELEMENT ( NAME "code", tick_barcode),
XMLELEMENT ( NAME "stat", status),
CASE WHEN type IS NOT NULL THEN XMLELEMENT ( NAME "tip", type)
ELSE NULL END,
CASE WHEN sec_name IS NOT NULL THEN XMLELEMENT ( NAME "sec",
sec_name) ELSE NULL END,
CASE WHEN row_name IS NOT NULL THEN XMLELEMENT ( NAME "row",
row_name) ELSE NULL END,
CASE WHEN seat_name IS NOT NULL THEN XMLELEMENT ( NAME "plc",
seat_name) ELSE NULL END,
CASE WHEN substr(tick_barcode,length(tick_barcode),1)= '1' THEN
XMLELEMENT ( NAME "groups",
XMLELEMENT ( NAME "group", 1)
)
ELSE NULL END
)
-- )
FROM tex.fnk_access_control_tickets(8560, 0);

"Function Scan on fnk_access_control_tickets (cost=0.25..17.75
rows=1000 width=238) (actual time=476.446..924.785 rows=16292 loops=1)"
"Total runtime: 928.768 ms"

explain analyze
SELECT
xmlagg(
XMLELEMENT ( NAME "bar",
XMLELEMENT ( NAME "code", tick_barcode),
XMLELEMENT ( NAME "stat", status),
CASE WHEN type IS NOT NULL THEN XMLELEMENT ( NAME "tip", type)
ELSE NULL END,
CASE WHEN sec_name IS NOT NULL THEN XMLELEMENT ( NAME "sec",
sec_name) ELSE NULL END,
CASE WHEN row_name IS NOT NULL THEN XMLELEMENT ( NAME "row",
row_name) ELSE NULL END,
CASE WHEN seat_name IS NOT NULL THEN XMLELEMENT ( NAME "plc",
seat_name) ELSE NULL END,
CASE WHEN substr(tick_barcode,length(tick_barcode),1)= '1' THEN
XMLELEMENT ( NAME "groups",
XMLELEMENT ( NAME "group", 1)
)
ELSE NULL END
)
)
FROM tex.fnk_access_control_tickets(8560, 0);

"Aggregate (cost=12.75..12.77 rows=1 width=238) (actual
time=16110.847..16110.848 rows=1 loops=1)"
" -> Function Scan on fnk_access_control_tickets (cost=0.25..10.25
rows=1000 width=238) (actual time=500.029..520.974 rows=16292 loops=1)"
"Total runtime: 16111.264 ms"

It seems the aggregate combining the elements is to blame... What I did
next was rewriting it in stored function using FOR loop and combining it
in a text variable. Sadly the result was the same (approximately 16
seconds).

From that I had to conclude that text := text + some_text operation is
an expensive one, but now I have no ideas left how to solve the problem.
Any notices, recommendations, advices are very welcome.

I've also tried google'ing on XML creation in Postgresql, but found no
warnings or even mentioning xmlagg could cause a headache. I have
nowhere to turn for help now, so please advice...

Not sure if that will help, but anyway:
Server:
PostgreSQL 9.0.3 on i486-pc-linux-gnu, compiled by GCC gcc-4.4.real
(Debian 4.4.5-10) 4.4.5, 32-bit
see the result of select * from pg_settings in attachment if needed.
Client:
Windows XP, pgAdmin 1.12.3

Thank you in advance.

--
Julius Tuskenis
Programavimo skyriaus vadovas
UAB nSoft
mob. +37068233050

Attachment Content-Type Size
pg_settings.txt text/plain 5.3 KB

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Smith 2011-06-16 22:12:08 Re: Performance advice for a new low(er)-power server
Previous Message Merlin Moncure 2011-06-16 19:04:16 Re: Performance advice for a new low(er)-power server