Re: Another issue with invalid XML values

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Noah Misch <noah(at)leadboat(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Another issue with invalid XML values
Date: 2011-07-19 23:42:54
Message-ID: 23388.1311118974@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Florian Pflug <fgp(at)phlo(dot)org> writes:
> Updated patch attached. Do you think this is "Ready for Committer"?

I've been looking through this patch. While it's mostly good, I'm
pretty unhappy with the way that the pg_xml_init/pg_xml_done code is
deliberately designed to be non-reentrant (ie, throw an Assert if
pg_xml_init is called twice without pg_xml_done in between).
There are at least two issues with that:

1. If you forget pg_xml_done in some code path, you'll find out from
an Assert at the next pg_xml_init, which is probably far away from where
the actual problem is.

2. I don't think it's entirely unlikely that uses of libxml could be
nested.

xpath_table in particular calls an awful lot of stuff between
pg_xml_init and pg_xml_done, and is at the very least open to loss of
control via an elog before it's called pg_xml_done.

I think this patch has already paid 90% of the notational price for
supporting fully re-entrant use of libxml. What I'm imagining is
that we move all five of the static variables (xml_strictness,
xml_err_occurred, xml_err_buf, xml_structuredErrorFunc_saved,
xml_structuredErrorContext_saved) into a struct that's palloc'd
by pg_xml_init and eventually passed to pg_xml_done. It could be
passed to xml_errorHandler via the currently-unused context argument.
A nice side benefit is that we could get rid of PG_XML_STRICTNESS_NONE.

Now the risk factor if we do that is that if someone misses a
pg_xml_done call, we leave an error handler installed with a context
argument that's probably pointing at garbage, and if someone then tries
to use libxml without re-establishing their error handler, they've
got problems. But they'd have problems anyway with the current form of
the patch. We could provide some defense against this by including a
magic identifier value in the palloc'd struct and making
xml_errorHandler check it before doing anything dangerous. Also, we
could make pg_xml_done warn if libxml's current context pointer is
different from the struct passed to it, which would provide another
means of detection that somebody had missed a cleanup call.

Unless someone sees a major hole in this idea, or a better way to do it,
I'm going to modify the patch along those lines and commit.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2011-07-20 00:45:36 Re: A few user-level questions on Streaming Replication and pg_upgrade
Previous Message Joshua D. Drake 2011-07-19 22:02:25 PgWest CFP closes in two weeks