Re: Native XML

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Anton" <antonin(dot)houska(at)gmail(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Native XML
Date: 2011-03-01 19:24:16
Message-ID: 24401.1299007456@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> I apparently didn't express myself very well, since you seem to have
> *completely* missed my point. I know we can do tsearch2 searches
> against XML, or JSON, or YAML, or (insert next week's new favorite
> format here). What we can't currently do efficiently is search for
> particular values in some particular place in the hierarchy of a
> document. I've had loads of fun approximating it with regular
> expressions, but some days I'd like life to be easier.

Check.

> What I was arguing for is a new type which would represent the
> structure in a fashion which was independent of the particular text
> format and was efficient to traverse hierarchically. Done right,
> that would map well to GiST. Although, thinking about that some
> more, perhaps there would be a way to create a GiST index suitable
> for that straight from the XML text, and avoid the sharded column.
> A GiST index actually seems pretty close to what such a structure
> would look like anyway....

FWIW, GIN might be a more natural match, at least for the cases where
"place in the document" has a scalar value. If you need to search for
"place" with something other than equality or prefix match semantics,
maybe not.

But in any case I think your point is that this is an indexing problem,
and whether the full document in the table column is pre-parsed or not
isn't all that relevant for performance. I agree. tsearch2 is really a
precedent for your argument, not a distinct approach, because it doesn't
expect pre-parsed text columns either.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-03-01 19:26:37 Re: wrapping up this CommitFest (was Re: knngist - 0.8)
Previous Message Jan Urbański 2011-03-01 19:20:58 Re: pl/python tracebacks