Re: Decomposing xml into table

From: Surafel Temesgen <surafel3000(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Decomposing xml into table
Date: 2020-06-23 12:25:47
Message-ID: CALAY4q826YiwYEn4f5oV=cZDeDMmqUkm130zBRaOqeEQk3F-fQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hey Tom

On Mon, Jun 22, 2020 at 10:13 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Big -1 on that. COPY is not for general-purpose data transformation.
> The more unrelated features we load onto it, the slower it will get,
> and probably also the more buggy and unmaintainable.

what new format handling takes to add regards to performance is a check to
a few place and I don’t think that have noticeable performance impact and
as far as I can see copy is extendable by design and I don’t think adding
additional format will be a huge undertaking

> There's also a
> really fundamental mismatch, in that COPY is designed to do row-by-row
> processing with essentially no cross-row state. How would you square
> that with the inherently nested nature of XML?
>
>
In xml case the difference is row delimiter . In xml mode user specifies
row delimiter tag name and starting from start tag of specified name up to
its end tag treated as single row and every text content in between will be
our columns value filed

>
> The big-picture question here, though, is why expend effort on XML at all?
> It seems like JSON is where it's at these days for that problem space.
>

there are a legacy systems and I think xml is still popular

regards
Surafel

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2020-06-23 12:31:51 [PATCH] fix size sum table_parallelscan_estimate (src/backend/access/table/tableam.c)
Previous Message Josef Šimánek 2020-06-23 12:22:07 Re: [PATCH] Initial progress reporting for COPY command