From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: logical changeset generation v6.2 |
Date: | 2013-10-15 15:02:39 |
Message-ID: | CA+TgmoazUmHbC1qUhsqhw6n8zJUX2x_BSQV3sQ0k77uXJN2JFg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Oct 15, 2013 at 10:48 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> > What about columns like:
>> > * action B|I|U|D|C
>>
>> BEGIN and COMMIT?
>
> That's B and C, yes. You'd rather not have them? When would you replay
> the commit without an explicit message telling you to?
No, BEGIN and COMMIT sounds good, actually. Just wanted to make sure
I understood.
>> Repeating the column names for every row strikes me as a nonstarter.
>> [...]
>> Sure, some people may want JSON or XML
>> output that reiterates the labels every time, but for a lot of people
>> that's going to greatly increase the size of the output and be
>> undesirable for that reason.
>
> But I argue that most simpler users - which are exactly the ones a
> generic output plugin is aimed at - will want all column names since it
> makes replay far easier.
Meh, maybe.
>> If the plugin interface isn't rich enough to provide a convenient way
>> to avoid that, then it needs to be fixed so that it is, because it
>> will be a common requirement.
>
> Oh, it surely is possibly to avoid repeating it. The output plugin
> interface simply gives you a relcache entry, that contains everything
> necessary.
> The output plugin would need to keep track of whether it has output data
> for a specific relation and it would need to check whether the table
> definition has changed, but I don't see how we could avoid that?
Well, it might be nice if there were a callback for, hey, schema has
changed! Seems like a lot of plugins will want to know that for one
reason or another, and rechecking for every tuple sounds expensive.
>> > What still need to be determined is:
>> > * how do we separate and escape multiple values in one CSV column
>> > * how do we represent NULLs
>>
>> I consider the escaping a key design decision. Ideally, it should be
>> something that's easy to reverse from a scripting language; ideally
>> also, it should be something similar to how we handle COPY. These
>> goals may be in conflict; we'll have to pick something.
>
> Note that parsing COPYs is a major PITA from most languages...
>
> Perhaps we should make the default output json instead? With every
> action terminated by a nullbyte?
> That's probably easier to parse from various scripting languages than
> anything else.
I could go for that. It's not quite as compact as I might hope, but
JSON does seem to make people awfully happy.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2013-10-15 15:06:09 | Re: logical changeset generation v6.2 |
Previous Message | Andres Freund | 2013-10-15 14:56:53 | Re: logical changeset generation v6.2 |