From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Steve Singer <steve(at)ssinger(dot)info>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Replication identifiers, take 4 |
Date: | 2015-04-07 15:08:16 |
Message-ID: | 20150407150816.GF12291@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2015-03-24 23:11:26 -0400, Robert Haas wrote:
> On Mon, Feb 16, 2015 at 4:46 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> >> At a quick glance, this basic design seems workable. I would suggest
> >> expanding the replication IDs to regular 4 byte oids. Two extra bytes is a
> >> small price to pay, to make it work more like everything else in the system.
> >
> > I don't know. Growing from 3 to 5 byte overhead per relevant record (or
> > even 0 to 5 in case the padding is reused) is rather noticeable. If we
> > later find it to be a limit (I seriously doubt that), we can still
> > increase it in a major release without anybody really noticing.
>
> You might notice that Heikki is making the same point here that I've
> attempted to make multiple times in the past: limiting to replication
> identifier to 2 bytes because that's how much padding space you happen
> to have available is optimizing for the wrong thing. What we should
> be optimizing for is consistency and uniformity of design. System
> catalogs have OIDs, so this one should, too. You're not going to be
> able to paper over the fact that the column has some funky data type
> that is unlike every other column in the system.
>
> To the best of my knowledge, the statement that there is a noticeable
> performance cost for those 2 extra bytes is also completely
> unsupported by any actual benchmarking.
I'm starting benchmarks now.
But I have to say: I find the idea that you'd need more than 2^16
identifiers anytime soon not very credible. The likelihood that
replication identifiers are the limiting factor towards that seems
incredibly small. Just consider how you'd apply changes from so many
remotes; how to stream changes to them; how to even configure such a
complex setup. We can easily change the size limits in the next major
release without anybody being inconvenienced.
We've gone through quite some lengths reducing the overhead of WAL. I
don't understand why it's important that we do not make compromises
here; but why that doesn't matter elsewhere.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2015-04-07 15:21:49 | Re: Row security violation error is misleading |
Previous Message | Andres Freund | 2015-04-07 14:37:05 | Re: Replication identifiers, take 4 |