Re: Chronic performance issue with Replication Failover and FSM.

From: Daniel Farina <daniel(at)heroku(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Chronic performance issue with Replication Failover and FSM.
Date: 2012-03-14 01:41:48
Message-ID: CAAZKuFZ7rDAjMZayCbhnqUhsX-SxRYaypfAq56MHvJrbRQAhcg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 13, 2012 at 4:53 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> All,
>
> I've discovered a built-in performance issue with replication failover
> at one site, which I couldn't find searching the archives.  I don't
> really see what we can do to fix it, so I'm posting it here in case
> others might have clever ideas.
>
> 1. The Free Space Map is not replicated between servers.
>
> 2. Thus, when we fail over to a replica, it starts with a blank FSM.
>
> 3. I believe replica also starts with zero counters for autovacuum.
>
> 4. On a high-UPDATE workload, this means that the replica assumes tables
> have no free space until it starts to build a new FSM or autovacuum
> kicks in on some of the tables, much later on.
>
> 5. If your hosting is such that you fail over a lot (such as on AWS),
> then this causes cumulative table bloat which can only be cured by a
> VACUUM FULL.
>
> I can't see any way around this which wouldn't also bog down
> replication.  Clever ideas, anyone?

Would it bog it down by "much"?

(1 byte per 8kb) * 2TB = 250MB. Even if you doubled or tripled it for
pointer-overhead reasons it's pretty menial, whereas VACUUM traffic is
already pretty intense. Still, it's clearly...work.

--
fdr

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2012-03-14 02:02:24 Re: wal_buffers, redux
Previous Message Bruce Momjian 2012-03-14 01:15:52 Re: pg_upgrade and statistics