From: | James Coleman <jtc331(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: massive FPI_FOR_HINT load after promote |
Date: | 2020-08-11 16:53:30 |
Message-ID: | CAAaqYe8CmnqEbdJ7QL=QRDZYDB0X9+YQJUH=VEdO-qpDYtPURQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Aug 11, 2020 at 2:55 AM Masahiko Sawada
<masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>
> On Tue, 11 Aug 2020 at 07:56, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> >
> > Last week, James reported to us that after promoting a replica, some
> > seqscan was taking a huge amount of time; on investigation he saw that
> > there was a high rate of FPI_FOR_HINT wal messages by the seqscan.
> > Looking closely at the generated traffic, HEAP_XMIN_COMMITTED was being
> > set on some tuples.
> >
> > Now this may seem obvious to some as a drawback of the current system,
> > but I was taken by surprise. The problem was simply that when a page is
> > examined by a seqscan, we do HeapTupleSatisfiesVisibility of each tuple
> > in isolation; and for each tuple we call SetHintBits(). And only the
> > first time the FPI happens; by the time we get to the second tuple, the
> > page is already dirty, so there's no need to emit an FPI. But the FPI
> > we sent only had the bit on the first tuple ... so the standby will not
> > have the bit set for any subsequent tuple. And on promotion, the
> > standby will have to have the bits set for all those tuples, unless you
> > happened to dirty the page again later for other reasons.
> >
> > So if you have some table where tuples gain hint bits in bulk, and
> > rarely modify the pages afterwards, and promote before those pages are
> > frozen, then you may end up with a massive amount of pages that will
> > need hinting after the promote, which can become troublesome.
>
> Did the case you observed not use hot standby? I thought the impact of
> this issue could be somewhat alleviated in hot standby cases since
> read queries on the hot standby can set hint bits.
We do have hot standby enabled, and there are sometimes large queries
that may do seq scans that run against a replica, but there are
multiple replicas (and each one would have to have the bits set), and
a given replica that gets promoted in our topology isn't guaranteed to
be one that's seen those reads.
James
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2020-08-11 17:13:18 | Re: posgres 12 bug (partitioned table) |
Previous Message | Pavel Biryukov | 2020-08-11 16:02:57 | Re: posgres 12 bug (partitioned table) |