Re: Query cancellation on hot standby because of buffer pins

From: Drazen Kacar <drazen(dot)kacar(at)oradian(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Query cancellation on hot standby because of buffer pins
Date: 2015-02-23 08:19:59
Message-ID: CAFxrd3terpV1ot6JPeOua_tFmQurDm_KcCZRpSTFLq7viX0tjg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On 20 February 2015 at 23:48, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:

> Drazen Kacar <drazen(dot)kacar(at)oradian(dot)com> wrote:
>
> > I'm seeing query cancellation on hot standby (Postgres 9.3.4,
> > configured with hot_standby_feedback=on) because of buffer pins
> > being held for too long.
> >
> > Database message looks like this:
> >
> > ERROR: canceling statement due to conflict with recovery
> > DETAIL: User was holding shared buffer pin for too long.
> > STATEMENT: ...
>
> It would be helpful to see what that statement was that you
> replaced with the ellipsis.
>

It's largish:

SELECT
products.name AS
product,
null AS
branch_name,
null AS
credit_officer_name,
COUNT(DISTINCT account_id) AS
accounts,
COUNT(DISTINCT client_id) AS
clients,
SUM(balance) AS
balance,
ROUND((SUM(balance) / COUNT(DISTINCT account_id)), 2) AS
averageBalance
FROM deposit_accounts
JOIN accounts ON deposit_accounts.account_id = accounts.id
JOIN clients ON accounts.client_id = clients.id
JOIN products ON accounts.product_id = products.id
JOIN organisation_structure_nodes ON
(clients.organisation_structure_node_id = organisation_structure_nodes.id)
WHERE organisation_structure_nodes.tree_path <@ (SELECT
tree_path FROM organisation_structure_nodes WHERE
organisation_structure_nodes.id = $1)
AND organisation_structure_nodes.tree_path <@ (SELECT
tree_path FROM user_top_visible_node_tree_paths_view WHERE user_id = $2)
AND accounts.status_id = 3 OR accounts.status_id = 4
GROUP BY products.id
ORDER BY product

>
> I you don't already have log_autovacuum_min_duration set to 0 (or
> something very close to it) on the primary, I recommend you set
> that; it would be useful to see if this corresponds to a vacuum on
> the table.
>

It is already at zero. There was no vacuuming on the primary around that
time. I've seen some other cancellations which were probably due to the
vacuuming on the primary (even though hot_standby_feedback is set to on),
but then the error on the standby looked like:

DETAIL: User was holding a relation lock for too long.

I'm not sure why that happened either, but I suppose it has a different
cause. At the time they happened on the standby there was vacuuming of one
table participating in the select query on the primary.

> What is your maintenance_work_mem setting on the primary? Does
> increasing it change the behavior? (You may want to jack that up
> really high for diagnostic purposes.)
>

It's set to 400MB on the primary. These cancellations happened on the
production system and they are very rare (maybe once or twice a week) and I
don't have a way to reproduce them. So I don't think I have a way to
correlate changing of the configuration parameters to query cancellation.

BTW, above mentioned cancellation because of the relation lock and
vacuuming on the primary also happens very rarely and I don't know how to
reproduce that either.

> If you can associate the problem with a particular table, seeing
> the definition of that table wouldn't hurt, either.
>

I don't think I can associate the problem with one table in the above query.

>
>
> Oh, and please update to 9.3.6; there are some nasty bugs fixed.
>
>
It's scheduled for today. :-)

>
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Simon Riggs 2015-02-23 10:12:28 Re: Query cancellation on hot standby because of buffer pins
Previous Message Kevin Grittner 2015-02-22 14:27:25 Re: How can i keep my databases when installing PostgreSQL ?