Re: pg_stat_activity crashes

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: petr(at)2ndquadrant(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_stat_activity crashes
Date: 2016-04-21 02:56:40
Message-ID: 20160421.115640.176305136.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

At Wed, 20 Apr 2016 15:14:16 +0200, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote in <571780A8(dot)4070902(at)2ndquadrant(dot)com>
> I noticed sporadic segfaults when selecting from pg_stat_activity on
> current HEAD.
>
> The culprit is the 53be0b1add7064ca5db3cd884302dfc3268d884e commit
> which added more wait info into the pg_stat_get_activity(). More
> specifically, the following code is broken:
>
> + proc = BackendPidGetProc(beentry->st_procpid);
> + wait_event_type = pgstat_get_wait_event_type(proc->wait_event_info);
>
> This needs to check if proc is NULL. When reading the code I noticed
> that the new functions pg_stat_get_backend_wait_event_type() and
> pg_stat_get_backend_wait_event() suffer from the same problem.

Good catch.

> Here is PoC patch which fixes the problem. I am wondering if we should
> raise warning in the pg_stat_get_backend_wait_event_type() and
> pg_stat_get_backend_wait_event() like the pg_signal_backend() does
> when proc is NULL instead of just returning NULL which is what this
> patch does though.

It still makes the two relevant columns in pg_stat_activity
inconsistent each other since it reads the procarray entry twice
without a lock on procarray.

The attached patch adds pgstat_get_wait_event_info to read
wait_event_info in more appropriate way. Then change
pg_stat_get_wait_event(_type) to take the wait_event_info.

Does this work for you?

We still may have an inconsistency between weit_event and query,
or beentry itself but preventing it would need to have local
copies of wait_event_info of all corresponding entries in
procarray, which will be overdone.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
fix-pgstat-proc-null-v2.diff text/x-patch 4.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-04-21 03:06:22 Re: "parallel= " information is not coming in pg_dumpall for create aggregate
Previous Message Stephen Frost 2016-04-21 02:50:21 Re: pg_dump dump catalog ACLs