| From: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
|---|---|
| To: | petr(at)2ndquadrant(dot)com |
| Cc: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: pg_stat_activity crashes |
| Date: | 2016-04-21 02:56:40 |
| Message-ID: | 20160421.115640.176305136.horiguchi.kyotaro@lab.ntt.co.jp |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello,
At Wed, 20 Apr 2016 15:14:16 +0200, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote in <571780A8(dot)4070902(at)2ndquadrant(dot)com>
> I noticed sporadic segfaults when selecting from pg_stat_activity on
> current HEAD.
>
> The culprit is the 53be0b1add7064ca5db3cd884302dfc3268d884e commit
> which added more wait info into the pg_stat_get_activity(). More
> specifically, the following code is broken:
>
> + proc = BackendPidGetProc(beentry->st_procpid);
> + wait_event_type = pgstat_get_wait_event_type(proc->wait_event_info);
>
> This needs to check if proc is NULL. When reading the code I noticed
> that the new functions pg_stat_get_backend_wait_event_type() and
> pg_stat_get_backend_wait_event() suffer from the same problem.
Good catch.
> Here is PoC patch which fixes the problem. I am wondering if we should
> raise warning in the pg_stat_get_backend_wait_event_type() and
> pg_stat_get_backend_wait_event() like the pg_signal_backend() does
> when proc is NULL instead of just returning NULL which is what this
> patch does though.
It still makes the two relevant columns in pg_stat_activity
inconsistent each other since it reads the procarray entry twice
without a lock on procarray.
The attached patch adds pgstat_get_wait_event_info to read
wait_event_info in more appropriate way. Then change
pg_stat_get_wait_event(_type) to take the wait_event_info.
Does this work for you?
We still may have an inconsistency between weit_event and query,
or beentry itself but preventing it would need to have local
copies of wait_event_info of all corresponding entries in
procarray, which will be overdone.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
| Attachment | Content-Type | Size |
|---|---|---|
| fix-pgstat-proc-null-v2.diff | text/x-patch | 4.5 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Robert Haas | 2016-04-21 03:06:22 | Re: "parallel= " information is not coming in pg_dumpall for create aggregate |
| Previous Message | Stephen Frost | 2016-04-21 02:50:21 | Re: pg_dump dump catalog ACLs |