From: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
---|---|
To: | Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com> |
Cc: | Fabio Isabettini <fisabettini(at)voipfuture(dot)com>, Arne Roland <A(dot)Roland(at)index(dot)de>, Sand Stone <sand(dot)m(dot)stone(at)gmail(dot)com>, Rick Otten <rottenwindfish(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | Re: dsa_allocate() faliure |
Date: | 2019-02-04 08:22:28 |
Message-ID: | CAEepm=2aHnTfPJnPbeS3AxO-ENoUg5-akuD-7PWYbn8+-c9JmQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
On Mon, Feb 4, 2019 at 6:52 PM Jakub Glapa <jakub(dot)glapa(at)gmail(dot)com> wrote:
> I see the error showing up every night on 2 different servers. But it's a bit of a heisenbug because If I go there now it won't be reproducible.
Huh. Ok well that's a lot more frequent that I thought. Is it always
the same query? Any chance you can get the plan? Are there more
things going on on the server, like perhaps concurrent parallel
queries?
> It was suggested by Justin Pryzby that I recompile pg src with his patch that would cause a coredump.
Small correction to Justin's suggestion: don't abort() after
elog(ERROR, ...), it'll never be reached.
> But I don't feel comfortable doing this especially if I would have to run this with prod data.
> My question is. Can I do anything like increasing logging level or enable some additional options?
> It's a production server but I'm willing to sacrifice a bit of it's performance if that would help.
If you're able to run a throwaway copy of your production database on
another system that you don't have to worry about crashing, you could
just replace ERROR with PANIC and run a high-speed loop of the query
that crashed in product, or something. This might at least tell us
whether it's reach that condition via something dereferencing a
dsa_pointer or something manipulating the segment lists while
allocating/freeing.
In my own 100% unsuccessful attempts to reproduce this I was mostly
running the same query (based on my guess at what ingredients are
needed), but perhaps it requires a particular allocation pattern that
will require more randomness to reach... hmm.
--
Thomas Munro
http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Tsunakawa, Takayuki | 2019-02-04 08:23:39 | RE: Protect syscache from bloating with negative cache entries |
Previous Message | Michael Banck | 2019-02-04 07:57:17 | Re: Online verification of checksums |
From | Date | Subject | |
---|---|---|---|
Next Message | suganthi Sekar | 2019-02-04 09:57:31 | Fw: server hardware tuning. |
Previous Message | Jakub Glapa | 2019-02-04 07:52:17 | Re: dsa_allocate() faliure |