Re: Funny hang on PostgreSQL 10 during parallel index scan on slave

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: chris(dot)travers(at)adjust(dot)com
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Funny hang on PostgreSQL 10 during parallel index scan on slave
Date: 2018-09-05 18:03:43
Message-ID: CAEepm=1fg5Sm92WQB0XjAnsCkMjRges0bxs-2AwM4m9CwhHGSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 5, 2018 at 10:13 AM Chris Travers <chris(dot)travers(at)adjust(dot)com> wrote:
> On Wed, Sep 5, 2018 at 6:55 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>> > On Wed, Sep 5, 2018 at 6:40 PM Chris Travers <chris(dot)travers(at)adjust(dot)com>
>> > wrote:
>> > >> Do you mean this loop in dsm_impl_posix_resize() is getting
>> > >> interrupted constantly and never completing?
>> > >>
>> > >> /* We may get interrupted, if so just retry. */
>> > >> do
>> > >> {
>> > >> rc = posix_fallocate(fd, 0, size);
>> > >> } while (rc == EINTR);
>> > >>
>>
>> Probably worthwile to check that the dsm code is properly robust if
>> errors are thrown from within here.

Yeah, currently dsm_impl_posix_resize() returns and lets
dsm_impl_posix() clean up (close(), shm_unlink()) before raising
errors. We can't just let CHECK_FOR_INTERRUPTS() take a non-local
exit. Some refactoring involving PG_TRY()/PG_CATCH() may be the
simplest way forward.

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-09-05 18:08:11 Re: Bug fix for glibc broke freebsd build in REL_11_STABLE
Previous Message Tom Lane 2018-09-05 17:49:34 Re: pgsql: Clean up after TAP tests in oid2name and vacuumlo.