Re: Parallel Index Scan vs BTP_DELETED and BTP_HALF_DEAD

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Index Scan vs BTP_DELETED and BTP_HALF_DEAD
Date: 2017-12-13 02:41:15
Message-ID: CAA4eK1KCm96hcz4uyYMfkFFx0HagyuURLh_H9ETyo-hH9FaM0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 13, 2017 at 7:02 AM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> Hi,
>
> Here's a reproducer which enabled me to reach this stuck state:
>
> pid | wait_event | query
> -------+---------------+-----------------------------------------------------------------------------
> 64617 | | select pid, wait_event, query from
> pg_stat_activity where state = 'active';
> 64619 | BufferPin | VACUUM jobs
> 64620 | ExecuteGather | SELECT COUNT(*) FROM jobs
> 64621 | ExecuteGather | SELECT COUNT(*) FROM jobs
> 64622 | ExecuteGather | SELECT COUNT(*) FROM jobs
> 64623 | ExecuteGather | SELECT COUNT(*) FROM jobs
> 84167 | BtreePage | SELECT COUNT(*) FROM jobs
> 84168 | BtreePage | SELECT COUNT(*) FROM jobs
> 96440 | | SELECT COUNT(*) FROM jobs
> 96438 | | SELECT COUNT(*) FROM jobs
> 96439 | | SELECT COUNT(*) FROM jobs
> (11 rows)
>
> The main thread deletes stuff in the middle of the key range (not sure
> if this is important) and vacuum in a loop, and meanwhile 4 threads
> (probably not important, might as well be 1) run Parallel Index Scans
> over the whole range, in the hope of hitting the interesting case. In
> the locked-up case I just saw now opaque->btpo_flags had the
> BTP_DELETED bit set, not BTP_HALF_DEAD (I could tell because I added
> logging).
>

Good. I hope that the patch I have posted above is able to resolve
this problem. I am asking as you haven't explicitly mentioned that.

> Clearly pages are periodically being marked half-dead but I
> haven't yet managed to get an index scan to hit one of those.
>

I think Kuntal has already able to hit that case, so maybe that is enough.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2017-12-13 02:41:45 Re: Incorrect debug info printed in generate_partition_wise_join_paths
Previous Message Amit Kapila 2017-12-13 02:37:48 Re: explain analyze output with parallel workers - question about meaning of information for explain.depesz.com