Re: BUG #15641: Autoprewarm worker fails to start on Windows with huge pages in use Old PostgreSQL community/pgsql-bugs x

From: Mithun Cy <mithun(dot)cy(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Hans Buschmann <buschmann(at)nidsa(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #15641: Autoprewarm worker fails to start on Windows with huge pages in use Old PostgreSQL community/pgsql-bugs x
Date: 2019-03-18 17:42:24
Message-ID: CADq3xVYcKxPantnV+HoHXER7Dg-n5ipsHKs11ibq0ueVYSG-7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Thanks Robert,
On Mon, Mar 18, 2019 at 9:01 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Mon, Mar 18, 2019 at 3:04 AM Mithun Cy <mithun(dot)cy(at)gmail(dot)com> wrote:
> > autoprewarm waorker should not be restarted. As per the code
> @apw_start_database_worker@ master starts a worker per database and wait
> until it exit by calling WaitForBackgroundWorkerShutdown. The call
> WaitForBackgroundWorkerShutdown cannot handle the case if the worker was
> restarted. The WaitForBackgroundWorkerShutdown() get the status
> BGWH_STOPPED from the call GetBackgroundWorkerPid() if worker got
> restarted. So master will next detach the shared memory and next restarted
> worker keep failing going in a unending loop.
>
> Ugh, that seems like a silly oversight. Does it fix the reported problem?
>

-- Yes this fixes the reported issue, Hans Buschmann has given below steps
to reproduce.

> This seems easy to reproduce:
>
> - Install/create a database with autoprewarm on and pg_prewarm loaded.
> - Fill the autoprewarm cache with some data
> - pg_dump the database
> - drop the database
> - create the database and pg_restore it from the dump
> - start the instance and logs are flooded

-- It is explained earlier [1] that they used older autoprewarm.blocks
which was generated before drop database. So on restrart autoprewarm worker
failed to connect to droped database and then lead to retry loop. This
patch should fix same.

NOTE : Also, another kind of error user might see because of same bug is,
restarted worker getting connected to next database in autoprewarm.blocks
because autoprewarm master updated shared data "apw_state->database =
current_db;" to start new worker for next database. Both restarted worker
and newly created worker will connect to same database(next one) and try to
load same pages. Hence end up with spurious log messages like "LOG:
autoprewarm successfully prewarmed 13 of 11 previously-loaded blocks"

If I understand correctly, the commit message would be something like this:
>
> ==
> Don't auto-restart per-database autoprewarm workers.
>
> We should try to prewarm each database only once. Otherwise, if
> prewarming fails for some reason, it will just keep retrying in an
> infnite loop. The existing code was intended to implement this
> behavior, but because it neglected to set worker.bgw_restart_time, the
> per-database workers keep restarting, contrary to what was intended.
>
> Mithun Cy, per a report from Hans Buschmann
> ==
>
> Does that sound right?
>

-- Yes I Agree.

[1]
https://www.postgresql.org/message-id/D2B9F2A20670C84685EF7D183F2949E202569F21%40gigant.nidsa.net

--
Thanks and Regards
Mithun Chicklore Yogendra
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jeff Janes 2019-03-18 18:15:30 Re: BUG #15700: PG 10 vs. 11: Large increase in memory usage when selecting BYTEA data (maybe memory leak)
Previous Message Tom Lane 2019-03-18 16:33:31 Re: BUG #15700: PG 10 vs. 11: Large increase in memory usage when selecting BYTEA data (maybe memory leak)

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-03-18 18:05:01 Re: [HACKERS] Block level parallel vacuum
Previous Message Chapman Flack 2019-03-18 17:27:10 Re: Fix XML handling with DOCTYPE