more autovacuum fixes

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: more autovacuum fixes
Date: 2007-06-19 17:49:21
Message-ID: 20070619174921.GZ4265@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

I attach a patch to fix some issues in the autovac process handling
code. Mainly, instead of rechecking periodically in the launcher
whether a worker was able to start or not, what we do is signal it from
the postmaster when it fails.

In order to do this I had to make the postmaster set a flag in shared
memory, much like the pmsignal.c code does. According to previous
discussions this is an acceptable thing for postmaster to do.

I must admit I did not try to fill shared memory with garbage and see if
it caused a postmaster crash. What I did test was creating a database,
making sure that the launcher was picking it up, and then delete the
database's directory. This correctly causes a worker to log errors when
trying to connect, and it is handled well -- i.e. other databases are
not starved. (One remaining problem I see is that a database in this
state, or similar, that fails to be vacuumed, *will* cause starvation if
in danger of Xid wraparound).

Another thing I noticed is that I can change the "autovacuum" parameter
in postgresql.conf and send SIGHUP to postmaster, which correctly
starts the launcher if it was previously disabled. To make the shutdown
case act accordingly, I made the launcher check the "start daemon" flag
after rereading the config file, and quit if it's off. I tried multiple
combinations of having it set and unset, changing
stats_collect_tuple_level and it works fine.

One problem with the patch is this (new code):

bn = (Backend *) malloc(sizeof(Backend));
! if (bn)
{
! bn->pid = StartAutoVacWorker();
! bn->is_autovacuum = true;
! /* we don't need a cancel key */

! if (bn->pid > 0)
! {
! /* FIXME -- unchecked memory allocation here */
! DLAddHead(BackendList, DLNewElem(bn));

If the palloc() inside DLNewElem fails, we will fail to report a "fork
failure" to the launcher. I am not sure how serious this is. One idea
that came to mind was using a PG_TRY block, sending the signal in the
CATCH block, and then rethrowing the exception. Is this acceptable?

All in all, this patch increases the reliability of autovacuum, so I
intend to apply it shortly unless there are objections. Further
improvements might come later as I become aware of possible fixes.

--
Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J
"Hay quien adquiere la mala costumbre de ser infeliz" (M. A. Evans)

Attachment Content-Type Size
autovacuum-procs.patch text/x-diff 18.5 KB

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2007-06-19 17:53:17 Re: [HACKERS] 'Waiting on lock'
Previous Message Michael Paesold 2007-06-19 17:37:40 Re: WIP: rewrite numeric division