From: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Bogus cleanup code in PostgresNode.pm |
Date: | 2016-04-26 04:21:03 |
Message-ID: | CAB7nPqSo-GpjEHqKi94U-sbsnPrJw4NkMfkS9NphXC+0JCapHQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Apr 25, 2016 at 11:51 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I noticed that even when they are successful, buildfarm members bowerbird
> and jacana tend to spew a lot of messages like this in their bin-check
> steps:
>
> Can't remove directory /home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/scripts/tmp_check/data_main_DdUf/pgdata/global: Directory not empty at /usr/lib/perl5/5.8/File/Temp.pm line 898
> Can't remove directory /home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/scripts/tmp_check/data_main_DdUf/pgdata/pg_xlog: Directory not empty at /usr/lib/perl5/5.8/File/Temp.pm line 898
> Can't remove directory /home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/scripts/tmp_check/data_main_DdUf/pgdata: Permission denied at /usr/lib/perl5/5.8/File/Temp.pm line 898
> Can't remove directory /home/pgrunner/bf/root/HEAD/pgsql.build/src/bin/scripts/tmp_check/data_main_DdUf: Directory not empty at /usr/lib/perl5/5.8/File/Temp.pm line 898
> ### Signalling QUIT to 9156 for node "main"
> # Running: pg_ctl kill QUIT 9156
>
> What is happening here is that the test script is not bothering to do an
> explicit $node->stop operation, and if it doesn't, the automatic cleanup
> steps happen in the wrong order: the File::Temp destructor for the temp
> data directory runs before PostgresNode.pm's DESTROY function, which is
> what's issuing the "pg_ctl kill" command. On Unix that's just messy,
> but on Windows it fails because you can't delete a process's working
> directory. I am not sure whether this is guaranteed wrong or just
> sometimes wrong; the Perl docs I can find say that destructors are run in
> unspecified order once interpreter shutdown begins. But by adding some
> debug printout I was able to verify on my own machine that the data
> directory was already gone when DESTROY runs.
The docs say regarding File::Temp that he object is removed once the
object goes out of scope in the parent:
http://search.cpan.org/~dagolden/File-Temp-0.2304/lib/File/Temp.pm
So basically it means that when we enter in PostgresNode's DESTROY the
temporary folder just "went out of scope" and has been removed?
DESTROY is run once per object, END is a global destructor, and END is
called really at the end of the execution. And actually one reason why
a DESTROY block instead of END is given by Alvaro here:
http://www.postgresql.org/message-id/20151201231121.GI2763@alvherre.pgsql
"
- I changed start/stop/restart so that they keep track of the postmaster
PID; also added a DESTROY sub to PostgresNode that sends SIGQUIT.
This means that when the test finishes, the server gets an immediate
stop signal. We were getting a lot of errors in the server log about
failing to write to the stats file otherwise, until the node noticed
that the datadir was gone.
"
> I believe we can fix this by forcing postmaster shutdown in an END
> routine instead of a DESTROY routine, and hence propose the attached
> patch, which does things in the right order for me. I'm a pretty
> poor Perl programmer, so I'd appreciate somebody vetting this.
Another, perhaps more solid approach, would be put the DESTROY method
in charge of removing PGDATA and extend TestLib::tempdir with an
argument to be able to switch to CLEANUP => 0 at will. Then we use
this argument for PGDATA after sending SIGQUIT.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2016-04-26 04:27:50 | Re: Support for N synchronous standby servers - take 2 |
Previous Message | Kyotaro HORIGUCHI | 2016-04-26 04:20:00 | Re: Verifying embedded oids in *recv is a bad idea |