Re: ssl tests aren't concurrency safe due to get_free_port()

From: Andres Freund <andres(at)anarazel(dot)de>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: ssl tests aren't concurrency safe due to get_free_port()
Date: 2022-11-16 01:51:05
Message-ID: 20221116015105.g6qxu2ugpj6agt26@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-11-15 15:56:37 -0500, Andrew Dunstan wrote:
> On 2022-11-06 Su 11:30, Andrew Dunstan wrote:
> >
> > One possible addition would be to add removing the reservation files in
> > an END handler. That would be pretty simple.
> >
> >
>
>
> Here's a version with that. I suggest we try it out and see if anything
> breaks.

Thanks! I agree it makes sense to go ahead with this.

I'd guess we should test drive this a bit in HEAD but eventually backpatch?

> diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
> index d80134b26f..85fae32c14 100644
> --- a/src/test/perl/PostgreSQL/Test/Cluster.pm
> +++ b/src/test/perl/PostgreSQL/Test/Cluster.pm

I think this should also update the comment for get_free_port(), since the
race mentioned there is largely addressed by this patch?

> @@ -140,6 +143,27 @@ INIT
>
> # Tracking of last port value assigned to accelerate free port lookup.
> $last_port_assigned = int(rand() * 16384) + 49152;
> +
> + # Set the port lock directory
> +
> + # If we're told to use a directory (e.g. from a buildfarm client)
> + # explicitly, use that
> + $portdir = $ENV{PG_TEST_PORT_DIR};
> + # Otherwise, try to use a directory at the top of the build tree
> + if (! $portdir && $ENV{MESON_BUILD_ROOT})
> + {
> + $portdir = $ENV{MESON_BUILD_ROOT} . '/portlock'
> + }
> + elsif (! $portdir && ($ENV{TESTDATADIR} || "") =~ /\W(src|contrib)\W/p)
> + {
> + my $dir = ${^PREMATCH};
> + $portdir = "$dir/portlock" if $dir;
> + }
> + # As a last resort use a directory under tmp_check
> + $portdir ||= $PostgreSQL::Test::Utils::tmp_check . '/portlock';
> + $portdir =~ s!\\!/!g;
> + # Make sure the directory exists
> + mkpath($portdir) unless -d $portdir;
> }

Perhaps we should just export a directory in configure instead of this
guessing game?

> =pod
> @@ -1505,6 +1529,7 @@ sub get_free_port
> last;
> }
> }
> + $found = _reserve_port($port) if $found;
> }
> }
>
> @@ -1535,6 +1560,40 @@ sub can_bind
> return $ret;
> }
>
> +# Internal routine to reserve a port number
> +# Returns 1 if successful, 0 if port is already reserved.
> +sub _reserve_port
> +{
> + my $port = shift;
> + # open in rw mode so we don't have to reopen it and lose the lock
> + my $filename = "$portdir/$port.rsv";
> + sysopen(my $portfile, $filename, O_RDWR|O_CREAT)
> + || die "opening port file $filename";

Perhaps add $! to the message so e.g. permission denied errors or such are
easier to debug?

> + # take an exclusive lock to avoid concurrent access
> + flock($portfile, LOCK_EX) || die "locking port file $filename";

dito

> + # see if someone else has or had a reservation of this port
> + my $pid = <$portfile>;
> + chomp $pid;
> + if ($pid +0 > 0)
> + {
> + if (kill 0, $pid)
> + {
> + # process exists and is owned by us, so we can't reserve this port
> + close($portfile);
> + return 0;
> + }
> + }
> + # All good, go ahead and reserve the port, first rewind and truncate.
> + # If truncation fails it's not a tragedy, it just might leave some
> + # trailing junk in the file that won't affect us.
> + seek($portfile, 0, SEEK_SET);
> + truncate($portfile, 0);

Perhaps check truncate's return value?

> + print $portfile "$$\n";
> + close($portfile);
> + push(@port_reservation_files, $filename);
> + return 1;
> +}

Perhaps it'd be better to release the file lock explicitly? flock() has this
annoying behaviour of only releasing the lock when the last file descriptor
for a file is closed. We shouldn't end up with dup'd FDs or forks here, but it
seems like it might be more robust to just explicitly release the lock?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ian Lawrence Barwick 2022-11-16 01:52:35 Re: Documentation for building with meson
Previous Message Peter Smith 2022-11-16 01:38:56 Re: [DOCS] Stats views and functions not in order?