Re: Improving connection scalability: GetSnapshotData()

From: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Ian Barwick <ian(dot)barwick(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, Bruce Momjian <bruce(at)momjian(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: Improving connection scalability: GetSnapshotData()
Date: 2020-10-01 20:00:20
Message-ID: 2355d1f0-0244-da9c-ef0c-7542b944e1ac@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 10/1/20 2:26 PM, Andres Freund wrote:
> Hi Ian, Andrew, All,
>
> On 2020-09-30 15:43:17 -0700, Andres Freund wrote:
>> Attached is an updated version of the test (better utility function,
>> stricter regexes, bailing out instead of failing just the current when
>> psql times out). I'm leaving it in this test for now, but it's fairly
>> easy to use this way, in my opinion, so it may be worth moving to
>> PostgresNode at some point.
> I pushed this yesterday. Ian, thanks again for finding this and helping
> with fixing & testing.
>
>
> Unfortunately currently some buildfarm animals don't like the test for
> reasons I don't quite understand. Looks like it's all windows + msys
> animals that run the tap tests. On jacana and fairywren the new test
> fails, but with a somewhat confusing problem:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2020-10-01%2015%3A32%3A34
> Bail out! aborting wait: program timed out
> # stream contents: >>data
> # (0 rows)
> # <<
> # pattern searched for: (?m-xis:^\\(0 rows\\)$)
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2020-10-01%2014%3A12%3A13
> Bail out! aborting wait: program timed out
> stream contents: >>data
> (0 rows)
> <<
> pattern searched for: (?^m:^\\(0 rows\\)$)
>
> I don't know with the -xis indicates on jacana, and why it's not present
> on fairywren. Nor do I know why the pattern doesn't match in the first
> place, sure looks like it should?
>
> Andrew, do you have an insight into how mingw's regex match differs
> from native windows and proper unixoid systems? I guess it's somewhere
> around line endings or such?
>
> Jacana successfully deals with 013_crash_restart.pl, which does use the
> same mechanis as the new 021_row_visibility.pl - I think the only real
> difference is that I used ^ and $ in the regexes in the latter...

My strong suspicion is that we're getting unwanted CRs. Note the
presence of numerous instances of this in PostgresNode.pm:

$stdout =~ s/\r\n/\n/g if $Config{osname} eq 'msys';

So you probably want something along those lines at the top of the loop
in send_query_and_wait:

$$psql{stdout} =~ s/\r\n/\n/g if $Config{osname} eq 'msys';

possibly also for stderr, just to make it more futureproof, and at the
top of the file:

use Config;

Do you want me to test that first?

The difference between the canonical way perl states the regex is due to
perl version differences. It shouldn't matter.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-10-01 20:22:01 Re: Improving connection scalability: GetSnapshotData()
Previous Message Andres Freund 2020-10-01 19:12:44 buildfarm animal shoveler failing with "Illegal instruction"