Re: Finding cause of test fails on the cfbot site

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Finding cause of test fails on the cfbot site
Date: 2021-02-19 14:54:34
Message-ID: ae2334fa-492c-f5d7-2e56-b3015185b639@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 2/18/21 11:01 AM, Andrew Dunstan wrote:
> On 2/17/21 3:42 PM, Thomas Munro wrote:
>> On Thu, Feb 18, 2021 at 9:18 AM Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>>> On 2/17/21 11:06 AM, Tom Lane wrote:
>>>> Peter Smith <smithpb2250(at)gmail(dot)com> writes:
>>>>> I saw that one of our commitfest entries (32/2914) is recently
>>>>> reporting a fail on the cfbot site [1]. I thought this was all ok a
>>>>> few days ago.
>>>>> ...
>>>>> Is there any other detailed information available anywhere, e.g.
>>>>> logs?, which might help us work out what was the cause of the test
>>>>> failure?
>>>> AFAIK the cfbot doesn't capture anything beyond the session typescript.
>>>> However, this doesn't look that hard to reproduce locally ... have you
>>>> tried, using similar configure options to what that cfbot run did?
>>>> Once you did reproduce it, there'd be logs under
>>>> contrib/test_decoding/tmp_check/.
>>> yeah. The cfbot runs check-world which makes it difficult for it to know
>>> which log files to show when there's an error. That's a major part of
>>> the reason the buildfarm runs a much finer grained set of steps.
>> Yeah, it's hard to make it print out just the right logs without
>> dumping so much stuff that it's hard to see the wood for the trees;
>> perhaps if the Makefile had an option to dump relevant stuff for the
>> specific tests that failed, or perhaps the buildfarm is already better
>> at that and cfbot should just use the buildfarm client directly. Hmm.
>> Another idea would be to figure out how to make a tarball of all log
>> files that you can download for inspection with better tools at home
>> when things go wrong. It would rapidly blow through the 1GB limit for
>> stored "artefacts" on open source/community Cirrus accounts though, so
>> we'd need to figure out how to manage retention carefully.
>
> I did some thinking about this. How about if we have the make files and
> the msvc build system create a well known file with the location(s) to
> search for log files if there's an error. Each bit of testing could
> overwrite this file before starting testing, and then tools like cfbot
> would know where to look for files to report? To keep things clean for
> other users the file would only be created if, say,
> PG_NEED_ERROR_LOG_LOCATIONS is set. The well known location would be
> something like "$(top_builddir)/error_log_locations.txt", and individual
> Makefiles would have entries something like:,
>
>
> override ERROR_LOG_LOCATIONS =
> $(top_builddir)/contrib/test_decoding/tmp_check/log
>
>
> If this seems like a good idea I can go and try to make that happen.
>
>

here's a very small and simple (and possibly naive)  POC patch that
demonstrates this and seems to do the right thing.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment Content-Type Size
error_log_locations_POC.patch text/x-patch 2.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-02-19 14:57:22 Re: pg_config_h.in not up-to-date
Previous Message Seamus Abshere 2021-02-19 14:53:46 Re: A reloption for partitioned tables - parallel_workers