sidewinder has one failure

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: sidewinder has one failure
Date: 2020-01-03 12:01:03
Message-ID: CAA4eK1LHhERi06Q+MmP9qBXBBboi+7WV3910J0aUgz71LcnKAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

After my recent commit d207038053837ae9365df2776371632387f6f655,
sidewinder is failing with error "insufficient file descriptors .." in
test 006_logical_decoding.pl [1]. The detailed failure displays
messages as below:

006_logical_decoding_master.log
2020-01-02 19:51:05.567 CET [26174:3] 006_logical_decoding.pl LOG:
statement: ALTER SYSTEM SET max_files_per_process = 26;
2020-01-02 19:51:05.570 CET [2777:4] LOG: received fast shutdown request
2020-01-02 19:51:05.570 CET [26174:4] 006_logical_decoding.pl LOG:
disconnection: session time: 0:00:00.005 user=pgbf database=postgres
host=[local]
2020-01-02 19:51:05.571 CET [2777:5] LOG: aborting any active transactions
2020-01-02 19:51:05.572 CET [2777:6] LOG: background worker "logical
replication launcher" (PID 23736) exited with exit code 1
2020-01-02 19:51:05.572 CET [15764:1] LOG: shutting down
2020-01-02 19:51:05.575 CET [2777:7] LOG: database system is shut down
2020-01-02 19:51:05.685 CET [24138:1] LOG: starting PostgreSQL 12.1
on x86_64-unknown-netbsd7.0, compiled by gcc (nb2 20150115) 4.8.4,
64-bit
2020-01-02 19:51:05.686 CET [24138:2] LOG: listening on Unix socket
"/tmp/sxAcn7SAzt/.s.PGSQL.56110"
2020-01-02 19:51:05.687 CET [24138:3] FATAL: insufficient file
descriptors available to start server process
2020-01-02 19:51:05.687 CET [24138:4] DETAIL: System allows 19, we
need at least 20.
2020-01-02 19:51:05.687 CET [24138:5] LOG: database system is shut down

Here, I think it is clear that the failure happens because we are
setting the value of max_files_per_process as 26 which is low for this
machine. It seems to me that the reason it is failing is that before
reaching set_max_safe_fds, it has already seven open files. Now, I
see on my CentOS system, the value of already_open files is 3, 6 and 6
respectively for versions HEAD, 12 and 10. We can easily see the
number of already opened files by changing the error level from DEBUG2
to LOG for elog message in set_max_safe_fds. It is not very clear to
me how many files we can expect to be kept open during startup? Can
the number vary on different setups?

One possible way to fix is that we change the test to set
max_files_per_process to a slightly higher number say 35, but I am not
sure what will be the safe value for the same. Alternatively, we can
think of removing the test entirely, but it seems like a useful case
to test corner cases, so we have added it in the first place.

I am planning to investigate this further by seeing which all files
are kept open and why. I will share my findings on this further, but
in the meantime, if anyone has any thoughts on this matter, please
feel free to share the same.

[1] - https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2020-01-02%2018%3A45%3A25

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2020-01-03 12:01:18 Re: pgbench - use pg logging capabilities
Previous Message Michael Paquier 2020-01-03 11:46:07 Re: pgsql: Add basic TAP tests for psql's tab-completion logic.