Re: log : bad file dscriptor????

From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: pgsql-bugs(at)postgresql(dot)org, Ati Rosselet <ati(dot)rosselet(at)gmail(dot)com>
Subject: Re: log : bad file dscriptor????
Date: 2009-03-14 18:30:20
Message-ID: 65937bea0903141130n28f593fcm823486849d98297b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

--- Sorry about the previous mail; that did not have the proper subject line
(for no fault of mine)

Hi All,

Acting on a customer's report I analyzed this bug, and have found a fix
for it. It is not a critical error, but it definitely is a bug, and can have
security implications.

This error is raised from syslogger.c, and since this sub-process is not
responsible for any backend communication, the backend (or the non-backend)
that raised this error has already successfully done its work of
communicating the message back to the application, if any.

The security implication is that, that the actual error which was
supposed to be also logged in the server log file, is lost; and instead we
have this error message in the WindowsErrorLog/Syslog.

Issue:
====

On Windows, the write to log file is done by a thread (whose main
function is pipeThread() ), and since it works completely independent of the
SysLoggerMain() ( which is responsible for calling logfile_rotate()
periodically, which in turn changes the global variable syslogFile) this is
causing a race condition due to an error in the way we are using the related
critical section.

Here's the flow in my opinion which is causing this error:

main => SysLoggerMain() and thread => pipeThread()

main : calls logfile_rotate()
thread: calls process_pipe_input()/flush_pipe_input()
-> calls write_syslogger_file()
thread: reads in the current value of global variable
syslogFile.
main : enters critical section sysfileSection, and
assigns a new value to the global variable
syslogFile
thread: blocked by the same critical section
main : leaves critical section
thread: enters the critical section and attempts to
write to the old value of syslogFile.
OS : throws an error, as that handle is already
closed by main

The solution is to read the global variable inside the critical section, in
write_syslogger_file().

How to reproduce:
=================

Apply the syslogger_race_bug_reproduce.patch patch.

Set the following GUCs in postgresql.conf:

logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_rotation_size = 1kB
log_min_messages = notice

Restart the DB for these changes to take effect.

Create the following plpgsql function:

create or replace function raise_notices() returns void as $$
begin
loop
raise notice 'dummy';
end loop;
end;
$$ language plpgsql;

Execute: select raise_notices();

Keep an eye on Windows error log (refresh often;
screenshot<http://img5.imageshack.us/img5/9477/windowserrorlogq.jpg>
).

Attachments:
=========

syslogger_race_bug_reproduce.patch :
To reproduce the bug.

syslogger_race_bug_reproduce_temp_fix.patch :
To see the effect of final patch, but with other support code.

syslogger_race_bugfix.patch :
Final patch, ready for application to HEAD.

Best regards,

On Sat, Oct 25, 2008 at 8:43 PM, Ati Rosselet <ati(dot)rosselet(at)gmail(dot)com>wrote:

> I'm still getting a lot of these entries in my eventlog whenever I have a
> reasonably large amount of logging:
>
> Event Type: Error
> Event Source: PostgreSQL
> Event Category: None
> Event ID: 0
> Date: 10/22/2008
> Time: 9:36:28 AM
> User: N/A
> Computer: ----------
> Description:
> could not write to log file: Bad file descriptor
>
> production server 8.3.3 (tried with 8.3.4 - same thing), win 2003 server.
> I'd register this as a formal bug, but I can't figure out how to reproduce
> it.. and
> apparently noone else has seen this? Or is noone else running postgres on
> win2003??? (survey says??)
>
> The timing of each appears to be when log is rolled over (size limited to
> 10MB) and postgres attempts to write a log entry at the same time:
>
> Any ideas? help? How to proceed???
> Cheers
> Ati
>
> My log settings from postgresql.conf:
>
> #-----------------------------
> -------------------------------------------------
> # ERROR REPORTING AND LOGGING
>
> #------------------------------------------------------------------------------
> # - Where to Log -
> log_destination = 'stderr' # Valid values are combinations of
> # This is used when logging to stderr:
> logging_collector = on # Enable capturing of stderr and csvlog
>
> # These are only used if logging_collector is on:
> #log_directory = 'pg_log' # directory where log files are written,
> # can be absolute or relative to PGDATA
> #log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name
> pattern,
> # can include strftime() escapes
> #log_truncate_on_rotation = off # If on, an existing log file of the
> # same name as the new log file will be
> # truncated rather than appended to.
> # But such truncation only occurs on
> # time-driven rotation, not on restarts
> # or size-driven rotation. Default is
> # off, meaning append to existing files
> # in all cases.
> #log_rotation_age = 1d # Automatic rotation of logfiles will
> # happen after that time. 0 to disable.
> log_rotation_size = 10MB # Automatic rotation of logfiles will
> # happen after that much log output.
> # 0 to disable.
>
> # These are relevant when logging to syslog:
> #syslog_facility = 'LOCAL0'
> #syslog_ident = 'postgres'
>
>
> # - When to Log -
>
> #client_min_messages = notice # values in order of decreasing
> detail:
>
> #log_min_messages = notice # values in order of decreasing detail:
>
> #log_error_verbosity = default # terse, default, or verbose messages
>
> #log_min_error_statement = error # values in order of decreasing detail:
> #log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements
> # and their durations, > 0 logs only
> # statements running at least this time.
>
> #silent_mode = off # DO NOT USE without syslog or
> # logging_collector
> # (change requires restart)
> # - What to Log -
>
> #debug_print_parse = off
> #debug_print_rewritten = off
> #debug_print_plan = off
> #debug_pretty_print = off
> #log_checkpoints = off
> log_connections = on
> log_disconnections = on
> #log_duration = off
> #log_hostname = off
> #log_line_prefix = '%t ' # special values:
>
> log_line_prefix = '%t [%h:%u(at)%d] '
> #log_lock_waits = off # log lock waits >= deadlock_timeout
> log_statement = 'all' # none, ddl, mod, all
> #log_temp_files = -1 # log temporary files equal or larger
> # than specified size;
> # -1 disables, 0 logs all temp files
> #log_timezone = unknown # actually, defaults to TZ environment
>

--
gurjeet[(dot)singh](at)EnterpriseDB(dot)com
singh(dot)gurjeet(at){ gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device

Attachment Content-Type Size
syslogger_race_bug_reproduce.patch application/octet-stream 744 bytes
syslogger_race_bug_reproduce_temp_fix.patch application/octet-stream 1.4 KB
syslogger_race_bugfix.patch application/octet-stream 1.1 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Dave Page 2009-03-14 19:16:19 Re: BUG #4705: Software developer
Previous Message Dave Page 2009-03-14 18:25:21 Re: BUG #4705: Software developer

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2009-03-14 18:31:51 Re: Has anybody think about changing BLCKSZ to an option of initdb?
Previous Message Josh Berkus 2009-03-14 18:26:53 Re: Column Aliases WAS: Prepping to break every past release...