Re: final CSVlog patch

From: Michael Glaesemann <grzm(at)seespotcode(dot)net>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: final CSVlog patch
Date: 2007-08-12 04:06:33
Message-ID: F7A9F697-998F-45C4-B519-7B39BB0D47CF@seespotcode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Looks like a nice feature for 8.3! I've looked over what I could. A
typo and a couple of wording suggestions below.

On Aug 10, 2007, at 21:21 , Andrew Dunstan wrote:

> ***************
> *** 2280,2285 ****
> --- 2287,2293 ----
> might not appear in <application>syslog</> output (a
> common example
> is dynamic-linker failure messages).
> This parameter can only be set at server start.
> + It is required to be on if CSV logs are to be generated.

<varname>start_log_collector</varname> must be on to generate CSV logs.

> ***************
> *** 2324,2329 ****
> --- 2332,2344 ----
> This parameter can only be set in the
> <filename>postgresql.conf</>
> file or on the server command line.
> </para>
> + <para>
> + If <varname>log_destination</varname> is set to
> <systemitem>csvlog</systemitem>,
> + <literal>.csv</literal> will be appended to the file
> name, to obtain the lofile

s/lofile/logfile/

"If log_destination ... just overwritten." might be rewritten as

If <varname>log_destination</varname> is set to <systemitem>csvlog</
systemitem>,
<literal>.csv</literal> will be appended to the timestamped
<varname>log_filename</varname>
to create the final log file name. (If log_filename ends in .log,
the .log suffix is overwritten.)

> + <para>
> + Putting csvlog into the log_destination list gives an
> + easy way to import the log files into a database table.
> + Here is a sample table that csvlog output could be
> + copied into:
> + </para>

Including csvlog in the log_destination list provides an convenient
way to import log files into a database table. Here is a sample table
to store csvlog output.

> + CREATE TABLE pglog

CREATE TABLE postgres_log

(pglog might be confused with the pg_ prefixed system catalogs.)

> + <programlisting>
> + COPY pglog FROM '/full/path/to/logfile.csv' WITH csv;

COPY postgres_log FROM /full/path/to/logfile.csv' WITH csv;

> + </programlisting>
> +
> + <para>
> + There are a few things you will need to do in order for
> + csvlog importing to produced files that you can import
> + easily and automatically:

There are a few things you need to import csvlog files easily and
automatically.

> +
> + <orderedlist>
> + <listitem>
> + <para>
> + Use a consistant, predictable naming scheme for your
> log files
> + with <varname>log_filename</varname>. This lets you
> guess what
> + the file name will be and know when an individual log
> file is
> + complete and therefore ready to be imported.

This lets you predict what the file name will be when it is ready to
be imported.

(*When* the log file is ready would be controlled by log_rotation_age
and perhaps other factors and not by the log_filename, wouldn't it?)

> + <listitem>
> + <para>
> + Set <varname>log_rotation_size</varname> = 0 to
> disable size-based
> + log rotation. That feature makes it more difficult
> to predict what
> + the log file name you need to import will be.

Set <varname>log_rotation_size</varname> to 0 to disable size-based
log rotation, as it
makes the log filename difficult to predict.

> Therefore, if you import
> + a partial log file, then later import the entire thing,
> + the unique index violation will cause the import to fail.
> + You need to wait until the log file is complete and
> closed
> + before trying to import the whole thing. Operating
> that way
> + will also protect you from accidentally trying to import
> + a partial line that hasn't been completely written
> yet, which
> + will also cause the COPY to fail.

If you import a partial log file and later import the file again when
its complete,
the primary key violation will cause the import to fail. Wait until
the log
is complete and closed before import. This will also protect against
accidently importing an partial line that hasn't been completely
written,
which would also cause the COPY to fail.

> Index: doc/src/sgml/func.sgml

> --- 11244,11251 ----
>
> <para>
> <function>pg_rotate_logfile</> signals the log-file manager
> to switch
> ! to a new output file immediately. This works only when the
> built-in
> ! log collector, is used for logging, since otherwise there
> is no log-file manager subprocess.

<function>pg_rotate_logifle</> signals the log-file manager to switch
immediately switch to a new output file. This works only when using the
built-in log collector, as otherwise there is no log-file manager
subprocess.

> *** src/include/postmaster/syslogger.h 25 Jul 2007 12:22:53 -0000 1.10
> --- src/include/postmaster/syslogger.h 11 Aug 2007 02:01:20 -0000
> ***************
> *** 24,32 ****
> * also cope with non-protocol data coming down the pipe, though
> we cannot
> * guarantee long strings won't get split apart.
> *
> ! * We use 't' or 'f' instead of a bool for is_last to make the
> protocol a tiny
> ! * bit more robust against finding a false double nul byte
> prologue. But we
> ! * still might find it in the len and/or pid bytes unless we're
> careful.
> */
>
> #ifdef PIPE_BUF
> --- 24,32 ----
> * also cope with non-protocol data coming down the pipe, though
> we cannot
> * guarantee long strings won't get split apart.
> *
> ! * We use non-nul bytes in is_last to make the protocol a tiny bit
> ! * more robust against finding a false double nul byte prologue. But
> ! * we still might find it in the len and/or pid bytes unless
> we're careful.

I guess there's a distinction between null and nul that I don't yet
understand as the spelling is consistent, but I'm just pointing it
out on the off-chance it's a typo.

> Index: src/backend/postmaster/syslogger.c
> ===================================================================
> RCS file: /cvsroot/pgsql/src/backend/postmaster/syslogger.c,v
> retrieving revision 1.36
> diff -c -r1.36 syslogger.c
> *** src/backend/postmaster/syslogger.c 4 Aug 2007 01:26:53 -0000 1.36
> --- src/backend/postmaster/syslogger.c 11 Aug 2007 02:01:13 -0000

> *** 1058,1063 ****
> --- 1200,1215 ----
> Log_filename, (unsigned long) timestamp);
> }
>
> + if (suffix != NULL)
> + {
> + len = strlen(filename);
> + if (len > 4 && (strcmp(filename+(len-4),".log") == 0))
> + {
> + len -= 4;
> + }

Just a style thing, but did you want to drop the braces since it's
only one line? There were a couple of other places like this, but
they also included comments pertaining to the branch, so I assumed
the braces were appropriate to clarify the branch extent.

Michael Glaesemann
grzm seespotcode net

In response to

Browse pgsql-patches by date

  From Date Subject
Next Message ITAGAKI Takahiro 2007-08-13 01:40:31 Re: Reduce the size of PageFreeSpaceInfo on 64bit platform
Previous Message Pavel Stehule 2007-08-11 18:00:48 ON DELETE SET NULL clauses do error when more than two columns are referenced to one table