Re: performance for high-volume log insertion

From: david(at)lang(dot)hm
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Ben Chobot <bench(at)silentmedia(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: performance for high-volume log insertion
Date: 2009-04-21 16:59:01
Message-ID: alpine.DEB.1.10.0904210948070.12662@asgard.lang.hm
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, 21 Apr 2009, Stephen Frost wrote:

> * Ben Chobot (bench(at)silentmedia(dot)com) wrote:
>> On Mon, 20 Apr 2009, david(at)lang(dot)hm wrote:
>>> one huge advantage of putting the sql into the configuration is the
>>> ability to work around other users of the database.
>>
>> +1 on this. We've always found tools much easier to work with when they
>> could be adapted to our schema, as opposed to changing our process for
>> the tool.
>
> I think we're all in agreement that we should allow the user to define
> their schema and support loading the data into it. The question has
> been if the user really needs the flexibility to define arbitrary SQL to
> be used to do the inserts.
>
> Something I'm a bit confused about, still, is if this is really even a
> problem. It sounds like rsyslog already allows arbitrary SQL in the
> config file with some kind of escape syntax for the variables. Why not
> just keep that, but split it into a prepared query (where you change the
> variables to $NUM vars for the prepared statement) and an array of
> values (to pass to PQexecPrepared)?
>
> If you already know how to figure out what the variables in the
> arbitrary SQL statement are, this shouldn't be any more limiting than
> today, except where a prepared query can't have a variable argument but
> a non-prepared query can (eg, table name). You could deal with that
> with some kind of configuration variable that lets the user choose to
> use prepared queries or not though, or some additional syntax that
> indicates certain variables shouldn't be translated to $NUM vars (eg:
> $*blah instead of $blah).

I think the key thing is that rsyslog today doesn't know anything about
SQL variables, it just creates a string that the user and the database say
looks like a SQL statement.

an added headache is that the rsyslog config does not have the concept of
arrays (the closest that it has is one special-case hack to let you
specify one variable multiple times)

if the performance win of the prepared statement is significant, then it's
probably worth the complication of changing these things.

David Lang

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Smith 2009-04-21 17:17:41 Re: performance for high-volume log insertion
Previous Message Stephen Frost 2009-04-21 16:37:48 Re: performance for high-volume log insertion