Re: performance for high-volume log insertion

From: david(at)lang(dot)hm
To: PFC <lists(at)peufeu(dot)com>
Cc: Glenn Maynard <glennfmaynard(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: performance for high-volume log insertion
Date: 2009-05-02 00:49:38
Message-ID: alpine.DEB.1.10.0905011748090.15782@asgard
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, 2 May 2009, PFC wrote:

>> Blocking round trips to another process on the same server should be
>> fairly cheap--that is, writing to a socket (or pipe, or localhost TCP
>> connection) where the other side is listening for it; and then
>> blocking in return for the response. The act of writing to an FD that
>> another process is waiting for will make the kernel mark the process
>> as "ready to wake up" immediately, and the act of blocking for the
>> response will kick the scheduler to some waiting process, so as long
>> as there isn't something else to compete for CPU for, each write/read
>> will wake up the other process instantly. There's a task switching
>> cost, but that's too small to be relevant here.
>>
>> Doing 1000000 local round trips, over a pipe: 5.25s (5 *microseconds*
>> each), code attached. The cost *should* be essentially identical for
>> any local transport (pipes, named pipes, local TCP connections), since
>> the underlying scheduler mechanisms are the same.
>
> Roundtrips can be quite fast but they have a hidden problem, which is
> that everything gets serialized.
> This means if you have a process that generates data to insert, and a
> postgres process, and 2 cores on your CPU, you will never use more than 1
> core, because both are waiting on each other.
> Pipelining is a way to solve this...
> In the ideal case, if postgres is as fast as the data-generating
> process, each would use 1 core, yielding 2x speedup.
> Of course if one of the processes is like 10x faster than the other,
> it doesn't matter.

in the case of rsyslog there are config options to allow multiple
threads to be working on doing the inserts, so it doesn't need to be
serialized as badly as you are fearing (there is locking involved, so it
doesn't scale perfectly)

David Lang

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Glenn Maynard 2009-05-02 04:13:19 Re: performance for high-volume log insertion
Previous Message PFC 2009-05-02 00:29:40 Re: performance for high-volume log insertion