Re: Running PostgreSQL as fast as possible no matter the consequences

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Running PostgreSQL as fast as possible no matter the consequences
Date: 2010-11-15 15:06:04
Message-ID: AANLkTi=4YuXEphsR3XW3Ua=eQGnvbpPsx7hf0T9sVi3C@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Nov 5, 2010 at 8:12 AM, Jon Nelson <jnelson+pgsql(at)jamponi(dot)net> wrote:
> On Fri, Nov 5, 2010 at 7:08 AM, Guillaume Cottenceau <gc(at)mnc(dot)ch> wrote:
>> Marti Raudsepp <marti 'at' juffo.org> writes:
>>
>>> On Fri, Nov 5, 2010 at 13:32, A B <gentosaker(at)gmail(dot)com> wrote:
>>>> I was just thinking about the case where I will have almost 100%
>>>> selects, but still needs something better than a plain key-value
>>>> storage so I can do some sql queries.
>>>> The server will just boot, load data, run,  hopefully not crash but if
>>>> it would, just start over with load and run.
>>>
>>> If you want fast read queries then changing
>>> fsync/full_page_writes/synchronous_commit won't help you.
>>
>> That illustrates how knowing the reasoning of this particular
>> requests makes new suggestions worthwhile, while previous ones
>> are now seen as useless.
>
> I disagree that they are useless - the stated mechanism was "start,
> load data, and run". Changing the params above won't likely change
> much in the 'run' stage but would they help in the 'load' stage?

Yes, they certainly will. And they might well help in the run stage,
too, if there are temporary tables in use, or checkpoints flushing
hint bit updates, or such things.

It's also important to crank up checkpoint_segments and
checkpoint_timeout very high, especially for the bulk data load but
even afterwards if there is any write activity at all. And it's
important to set shared_buffers correctly, too, which helps on
workloads of all kinds. But as said upthread, turning off fsync,
full_page_writes, and synchronous_commit are the things you can do
that specifically trade reliability away to get speed.

In 9.1, I'm hopeful that we'll have unlogged tables, which will even
better than turning these parameters off, and for which I just posted
a patch to -hackers. Instead of generating WAL and writing WAL to the
OS and then NOT trying to make sure it hits the disk, we just won't
generate it in the first place. But if PostgreSQL or the machine it's
running on crashes, you won't need to completely blow away the cluster
and start over; instead, the particular tables that you chose to
create as unlogged will be truncated, and the rest of your data,
including the system catalogs, will still be intact.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2010-11-15 15:12:59 Re: Difference between explain analyze and real execution time
Previous Message Artur Zając 2010-11-15 08:21:34 Difference between explain analyze and real execution time