Re: turn off caching for performance test

From: Willy-Bas Loos <willybas(at)gmail(dot)com>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, robertmhaas(at)gmail(dot)com, Pierre C <lists(at)peufeu(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: turn off caching for performance test
Date: 2010-09-30 16:41:55
Message-ID: AANLkTikmB=1H=EwqxSzjz4D-Li3cAj9eQXgmUAX_NfdO@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

Sorry for the late answer.
I found the query i was looking for in the log (duration) and could
prove that the seqscan is faster if the data were not cached.
This particular one was 22% faster.
It is "a query which will get turned into a nested loop index scan for
a lot of rows, on a huge table", but it's only 22% slower without a
seqscan.
(there's no advantage with seqscans off, as long as the cache is empty)

I found few queries that did sequential scans in the normal mode on
tables that matter.
I found one query that did a seqscan anyway(with enable_seqscan off),
because doing an index scan would be more than 1M points more
expensive (to the planner).

$ grep ^[^#] /etc/postgresql/8.3/main/postgresql.conf|grep -e ^[^[:space:]]
data_directory =<blah> # use data in another directory
hba_file = <blah> # host-based authentication file
ident_file = <blah> # ident configuration file
external_pid_file = <blah> # write an extra PID file
listen_addresses = '*' # what IP address(es) to listen on;
port = 5432 # (change requires restart)
max_connections = 200 # (change requires restart)
unix_socket_directory = '/var/run/postgresql' # (change requires restart)
tcp_keepalives_idle = 120 # TCP_KEEPIDLE, in seconds;
tcp_keepalives_interval = 0 # TCP_KEEPINTVL, in seconds;
tcp_keepalives_count = 0 # TCP_KEEPCNT;
shared_buffers = 2GB # min 128kB or max_connections*16kB
temp_buffers = 24MB # min 800kB
work_mem = 100MB # min 64kB
maintenance_work_mem = 256MB # min 1MB
max_fsm_pages = 600000 # min max_fsm_relations*16, 6 bytes each
synchronous_commit = off # immediate fsync at commit
checkpoint_segments = 16 # in logfile segments, min 1, 16MB each
effective_cache_size = 4GB
log_min_duration_statement = 2000 # -1 is disabled, 0 logs all
statements --> milliseconds
log_line_prefix = '%t ' # special values:
autovacuum = on # Enable autovacuum subprocess? 'on'
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8' # locale for system error message
lc_monetary = 'en_US.UTF-8' # locale for monetary formatting
lc_numeric = 'en_US.UTF-8' # locale for number formatting
lc_time = 'en_US.UTF-8' # locale for time formatting
default_text_search_config = 'pg_catalog.english'
max_locks_per_transaction = 128 # min 10

We have 15K rpm SAS disks in RAID10.
We have 16 GB of RAM and 4 modern processor cores (i think xeons,
might also be opteron)
We run Debian Lenny.
It's a dedicated DB server, there is one other cluster on it without
very much data and with few connections to it daily.
df -h on the data dir gives me 143G
we're growing
there are many queries that should be optimized
the seqscan option is in the connection string, not in the postgresql.conf

Cheers,

On Fri, Aug 27, 2010 at 7:57 PM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> wrote:
> On Thu, Aug 26, 2010 at 4:32 AM, Willy-Bas Loos <willybas(at)gmail(dot)com> wrote:
>> Hi,
>>
>> I have a colleague that is convinced that the website is faster if
>> enable_seqscan is turned OFF.
>> I'm convinced of the opposite (better to leave it ON), but i would like to
>> show it, prove it to him.
>
> Stop, you're both doing it wrong.  The issue isn't whether or not
> turning off seq scans will make a few things faster here and there,
> it's why is the query planner choosing sequential scans when it should
> be choosing index scans.
>
> So, what are your non-default settings in postgresql.conf?
> Have you increased effective_cache_size yet?
> Lowered random_page_cost?
> Raised default stats target and re-analyzed?
>
> Have you been looking at the problem queries with explain analyze?
> What does it have to say about the planners choices?
>

--
"Patriotism is the conviction that your country is superior to all
others because you were born in it." -- George Bernard Shaw

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Marcelo Zabani 2010-09-30 18:33:03 gist indexes for distance calculations
Previous Message adrian.kitchingman 2010-09-30 06:00:30 Re: postgresql-9.0 Windows service stops after database transaction