Skip site navigation (1) Skip section navigation (2)

Re: select on 1milion register = 6s

From: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
To: "Bruno Rodrigues Siqueira" <bruno(at)ravnus(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: select on 1milion register = 6s
Date: 2007-07-29 04:51:41
Message-ID: dcc563d10707282151m7a2de11ex5f593cfc19b98095@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
On 7/28/07, Bruno Rodrigues Siqueira <bruno(at)ravnus(dot)com> wrote:
>
> Ok.
> QUERY PLAN
> Sort  (cost=11449.37..11449.40 rows=119 width=8) (actual
> time=14431.537..14431.538 rows=2 loops=1)
>   Sort Key: to_char(data_encerramento, 'yyyy-mm'::text)
>   ->  HashAggregate  (cost=11448.79..11448.96 rows=119 width=8) (actual
> time=14431.521..14431.523 rows=2 loops=1)
>         ->  Index Scan using detalhamento_bas_idx3003 on detalhamento_bas
> (cost=0.00..11442.95 rows=11679 width=8) (actual time=0.135..12719.155
> rows=2335819 loops=1)

See the row mismatch there?  It expects about 11k rows, gets back 2.3
million.  That's a pretty big misestimate.  Have you run analyze
recently on this table?

Is there a reason you're doing this:


to_char( data_encerramento ,'yyyy-mm')
between   '2006-12' and  '2007-01'

when you should be able to just do:

data_encerramento between   '2006-12-01' and  '2007-01-31'
?  that should be able to use good estimates from analyze.  My guess
is the planner is making a bad guess because of the way you're
handling the dates.

>         SERVER
>                   DELL PowerEdge 2950
>                   XEON Quad-Core 3.0Ghz
>                   4Gb RAM
>                   Linux CentOS 5.0 64-bits
>      Postgres 8.1.4

>      Postgresql.conf
> # - Memory -
>
> shared_buffers = 50000                  # min 16 or max_connections*2, 8KB

400 Meg is kind of low for a server with 4 G ram.  25% is more
reasonable (i.e. 125000 buffers)

> work_mem = 3145728                      # min 64, size in KB
> maintenance_work_mem = 4194304          # min 1024, size in KB

Whoa nellie!  thats ~ 3 Gig of work mem, and 4 gig of maintenance work
mem.  In a machine with 4 gig ram, that's a recipe for disaster.

Something more reasonable would be 128000 (~125Meg) for each since
you've limited your machine to 10 connections you should be ok.
setting work_mem too high can run your machine out of memory and into
a swap storm that will kill performance.

> fsync = off                             # turns forced synchronization on or
> off

So, the data in this database isn't important?  Cause that's what
fsync = off says to me.  Better to buy yourself a nice battery backed
caching RAID controller than turn off fsync.

> effective_cache_size = 41943040         # typically 8KB each

And you're machine has 343,604,830,208 bytes of memory available for
caching?  Seems a little high to me.

> random_page_cost = 1                    # units are one sequential page
> fetch

Seldom if ever is it a good idea to bonk the planner on the head with
random_page_cost=1.  setting it to 1.2 ot 1.4 is low enough, but 1.4
to 2.0 is more realistic.

> stats_start_collector = off
> #stats_command_string = off
> #stats_block_level = off
> #stats_row_level = off
> #stats_reset_on_server_start = off

I think you need stats_row_level on for autovacuum, but I'm not 100% sure.

Let us know what happens after fixing these settings and running
analyze and running explain analyze, with possible changes to the
query.

In response to

Responses

pgsql-performance by date

Next:From: Jay KangDate: 2007-07-29 14:22:30
Subject: Questions on Tags table schema
Previous:From: Bruno Rodrigues SiqueiraDate: 2007-07-28 23:07:18
Subject: RES: select on 1milion register = 6s

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group