Re: dedicated server & postgresql 8.1 conf tunning

From: Richard Huxton <dev(at)archonet(dot)com>
To: paul(at)wayr(dot)org
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: dedicated server & postgresql 8.1 conf tunning
Date: 2008-10-01 11:36:48
Message-ID: 48E360D0.8080403@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

paul(at)wayr(dot)org wrote:
> Hello
>
> I'm running pgsql 8.1.11 (from debian stable) on a server with 16GB RAM
> (Linux helios 2.6.18-6-amd64 #1 SMP Tue Aug 19 04:30:56 UTC 2008 x86_64
> GNU/Linux).

Unless you're committed to this version, I'd seriously look into 8.3
from backports (or compiled yourself). I'd expect some serious
performance improvements for the workload you describe.

> I have a table "tickets" with 1 000 000 insert by month ( ~2600 each 2hours
> ) (for the moment 13000000 rows for 5GB )
> and i have to extract statistics ( number of calls, number of calls less
> than X seconds, number of news calles, number of calls from the new
> callers, ...)

OK, so not a lot of updates, but big aggregation queries. You might want
to pre-summarise older data as the system gets larger.

> 1°) The server will handle max 15 queries at a time.
> So this is my postgresql.conf
>
> max_connections = 15

Well, I'd allow 20 - just in case.

> shared_buffers = 995600 # ~1Go
> temp_buffers = 1000
> work_mem = 512000 # ~512Ko

I'd be tempted to increase work_mem by a lot, possibly even at the
expense of shared_buffers. You're going to be summarising large amounts
of data so the larger the better, particularly as your database is
currently smaller than RAM. Start with 5MB then try 10MB, 20MB and see
what difference it makes.

> maintenance_work_mem = 1048576 # 1Mo
>
> max_fsm_pages = 41522880 # ~40Mo
> max_fsm_relations = 8000

See what a vacuum full verbose says for how much free space you need to
track.

> checkpoint_segments = 10
> checkpoint_timeout = 3600

With your low rate of updates shouldn't matter.

> effective_cache_size = 13958643712 # 13Go

Assuming that's based on what "top" or "free" say, that's fine. Don't
forget it will need to be reduced if you increase work_mem or
shared_buffers.

> stats_start_collector = on
> stats_command_string = on
> stats_block_level = on
> stats_row_level = on
> autovacuum = off

Make sure you're vacuuming if autovacuum is off.

> How can i optimize the configuration?

Looks reasonable, so far as you can tell from an email. Try playing with
work_mem though.

> 2°) My queries look like
> SELECT tday AS n,
> COUNT(DISTINCT(a.appelant)) AS new_callers,
> COUNT(a.appelant) AS new_calls
> FROM cirpacks.tickets AS a
> WHERE LENGTH(a.appelant) > 4
> AND a.service_id IN ( 95, 224, 35, 18 )
> AND a.exploitant_id = 66
> AND a.tyear = 2008
> AND a.tmonth = 08

Index on (tyear,tmonth) might pay off, or one on exploitant_id perhaps.

> AND EXISTS ( SELECT 1 FROM cirpacks.clients AS b WHERE b.appelant =
> a.appelant AND b.service_id IN ( 95, 224, 35, 18 ) AND b.heberge_id = 66
> HAVING to_char(MIN(b.premier_appel), 'YYYYMMDD') = to_char(a.date,
> 'YYYYMMDD') )

It looks like you're comparing two dates by converting them to text.
That's probably not the most efficient way of doing it. Might not be an
issue here.

> GROUP BY n
> ORDER BY n;
>
> or select ... SUM( CASE WHEN condition THEN value ELSE 0) ... FROM
> cirpacks.tickets WHERE tyear = ... and tmonth = ... and tday = ... AND
> audiotel IN ( '...', '...' ....);
> or select ... SUM( CASE WHEN condition THEN value ELSE 0) ... FROM
> cirpacks.tickets WHERE '2007-01-01' <= date AND date <= '2008-08-31' AND
> audiotel IN ( '...', '...' ....);
>
>
> which indexes are the best ?

The only way to find out is to test. You'll want to run EXPLAIN after
adding each index to see what difference it makes. Then you'll want to
see what impact this has on overall workload.

Mostly though, I'd try out 8.3 and see if that buys you a free
performance boost.

--
Richard Huxton
Archonet Ltd

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message David logan 2008-10-01 12:11:33 Mystefied at poor performance of a standard query
Previous Message Matthew Wakeling 2008-10-01 10:49:54 Re: Confusing Query Performance