Re: Less rows -> better performance?

From: "Christian GRANDIN" <christian(dot)grandin(at)gmail(dot)com>
To: "Richard Huxton" <dev(at)archonet(dot)com>
Cc: "Andreas Hartmann" <andreas(at)apache(dot)org>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Less rows -> better performance?
Date: 2008-07-21 14:00:23
Message-ID: 1568f9ad0807210700k78d55744mcd36838df5b78e8e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

Reducing the amount of data will only have effect on table scan or index
scan. If your queries are selective and optimized, it will have no effect.

Before looking for solutions, the first thing to do is to understand what's
happen.

If you already know the queries then explain them. Otherwise, you must log
duration with the log_statement and log_min_duration parameters in the
postgresql.conf.

Before this, you must at least run VACUUM ANALYZE on the database to collect
actual statistics and have current explain plans.

Best regards.

Christian

2008/7/21 Richard Huxton <dev(at)archonet(dot)com>

> Andreas Hartmann wrote:
>
>>
>> Here's some info about the actual amount of data:
>>
>> SELECT pg_database.datname,
>> pg_size_pretty(pg_database_size(pg_database.datname)) AS size
>> FROM pg_database where pg_database.datname = 'vvz_live_1';
>>
>> datname | size
>> ---------------+---------
>> vvz_live_1 | 2565 MB
>>
>> I wonder why the actual size is so much bigger than the data-only dump -
>> is this because of index data etc.?
>>
>
> I suspect Guillame is right and you've not been vacuuming. That or you've
> got a *LOT* of indexes. If the database is only 27MB dumped, I'd just
> dump/restore it.
>
> Since the database is read-only it might be worth running CLUSTER on the
> main tables if there's a sensible ordering for them.
>
> What in particular is slow?
>>>
>>
>> There's no particular bottleneck (at least that we're aware of). During
>> the first couple of days after the beginning of the semester the application
>> request processing tends to slow down due to the high load (many students
>> assemble their schedule). The customer upgraded the hardware (which already
>> helped a lot), but they asked us to find further approaches to performance
>> optimiziation.
>>
>
> 1. Cache sensibly at the application (I should have thought there's plenty
> of opportunity here).
> 2. Make sure you're using a connection pool and have sized it reasonably
> (try 4,8,16 see what loads you can support).
> 3. Use prepared statements where it makes sense. Not sure how you'll manage
> the interplay between this and connection pooling in JDBC. Not a Java man
> I'm afraid.
>
> If you're happy with the query plans you're looking to reduce overheads as
> much as possible during peak times.
>
> 4. Offload more of the processing to clients with some fancy ajax-ed
> interface.
> 5. Throw in a spare machine as an app server for the first week of term.
> Presumably your load is 100 times average at this time.
>
> --
> Richard Huxton
> Archonet Ltd
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Andreas Hartmann 2008-07-21 14:45:10 Re: Less rows -> better performance?
Previous Message Rusty Conover 2008-07-21 13:57:17 Re: Perl/DBI vs Native