Re: Small Queries Really Fast, Large Queries Really Slow...

From: tv(at)fuzzy(dot)cz
To: "Tom Wilcox" <hungrytom(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Small Queries Really Fast, Large Queries Really Slow...
Date: 2010-06-24 11:20:04
Message-ID: 3376.85.160.2.47.1277378404.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> Any suggestions on what I can do to speed things up? I presume if I turn
> off
> Sequential Scan then it might default to Index Scan.. Is there anything
> else?
>
> Cheers,
> Tom

Well, I doubt turning off the sequential scan will improve the performance
in this case - actually the first case (running 400 sec) uses an index
scan, while the 'fast' one uses sequential scan.

Actually I'd try exactly the oposite - disabling the index scan, i.e.
forcing it to use sequential scan in the first case. You're selecting
about 4% of the rows, but we don't know how 'spread' are those rows
through the table. It might happen PostgreSQL actually has to read all the
blocks of the table.

This might be improved by clustering - create and index on the
'match_data_id' colunm and then run

CLUSTER match_data_id_idx ON match_data;

This will sort the table accoring to match_data_id column, which should
improve the performance. But it won't last forever - it degrades through
time, so you'll have to perform clustering once a while (and it locks the
table, so be careful).

How large is the table anyway? How many rows / pages are there? Try
something like this

SELECT reltuples, relpages FROM pg_class WHERE relname = 'match_data';

Multiply the blocks by 8k and you'll get the occupied space. How much is
it? How much memory (shared_buffers) is there?

You could try partitioning accoring to the match_data_id column, but there
are various disadvantages related to foreign keys etc. and it's often a
major change in the application, so I'd consider other solutions first.

BTW I have no experience with running PostgreSQL inside a Virtual Box VM,
so it might be another source of problems. I do remember we had some
serious problems with I/O (network and disks) when running vmware, but it
was a long time ago and now it works fine. But maybe this the root cause?
Can you run dstat / vmstat / iostat or something like that in the host OS
to see which of the resources is causing problems (if any)?

Tomas

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2010-06-24 11:35:16 Re: PostgreSQL as a local in-memory cache
Previous Message Tom Wilcox 2010-06-24 10:45:25 Small Queries Really Fast, Large Queries Really Slow...