Re: not using indexes on large table

From: Andreas Kostyrka <andreas(at)kostyrka(dot)org>
To: Jeroen Kleijer <jeroen(dot)kleijer(at)xs4all(dot)nl>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: not using indexes on large table
Date: 2007-04-21 21:17:04
Message-ID: 20070421211704.GS4118@andi-lap
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

* Jeroen Kleijer <jeroen(dot)kleijer(at)xs4all(dot)nl> [070421 23:10]:
>
> Hi all,
>
> I'm a bit new to PostgreSQL and database design in general so forgive me
> for asking stupid questions. ;-)
>
> I've setup a PostgreSQL database on a Linux machine (2 processor, 1GB
> mem) and while the database itself resides on a NetApp filer, via NFS,
> this doesn't seem to impact the performance to drastically.
>
> I basically use it for indexed tables without any relation between 'em
> so far this has worked perfectly.
>
> For statistics I've created the following table:
> volume varchar(30),
> qtree varchar(255),
> file varchar(512),
> ctime timestamp,
> mtime timestamp,
> atime timestamp
> annd created separate indexes on the volume and qtree columns.
>
> This table gets filled with the copy command and about 2 hours and
> some 40 million records later I issue a reindex command to make sure the
> indexes are accurate. (for good interest, there are some 35 values for
> volume and some 1450 for qtrees)
>
> While filling of this table, my database grows to an (expected) 11.5GB.
>
> The problems comes when I try to do a query without using a where clause
> because by then, it completely discards the indexes and does a complete
> table scan which takes over half an hour! (40.710.725 rows, 1110258
> pages, 1715 seconds)
>
> I've tried several things but doing a query like:
> select distinct volume from project_access_times
> or
> select distinct qtree from project_access_times
> always result in a full sequential table scan even after a 'vacuum' and
> 'vacuum analyze'.

Try:
select volume from project_access_times group by project_access_times;

And no matter, runnning a database over NFS smells like a dead rat.

Hopefully, you've mounted it hard, but still NFS does not have normal
semantics, e.g. locking, etc.

Next thing, as you've got only one client for that NFS mount, try to
make it to cache aggressivly meta data. The ac prefixed options in
nfs(5) come to mind.

Andreas

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message henk de wit 2007-04-21 23:23:50 Redundant sub query triggers slow nested loop left join
Previous Message Vincenzo Romano 2007-04-21 20:58:06 Re: not using indexes on large table