| From: | Hannu Krosing <hannu(at)tm(dot)ee> |
|---|---|
| To: | Nikk Anderson <Nikk(dot)Anderson(at)parallel(dot)ltd(dot)uk> |
| Cc: | "\"'Charles H(dot) \"Woloszynski'" <chw(at)clearmetrix(dot)com>, pgsql-performance(at)postgresql(dot)org |
| Subject: | Re: selects from large tables |
| Date: | 2002-11-20 20:26:22 |
| Message-ID: | 1037823981.1930.5.camel@rh72.home.ee |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-performance |
Nikk Anderson kirjutas K, 20.11.2002 kell 20:08:
> Hi,
>
> I tried a test cluster on a copy of our real data - all 10 million
> rows or so. WOW! The normal select performance improved
> drastically.
>
> Selecting 3 months worth of data was taking 146 seconds to retrieve.
> After clustering it took 7.7 seconds! We are now looking into ways we
> can automate clustering to keep the table up to date. The cluster
> itself took around 2.5 hours.
>
> As our backend systems are writing hundreds of rows of data in per
> minute into the table that needs clustering - will cluster handle
> locking the tables when dropping the old, and renaming the clustered
> data? What happens to the data being added to the table while cluster
> is running? Our backend systems may have some problems if the table
> does not exist when it tries to insert, and we don't want to lose any
> data.
You could use a staging table that takes all the inserts and the
contents of which are moved (begin;insert into big select from
small;delete from small;commit;vacuum full small;) to the main table
once a day (or week or month) just before clustering the big one.
Then do all your selects from a UNION view on both - thus you have a big
fast clustered table and non-clustered "live" table which stays small.
That should make your selects fast(er).
-----------------
Hannu
| From | Date | Subject | |
|---|---|---|---|
| Next Message | typea | 2002-11-20 23:40:46 | Query Analysis |
| Previous Message | Francisco Reyes | 2002-11-20 18:58:49 |