Quick Links

Re: Expected performance of querying 5k records from 4 million records?

From:	Samuel Gendler <sgendler(at)ideasculptor(dot)com>
To:	Anish Kejariwal <anishkej(at)gmail(dot)com>
Cc:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, josh(at)agliodbs(dot)com, pgsql-performance(at)postgresql(dot)org
Subject:	Re: Expected performance of querying 5k records from 4 million records?
Date:	2012-06-18 16:49:39
Message-ID:	CAEV0TzBC_MVO_D65dWMoRXFSMCnFfFa7ZDikcom9aEKFXsjL0A@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Mon, Jun 18, 2012 at 9:39 AM, Anish Kejariwal <anishkej(at)gmail(dot)com> wrote:

>
> So, it looks like clustering the index appropriately fixes things! Also,
> I'll recreate the index switching the order to (dataset_id, stat_id,data_id)
>
> Just keep in mind that clustering is a one-time operation. Inserts and
updates will change the order of records in the table, so you'll need to
re-cluster periodically to keep performance high if there are a lot of
inserts and updates into the tables. I didn't re-read the thread, but I
seem recall a partitioned table, so assuming you are partitioning in a
manner which keeps the number of partitions that are actively being
inserted/updated on to a minimum, you only need to cluster the active
partitions, which isn't usually terribly painful. Also, if you are bulk
loading data (and not creating random spaces in the table by deleting and
updating), you can potentially order the data on the way into the table to
avoid the need to cluster repeatedly.

--sam

In response to

Re: Expected performance of querying 5k records from 4 million records? at 2012-06-18 16:39:31 from Anish Kejariwal

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Josh Berkus	2012-06-18 19:08:37	Re: Expected performance of querying 5k records from 4 million records?
Previous Message	Anish Kejariwal	2012-06-18 16:39:31	Re: Expected performance of querying 5k records from 4 million records?