On Wed, Feb 8, 2012 at 12:03 PM, David Yeu <david(dot)yeu(at)skype(dot)net> wrote:
> Hi there,
> We've got a pretty large table that sees millions of new rows a day, and
> we're trying our best to optimize queries against it. We're hoping to find
> some guidance on this list.
> Thankfully, the types of queries that we perform against this table are
> pretty constrained. We never update rows and we never join against other
> tables. The table essentially looks like this:
> | id | group_id | created_at | everything elseŠ
> Where `id' is the primary key, auto-incrementing, `group_id' is the
> foreign key that we always scope against, and `created_at' is the
> insertion time. We have indices against the primary key and the group_id.
> Our queries essentially fall into the following cases:
> * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;
> * Š WHERE group_id = ? AND id > ? ORDER BY created_at DESC;
> * Š WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT 20;
> * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?;
> In human words, we're looking for:
> * The most recent (20) rows.
> * The most recent rows after a given `id'.
> * Twenty rows before a given `id'.
> * Pages of twenty rows.
You can probably significantly optimize this. But first, can we see
some explain analyze for the affected queries?
In response to
pgsql-performance by date
|Next:||From: Claudio Freire||Date: 2012-02-10 15:33:33|
|Subject: Re: Performance on large, append-only tables|
|Previous:||From: Tom Lane||Date: 2012-02-10 06:19:52|
|Subject: Re: timestamp with time zone |