Skip site navigation (1) Skip section navigation (2)

Re: index for group by

From: "A(dot) Kretschmer" <andreas(dot)kretschmer(at)schollglas(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: index for group by
Date: 2008-07-22 11:27:24
Message-ID: 20080722112724.GD2742@a-kretschmer.de (view raw or flat)
Thread:
Lists: pgsql-sql
am  Tue, dem 22.07.2008, um 13:18:30 +0200 mailte Patrick Scharrenberg folgendes:
> Hi,
> 
> is there a way to speedup "group by" queries with an index?
> 
> In particular if I have a table like this:
> 
> CREATE TABLE data
> (
>    id1 integer,
>    id2 integer,
>    somedata character varying,
>    ts timestamp with time zone
> );
> 
> where continously data is logged about "id1" and "id2" into "somedata",
> together with the timestamp when it was logged.
> 
> So I have multiple rows with the same id1 and id2 but different
> timestamp (and data maybe).
> 
> At the moment I have ~40.000.000 rows in that table so doing a
> 
> 	SELECT id1, id2 FROM data GROUP BY id1, id2;


without a where-clause every select forces a seq-scan.


> 
> takes some time (~10 minutes)
> and return about 1.000.000 rows.
> 
> I created an index on both colums id1 and id2 (together) which takes
> about 800 MB but doesn't speedup things.
> In fact it even doesn't seem to be used.

The database has to read all rows, an index can't help in this case.


> 
> Is there any way to speedup this "group by" or does it seem more likely
> that I have a conceptional flaw?

Hard to say...


Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

In response to

pgsql-sql by date

Next:From: Christian KindlerDate: 2008-07-22 18:12:56
Subject: Re: How to Select a Tupl by Nearest Date
Previous:From: Patrick ScharrenbergDate: 2008-07-22 11:18:30
Subject: index for group by

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group