Quick Links

Re: Query Question (one group by vs two separate)

From:	Joris Dobbelsteen <joris(at)familiedobbelsteen(dot)nl>
To:	Jonathon Suggs <jsuggs(at)murmp(dot)com>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Query Question (one group by vs two separate)
Date:	2008-12-14 00:51:05
Message-ID:	49445879.7030209@familiedobbelsteen.nl
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Jonathon Suggs wrote, On 10-12-08 20:12:
> I'm asking this as a more general question on which will perform
> better. I'm trying to get a set of comments and their score/rankings
> from two tables.
>
> *comments*
> cid (integer, primary key)
> title
> body
>
> *comment_ratings*
> cid (integer, primary key)
> uid (integer, primary key)
> score
>
> *Option 1* (Single group by query)
> select c.cid,c.title,c.body,coalsece(sum(r.score),0) as score
> from comments c left outer join comment_ratings r on c.cid = r.cid
> group by c.cid,c.title,c.body;
>
> *Option 2* (Two simple queries)
> select cid,title,body from comments;
> select cid,sum(score) from comment_ratings;
>
> I know that in general you always want to keep the number of queries to
> a minimum and that Option 1 is the most logical/preferred way to
> proceed. However, I've greatly simplified Option 1. In practice I'm
> going to be pulling a lot more columns and will be inner joining to
> additional tables. /So my real question is whether having a lot of
> items in the group by clause will have an adverse effect on performance./

You can answer that yourself... Create the database and fill it with a
decent set of sample data (if that doesn't already exists). Run analyze
to get your statistics. Use explain/explain analyze on the queries you
are interested in.

The manual has very good references. If you like, pgadmin can even
create graphical representation (which is what I prefer).

The obvious advantage depends on what you want to achieve? If its just a
dump of a table and an aggregate, you second option might be better.
If you need both data to transform it into something the user will see,
probably the first option is better, as you leave combining to the
highly specialized and optimized software and don't have to do it
yourself. The choice has to do with index scans and table scans, that
depend on the situation.

In fact, if the first one doesn't come out nice (first join than
aggregate, do aggeration in a subquery). So: learn explain!!! Every
(decent) RDBMS has such a facility.

- Joris

In response to

Query Question (one group by vs two separate) at 2008-12-10 19:12:49 from Jonathon Suggs

Responses

Re: Query Question (one group by vs two separate) at 2008-12-15 16:40:34 from Jonathon Suggs

Browse pgsql-general by date

	From	Date	Subject
Next Message	Michael Hall	2008-12-14 12:00:47	Relational database design book
Previous Message	SunWuKung	2008-12-13 19:13:39	application readable error message