Skip site navigation (1) Skip section navigation (2)

Query optimization....

From: Karl Denninger <karl(at)denninger(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Query optimization....
Date: 2007-07-31 01:08:44
Message-ID: 46AE8B9C.1000208@denninger.net (view raw or flat)
Thread:
Lists: pgsql-performance
In a followup to a question I put forward here on performance which I 
traced to the "stats" bug (and fixed it).  Now I'm trying to optimize 
that query and....... I'm getting confused fast...

I have the following (fairly complex) statement which is run with some 
frequency:

select post.forum, post.subject, post.replied from post where toppost = 
1 and (replied > (select lastview from forumlog where login='someone' 
and forum=post.forum and number is null)) is not false AND (replied > 
(select lastview from forumlog where login='someone' and 
forum=post.forum and number=post.number)) is not false order by pinned 
desc, replied desc;

This gives me exactly what I'm looking for BUT can be quite slow.

The "forumlog" table has one tuple for each post and user; it has the 
fields "forum", "number", "login" and "lastview".  The "post" items have 
a "forum", "number" and "replied" field (which is used to match the 
"lastview" one.) 

When you look at a "post" (which may have replies) the application 
updates your existing entry in that table if there is one, or INSERTs a 
new tuple if not.

Therefore, for each post you have viewed, there is a tuple in the 
"forumlog" table which represents the last time you looked at that item.

The problem is that for a person who has NOT visited a specific thread 
of discussion, there is no "forumlog" entry for that person and post in 
the table.  Thus, to get all posts which (1) you've not seen at all, or 
(2) you've seen but someone has added to since you saw them, the above 
complex query is what I've come up with; there may be a "null" table 
entry which a "wildcard" match if its present - if there is no match 
then the item also must treated as new.  The above statement works - but 
its slow.

The following query is VERY fast but only returns those in which there 
IS an entry in the table (e.g. you've visited the item at least once)

select post.forum, post.subject, post.replied from post, forumlog where 
post.number = forumlog.number and post.toppost = 1 and post.replied > 
forumlog.lastview and forumlog.login='someone' order by pinned desc, 
replied desc;

What I haven't been able to figure out is how to structure a query that 
is both fast and will return the posts for which you DO NOT have a 
matching entry in the "forumlog" table for the specific post but DO 
either (1) match the "null" number entry (that is, they're posted later 
than that) OR (2) have no match at all.  (The first statement matches 
these other two cases)

Any ideas?  (Its ok if that query(s) are separate; in other words, its 
cool if I have to execute two or even three queries and get the results 
separately - in fact, that might be preferrable in some circumstances)

Ideas?

-- 
Karl Denninger (karl(at)denninger(dot)net)
http://www.denninger.net




%SPAMBLOCK-SYS: Matched [(at)postgresql(dot)org+], message ok

pgsql-performance by date

Next:From: Decibel!Date: 2007-07-31 01:46:37
Subject: Re: Vacuum looping?
Previous:From: Luke LonerganDate: 2007-07-30 21:41:38
Subject: Re: Postgres configuration for 64 CPUs, 128 GB RAM...

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group