Optimize query for listing un-read messages

From: Andreas Joseph Krogh <andreas(at)visena(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Optimize query for listing un-read messages
Date: 2014-05-01 11:06:24
Message-ID: OfficeNetEmail.a.39b074ef232dff6c.145b76dce83@prod2
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi all,   I have a schema where I have lots of messages and some users who
might have read some of them. When a message is read by a user I create an
entry i a table message_property holding the property (is_read) for that user.  
The schema is as follows:   drop table if exists message_property;
drop table if exists message;
drop table if exists person;   create table person(
    id serial primary key,
    username varchar not null unique
);   create table message(
    id serial primary key,
    subject varchar
);   create table message_property(
    message_id integer not null references message(id),
    person_id integer not null references person(id),
    is_read boolean not null default false,
    unique(message_id, person_id)
);   insert into person(username) values('user_' || generate_series(0, 999));
insert into message(subject) values('Subject ' || random() ||
generate_series(0, 999999)); insert into message_property(message_id,
person_id, is_read) select id, 1, true from message order by id limit 999990;
insert into message_property(message_id, person_id, is_read) select id, 1,
false from message order by id limit 5 offset 999990;     So, for person 1
there are 10 unread messages, out of a total 1mill. 5 of those unread does not
have an entry in message_property and 5 have an entry andis_read set to FALSE.  
I have the following query to list all un-read messages for person with id=1:  
SELECT
    m.id                          AS message_id,
    prop.person_id,
    coalesce(prop.is_read, FALSE) AS is_read,
    m.subject
FROM message m
    LEFT OUTER JOIN message_property prop ON prop.message_id = m.id AND
prop.person_id = 1
WHERE 1 = 1
      AND NOT EXISTS(SELECT
                         *
                     FROM message_property pr
                     WHERE pr.message_id = m.id AND pr.person_id =
prop.person_id AND prop.is_read = TRUE)
    ;   The problem is that it's not quite efficient and performs badly,
explain analyze shows:
                                                                                        
QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Anti Join  (cost=1.27..148784.09 rows=5 width=40) (actual
time=907.412..907.419 rows=10 loops=1)
   Merge Cond: (m.id = pr.message_id)
   Join Filter: (prop.is_read AND (pr.person_id = prop.person_id))
   Rows Removed by Join Filter: 5
   ->  Merge Left Join  (cost=0.85..90300.76 rows=1000000 width=40) (actual
time=0.039..524.070 rows=1000000 loops=1)
         Merge Cond: (m.id = prop.message_id)
         ->  Index Scan using message_pkey on message m  (cost=0.42..34317.43
rows=1000000 width=35) (actual time=0.014..112.840 rows=1000000 loops=1)
         ->  Index Scan using message_property_message_id_person_id_key on
message_property prop  (cost=0.42..40983.40 rows=999995 width=9) (actual
time=0.019..129.035 rows=999995 loops=1)
               Index Cond: (person_id = 1)
   ->  Index Only Scan using message_property_message_id_person_id_key on
message_property pr  (cost=0.42..40983.40 rows=999995 width=8) (actual
time=0.023..138.327 rows=999995 loops=1)
         Index Cond: (person_id = 1)
         Heap Fetches: 999995
 Total runtime: 907.480 ms
(13 rows)     Does anyone have suggestions on how to optimize the query or
schema? It's important that any message not having an entry in message_property
for a user is considered un-read.   Thanks!   -- Andreas Jospeh Krogh CTO /
Partner - Visena AS Mobile: +47 909 56 963 andreas(at)visena(dot)com
<mailto:andreas(at)visena(dot)com> www.visena.com <https://www.visena.com>
<https://www.visena.com>

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Hengky Liwandouw 2014-05-01 15:04:02 Re: Return and sql tuple descriptions are incompatible
Previous Message Francisco Olarte 2014-05-01 10:32:29 Re: Vacuuming strategy