Re: problems maintaining boolean columns in a large table

From: "Timo Klecker" <klecker(at)decoit(dot)de>
To: "'Ben Campbell'" <ben(at)scumways(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: problems maintaining boolean columns in a large table
Date: 2010-02-11 14:23:25
Message-ID: 006c01caab25$c5bf7900$513e6b00$@de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Ben,

you can check weather one of your indexes is used within the Query by simply
using EXPLAIN ANALYZE.

EXPLAIN ANALYZE SELECT * from journo WHERE id IN (SELECT journo_id FROM
journo_attr WHERE article_id=$AnyExistingIdHere$);

Maybe you have another trigger on the journo table, that is triggered on
update? This would indeed slow everything down. As you mentioned you should
move the needs_indexing flag out of the article table. This could simply
hold all the article_ids that need indexing.

Greetings
Timo Klecker

-----Ursprüngliche Nachricht-----
Von: pgsql-general-owner(at)postgresql(dot)org
[mailto:pgsql-general-owner(at)postgresql(dot)org] Im Auftrag von Ben Campbell
Gesendet: Donnerstag, 11. Februar 2010 14:45
An: pgsql-general(at)postgresql(dot)org
Betreff: Re: [GENERAL] problems maintaining boolean columns in a large table

Timo Klecker wrote:
> could you post your trigger function? When you need to rebuild the index,
> you could disable the trigger setting the flag if the article is modified.
> This could speed up your UPDATE.

Embarrassingly, when I checked, I found that I'd never gotten around to
writing that particular trigger function... (It's just being handled at
the app level).
However, there _is_ a trigger function which sets another flag somewhere
which I bet is responsible for a lot of the time... it sets a "modified"
flag on any journalist associated with the article:

-------------------------------

-- article table trigger
CREATE OR REPLACE FUNCTION article_setjournomodified_onupdate() RETURNS
TRIGGER AS $$
BEGIN
-- whenever article is modified, set the modified flag on any
attributed journos
UPDATE journo SET modified=true WHERE id IN (SELECT journo_id FROM
journo_attr WHERE article_id=NEW.id);
return NULL;
END;
$$ LANGUAGE 'plpgsql';

CREATE TRIGGER article_update AFTER UPDATE ON article FOR EACH ROW
EXECUTE PROCEDURE article_setjournomodified_onupdate();

-------------------------------

(excuse the bad linebreaks!)
I bet the subselect in that trigger slows things down.
"article_id" in journo attr is a foreign key:

"journo_attr_article_id_fkey" FOREIGN KEY (article_id) REFERENCES
article(id) ON DELETE CASCADE

Can the SELECT use such a foreign key index to speed things up? Or do I
need to explicitly add another index? (and yes, I know that's a stupid
newbie question!)

Either way, I'll have a go at disabling the trigger to see what impact
it has on the bulk update of 'article.needs_indexing'!

Actually, I think it's a good argument for moving the needs_indexing
flag out of the article table - modifying any other article fields
should cause attributed journos to be marked 'modified', but the
'needs_indexing' doesn't need to do this - it's just a little
implementation detail rather than real data...
(and the same goes for 'journo.modified'!)

Thanks,
Ben.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Koichi Suzuki 2010-02-11 14:28:00 Re: Problem with pg_compresslog'd archives
Previous Message Filip Rembiałkowski 2010-02-11 14:20:06 Re: problems maintaining boolean columns in a large table