Re: stored proc and inserting hundreds of thousands of rows

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Joel Reymont" <joelr1(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: stored proc and inserting hundreds of thousands of rows
Date: 2011-04-30 18:24:12
Message-ID: 4DBC0D7C020000250003D09E@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

[rearranging to correct for top-posting]

Joel Reymont <joelr1(at)gmail(dot)com> wrote:
> Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>> Joel Reymont <joelr1(at)gmail(dot)com> wrote:
>>
>>> We have 2 million documents now and linking an ad to all of them
>>> takes 5 minutes on my top-of-the-line SSD MacBook Pro.
>>
>> How long does it take to run just the SELECT part of the INSERT
>> by itself?

> Are you suggesting eliminating the physical linking and
> calculating matching documents on the fly?

I'm not suggesting anything other than it being a good idea to
determine where the time is being spent before trying to make it
faster. You showed this as the apparent source of the five minute
delay:

INSERT INTO doc_ads (doc_id, ad_id, distance)
SELECT doc, (t).ad_id, (t).distance
FROM (SELECT ads_within_distance(topics, threshold) AS t
FROM docs
WHERE id = doc) AS x;

What we don't know is how much of that time is due to writing to the
doc_ads table, and how much is due to reading the other tables. We
can find that out by running this:

SELECT doc, (t).ad_id, (t).distance
FROM (SELECT ads_within_distance(topics, threshold) AS t
FROM docs
WHERE id = doc) AS x;

If this is where most of the time is, the next thing is to run it
with EXPLAIN ANALYZE, and post the output. It's a whole different
set of things to try to tune if that part is fast and the INSERT
itself is slow.

Of course, be aware of caching effects when you time this.

-Kevin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2011-04-30 18:36:46 Re: stored proc and inserting hundreds of thousands of rows
Previous Message Joel Reymont 2011-04-30 18:14:31 Re: stored proc and inserting hundreds of thousands of rows