Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

From: Gunnlaugur Thor Briem <gunnlaugur(at)gmail(dot)com>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT
Date: 2015-03-05 15:01:58
Message-ID: CAPs+M8JtK3EqjPp=o=CLpOCr4L_0zwmKnbr6ib2T=r+-Z4xP7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi, thanks for your follow-up questions.

- postgres version is 9.1.13
- the number of rows (in this latest instance) is 28,474,842
- I've clustered and vacuum-full-ed and analyzed this table frequently,
attempting to troubleshoot this. (Running vacuum full on the whole catalog
seems a little excessive, and unlikely to help.)
- no other processes are likely to be interfering; nothing other than
PostgreSQL runs on this machine (except for normal OS processes and New
Relic server monitoring service); concurrent activity in PostgreSQL is
low-level and unrelated, and this effect is observed systematically
whenever this kind of operation is performed on this table
- no override for this table; the system default_statistics_target is 100
(the default)
- yes, an ANALYZE is performed on the temp table after the COPY and before
the INSERT
- no index on the temp table (but I'm scanning the whole thing anyway).
There are indexes on f_foo as detailed in my original post, and I expect
the PK to make the WHERE NOT EXISTS filtering efficient (as it filters on
exactly all columns of the PK) ... but even if it didn't, I would expect
that to only slow down the actual insert execution, not the EXPLAIN.

Cheers,

Gulli

On Wed, Mar 4, 2015 at 8:10 PM, Marc Mamin <M(dot)Mamin(at)intershop(dot)de> wrote:

> >Hi,
> >we are seeing cases of EXPLAIN INSERT INTO foo SELECT ... taking over an
> hour, with disk I/O utilization (percent of time device is busy) at 100%
> the whole time, although I/O bandwidth is not saturated. This is on
> PostgreSQL 9.1.13.
> >What could cause this? Note that there is no ANALYZE. Is it possible that
> the SELECT is actually executed, in planning the INSERT?
> >When executing the INSERT itself (not EXPLAIN) immediately afterwards,
> that logs a "temporary file" message, but the EXPLAIN invocation does not
> (though the disk I/O suggests that a large on-disk sort is occurring):
> >LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp6016.0", size
> 744103936
> >STATEMENT: INSERT INTO f_foo SELECT
> >[...]
> >During that actual execution, there's a lower disk I/O utilization
> (though a higher MB/s rate).
> >Charts of disk I/O utilization and rate are at
> http://postimg.org/image/628h6jmn3/ ... the solid 100% span is the
> EXPLAIN statement, ending at 6:13:30pm, followed by the INSERT statement
> ending at 6:32pm. Metrics are collected by New Relic; their definition of
> I/O utilization is at https://discuss.newrelic.com/t/disk-i-o-metrics/2900
> >Here's the EXPLAIN statement:
> >LOG: duration: 3928778.823 ms statement: EXPLAIN INSERT INTO f_foo
> SELECT
> > t_foo.fk_d1,
> > t_foo.fk_d2,
> > t_foo.fk_d3,
> > t_foo.fk_d4,
> > t_foo.fk_d5,
> > t_foo.fk_d6,
> > t_foo.value
> >FROM t_foo
> >WHERE NOT (EXISTS
> > (SELECT *
> > FROM f_foo
> > WHERE f_foo.fk_d2 = t_foo.fk_d2
> > AND f_foo.fk_d3 = t_foo.fk_d3
> > AND f_foo.fk_d4 = t_foo.fk_d4
> > AND f_foo.fk_d5 = t_foo.fk_d5
> > AND f_foo.fk_d6 = t_foo.fk_d6
> > AND f_foo.fk_d1 = t_foo.fk_d1))
> >(where t_foo is a temp table previously populated using COPY, and the NOT
> EXISTS subquery refers to the same table we are inserting into)
> >Here's the EXPLAIN output:
> >Insert on f_foo (cost=8098210.50..9354519.69 rows=1 width=16)
> > -> Merge Anti Join (cost=8098210.50..9354519.69 rows=1 width=16)
> > Merge Cond: ((t_foo.fk_d2 = public.f_foo.fk_d2) AND
> > (t_foo.fk_d3 = public.f_foo.fk_d3) AND
> > (t_foo.fk_d4 = public.f_foo.fk_d4) AND
> > (t_foo.fk_d5 = public.f_foo.fk_d5) AND
> > (t_foo.fk_d6 = public.f_foo.fk_d6) AND
> > (t_foo.fk_d1 = public.f_foo.fk_d1))
> > -> Sort (cost=3981372.25..4052850.70 rows=28591380 width=16)
> > Sort Key: t_foo.fk_d2, t_foo.fk_d3, t_foo.fk_d4,
> t_foo.fk_d5,
> > t_foo.fk_d6, t_foo.fk_d1
> > -> Seq Scan on t_foo (cost=0.00..440461.80 rows=28591380
> > width=16)
> > -> Sort (cost=4116838.25..4188025.36 rows=28474842 width=12)
> > Sort Key: public.f_foo.fk_d2, public.f_foo.fk_d3,
> > public.f_foo.fk_d4, public.f_foo.fk_d5,
> > public.f_foo.fk_d6, public.f_foo.fk_d1
> > -> Seq Scan on f_foo (cost=0.00..591199.42 rows=28474842
> > width=12)
> >The INSERT is indeed rather large (which is why we're issuing an EXPLAIN
> ahead of it to log the plan). So its long execution time is expected. But I
> want to understand why the EXPLAIN takes even longer.
> >The table looks like this:
> >\d f_foo
> >Table "public.f_foo"
> > Column | Type | Modifiers
> >--------+----------+-----------
> > fk_d1 | smallint | not null
> > fk_d2 | smallint | not null
> > fk_d3 | smallint | not null
> > fk_d4 | smallint | not null
> > fk_d5 | smallint | not null
> > fk_d6 | smallint | not null
> > value | integer |
> >Indexes:
> > "f_foo_pkey" PRIMARY KEY, btree (fk_d2, fk_d6, fk_d4, fk_d3, fk_d5,
> fk_d1) CLUSTER
> > "ix_f_foo_d4" btree (fk_d4)
> > "ix_f_foo_d3" btree (fk_d3)
> > "ix_f_foo_d5" btree (fk_d5)
> > "ix_f_foo_d6" btree (fk_d6)
> >Foreign-key constraints:
> > "f_foo_d2_fkey" FOREIGN KEY (fk_d2) REFERENCES d2(id) DEFERRABLE
> > "f_foo_d6_fkey" FOREIGN KEY (fk_d6) REFERENCES d6(id) DEFERRABLE
> > "f_foo_d5_fkey" FOREIGN KEY (fk_d5) REFERENCES d5(id) DEFERRABLE
> > "f_foo_d4_fkey" FOREIGN KEY (fk_d4) REFERENCES d4(id) DEFERRABLE
> > "f_foo_d3_fkey" FOREIGN KEY (fk_d3) REFERENCES d3(id) DEFERRABLE
> >Conceivably relevant (though I don't know how): this database has a very
> large number of table objects (1.3 million rows in pg_class). But other
> EXPLAINs are not taking anywhere near this long in this DB; the heavy
> EXPLAIN is only seen on INSERT into this and a couple of other tables with
> tens of millions of rows.
> >Any ideas?
> >Thanks, best regards,
> >- Gulli
> >
>
> Hi,
> I've no clue for the time required by EXPLAIN
> but some more information are probably relevant to find an explanation:
>
> - postgres version
> - number of rows inserted by the query
> - how clean is your catalog in regard to vacuum
> ( can you run vacuum full verbose & analyze it, and then retry the
> analyze statement ?)
> - any other process that may interfere, e.g. while locking some catalog
> tables ?
> - statistic target ?
> - is your temp table analyzed?
> - any index on it ?
>
> We have about 300'000 entries in our pg_class tables, and I've never seen
> such an issue.
>
> regards,
> Marc Mamin
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Vladimir Sitnikov 2015-03-05 16:25:08 Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT
Previous Message Sven R. Kunze 2015-03-05 09:17:51 Re: Reverse Key Index