Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

From: Marc Mamin <M(dot)Mamin(at)intershop(dot)de>
To: Gunnlaugur Thor Briem <gunnlaugur(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT
Date: 2015-03-04 20:10:56
Message-ID: B6F6FD62F2624C4C9916AC0175D56D8828B82149@jenmbs01.ad.intershop.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

>Hi,
>we are seeing cases of EXPLAIN INSERT INTO foo SELECT ... taking over an hour, with disk I/O utilization (percent of time device is busy) at 100% the whole time, although I/O bandwidth is not saturated. This is on PostgreSQL 9.1.13.
>What could cause this? Note that there is no ANALYZE. Is it possible that the SELECT is actually executed, in planning the INSERT?
>When executing the INSERT itself (not EXPLAIN) immediately afterwards, that logs a "temporary file" message, but the EXPLAIN invocation does not (though the disk I/O suggests that a large on-disk sort is occurring):
>LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp6016.0", size 744103936
>STATEMENT: INSERT INTO f_foo SELECT
>[...]
>During that actual execution, there's a lower disk I/O utilization (though a higher MB/s rate).
>Charts of disk I/O utilization and rate are at http://postimg.org/image/628h6jmn3/ ... the solid 100% span is the EXPLAIN statement, ending at 6:13:30pm, followed by the INSERT statement ending at 6:32pm. Metrics are collected by New Relic; their definition of I/O utilization is at https://discuss.newrelic.com/t/disk-i-o-metrics/2900
>Here's the EXPLAIN statement:
>LOG: duration: 3928778.823 ms statement: EXPLAIN INSERT INTO f_foo SELECT
> t_foo.fk_d1,
> t_foo.fk_d2,
> t_foo.fk_d3,
> t_foo.fk_d4,
> t_foo.fk_d5,
> t_foo.fk_d6,
> t_foo.value
>FROM t_foo
>WHERE NOT (EXISTS
> (SELECT *
> FROM f_foo
> WHERE f_foo.fk_d2 = t_foo.fk_d2
> AND f_foo.fk_d3 = t_foo.fk_d3
> AND f_foo.fk_d4 = t_foo.fk_d4
> AND f_foo.fk_d5 = t_foo.fk_d5
> AND f_foo.fk_d6 = t_foo.fk_d6
> AND f_foo.fk_d1 = t_foo.fk_d1))
>(where t_foo is a temp table previously populated using COPY, and the NOT EXISTS subquery refers to the same table we are inserting into)
>Here's the EXPLAIN output:
>Insert on f_foo (cost=8098210.50..9354519.69 rows=1 width=16)
> -> Merge Anti Join (cost=8098210.50..9354519.69 rows=1 width=16)
> Merge Cond: ((t_foo.fk_d2 = public.f_foo.fk_d2) AND
> (t_foo.fk_d3 = public.f_foo.fk_d3) AND
> (t_foo.fk_d4 = public.f_foo.fk_d4) AND
> (t_foo.fk_d5 = public.f_foo.fk_d5) AND
> (t_foo.fk_d6 = public.f_foo.fk_d6) AND
> (t_foo.fk_d1 = public.f_foo.fk_d1))
> -> Sort (cost=3981372.25..4052850.70 rows=28591380 width=16)
> Sort Key: t_foo.fk_d2, t_foo.fk_d3, t_foo.fk_d4, t_foo.fk_d5,
> t_foo.fk_d6, t_foo.fk_d1
> -> Seq Scan on t_foo (cost=0.00..440461.80 rows=28591380
> width=16)
> -> Sort (cost=4116838.25..4188025.36 rows=28474842 width=12)
> Sort Key: public.f_foo.fk_d2, public.f_foo.fk_d3,
> public.f_foo.fk_d4, public.f_foo.fk_d5,
> public.f_foo.fk_d6, public.f_foo.fk_d1
> -> Seq Scan on f_foo (cost=0.00..591199.42 rows=28474842
> width=12)
>The INSERT is indeed rather large (which is why we're issuing an EXPLAIN ahead of it to log the plan). So its long execution time is expected. But I want to understand why the EXPLAIN takes even longer.
>The table looks like this:
>\d f_foo
>Table "public.f_foo"
> Column | Type | Modifiers
>--------+----------+-----------
> fk_d1 | smallint | not null
> fk_d2 | smallint | not null
> fk_d3 | smallint | not null
> fk_d4 | smallint | not null
> fk_d5 | smallint | not null
> fk_d6 | smallint | not null
> value | integer |
>Indexes:
> "f_foo_pkey" PRIMARY KEY, btree (fk_d2, fk_d6, fk_d4, fk_d3, fk_d5, fk_d1) CLUSTER
> "ix_f_foo_d4" btree (fk_d4)
> "ix_f_foo_d3" btree (fk_d3)
> "ix_f_foo_d5" btree (fk_d5)
> "ix_f_foo_d6" btree (fk_d6)
>Foreign-key constraints:
> "f_foo_d2_fkey" FOREIGN KEY (fk_d2) REFERENCES d2(id) DEFERRABLE
> "f_foo_d6_fkey" FOREIGN KEY (fk_d6) REFERENCES d6(id) DEFERRABLE
> "f_foo_d5_fkey" FOREIGN KEY (fk_d5) REFERENCES d5(id) DEFERRABLE
> "f_foo_d4_fkey" FOREIGN KEY (fk_d4) REFERENCES d4(id) DEFERRABLE
> "f_foo_d3_fkey" FOREIGN KEY (fk_d3) REFERENCES d3(id) DEFERRABLE
>Conceivably relevant (though I don't know how): this database has a very large number of table objects (1.3 million rows in pg_class). But other EXPLAINs are not taking anywhere near this long in this DB; the heavy EXPLAIN is only seen on INSERT into this and a couple of other tables with tens of millions of rows.
>Any ideas?
>Thanks, best regards,
>- Gulli
>

Hi,
I've no clue for the time required by EXPLAIN
but some more information are probably relevant to find an explanation:

- postgres version
- number of rows inserted by the query
- how clean is your catalog in regard to vacuum
( can you run vacuum full verbose & analyze it, and then retry the analyze statement ?)
- any other process that may interfere, e.g. while locking some catalog tables ?
- statistic target ?
- is your temp table analyzed?
- any index on it ?

We have about 300'000 entries in our pg_class tables, and I've never seen such an issue.

regards,
Marc Mamin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jim Nasby 2015-03-05 08:25:42 Re: Re: [pgadmin-support] Issue with a hanging apply process on the replica db after vacuum works on primary
Previous Message Soni M 2015-03-04 19:31:23 slow server : s_lock and _bt_checkkeys on perf top