From: | Danylo Hlynskyi <abcz2(dot)uprola(at)gmail(dot)com> |
---|---|
To: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: Delete tables difference involves seq scan |
Date: | 2017-12-01 13:17:30 |
Message-ID: | CANZg+yeKu3S=V=pf8xZzL=pfq9s1_Vbaa2x6dn8WAgtkKMKVKw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Oh, sorry, this happens on Postgresql 9.6.6. I've checked that on
Postgresql 10.0 query plan from setup (1)
now uses two seq scans, like in setup (2).
2017-12-01 15:03 GMT+02:00 Danylo Hlynskyi <abcz2(dot)uprola(at)gmail(dot)com>:
> Hello. I want to remove rows from first table, that exist in second
> (equality is done using PK). However I experience seq scan on second table,
> which counters my intuition - I think it should be index-only. Because
> tables are large, performance of query is very bad.
>
> However I got mixed results when trying to reproduce this behavior on
> syntetic tables. Here I'll show 3 different plans, which I got for the same
> query.
>
> 1. Setup is:
> ---------------------------
> create table diff (id uuid constraint diff_pkey primary key);
> create table origin (id uuid constraint origin_pkey primary key);
> ---------------------------
>
> The query generates correct plan, which performs only index scans:
>
> explain delete from origin where exists (select id from diff where
> origin.id = diff.id);
> QUERY PLAN
> ------------------------------------------------------------
> -------------------------------
> Delete on origin (cost=0.30..105.56 rows=1850 width=12)
> -> Merge Semi Join (cost=0.30..105.56 rows=1850 width=12)
> Merge Cond: (origin.id = diff.id)
> -> Index Scan using origin_pkey on origin (cost=0.15..38.90
> rows=1850 width=22)
> -> Index Scan using diff_pkey on diff (cost=0.15..38.90
> rows=1850 width=22)
> (5 rows)
>
> 2. Setup is:
> --------------------------------
> create table origin (id uuid constraint origin_pkey primary key, data
> jsonb);
> create table diff (id uuid constraint diff_pkey primary key, data jsonb);
> --------------------------------
>
> The query generates plan with two seq scans:
>
> explain delete from origin where exists (select id from diff where
> origin.id = diff.id);
> QUERY PLAN
> ------------------------------------------------------------
> ---------------
> Delete on origin (cost=34.08..69.49 rows=1070 width=12)
> -> Hash Semi Join (cost=34.08..69.49 rows=1070 width=12)
> Hash Cond: (origin.id = diff.id)
> -> Seq Scan on origin (cost=0.00..20.70 rows=1070 width=22)
> -> Hash (cost=20.70..20.70 rows=1070 width=22)
> -> Seq Scan on diff (cost=0.00..20.70 rows=1070 width=22)
> (6 rows)
>
> 3. My real `origin` table has 26 fields and 800 billion rows, real `diff`
> table has 12 million rows and the query generates plan with nested loop and
> seq scan on `diff` table:
>
> explain delete from drug_refills origin where exists (select id from
> drug_refills_diff diff where origin.id = diff.id);
> QUERY PLAN
> ------------------------------------------------------------
> ----------------------------------------------
> Delete on drug_refills origin (cost=0.57..22049570.11 rows=11975161
> width=12)
> -> Nested Loop (cost=0.57..22049570.11 rows=11975161 width=12)
> -> Seq Scan on drug_refills_diff diff (cost=0.00..720405.61
> rows=11975161 width=22)
> -> Index Scan using drug_refills_pkey on drug_refills origin
> (cost=0.57..1.77 rows=1 width=22)
> Index Cond: (id = diff.id)
> (5 rows)
>
> I have run ANALYZE on both tables, but it didn't help. Here are column
> types in origin and diff (same schema), if that matters:
>
> uuid
> timestamp with time zone
> timestamp with time zone
> character varying(255)
> character varying(255)
> character varying(1024)
> numeric(10,4)
> integer
> numeric(14,8)
> numeric(14,8)
> numeric(14,8)
> numeric(14,8)
> numeric(14,8)
> character varying(16)
> character varying(16)
> character varying(16)
> character varying(16)
> character varying(16)
> character varying(16)
> date
> jsonb
> text[]
> uuid
> uuid
> uuid
> uuid
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Danylo Hlynskyi | 2017-12-01 14:52:54 | Re: Delete tables difference involves seq scan |
Previous Message | Danylo Hlynskyi | 2017-12-01 13:03:35 | Delete tables difference involves seq scan |