From: | Marco Colli <collimarco91(at)gmail(dot)com> |
---|---|
To: | Michael Lewis <mlewis(at)entrata(dot)com> |
Cc: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: Slow "not in array" operation |
Date: | 2019-11-12 19:04:15 |
Message-ID: | CAFvCgN6bQhf03jdakdk1eVMad8r=6nFV=cmy2v_sp0f8_kZ-bg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
To be honest, I have simplified the question above. In order to show you
the plan, I must show you the actual query, which is this:
=== QUERY ===
SELECT COUNT(*) FROM "subscriptions" WHERE "subscriptions"."project_id" =
123 AND "subscriptions"."trashed_at" IS NULL AND NOT (tags @>
ARRAY['en']::varchar[]);
=== QUERY PLAN ===
QUERY
PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=2152593.04..2152593.05 rows=1 width=8) (actual
time=70555.561..70555.561 rows=1 loops=1)
-> Gather (cost=2152592.31..2152593.02 rows=7 width=8) (actual
time=70540.641..70702.365 rows=8 loops=1)
Workers Planned: 7
Workers Launched: 7
-> Partial Aggregate (cost=2151592.31..2151592.32 rows=1
width=8) (actual time=70537.376..70537.377 rows=1 loops=8)
-> Parallel Seq Scan on subscriptions
(cost=0.00..2149490.49 rows=840731 width=0) (actual time=0.742..70479.359
rows=611828 loops=8)
Filter: ((trashed_at IS NULL) AND (NOT (tags @>
'{en}'::character varying[])) AND (project_id = 123))
Rows Removed by Filter: 4572769
Planning Time: 1.304 ms
Execution Time: 70702.463 ms
(10 rows)
=== INDEXES ===
Indexes:
"subscriptions_pkey" PRIMARY KEY, btree (id)
"index_subscriptions_on_project_id_and_created_at" btree (project_id,
created_at DESC)
"index_subscriptions_on_project_id_and_tags" gin (project_id, tags)
WHERE trashed_at IS NULL
"index_subscriptions_on_project_id_and_trashed_at" btree (project_id,
trashed_at DESC)
=== NOTES ===
Running the query without the last filter on tags takes only 500ms.
Unfortunately I cannot make strict assumptions on data or tags: for example
I also have to count subscriptions in a project that don't have tag A and
don't have tag B, etc. This means that I cannot simply calculate the total
and then make a subtraction.
On Tue, Nov 12, 2019 at 7:40 PM Michael Lewis <mlewis(at)entrata(dot)com> wrote:
> What's the plan for the slow one? What's the time to just count all rows?
>
>>
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Lewis | 2019-11-12 19:20:10 | Re: Slow "not in array" operation |
Previous Message | Michael Lewis | 2019-11-12 18:39:46 | Re: Slow "not in array" operation |