Quick Links

Re: Slow Count-Distinct Query

From:	Shaun Thomas <sthomas(at)optionshouse(dot)com>
To:	'Christopher Jackson' <crjackso(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: Slow Count-Distinct Query
Date:	2014-03-31 13:17:55
Message-ID:	0683F5F5A5C7FE419A752A034B4A0B979785A92E@sswchi5pmbx2.peak6.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

> tl;dr - How can I speed up my count-distinct query?

You can't.

Doing a count(distinct x) is much different than a count(1), which can simply scan available indexes. To build a distinct, it has to construct an in-memory hash of every valid email, and count the distinct values therein. This will pretty much never be fast, especially with 2M rows involved.

I could be wrong about this, and the back-end folks might have a different answer, but I wouldn't hold my breath.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd | Suite 400 | Chicago IL, 60604
312-676-8870
sthomas(at)optionshouse(dot)com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email

In response to

Slow Count-Distinct Query at 2014-03-30 19:45:51 from Christopher Jackson

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Merlin Moncure	2014-03-31 13:47:21	Re: Sudden crazy high CPU usage
Previous Message	Niels Kristian Schjødt	2014-03-31 10:25:52	Sudden crazy high CPU usage