Re: count(*) performance improvement ideas

From: "Stephen Denne" <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>, "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Subject: Re: count(*) performance improvement ideas
Date: 2008-03-18 23:24:16
Message-ID: F0238EBA67824444BC1CB4700960CB4804EAC60E@dmpeints002.isotach.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(Sorry for the repeat email Tom, I forgot the cc's)

Tom Lane wrote:
> "Stephen Denne" <Stephen(dot)Denne(at)datamail(dot)co(dot)nz> writes:
> > I initially thought that what you meant by "having
> > transactions enter "delta" entries" was that I have a trigger
> > that would create a row each time it was called, writing how
> > many records where inserted or deleted. I didn't understand
> > how this would be much of an improvement, as each of my rows
> > would contain either +1 or -1.
>
> Well, ideally you'd aggregate all the deltas caused by a particular
> transaction into one entry in the counting table.

Yes, that's what I was attempting to do, but without changing the application code.

Using txid_current() can do that, so each of a large number of individual inserts or deletes within a transaction updates the same delta row for that transaction. I haven't found any references to this being a solution, and thought it was worth recording.

> > However I'm not after a fast count(*) from table, but more
> like a fast
> > select grouping_id, count(*) from my_table group by grouping_id
>
> You could apply the same technique across each group id, though this
> certainly is getting beyond what any built-in feature might offer.

Agreed. I've tried it out now, and am fairly happy with what I've got.

> > Can you clarify the lack of MVCC problems?
>
> The point there is that the "right answer" actually depends on the
> observer, since each observer might have a different snapshot and
> therefore be able to "see" a different set of committed rows in the
> underlying table. The multiple-delta table handles this
> automatically,
> because you can "see" a delta entry if and only if you could "see"
> the underlying-table changes it represents.
>
> > Does this idea apply with the same efficiency in pre 8.3,
> non-HOT implementations?
>
> I didn't claim it was amazingly efficient in any implementation ;-).
> HOT in particular is nearly useless since most rows in the count
> table will never be updated, only inserted and eventually deleted.
> You might get some mileage on the base row, but that'd be about it.
> The count table will need frequent vacuums as well as frequent
> aggregation scans.
>
> It should beat scanning a large underlying table, but it's hardly
> gonna be free.

What I was asking about with those questions, is if a single row is inserted in transaction A, and updated 1000 times still within transaction A, then transaction A commits... does a single row version get written, or 1001, 1000 of which are removable?

I tested this with a small summary table freshly vacuum full'ed.

10 transactions, one after the other, each transaction inserted three delta rows, and updates one of them three times, and the other two five times each.
So 3 inserts and 13 updates per transaction.
The updates only affect non-indexed fields in rows created in the same transaction.

The next vacuum full found 52 removable row versions.

I repeated the test, and got 13 removable row versions.

I repeated the test again, and got 13 removable row versions.

I repeated just one of the ten transactions, 13 removable row versions.

All inserts and updates are probably in the one page that has a fair bit of free space.

Is it possible to update the HOT code to re-use row versions on the same page if they were created in the same transaction?

Conclusion: making use of txid_current(), I can get single delta rows with deltas of 10000, but doing so creates 10000 dead row versions.

Regards,
Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachments is confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately, destroy it and do not copy, disclose or use it in any way.
__________________________________________________________________
This email has been scanned by the DMZGlobal Business Quality
Electronic Messaging Suite.
Please see http://www.dmzglobal.com/dmzmessaging.htm for details.
__________________________________________________________________

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2008-03-18 23:35:42 Re: Re: pgsql: Add URLs for : * Speed WAL recovery by allowing more than one
Previous Message Bruce Momjian 2008-03-18 23:22:44 Re: pgindent issue with EXEC_BACKEND-only typedefs