Re: Collect frequency statistics for arrays

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Nathan Boley <npboley(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Collect frequency statistics for arrays
Date: 2012-01-23 18:38:49
Message-ID: CAPpHfdsqAaWN4fqzLXK-fmY3VeXti1=i0Q-DEsyP2VrOcnSzuw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 23, 2012 at 7:58 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:

> > + /* Take care about events with low probabilities. */
> > + if (rest > DEFAULT_CONTAIN_SEL)
> > + {
>
> Why the change from "rest > 0" to this in the latest version?
>
Ealier addition of "rest" distribution require O(m) time. Now there is a
more accurate and proved estimate, but it takes O(m^2) time.It doesn't make
general assymptotical time worse, but it significant. That's why I decided
to skip for low values of "rest" which don't change distribution
significantly.

>
> > + /* emit some statistics for debug purposes */
> > + elog(DEBUG3, "array: target # mces = %d, bucket width =
> %d, "
> > + "# elements = %llu, hashtable size = %d, usable
> entries = %d",
> > + num_mcelem, bucket_width, element_no, i,
> track_len);
>
> That should be UINT64_FMT. (I introduced that error in v0.10.)
>
>
> I've attached a new version that includes the UINT64_FMT fix, some edits of
> your newest comments, and a rerun of pgindent on the new files. I see no
> other issues precluding commit, so I am marking the patch Ready for
> Committer.
>
Great!

> If I made any of the comments worse, please post another update.
>
Changes looks reasonable for me. Thanks!

------
With best regards,
Alexander Korotkov.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2012-01-23 19:45:34 Re: Multithread Query Planner
Previous Message Marko Kreen 2012-01-23 17:38:50 Re: GUC_REPORT for protocol tunables was: Re: Optimize binary serialization format of arrays with fixed size elements