Re: width_bucket function for timestamps

From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: Jeremy Drake <pgsql(at)jdrake(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: width_bucket function for timestamps
Date: 2006-10-09 15:53:47
Message-ID: 20061009155346.GY72517@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sinte we already have width_bucket, I'd argue this should go in core. If
someone's feeling adventurous, there should probably be a double
precision version as well. Hrm... and maybe text...

Doesn't the backend already have something like this for calculating
histograms?

On Sun, Oct 08, 2006 at 10:30:47PM -0700, Jeremy Drake wrote:
> I just came across this code I wrote about a year ago which implements a
> function equivilant to width_bucket for timestamps.
>
> I wrote this when I was trying to plot some data over time, and I had more
> points than I needed. This function allowed me to create a pre-determined
> number of "bins" to average the data inside of so that I could get a sane
> number of points. Part of the problem was that there were so many data
> points, that a sql implementation of the function (or plpgsql, I forget,
> it was a year ago) was painfully slow. This C function provided much
> better performance than any other means at my disposal.
>
> I wanted to share this code since it may be useful for someone else, but I
> don't know exactly what to do with it. So I am putting it out there, and
> asking what the proper home for such a function might be. I believe it
> would be generally useful for people, but it is so small that it hardly
> seems like a reasonable pgFoundry project. Maybe there is a home for such
> a thing in the core distribution in a future release?
>
> The code can be found at
> http://www.jdrake.com/postgresql/bintimestamp.tar.gz for a buildable PGXS
> module, or I attached just the C code. There is no documentation, the
> parameters work the same as the width_bucket function. The code is not
> necessarily the most readable in the world, I was trying to get as much
> speed out of it as possible, since I was calling it over a million times
> as a group by value.
>
> Thanks for any pointers...
>
> --
> Fortune's Office Door Sign of the Week:
>
> Incorrigible punster -- Do not incorrige.

> /*****************************************************************************
> * file: $RCSfile: bintimestamp.c,v $ $Revision: 1.1 $
> * module: timestamp
> * authors: jeremyd
> * last mod: $Author: jeremyd $ at $Date: 2005/10/28 20:26:38 $
> *
> * created: Fri Oct 28 13:26:38 PDT 2005
> *
> *****************************************************************************/
>
> #include <string.h>
> #include <math.h>
> #include "postgres.h"
>
> #include "fmgr.h"
> #include "libpq/pqformat.h"
> #include "utils/builtins.h"
> #include "funcapi.h"
> #include "utils/timestamp.h"
>
> #ifndef JROUND
> # define JROUND(x) (x)
> #endif
>
> Datum timestamp_get_bin_size(PG_FUNCTION_ARGS);
> Datum timestamp_bin(PG_FUNCTION_ARGS);
>
> PG_FUNCTION_INFO_V1(timestamp_get_bin_size);
> Datum
> timestamp_get_bin_size(PG_FUNCTION_ARGS)
> {
> Timestamp start = PG_GETARG_TIMESTAMP(0);
> Timestamp stop = PG_GETARG_TIMESTAMP(1);
> int32 nbuckets = PG_GETARG_INT32(2);
> Interval * retval = (Interval *)palloc (sizeof(Interval));
>
> if (!retval)
> {
> ereport(ERROR, (errcode(ERRCODE_OUT_OF_MEMORY), errmsg("insufficient memory for Interval allocation")));
> PG_RETURN_NULL();
> }
>
> memset (retval, 0, sizeof(Interval));
>
> retval->time = JROUND ((stop - start) / nbuckets);
>
> PG_RETURN_INTERVAL_P(retval);
> }
>
> PG_FUNCTION_INFO_V1(timestamp_bin);
> Datum
> timestamp_bin(PG_FUNCTION_ARGS)
> {
> /*Timestamp op = PG_GETARG_TIMESTAMP(0);*/
> Timestamp start = PG_GETARG_TIMESTAMP(1);
> /*Timestamp stop = PG_GETARG_TIMESTAMP(2);*/
> Timestamp binsz;
> /*int32 nbuckets = PG_GETARG_INT32(3)*/;
>
> binsz = (PG_GETARG_TIMESTAMP(2) - start) / PG_GETARG_INT32(3);
>
> PG_RETURN_TIMESTAMP(JROUND((int)((PG_GETARG_TIMESTAMP(0) - start) / binsz) * binsz + start));
> }

>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-10-09 16:02:12 Re: width_bucket function for timestamps
Previous Message Mark Woodward 2006-10-09 15:50:10 Re: Upgrading a database dump/restore