Re: Do we want a hashset type?

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Joel Jacobson <joel(at)compiler(dot)org>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Tom Dunstan <pgsql(at)tomd(dot)cc>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Do we want a hashset type?
Date: 2023-06-15 02:22:09
Message-ID: CACJufxFUeadb3qj8sSM_z2Z9YxGtQn_9YEL3Y=xd=xVVZ_=SaA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 15, 2023 at 5:04 AM Joel Jacobson <joel(at)compiler(dot)org> wrote:

> On Wed, Jun 14, 2023, at 15:16, Tomas Vondra wrote:
> > On 6/14/23 14:57, Joel Jacobson wrote:
> >> Would it be feasible to teach the planner to utilize the internal hash
> table of
> >> hashset directly? In the case of arrays, the hash table construction is
> an
> ...
> > It's definitely something I'd leave out of v0, personally.
>
> OK, thanks for guidance, I'll stay away from it.
>
> I've been doing some preparatory work on this todo item:
>
> > 3) support for other types (now it only works with int32)
>
> I've renamed the type from "hashset" to "int4hashset",
> and the SQL-functions are now prefixed with "int4"
> when necessary. The overloaded functions with
> int4hashset as input parameters don't need to be prefixed,
> e.g. hashset_add(int4hashset, int).
>
> Other changes since last update (4e60615):
>
> * Support creation of empty hashset using '{}'::hashset
> * Introduced a new function hashset_capacity() to return the current
> capacity
> of a hashset.
> * Refactored hashset initialization:
> - Replaced hashset_init(int) with int4hashset() to initialize an empty
> hashset
> with zero capacity.
> - Added int4hashset_with_capacity(int) to initialize a hashset with
> a specified capacity.
> * Improved README.md and testing
>
> As a next step, I'm planning on adding int8 support.
>
> Looks and sounds good?
>
> /Joel

still playing around with hashset-0.0.1-a8a282a.patch.

I think "postgres.h" should be on the top, (someone have said it on another
email thread, I forgot who said that)

In my
local /home/jian/postgres/pg16/include/postgresql/server/libpq/pqformat.h:

> /*
> * Append a binary integer to a StringInfo buffer
> *
> * This function is deprecated; prefer use of the functions above.
> */
> static inline void
> pq_sendint(StringInfo buf, uint32 i, int b)

So I changed to pq_sendint32.

ending and beginning, and in between white space should be stripped. The
following c example seems ok for now. but I am not sure, I don't know how
to glue it in hashset_in.

forgive me the patch name....

/*
gcc /home/jian/Desktop/regress_pgsql/strip_white_space.c && ./a.out
*/

#include<stdio.h>
#include<stdint.h>
#include<string.h>
#include<stdbool.h>
#include <ctype.h>
#include<stdlib.h>

/*
* array_isspace() --- a non-locale-dependent isspace()
*
* We used to use isspace() for parsing array values, but that has
* undesirable results: an array value might be silently interpreted
* differently depending on the locale setting. Now we just hard-wire
* the traditional ASCII definition of isspace().
*/
static bool
array_isspace(char ch)
{
if (ch == ' ' ||
ch == '\t' ||
ch == '\n' ||
ch == '\r' ||
ch == '\v' ||
ch == '\f')
return true;
return false;
}

int main(void)
{
long *temp = malloc(10 * sizeof(long));
memset(temp,0,10);
char source[5][50] = {{0}};
snprintf(source[0],sizeof(source[0]),"%s"," { 1 , 20 }");
snprintf(source[1],sizeof(source[0]),"%s"," { 1 ,20 , 30 ");
snprintf(source[2],sizeof(source[0]),"%s"," {1 ,20 , 30 ");
snprintf(source[3],sizeof(source[0]),"%s"," {1 , 20 , 30 }");
snprintf(source[4],sizeof(source[0]),"%s"," {1 , 20 , 30 }
");
/* Make a modifiable copy of the input */
char *p;
char string_save[50];

for(int j = 0; j < 5; j++)
{
snprintf(string_save,sizeof(string_save),"%s",source[j]);
p = string_save;

int i = 0;
while (array_isspace(*p))
p++;
if (*p != '{')
{
printf("line: %d should be {\n",__LINE__);
exit(EXIT_FAILURE);
}

for (;;)
{
char *q;
if (*p == '{')
p++;
temp[i] = strtol(p, &q,10);
printf("temp[j=%d] [%d]=%ld\n",j,i,temp[i]);

if (*q == '}' && (*(q+1) == '\0'))
{
printf("all works ok now exit\n");
break;
}
if( !array_isspace(*q) && *q != ',')
{
printf("wrong format. program will exit\n");
exit(EXIT_FAILURE);
}
while(array_isspace(*q))
q++;
if(*q != ',')
break;
else
p = q+1;
i++;
}
}
}

Attachment Content-Type Size
temp.patch text/x-patch 945 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2023-06-15 02:28:24 Re: Bypassing shared_buffers
Previous Message Kyotaro Horiguchi 2023-06-15 02:07:05 Re: Shouldn't cost_append() also scale the partial path's cost?