Re: Cleaning up array_in()

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Nikhil Benesch <nikhil(dot)benesch(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Cleaning up array_in()
Date: 2023-07-08 19:49:31
Message-ID: 1740062.1688845771@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
> On 08/07/2023 19:08, Tom Lane wrote:
>> That got sideswiped by ae6d06f09, so here's a trivial rebase to
>> pacify the cfbot.

> Something's wrong with your attachments.

Yeah, I forgot to run mhbuild :-(

> I spent some time today refactoring it for readability and speed. I
> introduced a separate helper function to tokenize the input. It deals
> with whitespace, escapes, and backslashes. Then I merged ArrayCount()
> and ReadArrayStr() into one function that parses the elements and
> determines the dimensions in one pass. That speeds up parsing large
> arrays. With the tokenizer function, the logic in ReadArrayStr() is
> still quite readable, even though it's now checking the dimensions at
> the same time.

Oh, thanks for taking a look!

> I also noticed that we used atoi() to parse the integers in the
> dimensions, which doesn't do much error checking.

Yup, I'd noticed that too but not gotten around to doing anything
about it. I agree with nailing it down better as long as we're
tightening things in this area.

> Attached are your patches, rebased to fix the conflicts with ae6d06f09
> like you intended. On top of that, my patches. My patches need more
> testing, benchmarking, and review, so if we want to sneak something into
> v16, better go with just your patches.

At this point I'm only proposing this for v17, so additional cleanup
is welcome.

BTW, what's your opinion of allowing "[1:0]={}" ? Although that was
my proposal to begin with, I'm having second thoughts about it now.
The main reason is that the input transformation would be lossy,
eg "[1:0]={}" and "[101:100]={}" would give the same results, which
seems a little ugly. Given the lack of field complaints, maybe we
should leave that alone.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2023-07-08 20:03:47 Re: Cleaning up array_in()
Previous Message Gurjeet Singh 2023-07-08 19:47:32 Re: DecodeInterval fixes