| From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
|---|---|
| To: | li(dot)evan(dot)chao(at)gmail(dot)com |
| Cc: | vik(at)postgresfriends(dot)org, david(dot)g(dot)johnston(at)gmail(dot)com, jacob(dot)champion(at)enterprisedb(dot)com, er(at)xs4all(dot)nl, peter(at)eisentraut(dot)org, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Row pattern recognition |
| Date: | 2025-12-01 12:21:38 |
| Message-ID: | 20251201.212138.1470631793001194849.ishii@postgresql.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
>> On Nov 27, 2025, at 11:10, Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>>
>> Hi Chao,
>>
>> Any comment on this?
>>
>>>> 13 - 0005 - nodeWindowAgg.c
>>>> ```
>>>> static
>>>> int
>>>> do_pattern_match(char *pattern, char *encoded_str, int len)
>>>> {
>>>> static regex_t *regcache = NULL;
>>>> static regex_t preg;
>>>> static char patbuf[1024]; /* most recent 'pattern' is cached here */
>>>> ```
>>>>
>>>> Using static variable in executor seems something I have never seen, which may persistent data across queries. I think usually per query data are stored in state structures.
>>>
>>> Yeah, such a usage maybe rare. But I hesitate to store the data
>>> (regex_t) in WindowAggState because:
>>>
>>> (1) it bloats WindowAggState which we want to avoid if possible
>>> (2) it requires execnodes.h to include regex.h. (maybe this itself is harmless, I am not sure)
>
> With the static function-scope variables, those data persist across queries, which is error prone. I am afraid that even if I let this pass, other reviewers might have the same concern.
>
> Maybe define a sub structure, say WindowAggCache, and put a pointer of WindowAggCache in WindowAggState, and only allocate memory when a query needs to.
>
>>>
>>>> 14 - 0005 - nodeWindowAgg.c
>>>> ```
>>>> + MemoryContext oldContext = MemoryContextSwitchTo(TopMemoryContext);
>>>> +
>>>> + if (regcache != NULL)
>>>> + pg_regfree(regcache); /* free previous re */
>>>> +
>>>> + /* we need to convert to char to pg_wchar */
>>>> + plen = strlen(pattern);
>>>> + data = (pg_wchar *) palloc((plen + 1) * sizeof(pg_wchar));
>>>> + data_len = pg_mb2wchar_with_len(pattern, data, plen);
>>>> + /* compile re */
>>>> + sts = pg_regcomp(&preg, /* compiled re */
>>>> + data, /* target pattern */
>>>> + data_len, /* length of pattern */
>>>> + cflags, /* compile option */
>>>> + C_COLLATION_OID /* collation */
>>>> + );
>>>> + pfree(data);
>>>> +
>>>> + MemoryContextSwitchTo(oldContext);
>>>> ```
>>>>
>>>> Here in do_pattern_match, it switches to top memory context and compiles the re and stores to “preg". Based on the comment of pg_regcomp:
>>>> ```
>>>> /*
>>>> * pg_regcomp - compile regular expression
>>>> *
>>>> * Note: on failure, no resources remain allocated, so pg_regfree()
>>>> * need not be applied to re.
>>>> */
>>>> int
>>>> pg_regcomp(regex_t *re,
>>>> const chr *string,
>>>> size_t len,
>>>> int flags,
>>>> Oid collation)
>>>> ```
>>>>
>>>> “preg” should be freed by pg_regfree(), given the memory is allocated in TopMemoryContext, this looks like a memory leak.
>>>
>>> I admit current patch leaves the memory unfreed even after a query
>>> finishes. What about adding something like:
>>>
>>> static void do_pattern_match_end(void)
>>> {
>>> if (regcache != NULL)
>>> pg_regfree(regcache);
>>> }
>>>
>>> and let ExecEndWindowAgg() call it?
>>
>
> I’m not sure. But I think if we move the data into WindowAggState, then I guess we don’t have to use TopmemoryContext here.
I decided to add new fields to WindowAggState:
/* regular expression compiled result cache. Used for RPR. */
char *patbuf; /* pattern to be compiled */
regex_t preg; /* compiled re pattern */
Then allocate the memory for them using
winstate->ss.ps.ps_ExprContext->ecxt_per_query_memory;
Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Pavel Stehule | 2025-12-01 12:33:52 | Re: Migrate to autoconf 2.72? |
| Previous Message | shveta malik | 2025-12-01 12:13:14 | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |