Re: Row pattern recognition

From: Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>
To: champion(dot)p(at)gmail(dot)com
Cc: er(at)xs4all(dot)nl, vik(at)postgresfriends(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Row pattern recognition
Date: 2023-11-08 07:37:05
Message-ID: 20231108.163705.1531753254147888242.t-ishii@sranhm.sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>> It would be nicer if it
>> could be implemented without using recursion.
>
> Yeah. If for some reason we end up going with a bespoke
> implementation, I assume we'd just convert the algorithm to an
> iterative one and optimize it heavily. But I didn't want to do that
> too early, since it'd probably make it harder to add new features...
> and anyway my goal is still to try to reuse src/backend/regex
> eventually.

Ok.

Attached is the v11 patch. Below are the summary of the changes from
previous version.

- rebase.

- Reduce memory allocation in pattern matching (search_str_set()). But
still Champion's second stress test gives OOM killer.

- While keeping an old set to next round, move the StringInfo to
new_str_set, rather than copying from old_str_set. This allows to
run pgbench.sql against up to 60k rows on my laptop (previously
20k).

- Use enlargeStringInfo to set the buffer size, rather than
incrementally enlarge the buffer. This does not seem to give big
enhancement but it should theoretically an enhancement.

- Fix "variable not found in subplan target list" error if WITH is
used.

- To fix this apply pullup_replace_vars() against DEFINE clause in
planning phase (perform_pullup_replace_vars()). Also add
regression test cases for WITH that caused the error in the
previous version.

- Fix the case when no greedy quantifiers ('+' or '*') are included in
PATTERN.

- Previously update_reduced_frame() did not consider the case and
produced wrong results. Add another code path which is dedicated
to none greedy PATTERN (at this point, it means there's no
quantifier case). Also add a test case for this.

- Remove unnecessary check in transformPatternClause().

- Previously it checked if all pattern variables are defined in
DEFINE clause. But currently RPR allows to "auto define" such
variables as "varname AS TRUE". So the check was not necessary.

- FYI here is the list to explain what was changed in each patch file.

0001-Row-pattern-recognition-patch-for-raw-parser.patch
- same

0002-Row-pattern-recognition-patch-parse-analysis.patch
- Add markTargetListOrigins() to transformFrameOffset().
- Change transformPatternClause().

0003-Row-pattern-recognition-patch-planner.patch
- Fix perform_pullup_replace_vars()

0004-Row-pattern-recognition-patch-executor.patch
- Fix update_reduced_frame()
- Fix search_str_set()

0005-Row-pattern-recognition-patch-docs.patch
- same

0006-Row-pattern-recognition-patch-tests.patch
- Add test case for non-greedy and WITH cases

0007-Allow-to-print-raw-parse-tree.patch
- same

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Attachment Content-Type Size
v11-0001-Row-pattern-recognition-patch-for-raw-parser.patch text/x-patch 21.1 KB
v11-0002-Row-pattern-recognition-patch-parse-analysis.patch text/x-patch 11.2 KB
v11-0003-Row-pattern-recognition-patch-planner.patch text/x-patch 5.9 KB
v11-0004-Row-pattern-recognition-patch-executor.patch text/x-patch 50.9 KB
v11-0005-Row-pattern-recognition-patch-docs.patch text/x-patch 9.6 KB
v11-0006-Row-pattern-recognition-patch-tests.patch text/x-patch 46.2 KB
v11-0007-Allow-to-print-raw-parse-tree.patch text/x-patch 749 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2023-11-08 07:40:34 Re: Improve WALRead() to suck data directly from WAL buffers when possible
Previous Message Bharath Rupireddy 2023-11-08 07:34:37 Re: Show WAL write and fsync stats in pg_stat_io