Re: Radix tree for character conversion

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: hlinnaka(at)iki(dot)fi
Cc: daniel(at)yesql(dot)se, peter(dot)eisentraut(at)2ndquadrant(dot)com, robertmhaas(at)gmail(dot)com, tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, ishii(at)sraoss(dot)co(dot)jp, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Radix tree for character conversion
Date: 2017-01-10 11:22:23
Message-ID: 20170110.202223.184810013.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, I found a bug in my portion while rebasing.

The attached patches apply on top of the current master HEAD, not
on Heikki's previous one. And separated into 4 parts.

At Tue, 13 Dec 2016 15:11:03 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <20161213(dot)151103(dot)157484378(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
> > Apart from the aboves, I have some trivial comments on the new
> > version.
> >
> >
> > 1. If we decide not to use old-style maps, UtfToLocal no longer
> > need to take void * as map data. (Patch 0001)

I changed the pointer type wrongly. Combined maps are of the type
*_combined.

> > 2. "use Data::Dumper" doesn't seem necessary. (Patch 0002)
> > 3. A comment contains a superfluous comma. (Patch 0002) (The last
> > byte of the first line below)
> > 4. The following code doesn't seem so perl'ish.
> > 5. download_srctxts.sh is no longer needed. (No patch)
>
> 6. Fixed some inconsistent indentation/folding.
> 7. Fix handling of $verbose.
> 8. Sort segments using leading bytes.

The attached files are the following. This patchset is not
complete missing changes of map files. The change is tremendously
large but generatable.

0001-Add-missing-semicolon.patch

UCS_to_EUC_JP.pl has a line missing teminating semicolon. This
doesn't harm but surely a syntax error. This patch fixes it.
This might should be a separate patch.

0002-Correct-reference-resolution-syntax.patch

convutils.pm has lines with different syntax of reference
resolution. This unifies the syntax.

0003-Apply-pgperltidy-on-src-backend-utils-mb-Unicode.patch

Before adding radix tree stuff, applied pgperltidy and inserted
format-skipping pragma for the parts where perltidy seems to do
too much.

0004-Use-radix-tree-for-character-conversion.patch

Radix tree body.

The unattached fifth patch is generated by the following steps.

[$(TOP)]$ ./configure
[Unicode]$ make
[Unicode]$ make distclean
[Unicode]$ git add .
[Unicode]$ commit
=== COMMITE MESSSAGE
Replace map files with radix tree files.

These encodings no longer uses the former map files and uses new radix
tree files.
===

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
0001-Add-missing-semicolon.patch text/x-patch 892 bytes
0002-Correct-reference-resolution-syntax.patch text/x-patch 3.7 KB
0003-Apply-pgperltidy-on-src-backend-utils-mb-Unicode.patch text/x-patch 22.3 KB
0004-Use-radix-tree-for-character-conversion.patch text/x-patch 1.6 MB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2017-01-10 11:46:00 Re: Parallel bitmap heap scan
Previous Message Amit Langote 2017-01-10 11:06:45 Re: Declarative partitioning - another take