| From: | YANG Xudong <yangxudong(at)ymatrix(dot)cn> | 
|---|---|
| To: | John Naylor <john(dot)naylor(at)enterprisedb(dot)com> | 
| Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org, wengyanqing(at)ymatrix(dot)cn, wanghao(at)ymatrix(dot)cn | 
| Subject: | Re: [PATCH] Add loongarch native checksum implementation. | 
| Date: | 2023-08-08 03:06:57 | 
| Message-ID: | da768d47-7534-c347-719a-06d8516cf939@ymatrix.cn | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Thanks for the comment. I have updated the patch to v3. Please have a look.
On 2023/8/7 19:01, John Naylor wrote:
> 
> On Fri, Jun 16, 2023 at 8:28 AM YANG Xudong <yangxudong(at)ymatrix(dot)cn 
> <mailto:yangxudong(at)ymatrix(dot)cn>> wrote:
>  > > +# If the intrinsics are supported, sets 
> pgac_loongarch_crc32c_intrinsics,
>  > > +# and CFLAGS_CRC.
>  > >
>  > > +# Check if __builtin_loongarch_crcc_* intrinsics can be used
>  > > +# with the default compiler flags.
>  > > +# CFLAGS_CRC is set if the extra flag is required.
>  > >
>  > > Same here -- it seems we don't need to set CFLAGS_CRC at all. Can you
>  > > confirm?
>  > >
>  >
>  > We don't need to set CFLAGS_CRC as commented. I have updated the
>  > configure script to make it align with the logic in meson build script.
> 
> (Looking again at v2)
> 
> The compilation test is found in c-compiler.m4, which still has all 
> logic for CFLAGS_CRC, including saving and restoring the old CFLAGS. Can 
> this also be simplified?
Fixed the function in c-compiler.m4 by removing the function argument 
and the logic of handling CFLAGS and CFLAGS_CRC.
> 
> I diff'd pg_crc32c_loongarch.c with the current other files, and found 
> it is structurally the same as the Arm implementation. That's logical if 
> memory alignment is important.
> 
>    /*
> - * ARMv8 doesn't require alignment, but aligned memory access is
> - * significantly faster. Process leading bytes so that the loop below
> - * starts with a pointer aligned to eight bytes.
> + * Aligned memory access is significantly faster.
> + * Process leading bytes so that the loop below starts with a pointer 
> aligned to eight bytes.
> 
> Can you confirm the alignment requirement -- it's not clear what the 
> intention is since "doesn't require" wasn't carried over. Is there any 
> documentation (or even a report in some other context) about aligned vs 
> unaligned memory access performance?
It is in the official document that the alignment is not required.
However, I found this patch in LKML that shows great performance gain 
when using aligned memory access similar to this patch.
https://lore.kernel.org/lkml/20230410115734(dot)93365-1-wangrui(at)loongson(dot)cn/
So I guess using aligned memory access is necessary and I have updated 
the comment in the code.
> 
> --
> John Naylor
> EDB: http://www.enterprisedb.com <http://www.enterprisedb.com>
| Attachment | Content-Type | Size | 
|---|---|---|
| v3-0001-Add-loongarch-native-checksum-implementation.patch | text/plain | 13.9 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andrey Lepikhov | 2023-08-08 03:22:49 | Re: [PoC] Reducing planning time when tables have many partitions | 
| Previous Message | Michael Paquier | 2023-08-08 03:05:25 | Re: WIP: new system catalog pg_wait_event |