Re: A bug when use get_bit() function for a long bytea string

From: "movead(dot)li(at)highgo(dot)ca" <movead(dot)li(at)highgo(dot)ca>
To: ashutosh(dot)bapat <ashutosh(dot)bapat(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: A bug when use get_bit() function for a long bytea string
Date: 2020-03-28 08:59:01
Message-ID: 2020032816585865712320@highgo.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>I think the second argument indicates the bit position, which would be max bytea length * 8. If max bytea length covers whole int32, the second argument >needs to be wider i.e. int64.
Yes, it makes sence and followed.

>I think we need a similar change in byteaGetBit() and byteaSetBit() as well.
Sorry, I think it's my mistake, it is the two functions above should be changed.

> Some more comments on the patch
> struct pg_encoding
> {
>- unsigned (*encode_len) (const char *data, unsigned dlen);
>+ int64 (*encode_len) (const char *data, unsigned dlen);
> unsigned (*decode_len) (const char *data, unsigned dlen);
> unsigned (*encode) (const char *data, unsigned dlen, char *res);
> unsigned (*decode) (const char *data, unsigned dlen, char *res);
> Why not use return type of int64 for rest of the functions here as well?
> res = enc->encode(VARDATA_ANY(data), datalen, VARDATA(result));
> /* Make this FATAL 'cause we've trodden on memory ... */
>- if (res > resultlen)
>+ if ((int64)res > resultlen)
>
>if we change return type of all those functions to int64, we won't need this cast.
I change the 'encode' function, it needs an int64 return type, but keep other
functions in 'pg_encoding', because I think it of no necessary reason.

>Ok, let's leave it for a committer to decide.
Well, I change all of them this time, because Tom Lane supports on next mail.


>Some more review comments.
>+ int64 res,resultlen;
Done

>We need those on separate lines, possibly.
>+ byteNo = (int32)(n / BITS_PER_BYTE);
>Does it hurt to have byteNo as int64 so as to avoid a cast. Otherwise, please
>add a comment explaining the reason for the cast. The comment applies at other
>places where this change appears.
>- int len;
>+ int64 len;
>Why do we need this change?
> int i;
It is my mistake as describe above, it should not be 'bitgetbit()/bitsetbit()' to be changed.

>It might help to add a test where we could pass the second argument something
>greater than 1G. But it may be difficult to write such a test case.
Add two test cases.

>+
>+select get_bit(
>+ set_bit((repeat('Postgres', 512 * 1024 * 1024 / 8))::bytea, 1024 * 1024 * 1024 + 1, 0)
>+ ,1024 * 1024 * 1024 + 1);

>This bit position is still within int4.
>postgres=# select pg_column_size(1024 * 1024 * 1024 + 1);
> pg_column_size
>----------------
> 4
>(1 row)

>You want something like
>postgres=# select pg_column_size(512::bigint * 1024 * 1024 * 8);
> pg_column_size
>----------------
> 8
>(1 row)
I intend to test size large then 1G, and now I think you give a better idea and followed.

Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Attachment Content-Type Size
long_bytea_string_bug_fix_ver4.patch application/octet-stream 10.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message i.taranov 2020-03-28 09:00:08 [PATCH] postgresql.conf.sample->postgresql.conf.sample.in
Previous Message Dilip Kumar 2020-03-28 08:49:31 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions