Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]
Date: 2012-09-28 05:19:06
Message-ID: CAFj8pRADh_R8bFixo6Eivnd_Dyw=JsUL40V-56XWE2Vmb2+wQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2012/9/28 Amit Kapila <amit(dot)kapila(at)huawei(dot)com>:
>> On Friday, September 28, 2012 2:28 AM Pavel Stehule wrote:
>> Hello
>>
>> I reduced this patch as just plpgsql (SPI) problem solution.
>>
>> Correct generic solution needs a discussion about enhancing our libpq
>> interface - we can take a number of returned rows, but we should to get
>> number of processed rows too. A users should to parse this number from
>> completationTag, but this problem is not too hot and usually is not
>> blocker, because anybody can process completationTag usually.
>>
>> So this issue is primary for PL/pgSQL, where this information is not
>> possible. There is precedent - CREATE AS SELECT (thanks for sample :)),
>> and COPY should to have same impact on ROW_COUNT like this statement
>> (last patch implements it).
>
> IMO now this patch is okay. I have marked it as Ready For Committer.

Thank you very much for review

Regards

Pavel

>
> With Regards,
> Amit Kapila.
>
>
>>
>>
>> 2012/9/21 Amit Kapila <amit(dot)kapila(at)huawei(dot)com>:
>> > On Friday, September 21, 2012 1:23 AM Pavel Stehule wrote:
>> >
>> >>> Basic stuff:
>> >>> ------------
>> >>> - Patch applies OK. but offset difference in line numbers.
>> >>> - Compiles with errors in contrib [pg_stat_statements, sepgsql]
>> >>> modules
>> >>> - Regression failed; one test-case in COPY due to incomplete
>> >>> test-case attached patch. – same as reported by Heikki
>> >>
>> >>fixed patch is in attachment
>> >
>> > After modifications:
>> > ---------------------------
>> > - Patch applies OK
>> > - Compiles cleanly without any errors/warnings
>> > - Regression tests pass.
>> >
>> >>>
>> >>> What it does:
>> >>> --------------
>> >>> Modification to get the number of processed rows evaluated via SPI.
>> >>> The changes are to add extra parameter in ProcessUtility to get the
>> >>> number of rows processed by COPY command.
>> >>>
>> >>> Code Review Comments:
>> >>> ---------------------
>> >>> 1. New parameter is added to ProcessUtility_hook_type function
>> >>> but the functions which get assigned to these functions like
>> >>> sepgsql_utility_command, pgss_ProcessUtility, prototype &
>> >>> definition is not modified.
>> >
>> > Functionality is not fixed correctly for hook functions, In function
>> > pgss_ProcessUtility for bellow snippet of code processed parameter is
>> passed NULL, as well as not initialized.
>> > because of this when "pg_stat_statements" extention is utilized COPY
>> command is giving garbage values.
>> > if (prev_ProcessUtility)
>> > prev_ProcessUtility(parsetree, queryString, params,
>> > dest,
>> completionTag, context, NULL);
>> > else
>> > standard_ProcessUtility(parsetree, queryString,
>> params,
>> > dest,
>> > completionTag, context, NULL);
>> >
>> > Testcase is attached.
>> > In this testcase table has only 1000 records but it show garbage
>> value.
>> > postgres=# show shared_preload_libraries ;
>> > shared_preload_libraries
>> > --------------------------
>> > pg_stat_statements
>> > (1 row)
>> > postgres=# CREATE TABLE tbl (a int);
>> > CREATE TABLE
>> > postgres=# INSERT INTO tbl VALUES(generate_series(1,1000));
>> > INSERT 0 1000
>> > postgres=# do $$
>> > declare r int;
>> > begin
>> > copy tbl to '/home/kiran/copytest.csv' csv;
>> > get diagnostics r = row_count;
>> > raise notice 'exported % rows', r;
>> > truncate tbl;
>> > copy tbl from '/home/kiran/copytest.csv' csv;
>> > get diagnostics r = row_count;
>> > raise notice 'imported % rows', r;
>> > end;
>> > $$ language plpgsql;
>> > postgres$#
>> > NOTICE: exported 13281616 rows
>> > NOTICE: imported 13281616 rows
>> > DO
>> >
>> >>>
>> >>> 2. Why to add the new parameter if completionTag hold the number of
>> >>> processed tuple information; can be extracted
>> >>>
>> >>> from it as follows:
>> >>> _SPI_current->processed = strtoul(completionTag
>> >>> + 7, NULL, 10);
>> >>
>> >>this is basic question. I prefer a natural type for counter - uint64
>> >>instead text. And there are no simply way to get offset (7 in this
>> >>case)
>> >
>> > I agree with your point, but currently in few other places we are
>> > parsing the completion tag for getting number of tuples processed. So
>> > may be in future we can change those places as well. For example
>>
>> yes, this step can be done in future - together with libpq enhancing.
>> Is not adequate change API (and break lot of extensions) for this
>> isolated problem. So I propose fix plpgsql issue, and add to ToDo -
>> "enhance libpq to return a number of processed rows as number" - but
>> this change breaking compatibility.
>>
>> Pavel
>>
>> >
>> > pgss_ProcessUtility
>> > {
>> > ..
>> >
>> > /* parse command tag to retrieve the number of affected rows. */ if
>> > (completionTag &&
>> > sscanf(completionTag, "COPY " UINT64_FORMAT, &rows) != 1)
>> > rows = 0;
>> >
>> > }
>> >
>> > _SPI_execute_plan
>> > {
>> > ..
>> > ..
>> > if (IsA(stmt, CreateTableAsStmt))
>> > {
>> > Assert(strncmp(completionTag, "SELECT ", 7) == 0);
>> > _SPI_current->processed = strtoul(completionTag + 7,
>> >
>> > NULL, 10);
>> >
>> > ..
>> > }
>> >
>> >
>> > With Regards,
>> > Amit Kapila.
>> >
>> >
>> >
>> > --
>> > Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org) To
>> > make changes to your subscription:
>> > http://www.postgresql.org/mailpref/pgsql-hackers
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2012-09-28 10:22:12 Re: data to json enhancements
Previous Message Amit Kapila 2012-09-28 04:47:21 Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]