Re: pgbench - allow to store select results into variables

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgbench - allow to store select results into variables
Date: 2016-09-12 07:33:39
Message-ID: de93db2a-539e-4044-bca8-47f2afa01ce1@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Fabien,

On 2016/09/07 23:01, Fabien COELHO wrote:
>> Custom script looks like:
>>
>> \;
>> select a \into a
>> from tab where a = 1;
>> \set i debug(:a)
>>
>> I get the following error:
>>
>> undefined variable "a"
>> client 0 aborted in state 1; execution of meta-command failed
>
> Good catch!
>
> Indeed, there is a problem with empty commands which are simply ignored by
> libpq/postgres if there are other commands around, so my synchronization
> between results & commands was too naive.
>
> In order to fix this, I made the scanner also count empty commands and
> ignore comments. I guessed that proposing to change libpq/postgres
> behavior was not an option.

Seems to be fixed.

>> Comments on the patch follow:
>>
>> + <listitem>
>> + <para>
>> + Stores the first fields of the resulting row from the current or
>> preceding
>> + <command>SELECT</> command into these variables.
>>
>> Any command returning rows ought to work, no?
>
> Yes. I put "SQL command" instead.

Check.

>> - char *line; /* text of command line */
>> + char *line; /* first line for short display */
>> + char *lines; /* full multi-line text of command */
>>
>> When I looked at this and related hunks (and also the hunks related to
>> semicolons), it made me wonder whether this patch contains two logical
>> changes. Is this a just a refactoring for the \into implementation or
>> does this provide some new externally visible feature?
>
> There is essentially a refactoring that I did when updating how Command is
> managed because it has to be done in several stages to fit "into" into it
> and to take care of compounds.
>
> However there was small "new externally visible feature": on -r, instead
> of cutting abruptly a multiline command at the end of the first line it
> appended "..." as an ellipsis because it looked nicer.
> I've removed this small visible changed.

There still seems to be a change in behavior of the -r option due to the
patch. Consider the following example:

# no \into used in the script
$ cat /tmp/into.sql
select a from a where a = 1 \;
select a+1 from a where a = 1;
\set a 1
\set b 2
\set i debug(:a)
\set i debug(:b)

$ pgbench -r -n -t 1 -f /tmp/into.sql postgres
<snip>
- statement latencies in milliseconds:
2.889 select a from a where a = 1 ;
0.012 \set a 1
0.009 \set b 2
0.031 \set i debug(:a)
0.014 \set i debug(:b)

# behavior wrt compound statement changes when \into is used
$ cat /tmp/into.sql
select a from a where a = 1 \; \into a
select a+1 from a where a = 1; \into b
\set i debug(:a)
\set i debug(:b)

$ pgbench -r -n -t 1 -f /tmp/into.sql postgres
<snip>
- statement latencies in milliseconds:
2.093 select a from a where a = 1 ; select a+1 from a where a = 1;
0.034 \set i debug(:a)
0.015 \set i debug(:b)

One more thing I observed which I am not sure if it's a fault of this
patch is illustrated below:

$ cat /tmp/into.sql
\;
select a from a where a = 1 \;
select a+1 from a where a = 1;
\set a 1
\set b 2
\set i debug(:a)
\set i debug(:b)

$ pgbench -r -n -t 1 -f /tmp/into.sql postgres
<snip>
- statement latencies in milliseconds:
2.349 ;
0.013 \set a 1
0.009 \set b 2
0.029 \set i debug(:a)
0.015 \set i debug(:b)

Note that the compound select statement is nowhere to be seen in the
latencies output. The output remains the same even if I use the \into's.
What seems to be going on is that the empty statement on the first line
(\;) is the only part kept of the compound statement spanning lines 1-3.

>> Also I wonder if intonames or intoargs would be a slightly better name
>> for the field.
>
> I put "intovars" as they are variable names.

Sounds fine.

>> + fprintf(stderr,
>> + "client %d state %d compound %d: "
>> + "cannot apply \\into to non SELECT statement\n",
>> + st->id, st->state, compound);
>>
>> How about make this error message say something like other messages
>> related to \into, perhaps something like: "\\into cannot follow non SELECT
>> statements\n"
>
> As you pointed out above, there may be statements without "SELECT" which
> which return a row. I wrote "\\into expects a row" instead.

Sounds fine.

>
>> /*
>> * Read and discard the query result; note this is not
>> included in
>> - * the statement latency numbers.
>> + * the statement latency numbers (above), thus if reading the
>> + * response fails the transaction is counted nevertheless.
>> */
>>
>> Does this comment need to mention that the result is not discarded when
>> \into is specified?
>
> Hmmm. The result structure is discarded, but the results are copied into
> variables before that, so the comments seems ok...

Hmm, OK.

>> + bool sql_command_in_progress = false;
>>
>> This variable's name could be slightly confusing to readers. If I
>> understand its purpose correctly, perhaps it could be called
>> sql_command_continues.
>
> It is possible. I like 'in progress' though. Why is it confusing?
> It means that the current command is not finished yet and more is
> expected, that is the final ';' has not been encountered.

I understood that it refers to what you explain here. But to me it
sounded like the name is referring to the progress of *execution* of a SQL
command whereas the code in question is simply expecting to finish
*parsing* the SQL command using the next lines. It may be fine though.

>> + if (index == 0)
>> + syntax_error(desc, lineno, NULL, NULL,
>> + "\\into cannot start a script",
>> + NULL, -1);
>>
>> How about: "\\into cannot be at the beginning of a script" ?
>
> It is possible, but it's quite longer... I'm not a native speaker, so I'm
> do not know whether it would be better.

Me neither, let's leave it for the committer to decide.

> The attached patch takes into all your comments but:
> - comment about discarded results...
> - the sql_command_in_progress variable name change
> - the longer message on into at the start of a script

The patch seems fine without these, although please consider the concern I
raised with regard to the -r option above.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2016-09-12 07:59:27 Re: Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn
Previous Message Michael Paquier 2016-09-12 07:28:41 Re: Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn