Confusing order by error

From: "" <kbrannen(at)pwhome(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Confusing order by error
Date: 2017-03-31 19:39:50
Message-ID: 20170331123950.FE8343B3@m0087793.ppops.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I'm hoping someone can give us a little help understanding an error in the ORDER BY clause, because when I read https://www.postgresql.org/docs/9.5/static/sql-select.html#SQL-ORDERBY I just don't see anything that explains the behavior.

This is with Pg-9.5.1 on Centos (not that I think the OS matters here).

Consider this table and data, stripped down example of real code:

CREATE TABLE IF NOT EXISTS test_table (
pk INTEGER PRIMARY KEY,
id INTEGER NOT NULL,
name TEXT NOT NULL,
ref INTEGER REFERENCES test_table
);

INSERT INTO test_table
( pk, id, name, ref )
VALUES
( 1, 1000, 'fred', null ),
( 2, 2000, 'barney', 1 ),
( 3, 3000, 'betty', 2 ),
( 4, 4000, 'wilma', 1 )
ON CONFLICT DO NOTHING;

select * from test_table;

pk | id | name | ref
----+------+--------+-----
1 | 1000 | fred |
2 | 2000 | barney | 1
3 | 3000 | betty | 2
4 | 4000 | wilma | 1
(4 rows)

So far so good, but when we try to use the data in a more meaningful way:

SELECT t1.pk, t1.name, t1.ref, CONCAT( t2.id , ':', t2.name ) AS ref_display
FROM test_table as t1
LEFT JOIN test_table as t2 ON t1.ref = t2.pk
ORDER BY name;

pk | name | ref | ref_display
----+--------+-----+-------------
2 | barney | 1 | 1000:fred
3 | betty | 2 | 2000:barney
1 | fred | | :
4 | wilma | 1 | 1000:fred
(4 rows)

That looks reasonable ... but if we change the ORDER BY clause to normalize should the name be mixed case:

SELECT t1.pk, t1.name, t1.ref, CONCAT( t2.id , ':', t2.name ) AS ref_display
FROM test_table as t1
LEFT JOIN test_table as t2 ON t1.ref = t2.pk
ORDER BY UPPER(name);

ERROR: column reference "name" is ambiguous
LINE 4: ORDER BY UPPER(name);
^

Eh? The parser (or whatever phase) understood "ORDER BY name" in the first query, so why did that UPPER() string function make a difference in the second query?

I can almost make sense of it in that when the result tuples are created as it works, there are 2 name fields present: t1.name & t2.name. In the first example they should have the same value but in the second they'd potentially have different values (1 raw and 1 up-cased). But that also doesn't really make sense either as I'd think the first query should have the same issue. I'd think (obviously incorrectly :) that we'd get either both working or both failing, not 1 works while the other fails.

So what's going on here?

Thanks,
Kevin

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Brian Dunavant 2017-03-31 19:49:22 Re: Confusing order by error
Previous Message Magnus Hagander 2017-03-31 17:57:32 Re: REFERENCES privilege should not be symmetric (was Re: [GENERAL] Postgres Permissions Article)