Re: [patch] Proposal for \rotate in psql

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Daniel Verite <daniel(at)manitou-mail(dot)org>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [patch] Proposal for \rotate in psql
Date: 2015-08-29 03:57:07
Message-ID: CAFj8pRA55p7X7_h1MmgjuSz74AF02AvrtVjgQ9H7VVg7crwZUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2015-08-29 0:48 GMT+02:00 Daniel Verite <daniel(at)manitou-mail(dot)org>:

> Hi,
>
> This is a reboot of my previous proposal for pivoting results in psql,
> with a new patch that generalizes the idea further through a command
> now named \rotate, and some examples.
>
> So the concept is: having an existing query in the query buffer,
> the user can specify two column numbers C1 and C2 (by default the 1st
> and 2nd) as an argument to a \rotate command.
>
> The query results are then displayed in a 2D grid such that each tuple
> (vx, vy, va, vb,...) is shown as |va vb...| in a cell at coordinates
> (vx,vy).
> The values vx,xy come from columns C1,C2 respectively and are
> represented in the output as an horizontal and a vertical header.
>
> A cell may hold several columns from several rows, growing horizontally and
> vertically (\n inside the cell) if necessary to show all results.
>
> The examples below should be read with a monospaced font as in psql,
> otherwise they will look pretty bad.
>
> 1. Example with only 2 columns, querying login/group membership from the
> catalog.
> Query:
>
> SELECT r.rolname as username,r1.rolname as groupname
> FROM pg_catalog.pg_roles r LEFT JOIN pg_catalog.pg_auth_members m
> ON (m.member = r.oid)
> LEFT JOIN pg_roles r1 ON (m.roleid=r1.oid)
> WHERE r.rolcanlogin
> ORDER BY 1
>
> Sample results:
> username | groupname
> ------------+-----------
> daniel | mailusers
> drupal |
> dv | admin
> dv | common
> extc | readonly
> extu |
> foobar |
> joel |
> mailreader | readonly
> manitou | mailusers
> manitou | admin
> postgres |
> u1 | common
> u2 | mailusers
> zaz | mailusers
>
>
> Applying \rotate gives:
> Rotated query results
> username | admin | common | mailusers | readonly
> ------------+-------+--------+-----------+----------
> daniel | | | X |
> drupal | | | |
> dv | X | X | |
> extc | | | | X
> extu | | | |
> foobar | | | |
> joel | | | |
> mailreader | | | | X
> manitou | X | | X |
> postgres | | | |
> u1 | | X | |
> u2 | | | X |
> zaz | | | X |
>
> The 'X' inside cells is automatically added as there are only
> 2 columns. If there was a 3rd column, the content of that column would
> be displayed instead (as in the next example).
>
> What's good in that \rotate display compared to the classic output is that
> it's more apparent, visually speaking, that such user belongs or not to
> such
> group or another.
>
> 2. Example with a unicode checkmark added as 3rd column, and
> unicode linestyle and borders (to be seen with a mono-spaced font):
>
> SELECT r.rolname as username,r1.rolname as groupname, chr(10003)
> FROM pg_catalog.pg_roles r LEFT JOIN pg_catalog.pg_auth_members m
> ON (m.member = r.oid)
> LEFT JOIN pg_roles r1 ON (m.roleid=r1.oid)
> WHERE r.rolcanlogin
> ORDER BY 1
>
> Rotated query results
> ┌────────────┬───────┬───�”
> �────┬───────────┬────────â
> ��─┐
> │ username │ admin │ common │ mailusers │ readonly │
> ├────────────┼───────┼───�”
> �────┼───────────┼────────â
> ��─┤
> │ daniel │ │ │ ✓ │ │
> │ drupal │ │ │ │ │
> │ dv │ ✓ │ ✓ │ │ │
> │ extc │ │ │ │ ✓ │
> │ extu │ │ │ │ │
> │ foobar │ │ │ │ │
> │ joel │ │ │ │ │
> │ mailreader │ │ │ │ ✓ │
> │ manitou │ ✓ │ │ ✓ │ │
> │ postgres │ │ │ │ │
> │ u1 │ │ ✓ │ │ │
> │ u2 │ │ │ ✓ │ │
> │ zaz │ │ │ ✓ │ │
> └────────────┴───────┴───�”
> �────┴───────────┴────────â
> ��─┘
>
>
> What I like in that representation is that it looks good enough
> to be pasted directly into a document in a word processor.
>
> 3. It can be rotated easily in the other direction, with:
> \rotate 2 1
>
> (Cut horizontally to fit in a mail, the actual output is 116 chars wide).
>
> Rotated query results
> ┌───────────┬────────┬───�”
> �────┬────┬──────┬──────┬─â
> ��──────┬──────┬────
> │ username │ daniel │ drupal │ dv │ extc │ extu │ foobar │
> joel │ mai...
> ├───────────┼────────┼───�”
> �────┼────┼──────┼──────┼─â
> ��──────┼──────┼────
> │ mailusers │ ✓ │ │ │ │ │ │
> │
> │ admin │ │ │ ✓ │ │ │ │
> │
> │ common │ │ │ ✓ │ │ │ │
> │
> │ readonly │ │ │ │ ✓ │ │ │
> │ ✓
> └───────────┴────────┴───�”
> �────┴────┴──────┴──────┴─â
> ��──────┴──────┴────
>
>
> 4. Example with 3 columns and a count as the value to visualize along
> two axis: date and category.
> I'm using the number of mails posted per month in a few PG mailing lists,
> broken down by list (which are tags in my schema).
>
> Query:
> SELECT date_trunc('month', msg_date)::date as month,
> t.name,
> count(*) as cnt
> FROM mail JOIN mail_tags using(mail_id) JOIN tags t
> on(t.tag_id=mail_tags.tag)
> WHERE t.tag_id in (7,8,12,34,79)
> AND msg_date>='2014-05-01'::date and msg_date<'2015-01-01'::date
> GROUP BY date_trunc('month', msg_date)::date, t.name
> ORDER BY 1,2;
>
> Results:
> month | name | cnt
> ------------+-------------+------
> 2014-05-01 | announce | 19
> 2014-05-01 | general | 550
> 2014-05-01 | hackers | 1914
> 2014-05-01 | interfaces | 4
> 2014-05-01 | performance | 122
> 2014-06-01 | announce | 10
> 2014-06-01 | general | 499
> 2014-06-01 | hackers | 2008
> 2014-06-01 | interfaces | 10
> 2014-06-01 | performance | 137
> 2014-07-01 | announce | 12
> 2014-07-01 | general | 703
> 2014-07-01 | hackers | 1504
> 2014-07-01 | interfaces | 6
> 2014-07-01 | performance | 142
> 2014-08-01 | announce | 9
> 2014-08-01 | general | 616
> 2014-08-01 | hackers | 1864
> 2014-08-01 | interfaces | 11
> 2014-08-01 | performance | 116
> 2014-09-01 | announce | 10
> 2014-09-01 | general | 645
> 2014-09-01 | hackers | 2364
> 2014-09-01 | interfaces | 3
> 2014-09-01 | performance | 105
> 2014-10-01 | announce | 13
> 2014-10-01 | general | 476
> 2014-10-01 | hackers | 2325
> 2014-10-01 | interfaces | 10
> 2014-10-01 | performance | 137
> 2014-11-01 | announce | 10
> 2014-11-01 | general | 457
> 2014-11-01 | hackers | 1810
> 2014-11-01 | performance | 109
> 2014-12-01 | announce | 11
> 2014-12-01 | general | 623
> 2014-12-01 | hackers | 2043
> 2014-12-01 | interfaces | 1
> 2014-12-01 | performance | 71
> (39 rows)
>
> \rotate gives:
> Rotated query results
> month | announce | general | hackers | interfaces | performance
> ------------+----------+---------+---------+------------+-------------
> 2014-05-01 | 19 | 550 | 1914 | 4 | 122
> 2014-06-01 | 10 | 499 | 2008 | 10 | 137
> 2014-07-01 | 12 | 703 | 1504 | 6 | 142
> 2014-08-01 | 9 | 616 | 1864 | 11 | 116
> 2014-09-01 | 10 | 645 | 2364 | 3 | 105
> 2014-10-01 | 13 | 476 | 2325 | 10 | 137
> 2014-11-01 | 10 | 457 | 1810 | | 109
> 2014-12-01 | 11 | 623 | 2043 | 1 | 71
>
> Advantage: we can figure out the trends, and notice empty slots,
> much quicker than with the previous output. It seems smaller
> but there is the same amount of information.
>
>
> 5. Example with an additional column showing if the count grows up or down
> compared to the previous month. This shows how the contents get stacked
> inside cells when they come from several columns and rows.
>
> Query:
>
> SELECT to_char(mon, 'yyyy-mm') as month,
> name,
> CASE when lag(name,1) over(order by name,mon)=name then
> case sign(cnt-(lag(cnt,1) over(order by name,mon)))
> when 1 then chr(8593)
> when 0 then chr(8597)
> when -1 then chr(8595)
> else ' ' end
> END,
> cnt
> from (SELECT date_trunc('month', msg_date)::date as mon, t.name,count(*)
> as
> cnt
> FROM mail JOIN mail_tags using(mail_id) JOIN tags t
> on(t.tag_id=mail_tags.tag)
> WHERE t.tag_id in (7,8,12,34,79)
> AND msg_date>='2014-05-01'::date and msg_date<'2015-01-01'::date
> GROUP BY date_trunc('month', msg_date)::date, t.name) l order by 2,1;
>
> Result:
> month | name | case | cnt
> ---------+-------------+------+------
> 2014-05 | announce | | 19
> 2014-06 | announce | ↓ | 10
> 2014-07 | announce | ↑ | 12
> 2014-08 | announce | ↓ | 9
> 2014-09 | announce | ↑ | 10
> 2014-10 | announce | ↑ | 13
> 2014-11 | announce | ↓ | 10
> 2014-12 | announce | ↑ | 11
> 2014-05 | general | | 550
> 2014-06 | general | ↓ | 499
> 2014-07 | general | ↑ | 703
> 2014-08 | general | ↓ | 616
> 2014-09 | general | ↑ | 645
> 2014-10 | general | ↓ | 476
> 2014-11 | general | ↓ | 457
> 2014-12 | general | ↑ | 623
> 2014-05 | hackers | | 1914
> 2014-06 | hackers | ↑ | 2008
> 2014-07 | hackers | ↓ | 1504
> 2014-08 | hackers | ↑ | 1864
> 2014-09 | hackers | ↑ | 2364
> 2014-10 | hackers | ↓ | 2325
> 2014-11 | hackers | ↓ | 1810
> 2014-12 | hackers | ↑ | 2043
> 2014-05 | interfaces | | 4
> 2014-06 | interfaces | ↑ | 10
> 2014-07 | interfaces | ↓ | 6
> 2014-08 | interfaces | ↑ | 11
> 2014-09 | interfaces | ↓ | 3
> 2014-10 | interfaces | ↑ | 10
> 2014-12 | interfaces | ↓ | 1
> 2014-05 | performance | | 122
> 2014-06 | performance | ↑ | 137
> 2014-07 | performance | ↑ | 142
> 2014-08 | performance | ↓ | 116
> 2014-09 | performance | ↓ | 105
> 2014-10 | performance | ↑ | 137
> 2014-11 | performance | ↓ | 109
> 2014-12 | performance | ↓ | 71
> (39 rows)
>
> \rotate:
>
> Rotated query results
> month | announce | general | hackers | interfaces | performance
> ---------+----------+---------+---------+------------+-------------
> 2014-05 | 19 | 550 | 1914 | 4 | 122
> 2014-06 | ↓ 10 | ↓ 499 | ↑ 2008 | ↑ 10 | ↑ 137
> 2014-07 | ↑ 12 | ↑ 703 | ↓ 1504 | ↓ 6 | ↑ 142
> 2014-08 | ↓ 9 | ↓ 616 | ↑ 1864 | ↑ 11 | ↓ 116
> 2014-09 | ↑ 10 | ↑ 645 | ↑ 2364 | ↓ 3 | ↓ 105
> 2014-10 | ↑ 13 | ↓ 476 | ↓ 2325 | ↑ 10 | ↑ 137
> 2014-11 | ↓ 10 | ↓ 457 | ↓ 1810 | | ↓ 109
> 2014-12 | ↑ 11 | ↑ 623 | ↑ 2043 | ↓ 1 | ↓ 71
> (8 rows)
>
> The output columns 3 and 4 of the same row get projected into the same
> cell, laid out horizontally (separated by space).
>
> 6. Example with the same query but rotated differently so that
> it's split into two columns: the counts that go up from the previous
> and those that go down. I'm also cheating a bit by
> casting name and cnt to char(N) for a better alignment
>
> SELECT to_char(mon, 'yyyy-mm') as month,
> name::char(12),
> CASE when lag(name,1) over(order by name,mon)=name then
> case sign(cnt-(lag(cnt,1) over(order by name,mon)))
> when 1 then chr(8593)
> when 0 then chr(8597)
> when -1 then chr(8595)
> else ' ' end
> END,
> cnt::char(8)
> from (SELECT date_trunc('month', msg_date)::date as mon, t.name,count(*)
> as
> cnt
> FROM mail JOIN mail_tags using(mail_id) JOIN tags t
> on(t.tag_id=mail_tags.tag)
> WHERE t.tag_id in (7,8,12,34,79)
> AND msg_date>='2014-05-01'::date and msg_date<'2015-01-01'::date
> GROUP BY date_trunc('month', msg_date)::date, t.name) l order by 2,1;
>
> \rotate 1 3
>
> +---------+-----------------------+-----------------------+
> | month | ↑ | ↓ |
> +---------+-----------------------+-----------------------+
> | 2014-05 | | |
> | 2014-06 | hackers 2008 +| announce 10 +|
> | | interfaces 10 +| general 499 |
> | | performance 137 | |
> | 2014-07 | announce 12 +| hackers 1504 +|
> | | general 703 +| interfaces 6 |
> | | performance 142 | |
> | 2014-08 | hackers 1864 +| announce 9 +|
> | | interfaces 11 | general 616 +|
> | | | performance 116 |
> | 2014-09 | announce 10 +| interfaces 3 +|
> | | general 645 +| performance 105 |
> | | hackers 2364 | |
> | 2014-10 | announce 13 +| general 476 +|
> | | interfaces 10 +| hackers 2325 |
> | | performance 137 | |
> | 2014-11 | | announce 10 +|
> | | | general 457 +|
> | | | hackers 1810 +|
> | | | performance 109 |
> | 2014-12 | announce 11 +| interfaces 1 +|
> | | general 623 +| performance 71 |
> | | hackers 2043 | |
> +---------+-----------------------+-----------------------+
>
> As there are several rows that match the vertical/horizontal filter,
> (for example 3 results for 2014-06 as row and "arrow up" as column),
> they are stacked vertically inside the cell, in addition to
> "name" and "cnt" being shown side by side horizontally.
>
> Note that no number show up for 2014-05; this is because they're not
> associated with arrow up or down; empty as a column is discarded.
> Maybe it shouldn't. In this case, the numbers for 2014-05 would be in a
> column with an empty name.
>
>
> Conclusion, the point of \rotate:
>
> When analyzing query results, these rotated representations may be
> useful or not depending on the cases, but the point is that they require
> no effort to be obtained through \rotate X Y
> It's so easy to play with various combinations to see if the result
> makes sense, and if it reveals something about the data.
> (it still reexecutes the query each time, tough).
>
> We can get more or less the same results with crosstab/pivot, as it's the
> same basic concept, but with much more effort spent on getting the SQL
> right,
> plus the fact that columns not known in advance cannot be returned pivoted
> in a single pass in SQL, a severe complication that the client-side doesn't
> have.
>

simple and user friendy

nice

+1

Pavel

>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2015-08-29 04:37:09 Re: proposal: multiple psql option -c
Previous Message Thomas Munro 2015-08-29 02:55:08 Re: NOTIFY in Background Worker