Re: Patch: Improve Boolean Predicate JSON Path Docs

From: David E(dot) Wheeler <david(at)justatheory(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Erik Wienhold <ewie(at)ewie(dot)name>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Patch: Improve Boolean Predicate JSON Path Docs
Date: 2024-01-21 19:02:12
Message-ID: 7358276C-45EC-42C2-84AD-4525FF7C66FF@justatheory.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jan 20, 2024, at 11:45, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> You sure about that? It would surprise me if we could effectively use
> a not-equal condition with an index. If it is only == that works,
> then the preceding statement seems sufficient.

I’m not! I just assumed it in the same way creating an SQL = operator automatically respects NOT syntax (or so I recall). In fiddling a bit, I can’t get it to use an index:

CREATE TABLE MOVIES (id SERIAL PRIMARY KEY, movie JSONB NOT NULL);
\copy movies(movie) from PROGRAM 'curl -s https://raw.githubusercontent.com/prust/wikipedia-movie-data/master/movies.json | jq -c ".[]" | sed "s|\\\\|\\\\\\\\|g"';
create index on movies using gin (movie);
analyze movies;

david=# explain analyze select id from movies where movie @? '$ ?(@.genre[*] != "Teen")';
QUERY PLAN -----------------------------------------------------------------------------------------------------
Seq Scan on movies (cost=0.00..3741.41 rows=4 width=4) (actual time=19.222..19.223 rows=0 loops=1)
Filter: (movie @? '$?(@."genre"[*] != "Teen")'::jsonpath)
Rows Removed by Filter: 36273
Planning Time: 1.242 ms
Execution Time: 19.247 ms
(5 rows)

But that might be because the planner knows that the query is going to fetch most records, anyway. If I set most records to a single value:

david=# update movies set movie = jsonb_set(movie, '{year}', '2020'::jsonb) where id < 3600;
UPDATE 3599
david=# analyze movies;
ANALYZE
david=# explain analyze select id from movies where movie @? '$ ?(@.year != 2020)';
QUERY PLAN ------------------------------------------------------------------------------------------------------------
Seq Scan on movies (cost=0.00..3884.41 rows=32609 width=4) (actual time=0.065..43.730 rows=32399 loops=1)
Filter: (movie @? '$?(@."year" != 2020)'::jsonpath)
Rows Removed by Filter: 3874
Planning Time: 1.759 ms
Execution Time: 45.368 ms
(5 rows)

Looks like it still doesn’t use the index with !=. Pity.

Best,

David

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2024-01-21 19:07:15 Re: [17] CREATE COLLATION default provider
Previous Message Jeff Davis 2024-01-21 18:57:57 Re: Collation version tracking for macOS