-
Notifications
You must be signed in to change notification settings - Fork 617
Description
Describe the issue
When executing a SELECT DISTINCT ON (some_field) query via the ClickHouse JDBC driver version 0.9.6, the driver's response can be misinterpreted as an UPDATE operation. This manifests as the application receiving an indication that an update was performed, even though the query is a SELECT. The issue is reproducible specifically when the query uses * to select all columns or when any selected column is given an alias.
For example:
SELECT DISTINCT ON (id) * FROM tableSELECT DISTINCT ON (id) id AS user_id, name FROM table
Such queries cause the JDBC driver to behave as if they were update statements, leading to confusion in applications that check update counts to distinguish between queries and updates.
Steps to Reproduce
- Set up a ClickHouse table with sample data containing duplicate keys.
CREATE TABLE test (id UInt32, value String, ts DateTime) ENGINE = MergeTree() ORDER BY id;
INSERT INTO test VALUES (1, 'a', now()), (1, 'b', now() + 1), (2, 'c', now());- Use the ClickHouse JDBC driver v0.9.6 to execute a
SELECT DISTINCT ONquery that includes either*or an aliased column.
Case 1: Using *
(ns clickhouse-distinct-on-issue
(:import (java.sql DriverManager Statement)))
(def db-url "jdbc:ch://localhost:8123/default")
(def username "default")
(def password "")
(defn reproduce-issue []
(with-open [conn (DriverManager/getConnection db-url username password)
stmt (.createStatement conn)]
(let [sql "SELECT DISTINCT ON (id) * FROM test ORDER BY id, ts DESC"
has-result-set? (.execute stmt sql)
update-count (.getUpdateCount stmt)]
(println "has-result-set?" has-result-set?) ; => true (still true, but update-count is 0)
(println "update-count:" update-count) ; => 0 (should be -1 for a SELECT)
(when has-result-set?
(let [rs (.getResultSet stmt)]
;; Attempt to read result set – but application may be confused by update-count
(while (.next rs)
(println (format "id: %d, value: %s, ts: %s"
(.getInt rs "id")
(.getString rs "value")
(.getTimestamp rs "ts")))))))))Case 2: Using an alias
(let [sql "SELECT DISTINCT ON (id) id AS user_id, value, ts FROM test ORDER BY id, ts DESC"
has-result-set? (.execute stmt sql)
update-count (.getUpdateCount stmt)]
;; same inspection as above
)
// Same execution pattern as above- Observe that
has-result-set?is true (so the driver does return a ResultSet), but.getUpdateCountreturns 0 instead of -1. This violates the JDBC specification, which states thatgetUpdateCount()should return-1when the current result is a ResultSet.
Expected behavior
- For any
SELECTquery (including those withDISTINCT ON,*, or aliases), Statement.execute() should return true. Statement.getUpdateCount()should return-1until a different result type (e.g., an update count) is available.- The driver should not conflate a
SELECTstatement with anUPDATEoperation.
Actual behavior
When DISTINCT ON is combined with * or aliased columns, the driver appears to internally treat the response as having an associated update count (0). This likely stems from a misinterpretation of the server's response packet or metadata for these specific query patterns.
Workaround observed
- If the query selects columns without aliases and avoids
*(e.g.,SELECT DISTINCT ON (id) id, value, ts), the driver behaves correctly (ResultSet returned, updateCount = -1). - This suggests the issue is tied to how the driver parses column names or handles the server's response when the column list is not a simple list of identifiers.
Versions
ClickHouse JDBC Driver: 0.9.6
ClickHouse Server: [e.g., 25.8.14.17]
Java: [e.g., OpenJDK 22.0.2]
Clojure version: [e.g., 1.12.3]
Additional context
The problem does not occur with plain SELECT statements or DISTINCT ON without */aliases.