I encountered an error while importing sample data from https://datasets.clickhouse.com/hits/tsv/hits_v1.tsv.xz
It seems to be a parse error, and I don't know why Row 4531: Column 7 's parse result is "0??:?[?<0x03>U" rather than "0", why the '\t' is parsed in to string?
First, Create Table:
CREATE TABLE hits_NoPrimaryKey
(
`UserID` UInt32,
`URL` String,
`EventTime` DateTime
)
ENGINE = MergeTree
PRIMARY KEY tuple();
Second, import data:
INSERT INTO hits_NoPrimaryKey SELECT
intHash32(c11::UInt64) AS UserID,
c15 AS URL,
c5 AS EventTime
FROM url('https://datasets.clickhouse.com/hits/tsv/hits_v1.tsv.xz')
WHERE URL != '';
↓ Progress: 4.37 million rows, 8.28 GB (18.25 thousand rows/s., 34.55 MB/s.) (0.0 CPU, 121.22 MB RAM)
0 rows in set. Elapsed: 239.650 sec. Processed 4.37 million rows, 8.28 GB (18.25 thousand rows/s., 34.55 MB/s.)
Received exception from server (version 22.10.1):
Code: 27. DB::Exception: Received from localhost:9000. DB::ParsingException. DB::ParsingException: Cannot parse input: expected '\t' before: 'c??m???\t115\t2668037917139250981\t0\t227\t105\thttp://yandsearch[filter=user_page=http://book-nika/nyurttunian/haberlandsearch&text=all&user_page-148564b4080_1280x1':
Row 4530:
Column 0, name: c1, type: Nullable(DateTime64(9)), parsed text: "7111657564365305139"
Column 1, name: c2, type: Nullable(Int64), parsed text: "1"
Column 2, name: c3, type: Nullable(String), parsed text: <EMPTY>
Column 3, name: c4, type: Nullable(Int64), parsed text: "1"
Column 4, name: c5, type: Nullable(DateTime64(9)), parsed text: "2014-03-17 20:56:03"
Column 5, name: c6, type: Nullable(Date), parsed text: "2014-03-17"
Column 6, name: c7, type: Nullable(String), parsed text: "31440846"
Column 7, name: c8, type: Nullable(DateTime64(9)), parsed text: "3653375523"
Column 8, name: c9, type: Nullable(String), parsed text: "??:?[?<0x03>Uc??m???<0x1A>"
Column 9, name: c10, type: Nullable(String), parsed text: "42"
Column 10, name: c11, type: Nullable(String), parsed text: "2668037917139250981"
Column 11, name: c12, type: Nullable(Int64), parsed text: "0"
Column 12, name: c13, type: Nullable(Int64), parsed text: "227"
Column 13, name: c14, type: Nullable(Int64), parsed text: "105"
Column 14, name: c15, type: Nullable(String), parsed text: "http://yandsearch[filter=user_page=http://book-nika/nyurttunian/haberlandsearch&text=all&user_page-148564b4080_1280x120"
Column 15, name: c16, type: Nullable(String), parsed text: <EMPTY>
Column 16, name: c17, type: Nullable(String), parsed text: "yandex.ru.msn"
Column 17, name: c18, type: Nullable(String), parsed text: <EMPTY>
Column 18, name: c19, type: Nullable(Int64), parsed text: "0"
Column 19, name: c20, type: Nullable(Int64), parsed text: "0"
Column 20, name: c21, type: Array(Nullable(Int64)), parsed text: "[]"
Column 21, name: c22, type: Array(Nullable(Int64)), parsed text: "[]"
Column 22, name: c23, type: Array(Nullable(Int64)), parsed text: "[239]"
Column 23, name: c24, type: Array(Nullable(Int64)), parsed text: "[]"
Column 24, name: c25, type: Nullable(Int64), parsed text: "355"
Column 25, name: c26, type: Nullable(Int64), parsed text: "514"
Column 26, name: c27, type: Nullable(Int64), parsed text: "57"
Column 27, name: c28, type: Nullable(Int64), parsed text: "0"
Column 28, name: c29, type: Nullable(Int64), parsed text: "0"
Column 29, name: c30, type: Nullable(Float64), parsed text: <EMPTY>
Column 30, name: c31, type: Nullable(Int64), parsed text: "0"
Column 31, name: c32, type: Nullable(Int64), parsed text: "0"
Column 32, name: c33, type: Nullable(Int64), parsed text: "44"
Column 33, name: c34, type: Nullable(String), parsed text: "s?"
Column 34, name: c35, type: Nullable(Int64), parsed text: "1"
Column 35, name: c36, type: Nullable(Int64), parsed text: "1"
Column 36, name: c37, type: Nullable(Int64), parsed text: "1"
Column 37, name: c38, type: Nullable(Int64), parsed text: "0"
Column 38, name: c39, type: Nullable(String), parsed text: <EMPTY>
Column 39, name: c40, type: Nullable(String), parsed text: <EMPTY>
Column 40, name: c41, type: Nullable(String), parsed text: "2023156"
Column 41, name: c42, type: Nullable(Int64), parsed text: "0"
Column 42, name: c43, type: Nullable(Int64), parsed text: "0"
Column 43, name: c44, type: Nullable(String), parsed text: <EMPTY>
Column 44, name: c45, type: Nullable(Int64), parsed text: "0"
Column 45, name: c46, type: Nullable(Int64), parsed text: "1"
Column 46, name: c47, type: Nullable(Int64), parsed text: "436"
Column 47, name: c48, type: Nullable(Int64), parsed text: "1002"
Column 48, name: c49, type: Nullable(Int64), parsed text: "296"
Column 49, name: c50, type: Nullable(DateTime64(9)), parsed text: "2014-03-17 07:47:03"
Column 50, name: c51, type: Nullable(Int64), parsed text: "0"
Column 51, name: c52, type: Nullable(Int64), parsed text: "0"
Column 52, name: c53, type: Nullable(String), parsed text: "0"
Column 53, name: c54, type: Nullable(Int64), parsed text: "0"
Column 54, name: c55, type: Nullable(String), parsed text: "utf-8"
Column 55, name: c56, type: Nullable(Int64), parsed text: "315"
Column 56, name: c57, type: Nullable(Int64), parsed text: "0"
Column 57, name: c58, type: Nullable(Int64), parsed text: "0"
Column 58, name: c59, type: Nullable(Int64), parsed text: "1"
Column 59, name: c60, type: Nullable(String), parsed text: "0"
Column 60, name: c61, type: Nullable(String), parsed text: "559851309"
Column 61, name: c62, type: Nullable(Int64), parsed text: "0"
Column 62, name: c63, type: Nullable(Int64), parsed text: "0"
Column 63, name: c64, type: Nullable(Int64), parsed text: "0"
Column 64, name: c65, type: Nullable(Int64), parsed text: "1"
Column 65, name: c66, type: Nullable(Int64), parsed text: "0"
Column 66, name: c67, type: Nullable(String), parsed text: "E"
Column 67, name: c68, type: Nullable(DateTime64(9)), parsed text: "2014-03-17 07:06:29"
Column 68, name: c69, type: Nullable(Int64), parsed text: "55"
Column 69, name: c70, type: Nullable(Int64), parsed text: "1"
Column 70, name: c71, type: Nullable(Int64), parsed text: "3"
Column 71, name: c72, type: Nullable(String), parsed text: "0"
Column 72, name: c73, type: Nullable(Int64), parsed text: "0"
Column 73, name: c74, type: Array(Nullable(Int64)), parsed text: "[72,14]"
Column 74, name: c75, type: Nullable(DateTime64(9)), parsed text: "2044414662"
Column 75, name: c76, type: Nullable(String), parsed text: "??<0x17>?_Y???<0x06>?5Ӵ."
Column 76, name: c77, type: Nullable(String), parsed text: "54527"
Column 77, name: c78, type: Nullable(Int64), parsed text: "-1"
Column 78, name: c79, type: Nullable(Int64), parsed text: "1"
Column 79, name: c80, type: Nullable(String), parsed text: "nD"
Column 80, name: c81, type: Nullable(String), parsed text: "??"
Column 81, name: c82, type: Nullable(String), parsed text: <EMPTY>
Column 82, name: c83, type: Nullable(String), parsed text: <EMPTY>
Column 83, name: c84, type: Nullable(Int64), parsed text: "0"
Column 84, name: c85, type: Nullable(String), parsed text: "1978"
Column 85, name: c86, type: Nullable(Int64), parsed text: "-1"
Column 86, name: c87, type: Nullable(String), parsed text: "-1"
Column 87, name: c88, type: Nullable(String), parsed text: "-1"
Column 88, name: c89, type: Nullable(String), parsed text: "-1"
Column 89, name: c90, type: Nullable(String), parsed text: "-1"
Column 90, name: c91, type: Nullable(String), parsed text: "-1"
Column 91, name: c92, type: Nullable(String), parsed text: "-1"
Column 92, name: c93, type: Nullable(String), parsed text: "-1"
Column 93, name: c94, type: Nullable(String), parsed text: "2852"
Column 94, name: c95, type: Nullable(String), parsed text: "3597"
Column 95, name: c96, type: Nullable(Int64), parsed text: "15"
Column 96, name: c97, type: Nullable(String), parsed text: "-1"
Column 97, name: c98, type: Nullable(String), parsed text: "3888"
Column 98, name: c99, type: Nullable(Int64), parsed text: "-1"
Column 99, name: c100, type: Nullable(Int64), parsed text: "0"
Column 100, name: c101, type: Nullable(String), parsed text: <EMPTY>
Column 101, name: c102, type: Nullable(Int64), parsed text: "0"
Column 102, name: c103, type: Nullable(String), parsed text: <EMPTY>
Column 103, name: c104, type: Nullable(String), parsed text: "<0x07>?<0x1F>"
Column 104, name: c105, type: Nullable(Int64), parsed text: "0"
Column 105, name: c106, type: Array(Nullable(Int64)), parsed text: "[]"
Column 106, name: c107, type: Nullable(String), parsed text: <EMPTY>
Column 107, name: c108, type: Nullable(String), parsed text: <EMPTY>
Column 108, name: c109, type: Nullable(String), parsed text: <EMPTY>
Column 109, name: c110, type: Nullable(String), parsed text: <EMPTY>
Column 110, name: c111, type: Nullable(String), parsed text: <EMPTY>
Column 111, name: c112, type: Nullable(String), parsed text: <EMPTY>
Column 112, name: c113, type: Nullable(String), parsed text: <EMPTY>
Column 113, name: c114, type: Nullable(String), parsed text: <EMPTY>
Column 114, name: c115, type: Nullable(String), parsed text: <EMPTY>
Column 115, name: c116, type: Nullable(String), parsed text: <EMPTY>
Column 116, name: c117, type: Nullable(Int64), parsed text: "0"
Column 117, name: c118, type: Nullable(String), parsed text: "15284527577228392792"
Column 118, name: c119, type: Nullable(String), parsed text: "1303689622826169012"
Column 119, name: c120, type: Nullable(String), parsed text: "0"
Column 120, name: c121, type: Nullable(String), parsed text: "0"
Column 121, name: c122, type: Nullable(String), parsed text: <EMPTY>
Column 122, name: c123, type: Nullable(String), parsed text: <EMPTY>
Column 123, name: c124, type: Nullable(String), parsed text: <EMPTY>
Column 124, name: c125, type: Array(Nullable(String)), parsed text: "[]"
Column 125, name: c126, type: Array(Nullable(String)), parsed text: "[]"
Column 126, name: c127, type: Array(Nullable(String)), parsed text: "[]"
Column 127, name: c128, type: Array(Nullable(String)), parsed text: "[]"
Column 128, name: c129, type: Array(Nullable(String)), parsed text: "[]"
Column 129, name: c130, type: Array(Nullable(Float64)), parsed text: "[]"
Column 130, name: c131, type: Nullable(String), parsed text: "???+???<0x19>?<0x04>??bKQ9"
Column 131, name: c132, type: Nullable(String), parsed text: "6"
Column 132, name: c133, type: Nullable(Int64), parsed text: "1"
Row 4531:
Column 0, name: c1, type: Nullable(DateTime64(9)), parsed text: "8484166349348046735"
Column 1, name: c2, type: Nullable(Int64), parsed text: "1"
Column 2, name: c3, type: Nullable(String), parsed text: "Почта Mail.ru - Почта Mail.Ru | Spor,Magazin,Haberler, Oyun, Video moda.ru"
Column 3, name: c4, type: Nullable(Int64), parsed text: "1"
Column 4, name: c5, type: Nullable(DateTime64(9)), parsed text: "2014-03-17 21:33:13"
Column 5, name: c6, type: Nullable(Date), parsed text: "2014-03-17"
Column 6, name: c7, type: Nullable(String), parsed text: "31440846"
Column 7, name: c8, type: Nullable(DateTime64(9)), parsed text: "0<TAB>??:?[?<0x03>U"
ERROR: garbage after Nullable(DateTime64(9)): "c??m???<0x1A><TAB>1"
: While executing ParallelParsingBlockInputFormat: While executing URL: (in file/uri https://datasets.clickhouse.com/hits/tsv/hits_v1.tsv.xz): (at row 4377757)
. (CANNOT_PARSE_INPUT_ASSERTION_FAILED)
Any one know why the exception occured?