Clickhouse. Create database on cluster ends with timeout - clickhouse

I have a cluster which consists of two nodes of Clickhouse. Both instances are in docker containers. All communications between hosts are successfully checked - ping, telnet, wget works fine. In the Zookeeper I can see my fired queries under the ddl brunch.
Every execution of the statement "create database on cluster " ends with timeout. What is the problem? Does anybody have any ideas?
There are fragments of the config file.
Ver 20.10.3.30
<remote_servers>
<history_cluster>
<shard>
<replica>
<host>10.3.194.104</host>
<port>9000</port>
</replica>
<replica>
<host>10.3.194.105</host>
<port>9000</port>
</replica>
</shard>
</history_cluster>
</remote_servers>
<zookeeper>
<node index="1">
<host>10.3.194.106</host>
<port>2181</port>
</node>
</zookeeper>
The "macros" section
<macros incl="macros" optional="true" />
The log fragment
2020.11.20 22:38:44.104001 [ 90 ] {68062325-a6cf-4ac3-a355-c2159c66ae8b} <Error> executeQuery: Code: 159, e.displayText() = DB::Exception: Watching task /clickhouse/task_queue/ddl/query-0000000013 is executing longer than distributed_ddl_task_timeout (=180) seconds. There are 2 unfinished hosts (0 of them are currently active), they are going to execute the query in background (version 20.10.3.30 (official build)) (from 172.17.0.1:51272) (in query: create database event_history on cluster history_cluster;), Stack trace (when copying this message, always include the lines below):
0. DB::Exception::Exception<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, long&, unsigned long&, unsigned long&>(int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, long&, unsigned long&, unsigned long&) # 0xd8dcc75 in /usr/bin/clickhouse
1. DB::DDLQueryStatusInputStream::readImpl() # 0xd8dc84d in /usr/bin/clickhouse
2. DB::IBlockInputStream::read() # 0xd71b1a5 in /usr/bin/clickhouse
3. DB::AsynchronousBlockInputStream::calculate() # 0xd71761d in /usr/bin/clickhouse
4. ? # 0xd717db8 in /usr/bin/clickhouse
5. ThreadPoolImpl<ThreadFromGlobalPool>::worker(std::__1::__list_iterator<ThreadFromGlobalPool, void*>) # 0x7b8c17d in /usr/bin/clickhouse
6. std::__1::__function::__func<ThreadFromGlobalPool::ThreadFromGlobalPool<void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()>(void&&, void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()&&...)::'lambda'(), std::__1::allocator<ThreadFromGlobalPool::ThreadFromGlobalPool<void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()>(void&&, void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()&&...)::'lambda'()>, void ()>::operator()() # 0x7b8e67a in /usr/bin/clickhouse
7. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) # 0x7b8963d in /usr/bin/clickhouse
8. ? # 0x7b8d153 in /usr/bin/clickhouse
9. start_thread # 0x9609 in /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
10. clone # 0x122293 in /usr/lib/x86_64-linux-gnu/libc-2.31.so

The most probable issue is the nodes docker internal IPs/hostnames.
A node initiator (where the 'on cluster' is executed) puts into ZK a task for 10.3.194.104 and 10.3.194.105. All nodes constantly check the task queue and pull their task. If their IPs/hostnames are 127.0.0.1 / localhost they never find their tasks. Because 10.3.194.104 != 127.0.0.1.

Related

Materialize index throws DB::Exception: Missing columns

I met a problem for materialize index with version 21.3.
I create a table with 3 coulmns u64, i32 and s.
CREATE TABLE test_idx (`u64` UInt64, `i32` Int32, `s` String) ENGINE = MergeTree() ORDER BY u64;
Then I insert 3 millions data in it and run OPTIMIZE TABLE test_idx FINAL(to make part Wide)
select name, part_type from system.parts where table='test_idx' and active=1;
┌─name──────────────────┬─part_type─┐
│ all_1_21762_111_21773 │ Wide │
└───────────────────────┴───────────┘
Then I add tow indexes to the table
alter table test_idx add INDEX a (u64, s) TYPE minmax GRANULARITY 3;
alter table test_idx add INDEX b (i32 * length(s)) TYPE set(1000) GRANULARITY 4;
Then I materialize the index a to make the index useful for old data.
alter table test_idx materialize index a;
Here's the Exception and stack trace.
2022.07.14 04:06:38.192403 [ 11633 ] {} <Error> DB::IBackgroundJobExecutor::jobExecutingTask()::<lambda()>: Code: 47, e.displayText() = DB::Exception: Missing columns: 'i32' while processing query: 'u64, s, i32 * length(s)', required columns: 'u64' 's' 'i32' 'u64' 's' 'i32', Stack trace (when copying this message, always include the lines below):
0. /root/zyf/workspace/clickhouse/build/RelWithDebInfo/../../contrib/poco/Foundation/src/Exception.cpp:27: Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) # 0xe16fb61 in /root/zyf/workspace/clickhouse/build/RelWithDebInfo/bin/clickhouse
1. /root/zyf/workspace/clickhouse/build/RelWithDebInfo/../../src/Common/Exception.cpp:55: DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) # 0x476f358 in /root/zyf/workspace/clickhouse/build/RelWithDebInfo/bin/clickhouse
2. /root/zyf/workspace/clickhouse/build/RelWithDebInfo/../../src/Interpreters/TreeRewriter.cpp:752: DB::TreeRewriterResult::collectUsedColumns(std::__1::shared_ptr<DB::IAST> const&, bool) (.cold) # 0x416fba2 in /root/zyf/workspace/clickhouse/build/RelWithDebInfo/bin/clickhouse
3. /root/zyf/workspace/clickhouse/build/RelWithDebInfo/../../contrib/libcxx/include/new:237: DB::TreeRewriter::analyze(std::__1::shared_ptr<DB::IAST>&, DB::NamesAndTypesList const&, std::__1::shared_ptr<DB::IStorage const>, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, bool, bool) const # 0xa8a1228 in /root/zyf/workspace/clickhouse/build/RelWithDebInfo/bin/clickhouse
4. /root/zyf/workspace/clickhouse/build/RelWithDebInfo/../../contrib/libcxx/include/list:753: DB::MergeTreeDataMergerMutator::getIndicesToRecalculate(std::__1::shared_ptr<DB::IBlockInputStream>&, DB::NamesAndTypesList const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, DB::Context const&) # 0xac80bbb in /root/zyf/workspace/clickhouse/build/RelWithDebInfo/bin/clickhouse
5. /root/zyf/workspace/clickhouse/build/RelWithDebInfo/../../contrib/libcxx/include/list:753: DB::MergeTreeDataMergerMutator::mutatePartToTemporaryPart(DB::FutureMergedMutatedPart const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, DB::MutationCommands const&, DB::BackgroundProcessListEntry<DB::MergeListElement, DB::MergeInfo>&, long, DB::Context const&, std::__1::unique_ptr<DB::IReservation, std::__1::default_delete<DB::IReservation> > const&, std::__1::shared_ptr<DB::RWLockImpl::LockHolderImpl>&) # 0xac87d74 in /root/zyf/workspace/clickhouse/build/RelWithDebInfo/bin/clickhouse
6. /root/zyf/workspace/clickhouse/build/RelWithDebInfo/../../contrib/libcxx/include/type_traits:3934: DB::StorageMergeTree::mutateSelectedPart(std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, DB::StorageMergeTree::MergeMutateSelectedEntry&, std::__1::shared_ptr<DB::RWLockImpl::LockHolderImpl>&) # 0xaaf12d2 in /root/zyf/workspace/clickhouse/build/RelWithDebInfo/bin/clickhouse
7. /root/zyf/workspace/clickhouse/build/RelWithDebInfo/../../src/Storages/StorageMergeTree.cpp:967: bool std::__1::__function::__policy_invoker<bool ()>::__call_impl<std::__1::__function::__default_alloc_func<DB::StorageMergeTree::getDataProcessingJob()::'lambda'(), bool ()> >(std::__1::__function::__policy_storage const*) # 0xaaf15dc in /root/zyf/workspace/clickhouse/build/RelWithDebInfo/bin/clickhouse
8. /root/zyf/workspace/clickhouse/build/RelWithDebInfo/../../src/Storages/MergeTree/BackgroundJobsExecutor.cpp:103: void std::__1::__function::__policy_invoker<void ()>::__call_impl<std::__1::__function::__default_alloc_func<DB::IBackgroundJobExecutor::jobExecutingTask()::'lambda'(), void ()> >(std::__1::__function::__policy_storage const*) # 0xabefd13 in /root/zyf/workspace/clickhouse/build/RelWithDebInfo/bin/clickhouse
9. /root/zyf/workspace/clickhouse/build/RelWithDebInfo/../../contrib/libcxx/include/functional:2212: ThreadPoolImpl<ThreadFromGlobalPool>::worker(std::__1::__list_iterator<ThreadFromGlobalPool, void*>) # 0x47cd3a2 in /root/zyf/workspace/clickhouse/build/RelWithDebInfo/bin/clickhouse
10. /root/zyf/workspace/clickhouse/build/RelWithDebInfo/../../src/Common/ThreadPool.h:181: ThreadFromGlobalPool::ThreadFromGlobalPool<void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()>(void&&, void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()&&...)::'lambda'()::operator()() # 0x47cd84e in /root/zyf/workspace/clickhouse/build/RelWithDebInfo/bin/clickhouse
11. /root/zyf/workspace/clickhouse/build/RelWithDebInfo/../../contrib/libcxx/include/functional:2212: ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) # 0x47ccb42 in /root/zyf/workspace/clickhouse/build/RelWithDebInfo/bin/clickhouse
12. /root/zyf/workspace/clickhouse/build/RelWithDebInfo/../../contrib/libcxx/include/memory:1655: void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void ThreadPoolImpl<std::__1::thread>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()> >(void*) # 0x47cb4f3 in /root/zyf/workspace/clickhouse/build/RelWithDebInfo/bin/clickhouse
13. start_thread # 0x8609 in /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
14. __clone # 0x11f133 in /usr/lib/x86_64-linux-gnu/libc-2.31.so
(version 21.3.14.1.7)
Column i32 is not used for index a, but why here shows Missing columns? Or I can't use skipping index in this way?
Upgrading to the latest version of ClickHouse should resolve your issue. Is there any specific reason for you to use 21.3?
Try detach all the data parts that causes the error (assuming that the MATERIALIZE mutation had succeeded on most data parts), and attach them to a new table with same structure, do MATERIALIZE on the new table, and hopefully it will complete with no error. Finally ALTER TABLE ... MOVE PARTITION ... to move all partitions back to the old table.

ClickHouse init process failed. There is no projection XX in table

2022.01.26 11:28:41.502968 [ 108 ] {} <Error> `454959780851617792`.log_detail (9456207b-f9c8-4174-9456-207bf9c8b174): auto DB::StorageReplicatedMergeTree::processQueueEntry(ReplicatedMergeTreeQueue::SelectedEntryPtr)::(anonymous class)::operator()(DB::StorageReplicatedMergeTree::LogEntryPtr &) const: Code: 582, e.displayText() = DB::Exception: There is no projection log_trace_service_stat_projection in table, Stack trace (when copying this message, always include the lines below):
0. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) # 0x8b6cbba in /usr/bin/clickhouse
1. DB::ProjectionsDescription::get(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) const # 0xfde09b1 in /usr/bin/clickhouse
2. DB::MutationsInterpreter::prepare(bool) # 0xfaa64f6 in /usr/bin/clickhouse
3. DB::MutationsInterpreter::MutationsInterpreter(std::__1::shared_ptr<DB::IStorage>, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, DB::MutationCommands, std::__1::shared_ptr<DB::Context>, bool) # 0xfaa4f3c in /usr/bin/clickhouse
4. DB::MergeTreeDataMergerMutator::mutatePartToTemporaryPart(DB::FutureMergedMutatedPart const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, DB::MutationCommands const&, DB::BackgroundProcessListEntry<DB::MergeListElement, DB::MergeInfo>&, long, std::__1::shared_ptr<DB::Context>, std::__1::unique_ptr<DB::IReservation, std::__1::default_delete<DB::IReservation> > const&, std::__1::shared_ptr<DB::RWLockImpl::LockHolderImpl>&) # 0x1018c94b in /usr/bin/clickhouse
5. DB::StorageReplicatedMergeTree::tryExecutePartMutation(DB::ReplicatedMergeTreeLogEntry const&) # 0xfefd59d in /usr/bin/clickhouse
6. DB::StorageReplicatedMergeTree::executeLogEntry(DB::ReplicatedMergeTreeLogEntry&) # 0xfeec1c2 in /usr/bin/clickhouse
7. ? # 0xff6f23f in /usr/bin/clickhouse
8. DB::ReplicatedMergeTreeQueue::processEntry(std::__1::function<std::__1::shared_ptr<zkutil::ZooKeeper> ()>, std::__1::shared_ptr<DB::ReplicatedMergeTreeLogEntry>&, std::__1::function<bool (std::__1::shared_ptr<DB::ReplicatedMergeTreeLogEntry>&)>) # 0x102ef74c in /usr/bin/clickhouse
9. DB::StorageReplicatedMergeTree::processQueueEntry(std::__1::shared_ptr<DB::ReplicatedMergeTreeQueue::SelectedEntry>) # 0xff1efbd in /usr/bin/clickhouse
10. ? # 0x100b52b7 in /usr/bin/clickhouse
11. ThreadPoolImpl<ThreadFromGlobalPool>::worker(std::__1::__list_iterator<ThreadFromGlobalPool, void*>) # 0x8baf998 in /usr/bin/clickhouse
12. ThreadFromGlobalPool::ThreadFromGlobalPool<void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()>(void&&, void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()&&...)::'lambda'()::operator()() # 0x8bb135f in /usr/bin/clickhouse
13. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) # 0x8bacedf in /usr/bin/clickhouse
14. ? # 0x8bb0403 in /usr/bin/clickhouse
15. start_thread # 0x9609 in /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
16. clone # 0x122293 in /usr/lib/x86_64-linux-gnu/libc-2.31.so
(version 21.6.5.37 (official build))
ClickHouse init process failed.
I kill a clickhouse pod of two pod of a share.but the pod restart fail.
the projection log_trace_service_stat_projection is not exist in other clickhouse node.
I query system.projection_parts, but I am not find it.
Can someone help me to understand why the process init fail?
How to fix it? Thank you!!

ClickHouse - Too many links

I am testing a ClickHouse server with heavy insertion and have encountered that the server is in a state that it stops processing insertions with “Too many links” exceptions. Based on observations, I don’t think it could recover from the state even though I stopped insertion. I also noticed that the “Too many links” exception message come every millisecond which results server log files fill-up quickly.
Test Env. & How to reproduce:
Sever: Dual xxx 14 cores # 2.4 GHz, 56 vCPU with 256GB mem. Centos 7, clickhouse-server: 21.2.2 revision 54447 (tested with 21.8 also)
Engine: MergeTree PARTITION BY toYYYYMMDD(time_generated)
ORDER BY time_generated
15 clients (10 clickhouse-client, 5 CPP clients) continually inserting log data (~150 fields) with tsv format (bulk size is 500K rows) for a day or so
In this state, clickhouse-server is using 1.5 cores and w/o noticeable file I/O activities.
Other queries work.
To recover from the state, I deleted the temporary directory(s).
I don’t think we will normally insert this way (ignoring "Too many parts") in practice however wonder if this (going into this state) can be an issue. And, beside not inserting data abnormally, is there any way to prevent this?
Thanks in advance.
Logs:
- client
Code: 252. DB::Exception: Received from xx:9000. DB::Exception: Too many parts (303). Merges are processing significantly slower than inserts..
- server:
2021.10.21 09:17:48.649609 [ 21223 ] {} <Error> auto DB::IBackgroundJobExecutor::jobExecutingTask()::(anonymous class)::operator()() const: Poco::Exception. Code: 1000, e.code() = 31, e.displayText() = File access error: Too many links: /var/lib/clickhouse/tmp/store/48c/48cab972-1221-4222-a5f4-ed3960a08f35/tmp_merge_20211021_452585_452597_1, Stack trace (when copying this message, always include the lines below):
0. Poco::FileImpl::handleLastErrorImpl(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) # 0x11c42124 in /usr/bin/clickhouse
1. Poco::FileImpl::createDirectoryImpl() # 0x11c4372f in /usr/bin/clickhouse
2. Poco::File::createDirectories() # 0x11c456b7 in /usr/bin/clickhouse
3. DB::DiskLocal::createDirectories(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) # 0xe79e358 in /usr/bin/clickhouse
4. DB::MergeTreeDataMergerMutator::mergePartsToTemporaryPart(DB::FutureMergedMutatedPart const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, DB::BackgroundProcessListEntry<DB::MergeListElement, DB::MergeInfo>&, std::__1::shared_ptr<DB::RWLockImpl::LockHolderImpl>&, long, DB::Context const&, std::__1::unique_ptr<DB::IReservation, std::__1::default_delete<DB::IReservation> > const&, bool, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) # 0xf36ad8e in /usr/bin/clickhouse
5. DB::StorageMergeTree::mergeSelectedParts(std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, bool, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, DB::StorageMergeTree::MergeMutateSelectedEntry&, std::__1::shared_ptr<DB::RWLockImpl::LockHolderImpl>&) # 0xf10f108 in /usr/bin/clickhouse
6. ? # 0xf12168c in /usr/bin/clickhouse
7. ? # 0xf2cb076 in /usr/bin/clickhouse
8. ThreadPoolImpl<ThreadFromGlobalPool>::worker(std::__1::__list_iterator<ThreadFromGlobalPool, void*>) # 0x8513fb8 in /usr/bin/clickhouse
9. ThreadFromGlobalPool::ThreadFromGlobalPool<void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()>(void&&, void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()&&...)::'lambda'()::operator()() # 0x8515f6f in /usr/bin/clickhouse
10. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) # 0x851158f in /usr/bin/clickhouse
11. ? # 0x8515023 in /usr/bin/clickhouse
12. ? # 0x7eb5 in /usr/lib64/libpthread-2.17.so
13. __clone # 0xfe8fd in /usr/lib64/libc-2.17.so
(version 21.2.2.8 (official build))enter code here
--- with 21.8.
2021.10.25 08:29:18.354200 [ 55326 ] {} <Error> auto DB::IBackgroundJobExecutor::execute(DB::JobAndPool)::(anonymous class)::operator()() const: std::exception. Code: 1001, type: std::__1::__fs::filesystem::filesystem_error, e.what() = filesystem error: in create_directory: Too many links [/var/lib/clickhouse/tmp/store/48c/48cab972-1221-4222-a5f4-ed3960a08f35/tmp_merge_20211024_906198_906236_1], Stack trace (when copying this message, always include the lines below):
0. std::__1::system_error::system_error(std::__1::error_code, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) # 0x1590de6f in ?
1. ? # 0x158a171f in ?
2. ? # 0x158a1136 in ?
3. ? # 0x158a58f8 in ?
4. std::__1::__fs::filesystem::__create_directory(std::__1::__fs::filesystem::path const&, std::__1::error_code*) # 0x158a646b in ?
5. std::__1::__fs::filesystem::__create_directories(std::__1::__fs::filesystem::path const&, std::__1::error_code*) # 0x158a6125 in ?
6. std::__1::__fs::filesystem::__create_directories(std::__1::__fs::filesystem::path const&, std::__1::error_code*) # 0x158a6189 in ?
7. DB::DiskLocal::createDirectories(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) # 0xff032ec in /usr/bin/clickhouse
8. DB::MergeTreeDataMergerMutator::mergePartsToTemporaryPart(DB::FutureMergedMutatedPart const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, DB::BackgroundProcessListEntry<DB::MergeListElement, DB::MergeInfo>&, std::__1::shared_ptr<DB::RWLockImpl::LockHolderImpl>&, long, std::__1::shared_ptr<DB::Context const>, std::__1::unique_ptr<DB::IReservation, std::__1::default_delete<DB::IReservation> > const&, bool, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, DB::MergeTreeData::MergingParams const&, DB::IMergeTreeDataPart const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) # 0x10d14ff8 in /usr/bin/clickhouse
9. DB::StorageMergeTree::mergeSelectedParts(std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, bool, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, DB::StorageMergeTree::MergeMutateSelectedEntry&, std::__1::shared_ptr<DB::RWLockImpl::LockHolderImpl>&) # 0x10eea024 in /usr/bin/clickhouse
10. ? # 0x10ef9937 in /usr/bin/clickhouse
11. ? # 0x10c40e77 in /usr/bin/clickhouse
12. ThreadPoolImpl<ThreadFromGlobalPool>::worker(std::__1::__list_iterator<ThreadFromGlobalPool, void*>) # 0x8ffab98 in /usr/bin/clickhouse
13. ThreadFromGlobalPool::ThreadFromGlobalPool<void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()>(void&&, void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()&&...)::'lambda'()::operator()() # 0x8ffc73f in /usr/bin/clickhouse
14. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) # 0x8ff84ff in /usr/bin/clickhouse
15. ? # 0x8ffb763 in /usr/bin/clickhouse
16. ? # 0x7eb5 in /usr/lib64/libpthread-2.17.so
17. __clone # 0xfe8fd in /usr/lib64/libc-2.17.so
Cannot print extra info for Poco::Exception (version 21.8.5.1.altinity+prestable (altinity build))
df -i /var/lib/clickhouse/
df -h /var/lib/clickhouse/
upgrade CH to 21.8.10.19 https://github.com/ClickHouse/ClickHouse/issues/26471
https://github.com/ClickHouse/ClickHouse/issues/3174#issuecomment-423435071
https://clickhouse.com/docs/en/operations/settings/merge-tree-settings/#parts-to-throw-insert
# cat /etc/clickhouse-server/config.d/z_parts_to_throw.xml
<yandex>
<merge_tree>
<old_parts_lifetime>30</old_parts_lifetime>
<parts_to_delay_insert>150</parts_to_delay_insert>
<parts_to_throw_insert>900</parts_to_throw_insert>
<max_delay_to_insert>5</max_delay_to_insert>
</merge_tree>
</yandex>
https://clickhouse.com/docs/en/operations/settings/settings/#background_pool_size
# cat /etc/clickhouse-server/users.d/user_substitutes.xml
<?xml version="1.0"?>
<yandex>
<profiles>
<default>
<background_pool_size>32</background_pool_size>
</default>
</profiles>
</yandex>
restart CH

Struggling with Datetime Parsing

Can I using some internal ch functions to convert this datetime format 5/11/2021 13:10:25 to YYYY-MM-DD hh:mm:ss ?
I'm quite struggling while find it out.
I've tried following:
replaceRegexpOne(toString(START_TIME), '\d{2}/\d{2}/\d{4}', '(\2)-(\1)-(\3)')
But got:
RROR: garbage after DateTime: "375291300"
ERROR: DateTime must be in YYYY-MM-DD hh:mm:ss or NNNNNNNNNN (unix timestamp, exactly 10 digits) format.
: (at row 1)
, Stack trace (when copying this message, always include the lines below):
DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, int, bool) # 0x8b770fa in /usr/bin/clickhouse
DB::throwAtAssertionFailed(char const*, DB::ReadBuffer&) # 0x8bcd437 in /usr/bin/clickhouse
? # 0x105329db in /usr/bin/clickhouse
DB::CSVRowInputFormat::readRow(std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, DB::RowReadExtension&) # 0x10532b2e in /usr/bin/clickhouse
DB::IRowInputFormat::generate() # 0x1051ccc8 in /usr/bin/clickhouse
DB::ISource::tryGenerate() # 0x104a97d5 in /usr/bin/clickhouse
DB::ISource::work() # 0x104a93ba in /usr/bin/clickhouse
DB::ParallelParsingInputFormat::InternalParser::getChunk() # 0x10567dde in /usr/bin/clickhouse
DB::ParallelParsingInputFormat::parserThreadFunction(std::__1::shared_ptr<DB::ThreadGroupStatus>, unsigned long) # 0x1056737e in /usr/bin/clickhouse
ThreadPoolImpl::worker(std::__1::__list_iterator<ThreadFromGlobalPool, void*>) # 0x8bb9ed8 in /usr/bin/clickhouse
ThreadFromGlobalPool::ThreadFromGlobalPool<void ThreadPoolImpl::scheduleImpl(std::__1::function<void ()>, int, std::__1::optional)::'lambda1'()>(void&&, void ThreadPoolImpl::scheduleImpl(std::__1::function<void ()>, int, std::__1::optional)::'lambda1'()&&...)::'lambda'()::operator()() # 0x8bbb89f in /usr/bin/clickhouse
ThreadPoolImplstd::__1::thread::worker(std::__1::__list_iterator<std::__1::thread, void*>) # 0x8bb741f in /usr/bin/clickhouse
? # 0x8bba943 in /usr/bin/clickhouse
start_thread # 0x9609 in /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
clone # 0x122293 in /usr/lib/x86_64-linux-gnu/libc-2.31.so
(version 21.6.6.51 (official build))
(I can't show whole stack trace because of private data)
https://clickhouse.tech/docs/en/sql-reference/functions/type-conversion-functions/#parsedatetimebesteffort
https://clickhouse.tech/docs/en/sql-reference/functions/type-conversion-functions/#parsedatetimebesteffortUS
SELECT parseDateTimeBestEffort('5/11/2021 13:10:25')
┌─parseDateTimeBestEffort('5/11/2021 13:10:25')─┐
│ 2021-11-05 13:10:25 │
└───────────────────────────────────────────────┘
https://clickhouse.tech/docs/en/operations/settings/settings/#settings-date_time_input_format
You can set date_time_input_format=best_effort in a session, query or user's profile. It will allow to parse text formats CSV/TSV/JSON*,...

Connection reset by peer while uploading large csv in clickhouse

Getting this error while uploading a csv to clickhouse with row count > 2.5M and number of column > 90.
code: 210. DB::NetException: Connection reset by peer, while writing to socket (10.107.146.25:9000): data for INSERT was parsed from stdin
Here is the create table statement of the table
CREATE TABLE table_names
(
{column_names and types}
)
ENGINE = MergeTree
ORDER BY tuple()
SETTINGS index_granularity = 8192,
allow_nullable_key = 1
Here is the command which I am running for inserting the csv
cat {filepath}.csv | sudo docker run -i --rm yandex/clickhouse-client -m --host {host} -u {user} --input_format_allow_errors_num=10 --input_format_allow_errors_ratio=0.1 --max_memory_usage=15000000000 --format_csv_allow_single_quotes 0 --input_format_skip_unknown_fields 1 --query='INSERT INTO table_name FORMAT CSVWithNames'
This is the error logged in query_logs system table in clickhouse
Code: 33, e.displayText() = DB::Exception: Cannot read all data. Bytes read: 65735. Bytes expected: 134377. (version 21.6.5.37 (official build))
Here is the stack trace (again from query_log table)
0. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) # 0x8b6cbba in /usr/bin/clickhouse
1. DB::ReadBuffer::readStrict(char*, unsigned long) # 0x8ba7c4d in /usr/bin/clickhouse
2. DB::CompressedReadBufferBase::readCompressedData(unsigned long&, unsigned long&, bool) # 0xf2347bc in /usr/bin/clickhouse
3. DB::CompressedReadBuffer::nextImpl() # 0xf233f27 in /usr/bin/clickhouse
4. void DB::readVarUIntImpl<false>(unsigned long&, DB::ReadBuffer&) # 0x8ba7eac in /usr/bin/clickhouse
5. ? # 0xf40843b in /usr/bin/clickhouse
6. DB::SerializationString::deserializeBinaryBulk(DB::IColumn&, DB::ReadBuffer&, unsigned long, double) const # 0xf40723b in /usr/bin/clickhouse
7. DB::ISerialization::deserializeBinaryBulkWithMultipleStreams(COW<DB::IColumn>::immutable_ptr<DB::IColumn>&, unsigned long, DB::ISerialization::DeserializeBinaryBulkSettings&, std::__1::shared_ptr<DB::ISerialization::DeserializeBinaryBulkState>&, std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, COW<DB::IColumn>::immutable_ptr<DB::IColumn> > > >*) const # 0xf3d4dd5 in /usr/bin/clickhouse
8. DB::SerializationNullable::deserializeBinaryBulkWithMultipleStreams(COW<DB::IColumn>::immutable_ptr<DB::IColumn>&, unsigned long, DB::ISerialization::DeserializeBinaryBulkSettings&, std::__1::shared_ptr<DB::ISerialization::DeserializeBinaryBulkState>&, std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, COW<DB::IColumn>::immutable_ptr<DB::IColumn> > > >*) const # 0xf3f550f in /usr/bin/clickhouse
9. DB::NativeBlockInputStream::readData(DB::IDataType const&, COW<DB::IColumn>::immutable_ptr<DB::IColumn>&, DB::ReadBuffer&, unsigned long, double) # 0xfa9f8f5 in /usr/bin/clickhouse
10. DB::NativeBlockInputStream::readImpl() # 0xfaa07b3 in /usr/bin/clickhouse
11. DB::IBlockInputStream::read() # 0xf30f452 in /usr/bin/clickhouse
12. DB::TCPHandler::receiveData(bool) # 0x104403c4 in /usr/bin/clickhouse
13. DB::TCPHandler::receivePacket() # 0x10435bec in /usr/bin/clickhouse
14. DB::TCPHandler::readDataNext(unsigned long, long) # 0x10437e5f in /usr/bin/clickhouse
15. DB::TCPHandler::processInsertQuery(DB::Settings const&) # 0x1043625e in /usr/bin/clickhouse
16. DB::TCPHandler::runImpl() # 0x1042eb09 in /usr/bin/clickhouse
17. DB::TCPHandler::run() # 0x10441839 in /usr/bin/clickhouse
18. Poco::Net::TCPServerConnection::start() # 0x12a3fd4f in /usr/bin/clickhouse
19. Poco::Net::TCPServerDispatcher::run() # 0x12a417da in /usr/bin/clickhouse
20. Poco::PooledThread::run() # 0x12b7ab39 in /usr/bin/clickhouse
21. Poco::ThreadImpl::runnableEntry(void*) # 0x12b76b2a in /usr/bin/clickhouse
22. start_thread # 0x9609 in /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
23. __clone # 0x122293 in /usr/lib/x86_64-linux-gnu/libc-2.31.so
I initially though this was due to partitioning and order key, I removed everything, still the same issue comes and row count differs by more than 1 M rows.
The error DB::Exception: Cannot read all data. means data you're trying to insert is somehow corrupted. Likely, some of your 2.5M rows don't have all fields, or there is a type mismatch in the content of some rows.
I would suggest trying to insert in smaller batches, so you can figure out where the problem with your data is. So you could get some succeeded batches until it finds a corrupted row.

Resources