How to setup replica and shard in clickhouse? - clickhouse

Do you know if there is any web article that can guide me on how to setup replica or shard in ClickHouse? I am not knowledge on ZooKeeper as well, don't know how to get started; most of the article that I read online is about how it works, but I can't find how to setup it.
all these articles do show some settings; however, I don't how to put them together.
For example, I don't know where to put Zoo Keeper configuration...
I read these articles
https://clickhouse.yandex/tutorial.html
https://blog.uiza.io/replicated-vs-distributed-on-clickhouse-part-1/
https://blog.uiza.io/replicated-and-distributed-on-clickhouse-part-2/
https://www.altinity.com/blog/2018/5/10/circular-replication-cluster-topology-in-clickhouse
Set ZooKeeper locations in configuration file
<zookeeper-servers>
<node>
<host>zoo01.yandex.ru</host>
<port>2181</port>
</node>
<node>
<host>zoo02.yandex.ru</host>
<port>2181</port>
</node>
<node>
<host>zoo03.yandex.ru</host>
<port>2181</port>
</node>
</zookeeper-servers>

Zookeeper it's a standalone daemon, you need install it and run it (one instance of zookeeper daemon is enough) after that you need add
<zookeeper-servers>
<node>
<host>zoo01.yourdomain.com</host>
<port>2181</port>
</node>
</zookeeper-servers>
to config on each Clickhouse server
and add remote-servers configuration to each Clickhouse server, i.e.
<remote_servers>
<your_cluster_name>
<shard>
<weight>1</weight>
<internal_replication>true</internal_replication>
<replica>
<host>clickhouse-ru-1.local</host>
<port>9000</port>
</replica>
<replica>
<host>clickhouse-ru-2.local</host>
<port>9000</port>
</replica>
</shard>
<shard>
<weight>1</weight>
<internal_replication>false</internal_replication>
<replica>
<host>clickhouse-eu-1.local</host>
<port>9000</port>
</replica>
<replica>
<host>clickhouse-eu-2.local</host>
<port>9000</port>
</replica>
</shard>
<shard>
<weight>1</weight>
<internal_replication>false</internal_replication>
<replica>
<host>clickhouse-us-1.local</host>
<port>9000</port>
</replica>
<replica>
<host>clickhouse-us-2.local</host>
<port>9000</port>
</replica>
</shard>
</your_cluster_name>
</remote_servers>
after that please read about Distributed table Engine
https://clickhouse.yandex/docs/en/operations/table_engines/distributed/

Related

How to use the <secret> tag in the remote_servers tag?

I am modifying the remote_servers tag in config.xml.
Most of the setup was done with the help of stackoverflow.
thank you.
However, when I use the secret tag in the cluster configuration, I get an error.
Below is an example of the settings I have configured.
<clusters>
<secret>same_user</secret>
<shard>
<replica>
<host>127.0.0.1</host>
<default_database>local</default_database>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>remote1</host>
<default_database>local</default_database>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>remote2</host>
<default_database>local</default_database>
<port>9000</port>
</replica>
</shard>
</clusters>
All shard is using the same user and same password.
Also, the user connected as clickhouse-client to execute the query is the same.
But when I execute SELECT COUNT(*) FROM distributedTable in clickhouse-client , an error occurs.
error is
Code: 101. DB::Exception: Received from localhost:9000. DB::Exception: Received from remote1:9000. DB::Exception: Hash mismatch.
This is a query that works fine if i enter the <user>, <password> tag without using the secret tag. Is there a cause? I'm using the 20.11.7 version.
Please help if there is a way.
thank you.
I found the problem.
Must enter the same value for the secret tag of each remote shard config file.
I only put the secret tag on servers using the distributed table engine, and this was a mistake.

Why is the password entered in the shard directories created in the distributed engine table directory?

I have set up 3 shards including local shards on the cluster as shown below.
<log>
<secret>users</secret>
<shard>
<replica>
<host>127.0.0.1</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>remote1</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>remote2</host>
<port>9000</port>
</replica>
</shard>
</log>
And I created a distributed engine table including the shrding key.
When INSERT Query is executed on a distributed table,
a directory of shards containing passwords is created in the 'distributed table directory' in the 'ClickHouse data directory'.
For example, if the ClickHouse data directory is
/etc/clickhouse-server/data/
and a distributed table named log_total is created in the local db.
When I execute INSERT INTO local.log_total ..... query.
In this path (/etc/clickhouse-server/data/local/log_total/)
three directories are created as shown below.
default:password#127%2E0%2E0%2E1:9000#dbname
default:password#remote1:9000#dbname
default:password#remote2:9000#dbname
I wish this directories didn't contain passwords.
I thought using the secret tag would solve it, but it wasn't.
Is there any good way?
Please share your experience.
thank you.
There is a setting for this
--use_compact_format_in_distributed_parts_names arg
Changes format of directories names for distributed table insert parts.
and it's enabled by default
SELECT
name,
value
FROM system.settings
WHERE name = 'use_compact_format_in_distributed_parts_names'
┌─name──────────────────────────────────────────┬─value─┐
│ use_compact_format_in_distributed_parts_names │ 1 │
└───────────────────────────────────────────────┴───────┘
If use_compact_format_in_distributed_parts_names=1 then folder name became shard_1_replica_1 without username/password.
What CH version do you use?

ehcache memory only cache attempting to store to disk

I have the following in my ehcache.xml
<?xml version="1.0" encoding="UTF-8"?>
<ehcache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="http://ehcache.org/ehcache.xsd" updateCheck="false" >
<defaultCache
maxElementsInMemory="500"
eternal="false"
overflowToDisk="false"
overflowToOffHeap="false"
memoryStoreEvictionPolicy="LRU" >
<persistence strategy="none" />
</defaultCache>
<cache name="leadSourceCollectionsCache"
maxElementsInMemory="500"
eternal="false"
overflowToDisk="false"
overflowToOffHeap="false"
memoryStoreEvictionPolicy="LRU" >
<persistence strategy="none" />
</cache>
</ehcache>
When I start my server I see the following ehcache related debug output
07.08.2013 15:36:10 DEBUG-ConfigurationFactory: [Configuring ehcache from InputStream]
07.08.2013 15:36:10 DEBUG-BeanHandler: [Ignoring ehcache attribute xmlns:xsi]
07.08.2013 15:36:10 DEBUG-BeanHandler: [Ignoring ehcache attribute xsi:noNamespaceSchemaLocation]
07.08.2013 15:36:10 DEBUG-PropertyUtil: [propertiesString is null.]
07.08.2013 15:36:10 DEBUG-ConfigurationHelper: [No CacheManagerEventListenerFactory class specified. Skipping...]
07.08.2013 15:36:10 DEBUG-Cache: [No BootstrapCacheLoaderFactory class specified. Skipping...]
07.08.2013 15:36:10 DEBUG-Cache: [CacheWriter factory not configured. Skipping...]
07.08.2013 15:36:10 DEBUG-ConfigurationHelper: [No CacheExceptionHandlerFactory class specified. Skipping...]
07.08.2013 15:36:10 DEBUG-Cache: [No BootstrapCacheLoaderFactory class specified. Skipping...]
07.08.2013 15:36:10 DEBUG-Cache: [CacheWriter factory not configured. Skipping...]
07.08.2013 15:36:10 DEBUG-ConfigurationHelper: [No CacheExceptionHandlerFactory class specified. Skipping...]
07.08.2013 15:36:10 DEBUG-MemoryStore: [Initialized net.sf.ehcache.store.NotifyingMemoryStore for leadSourceCollectionsCache]
07.08.2013 15:36:10 DEBUG-Cache: [Initialised cache: leadSourceCollectionsCache]
07.08.2013 15:36:10 DEBUG-ConfigurationHelper: [CacheDecoratorFactory not configured. Skipping for 'leadSourceCollectionsCache'.]
07.08.2013 15:36:10 DEBUG-ConfigurationHelper: [CacheDecoratorFactory not configured for defaultCache. Skipping for 'leadSourceCollectionsCache'.]
07.08.2013 15:36:10 DEBUG-Cache: [No BootstrapCacheLoaderFactory class specified. Skipping...]
07.08.2013 15:36:10 DEBUG-Cache: [CacheWriter factory not configured. Skipping...]
07.08.2013 15:36:10 DEBUG-MemoryStore: [Initialized net.sf.ehcache.store.MemoryStore for leadSourceCollectionCache]
07.08.2013 15:36:10 DEBUG-DiskStorePathManager: [Using diskstore path /tmp]
07.08.2013 15:36:10 DEBUG-DiskStorePathManager: [Holding exclusive lock on /tmp/.ehcache-diskstore.lock]
07.08.2013 15:36:10 DEBUG-DiskStorageFactory: [Failed to delete file lead%0053ource%0043ollection%0043ache.index]
07.08.2013 15:36:10 DEBUG-DiskStorageFactory: [Matching data file missing (or empty) for index file. Deleting index file /tmp/lead%0053ource%0043ollection%0043ache.index]
07.08.2013 15:36:10 DEBUG-DiskStorageFactory: [Failed to delete file lead%0053ource%0043ollection%0043ache.index]
07.08.2013 15:36:10 DEBUG-Cache: [Initialised cache: leadSourceCollectionCache]
So it looks like it is creating the leadSourceCollectionCache as a memory store cache, which is what I want. However, when I call leadSourceCollectionCache.put I get the following
07.08.2013 15:37:46 DEBUG-Segment: [put added 0 on heap]
07.08.2013 15:37:46 ERROR-DiskStorageFactory: [Disk Write of 1-2 failed: ]
java.io.NotSerializableException: intouch.connector.business.LeadSourceCollection
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1181)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1541)
at java.io.ObjectOutputStream.defaultWriteObject(ObjectOutputStream.java:439)
at net.sf.ehcache.Element.writeObject(Element.java:835)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:988)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1493)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
at net.sf.ehcache.util.MemoryEfficientByteArrayOutputStream.serialize(MemoryEfficientByteArrayOutputStream.java:97)
at net.sf.ehcache.store.disk.DiskStorageFactory.serializeElement(DiskStorageFactory.java:405)
at net.sf.ehcache.store.disk.DiskStorageFactory.write(DiskStorageFactory.java:384)
at net.sf.ehcache.store.disk.DiskStorageFactory$DiskWriteTask.call(DiskStorageFactory.java:485)
at net.sf.ehcache.store.disk.DiskStorageFactory$PersistentDiskWriteTask.call(DiskStorageFactory.java:1088)
at net.sf.ehcache.store.disk.DiskStorageFactory$PersistentDiskWriteTask.call(DiskStorageFactory.java:1072)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
I cannot for the life of me figure out why this is happening. I do not want my cache to touch the disk in any way. I also don't know why it is attempt to write things to /tmp when persistence is set to none.
facepalm
Seems I had a typo in the name. I had leadSourceCollectionsCache but it should have been leadSourceCollectionCache without the 's'.
My co-worker got the following extra output which I didn't get:
WARN-ConfigurationFactory: [No configuration found. Configuring ehcache from ehcache-failsafe.xml found in the classpath: jar:file:/Users/foo/.m2/repository/net/sf/ehcache/ehcache-core/2.6.6/ehcache-core-2.6.6.jar!/ehcache-failsafe.xml]

JBoss AS 7 Infinispan Cluster

I have a two node JBoss AS 7.1.1.FINAL cluster setup in the following way -
master - running on Ubuntu Server 12.10 (VirtualBox VM)
slave - running on Windows 7 (VirtaulBox host machine)
I have deployed a Spring web application on both nodes and I'm trying to set up a working replicated cache. My problem is that the cache does not seem to be replicated even though the clustering apparently works.
My config -
in domain.xml (both on master and slave)
<subsystem xmlns="urn:jboss:domain:infinispan:1.2" default-cache-container="cluster">
<cache-container name="cluster" aliases="ha-partition" default-cache="default" jndi-name="java:jboss/infinispan/cluster" start="EAGER">
<transport lock-timeout="60000" />
<replicated-cache name="default" mode="SYNC" batching="true">
<locking isolation="REPEATABLE_READ"/>
</replicated-cache>
</cache-container>
</subsystem>
This is pretty much the default config in domain.xml, except for the jndi-name and the EAGER start.
In spring configuration -
<infinispan:container-cache-manager id="cacheManager" cache-container-ref="springCacheContainer" />
<jee:jndi-lookup id="springCacheContainer" jndi-name="java:jboss/infinispan/cluster" />
With this set up, the caching works, but its not replicated. The caches seem to operate independently of each other. Also, the EAGER start seems to have no effect. The caches seem to be initialized only when they are first used.
from master log (first time cache is used)-
[Server:server-one] 03:25:55,756 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ajp-192.168.2.13-192.168.2.13-8009-3) ISPN000078: Starting JGroups Channel
[Server:server-one] 03:25:55,762 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ajp-192.168.2.13-192.168.2.13-8009-3) ISPN000094: Received new cluster view: [master:server-one/cluster|1] [master:server-one/cluster, slave:server-one-slave/cluster]
[Server:server-one] 03:25:55,763 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ajp-192.168.2.13-192.168.2.13-8009-3) ISPN000079: Cache local address is master:server-one/cluster, physical addresses are [192.168.2.13:55200]
[Server:server-one] 03:25:55,769 INFO [org.infinispan.factories.GlobalComponentRegistry] (ajp-192.168.2.13-192.168.2.13-8009-3) ISPN000128: Infinispan version: Infinispan 'Brahma' 5.1.2.FINAL
[Server:server-one] 03:25:55,851 INFO [org.jboss.as.clustering.infinispan] (ajp-192.168.2.13-192.168.2.13-8009-3) JBAS010281: Started cluster cache from cluster container
from slave log (first time cache is used)-
[Server:server-one-slave] 03:29:38,124 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ajp--192.168.2.10-8009-2) ISPN000078: Starting JGroups Channel
[Server:server-one-slave] 03:29:38,129 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ajp--192.168.2.10-8009-2) ISPN000094: Received new cluster view: [master:server-one/cluster|1] [master:server-one/cluster, slave:server-one-slave/cluster]
[Server:server-one-slave] 03:29:38,130 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ajp--192.168.2.10-8009-2) ISPN000079: Cache local address is slave:server-one-slave/cluster, physical addresses are [192.168.2.10:55200]
[Server:server-one-slave] 03:29:38,133 INFO [org.infinispan.factories.GlobalComponentRegistry] (ajp--192.168.2.10-8009-2) ISPN000128: Infinispan version: Infinispan 'Brahma' 5.1.2.FINAL
[Server:server-one-slave] 03:29:38,195 INFO [org.jboss.as.clustering.infinispan] (ajp--192.168.2.10-8009-2) JBAS010281: Started cluster cache from cluster container
I don't think this is a udp/multicast issue, as I have mod_cluster, HornetQ and Quartz set up in this cluster and they all work as expected.
Putting <distributable/> in web.xml did the trick.
I had a similar issue where my cache wouldn't replicate until the application was first used. I was able to resolve this by setting the "start" attribute of the replicated-cache to EAGER, along with the cache-container attribute start="EAGER".
<replicated-cache name="default" mode="SYNC" batching="true" start="EAGER">

Mahout RecommenderJob not converging

This is my first SO post so please let me know if I've missed out anything important. I am a Mahout/Hadoop beginner, and am trying to put together a distributed recommendation engine.
In order to simulate working on a remote cluster, I have set up hadoop on my machine to communicate with a Ubuntu VM (using VirtualBox), also located on my machine, which has hadoop installed on it. This setup seems to be working fine and I am now trying to run Mahout's 'RecommenderJob' on a (very!) small trial dataset as a test.
The input consists of a .csv file (saved on the hadoop dfs) containing around 50 user preferences in the format: userID, itemID, preference ... and the command I am running is:
hadoop jar /Users/MyName/src/trunk/core/target/mahout-core-0.8-SNAPSHOT-job.jar org.apache.mahout.cf.taste.hadoop.item.RecommenderJob -Dmapred.input.dir=/user/MyName/Recommendations/input/TestRatings.csv -Dmapred.output.dir=/user/MyName/Recommendations/output -s SIMILARITY_PEARSON_CORELLATION
where TestRatings.csv is the file containing the preferences and output is the desired output directory.
At first the job looks like it's running fine, and I get the following output:
12/12/11 12:26:21 INFO common.AbstractJob: Command line arguments: {--booleanData=[false], --endPhase=[2147483647], --maxPrefsPerUser=[10], --maxPrefsPerUserInItemSimilarity=[1000], --maxSimilaritiesPerItem=[100], --minPrefsPerUser=[1], --numRecommendations=[10], --similarityClassname=[SIMILARITY_PEARSON_CORELLATION], --startPhase=[0], --tempDir=[temp]}
12/12/11 12:26:21 INFO common.AbstractJob: Command line arguments: {--booleanData=[false], --endPhase=[2147483647], --input=[/user/Naaman/Delphi/input/TestRatings.csv], --maxPrefsPerUser=[1000], --minPrefsPerUser=[1], --output=[temp/preparePreferenceMatrix], --ratingShift=[0.0], --startPhase=[0], --tempDir=[temp]}
12/12/11 12:26:21 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
12/12/11 12:26:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
12/12/11 12:26:22 INFO input.FileInputFormat: Total input paths to process : 1
12/12/11 12:26:22 WARN snappy.LoadSnappy: Snappy native library not loaded
12/12/11 12:26:22 INFO mapred.JobClient: Running job: job_local_0001
12/12/11 12:26:22 INFO mapred.Task: Using ResourceCalculatorPlugin : null
12/12/11 12:26:22 INFO mapred.MapTask: io.sort.mb = 100
12/12/11 12:26:22 INFO mapred.MapTask: data buffer = 79691776/99614720
12/12/11 12:26:22 INFO mapred.MapTask: record buffer = 262144/327680
12/12/11 12:26:22 INFO mapred.MapTask: Starting flush of map output
12/12/11 12:26:22 INFO compress.CodecPool: Got brand-new compressor
12/12/11 12:26:22 INFO mapred.MapTask: Finished spill 0
12/12/11 12:26:22 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
12/12/11 12:26:22 INFO mapred.LocalJobRunner:
12/12/11 12:26:22 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done.
12/12/11 12:26:22 INFO mapred.Task: Using ResourceCalculatorPlugin : null
12/12/11 12:26:22 INFO mapred.ReduceTask: ShuffleRamManager: MemoryLimit=1491035776, MaxSingleShuffleLimit=372758944
12/12/11 12:26:22 INFO compress.CodecPool: Got brand-new decompressor
12/12/11 12:26:22 INFO compress.CodecPool: Got brand-new decompressor
12/12/11 12:26:22 INFO compress.CodecPool: Got brand-new decompressor
12/12/11 12:26:22 INFO compress.CodecPool: Got brand-new decompressor
12/12/11 12:26:22 INFO compress.CodecPool: Got brand-new decompressor
12/12/11 12:26:22 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0 Thread started: Thread for merging on-disk files
12/12/11 12:26:22 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0 Thread started: Thread for merging in memory files
12/12/11 12:26:22 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0 Thread waiting: Thread for merging on-disk files
12/12/11 12:26:22 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0 Need another 1 map output(s) where 0 is already in progress
12/12/11 12:26:22 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0 Thread started: Thread for polling Map Completion Events
12/12/11 12:26:22 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0 Scheduled 0 outputs (0 slow hosts and0 dup hosts)
12/12/11 12:26:23 INFO mapred.JobClient: map 100% reduce 0%
12/12/11 12:26:28 INFO mapred.LocalJobRunner: reduce > copy >
12/12/11 12:26:31 INFO mapred.LocalJobRunner: reduce > copy >
12/12/11 12:26:37 INFO mapred.LocalJobRunner: reduce > copy >
But then the last three lines repeat indefinitely (I left it overnight...), with the two lines:
12/12/11 12:27:22 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0 Need another 1 map output(s) where 0 is already in progress
12/12/11 12:27:22 INFO mapred.ReduceTask: attempt_local_0001_r_000000_0 Scheduled 0 outputs (0 slow hosts and0 dup hosts)
repeating every twelve rows.
I'm not sure whether there's something wrong with my input, or whether the tiny size of the trial data is messing things up. Any help and/or advice on the best way to go about this would be much appreciated.
p.s. I was trying to follow the instructions from https://www.box.com/s/041rdjeh7sny128r2uki
This is really a Hadoop or cluster issue. It is waiting on mapper output that is not coming. Look for earlier failures, in the mapping phase.

Resources