Change Source Directory in Clickhouse - clickhouse

I'm trying to change /var/lib/clickhouse to something like /mnt/sdc/clickhouse so that i could have clickhouse in another hard disk. I've tried this steps:
‍‍1. Stop Clickhouse
2. Move directory /var/lib/clickhouse to /mnt/sdc/clickhouse
3. Replace all /var/lib/s to /mnt/sdc/ in file /etc/clickhouse-server/config.xml
4. Start Clickhouse
But the problem is /var/lib/clickhouse contains hard links so when i mv the directory, this hard links become corrupted.
Is this OK or not?
How should i change the clickhouse directory?

To copy files while preserving hard links, you can use rsync with --hard-links (or -H) option. For your setup, you should be able to run the following:
rsync -a -H /var/lib/clickhouse/ /mnt/sdc/clickhouse
Note the trailing slash after the first directory to copy the directory contents rather than the directory itself.
Then, as you mentioned, update the /var/lib/ paths to /mnt/sdc/ in /etc/clickhouse-server/config.xml, and restart ClickHouse with systemctl restart clickhouse-server.
I was able to follow these steps to migrate ClickHouse data to a new disk mount using rsync, and ClickHouse restarted successfully using the new disk (ClickHouse v22.3 on Ubuntu 18.04).

Related

How to execute a bash file containing curl instructions within a Google storage bucket and directly copy the contents to the bucket?

I have a bash script file where each line is a curl command to download a file. This bash file is in a Google bucket.
I would like to execute the file either directly from the storage and copy its downloaded contents there or execute it locally and directly copy its content to the bucket.
Basically, I do not want to have these fils on my local machine.. I have tried things along these lines but it either failed or simply downloaded everything locally.
gsutil cp gs://bucket/my_file.sh - | bash gs://bucket/folder_to_copy_to/
Thank you!
To do so, the bucket needs to be mounted on the pod (the pod would see it as a directory).
If the bucket supports NFS, you would be able to mount it as shown here.
Also, there is another way as shown in this question.
otherwise, you would need to copy the script to the pod, run it, then upload the generated files to the bucket, and lastly clean everything up.
The better option is to use a filestore which can be easily mounted using CSI drivers as mentioned here.

Is it possible to set read-only for myself on unix?

I have been given the address to a very large folder on a shared Unix server. I've been given a path to some files on a unix server I'm working on through ssh. I don't want to waste space by creating a duplicate in my home area so I've linked the folder through ln -s. However I don't want to risk making any changes to the data within the folder.
How would I go about setting the files to read-only for myself? Do I have to ask the owner of the folder/file? Do I need sudo access? I am not the owner of the file and I do not have root access.
Read about chmod command to change the mask on the files the links point to.
The owner or root can restrict access to files.
Also you probably need to mount that shared folder as read-only. But I am not sure how your folder is connected
UPDATE
The desired behaviour can be achieved using mount tool. (man page for mount).
Note that the filesystem mount options will remain the same as those on the original mount point, and cannot be changed by passing the -o option along with --bind/--rbind. The mount options can be changed by a separate remount command, for example:
mount --bind olddir newdir
mount -o remount,ro newdir
Here is the similiar question to yours. Also solved via mount tool.

How can I safely move Elasticsearch indices to another mount in Linux?

I'm having a number of indices which are actually causing some space issues at the moment in my Ubuntu machine. The indices keep growing on a daily basis.
So I thought of moving it to another mount directory which has more space apparently. How can I do this safely?
And I have to make sure that the existing ES indices and the Kibana graphs would be safe enough after the doing the move.
What I did: Followed this SO and moved my data directory of Elasticsearch somehow to the directory (/data/es_data) I needed, but after I did that, I couldn't view my existing indices plus the Kibana graphs and dashboards which I created as well.
Am I doing something wrong? Any help could be appreciated.
FWIW If it were me, I would stop elasticsearch & kibana (& logstash if this is the only elasticsearch node in the cluster) then move the old data dir to a new location out of the way:
sudo mv /var/lib/elasticsearch /var/lib/elasticsearch-old
Then set up the new volume (which should be at least 15% larger than the size of the indexes you have on disk as elasticsearch won't create new indexes on a disk with less than 15% free space) with a file system and find out it's UUID and get ready to mount it:
sudo fdisk /dev/sdX # New volume, use all the space
sudo mkfs.ext4 /dev/sdX1
ls -la /dev/disk/by-uuid/ | grep /dev/sdX1 # Or forget the grep and manually look for it
Then add the following to your /etc/fstab, replacing with the UUID from previous command:
UUID=<RESPONSE> /var/lib/elasticsearch ext4 defaults 0 0
Make the new directory as the old one is gone, it probably wants chowning (I assume the owner should be elasticsearch but you can confirm by checking ownership of the old folder) and you want to copy the content from the old one:
sudo mkdir /var/lib/elasticsearch
sudo chown -R elasticsearch: /var/lib/elasticsearch
cp -rp /var/lib/elasticsearch-old/* /var/lib/elasticsearch
Once everything has finished copying across you should then be able to start elasticsearch back up, it should find the indexes as they haven't moved, config doesn't need updating.
Once you're happy that everything is working you can delete /var/lib/elasticsearch-old and reclaim your space. Failing that you can revert to the old data and it should continue to work.

Rsync create symbolic links only

I currently have rsync working well. It copies all my files from one directory to another directory. The only thing is it is physically copying the files.
I have a lot of large files that I don't want to have a duplicate of all the files. I just want to create a symbolic link in the new directory so that I can serve the data on a webpage. The source directory has some scripts and files I don't want the public to see. I'm moving the safe data to the web root (destination).
What I would like rsync to do is any new files in the source directory would create links into the destination. That way I am not using up my hard drive space like I currently am doing. What I have works perfect except for doing the symbolic link aspect to it. Is there a way to have rsync track and create symbolic links?
rsync -aP --exclude="file.sql" --exclude="*~" --exclude=".*" --exclude="*.sh" . ${destination}
It's not a symlink, but you might be able to work with --link-dest=DIR. It creates a hard link which will create a new name for the same file. This will behave similarly to a softlink as long as:
Both files are on the same filesystem
You don't plan to delete the original and not the copy (the symlink would break but a hard-link won't)
You don't have anything explicitly checking to see if it's a softlink
You could use cp -aR -s (Linux or FreeBSD) or cp --archive --recursive --symbolic-link (Linux) to create symbolic links to the source files in the destination directory instead of copies. Note that -s is non-standard.
Can lndir be useful to you. According to manual it creates a shadow directory of symbolic links to another directory tree.
I think master_delivery is probably the best tool for this. With the already introduced --link-dest option of rsync, files which are not the same will be copied. If you don't mind the situation where copies and hardlinks are mixed, you can use rsync, but if you want to eliminate duplicates completely, use master_delivery.
Usage is:
gem install master_delivery
master_delivery -m <path_to_master> -d <path_to_delivery_root>

Rsync bash script and hard linking files

I am creating a bash script to backup my files with rsync.
Backups all come from a single directory.
I only want new or modified files to be backed up.
Currently, I am telling rsync to backup the dir, and to check the files compared to the last backup.
The way I am doing this is
THE_TIME=`date "+%Y-%m-%dT%H:%M:%S"`
rsync -aP --link-dest=/Backup/Current /usr/home/user/backup /Backup/Backup-$THE_TIME
rm -f /Backup/Current
ln -s /Backup/Backup-$THE_TIME /Backup/Current
I am pretty sure I have the syntax correct for this. Each backup will check against the "Current" folder, and upload only as necesary. It will then delete the Current folder, and re-create the symlink to the newest backup it just did.
I am getting an error when I run the script:
rsync: link "/Backup/Backup-2010-08-04-12:21:15/dgs1200series_manual_310.pdf"
=> /Backup/Current/dgs1200series_manual_310.pdf
failed: Operation not supported (45)
The host OS is running HFS filesystem, which supports hard linking. I am trying to figure out if something else is not supporting this, or if I have a problem in my code.
Thanks for any help
Edit:
I am able to create a hard link on my local machine.
I am also able to create a hard link on the remote server (when logged in locally)
I am NOT able to create a hard link on the remote server when mounted via afp. Even if both files exist on the server.
I am guessing this is a limitation of afp.
Just in case your command line is only an example: Be sure to always specify the link-dest directory with an absolute pathname! That’s something which took me quite some time to figure out …
Two things from the man page stand out that are worth checking:
If file's aren't linking, double-check their attributes. Also
check if some attributes are getting forced outside of rsync's
control, such a mount option that squishes root to a single
user, or mounts a removable drive with generic ownership (such
as OS X's “Ignore ownership on this volume” option).
and
Note that rsync versions prior to 2.6.1 had a bug that could
prevent --link-dest from working properly for a non-super-user
when -o was specified (or implied by -a). You can work-around
this bug by avoiding the -o option when sending to an old rsync.
Do you have the "ignore ownership" option turned on? What version of rsync do you have?
Also, have you tried manually creating a similar hardlink using ln at the command line?
I don't know if this is the same issue, but I know that rsync can't sync a file when the destination is a FAT32 partition and the filename has a ":" (colon) in it. [The source filesystem is ext3, and the destination is FAT32]
Try reconfiguring the date command so that it doesn't use a colon and see if that makes a difference.
e.g.
THE_TIME=`date "+%Y-%m-%dT%H_%_%S"`

Resources