Comparing files using Ansible - ansible

I need to find a way where I can compare elasticsearch template folders in given elasticsearch Data host group. Meaning if the directory is /usr/local/elasticsearch/config/templates/, I need to make sure all the files inside that directory in that ansible host group is same.
No extra template files or difference version template files. I haven't been able to figure out how to do this.

Try combining ansible with rsync dry-run using the shell module:
ansible -i production data_hosts -l '!~host1' -f 1 -m shell \
-a 'rsync --checksum --delete --dry-run -r -v host1.example.com:/usr/local/elasticsearch/config/templates/ /usr/local/elasticsearch/config/templates'
Explanation
Compares the /usr/local/elasticsearch/config/templates directory on all hosts in the [data_hosts] group to host1
Excludes host1.example.com using the -l limit argument: -l '!~host1'
Uses -f 1 to only run one compare at a time. Optional, but helped in my case because the directories contained large numbers of files (>10K)
Uses --dry-run to prevent rsync from actually sync'ing the directories
Uses --delete to list extraneous files in the destination directory
Uses --checksum to compare files based on checksum rather than mod-time and size
Notes
You could modify this one-liner to perform the sync by removing --dry-run. Consider adding -z to compress file data during transfer, and -a for archive mode.
The production inventory file would look like this:
[data_hosts]
host1.example.com
host2.example.com
host3.example.com
host4.example.com

I did it by first comparing the number of files in all hosts in that given group under the template folder, then getting the list of files and their respective md5sum values and exporting them to current playbook using include_vars. Then I compared each files md5sum against the one exported with include_vars and with_items.

Related

dockerfile copy list of files, when list is taken from a local file

I've got a file containing a list of paths that I need to copy by Dockerfile's COPY command on docker build.
My use case is such: I've got a python requirements.txt file, when inside I'm calling multiple other requirements files inside the project, with -r PATH.
Now, I want to docker COPY all the requirements files alone, run pip install, and then copy the rest of the project (for cache and such). So far i haven't managed to do so with docker COPY command.
No need of help on fetching the paths from the file - I've managed that - just if it is possible to be done - how?
thanks!
Not possible in the sense that the COPY directive allows it out of the box, however if you know the extensions you can use a wildcard for the path such as COPY folder*something*name somewhere/.
For simple requirements.txt fetching that could be:
# but you need to distinguish it somehow
# otherwise it'll overwrite the files and keep the last one
# e.g. rename package/requirements.txt to package-requirements.txt
# and it won't be an issue
COPY */requirements.txt ./
RUN for item in $(ls requirement*);do pip install -r $item;done
But if it gets a bit more complex (as in collecting only specific files, by some custom pattern etc), then, no. However for that case simply use templating either by a simple F-string, format() function or switch to Jinja, create a Dockerfile.tmpl (or whatever you'd want to name a temporary file), then collect the paths, insert into the templated Dockerfile and once ready dump to a file and execute afterwards with docker build.
Example:
# Dockerfile.tmpl
FROM alpine
{{replace}}
# organize files into coherent structures so you don't have too many COPY directives
files = {
"pattern1": [...],
"pattern2": [...],
...
}
with open("Dockerfile.tmpl", "r") as file:
text = file.read()
insert = "\n".join([
f"COPY {' '.join(values)} destination/{key}/"
for key, values in files.items()
])
with open("Dockerfile", "w") as file:
file.write(text.replace("{{replace}}", insert))
You might want to do this for example:
FROM ...
ARG files
COPY files
and run with
docker build -build-args items=`${cat list_of_files_to_copy.txt}`

scp, inconsistency for file structure preservation

My task: collect log files from several servers.
Server file structure: "/remote/path/dir/sub-dirs/files.log", which
is the same on all servers. (All servers have the same set of
"sub-dirs", absence could happen, and of course "files.log" names
differ)
Local file structure: "/local/path/logs"
After copy I would like to have
"/local/path/logs/dir/sub-dirs/files.log"
Method (in a whlile loop for servers): scp -r
$SERVERS:/remote/path/dir /local/path/logs
Problem: For reasons I don't understand, the first scp command
ignores the "dir" folder, I get "/local/path/logs/sub-dirs/files.log"
But following scp commands gives me what I intended
"/local/path/logs/dir/sub-dirs/files.log"
Why is this happening and how should I fix/get around it?
Thanks!
Why is this happening [...]
In the command scp -r path/to/source dest:
If dest doesn't exist, the dest directory will be created, and path/to/source/* will be copied into it. For example if you have path/to/source/X then dest/X will be created.
If dest is a directory, then dest/source will be created, and the path/to/source/* will be copied into it. For example if you have path/to/source/X then dest/source/X will be created.
[...] and how should I fix/get around it?
Create dest in advance, for example:
mkdir -p /local/path/logs
scp -r $SERVERS:/remote/path/dir /local/path/logs

Split a folder which has 100s of subfolders with each having few files into one more level of subfolder using shell

I have a following data dir:
root/A/1
root/A/2
root/B/1
root/B/2
root/B/3
root/C/1
root/C/2
And I want to convert it into following file structure:
root2/I/A/1
root2/I/A/2
root2/I/B/1
root2/I/B/2
root2/I/B/3
root2/II/C/1
root2/II/C/2
Purpose of doing it is I want to run some script which takes home folder (root here) and runs on it. And I want to run it in parallel on many folders(I, II) to speed up the process.
Simple assumption about file and folder name is that all are alphanumeric, even no period or underscore.
Edit: I tried following:
for i in `seq 1 30`; do mkdir -p "root2/folder$i"; find root -type f | head -n 4000 | xargs -i cp "{}" "root2/folder$i"; done
Problem is that it creates something like following, which is not what i wanted.
root2/I/1
root2/I/2
root2/I/1
root2/I/2
root2/I/3
root2/II/1
root2/II/2
You may wish to use a lesser known command called dirsplit, the usual application of which is to split a directory into multiple directories for burning purposes.
Use it like below :
dirsplit -m -s 300M /root/ -p /backup/folder1
Options implies below stuff :
-m|--move Move files to target dirs
-e 2 special exploration mode, 2 means files in directory are put together
-p prefix to be attached to each directory created, in you case I, II etc
-s Maximum size allowed for each new folder created.
For more information see :
dirsplit -H

Finding and Removing Unused Files Through Command Line

My websites file structure has gotten very messy over the years from uploading random files to test different things out. I have a list of all my files such as this:
file1.html
another.html
otherstuff.php
cool.jpg
whatsthisdo.js
hmmmm.js
Is there any way I can input my list of files via command line and search the contents of all the other files on my website and output a list of the files that aren't mentioned anywhere on my other files?
For example, if cool.jpg and hmmmm.js weren't mentioned in any of my other files then it could output them in a list like this:
cool.jpg
hmmmm.js
And then any of those other files mentioned above aren't listed because they are mentioned somewhere in another file. Note: I don't want it to just automatically delete the unused files, I'll do that manually.
Also, of course I have multiple folders so it will need to search recursively from my current location and output all the unused (unreferenced) files.
I'm thinking command line would be the fastest/easiest way, unless someone knows of another. Thanks in advance for any help that you guys can be!
Yep! This is pretty easy to do with grep. In this case, you would run a command like:
$ for orphan in `cat orphans.txt`; do \
echo "Checking for presence of ${orphan} in present directory..." ;
grep -rl $orphan . ; done
And orphans.txt would look like your list of files above, one file per line. You can add -i to the grep above if you want to grep case-insensitively. And you would want to run that command in /var/www or wherever your distribution keeps its webroots. If, after you see the above "Checking for..." and no matches below, you haven't got any files matching that name.

Unable to create the md5sum file I need to create. Manually doing it would be far too labour-intensive

I need to create/recreate an md5sum file for all files in a directory and all files in all sub-directories of that directory.
I am using a rockettheme template that requires a valid md5sum document and I have made changes to the files, so the originally included md5sum file is no longer valid.
There are over 300 files that need to be checksummed, and the md5hash added to a single file.
The basic structure of the file is as follows:
1555599f85c7cd6b3d8f1047db42200b admin/forms/fields/imagepicker.php
8a3edb0428f11a404535d9134c90063f admin/forms/fields/index.html
8a3edb0428f11a404535d9134c90063f admin/forms/index.html
8a3edb0428f11a404535d9134c90063f admin/index.html
8a3edb0428f11a404535d9134c90063f admin/presets/index.html
b6609f823ffa5cb52fc2f8a49618757f admin/presets/preset1.png
7d84b8d140e68c0eaf0b3ee6d7b676c8 admin/presets/preset2.png
0de9472357279d64771a9af4f8657c2a admin/presets/preset3.png
5bda28157fe18bffe11cad1e4c8a78fa admin/presets/preset4.png
2ff2c5c22e531df390d2a4adb1700678 admin/presets/preset5.png
4b3561659633476f1fd0b88034ae1815 admin/presets/preset6.png
8a3edb0428f11a404535d9134c90063f admin/tips/index.html
2afd5df9f103032d5055019dbd72da38 admin/tips/overview.xml
79f1beb0ce5170a8120ba65369503bdc component.php
caf4a31db542ca8ee63501b364821d9d css/grid-responsive.css
8a3edb0428f11a404535d9134c90063f css/index.html
8697baa2e31e784c8612e2c56a1cd472 css/master-gecko.css
0857bc517aa15592eb796553fd57668b css/master-ie10.css
a4625ce5b8e23790eacb7704742bf735 css/master-ie8.css
This is just a snippet, but the logic is there.
hash path/to/file/relative/to/MD5SUM_file
Can anyone help me write a shell script (bash shell) that I can add to my path that will execute and generate a file called "MD5SUM_new"? I want the output file name to be "MD5SUM_new" so I can review the content before issuing a mv MD5SUM_new MD5SUM
FYI, the MD5SUM_new file needs to be saved in the root level of the template.
Thanks
This is quite easy, really. To hash all files under the current directory:
find . -type f | xargs md5sum > md5sums
Then, you can make sure it's correct:
md5sum -c md5sums

Resources