Compare two archive files BashShell

Compare two archive files BashShell - bash

I'm new in Bash and I need help.
I need to create a shell script that shall compare two gzipped archives. For each file or directory in each archive file (even in archived subdirectories), the script shall verify whether a file/directory of the same name exists in the other archive. In case of a missing directory, ignore missing files or subdirectories within this directory. The script shall list the names of all files which do not have a matching equivalent in the other archive.
The output of script when comparing archives arch1.tar.gz and archive2.tar.gz and finding differing files aa/a.txt, bb/b.txt in archive.tar.gz and c.txt v arch2.tar.gz:
arch1.tar.gz:aa/a.txt
arch1.tar.gz:bb/b.txt
arch2.tar.gz:c.txt
Here what I have:
#!/bin/bash
$1
$2
tar tf $1>> list1.txt
tar tf $2>> list2.txt
comm -23 <(sort list1.txt -o list1.txt | uniq) <(sort list2.txt -o list2.txt| uniq)
diff list1.txt list2.txt>>contestboth
The thing is that I can't image anything for output.

Try this:
diff <(sort -u list1.txt) <(sort -u list2.txt)
By this two sub processes are started (the two sort commands) and their output is associated with file descriptors. The syntax <(...) returns a file name representing this file descriptor (something like /dev/fd/63). So in the end, diff is called with two files which, when read, (seem to) contain the output of the two processes.
This method works fine for programs which read a file strictly linearly. Seeking in the "file" is not possible, of course.

Related

bash: use list of file names to concatenate matching files across directories and save all files in new directory

I have a large number of files that are found in three different directories. For some of these files, a file with an identical name exists in another directory. Other files exist in only one directory. I'd like to use bash to copy all of the files from the three directories to a single new directory, but for files with identically named files in more than one directory I want to concatenate the file contents across directories before saving to the new directory.
Here's an example of what my file structure looks like:
ls dir1/
file1.txt
file2.txt
file4.txt
ls dir2/
file2.txt
file5.txt
file6.txt
file9.txt
ls dir3/
file2.txt
file3.txt
file4.txt
file7.txt
file8.txt
file10.txt
Using this example, I'd like to produce a new directory that contains file1.txt through file10.txt, but with the contents of identically named files (e.g. file2.txt, file4.txt) concatenated in the new directory.
I have a unique list of all of the file names contained in my three directories (single instance of each unique file name is contained within the list). So far, I have come up with code to take a list of file names from one directory and concatenate these files with identically named files in a second directory, but I'm not sure how to use my list of file names as a reference for concatenating and saving files (instead of the output from ls in the first directory). Any ideas for how to modify? Thanks very much!
PATH1='/path/to/dir1'
PATH2='/path/to/dir2'
PATH3='/path/to/dir3'
mkdir dir_new
ls $PATH1 | while read FILE; do
cat $PATH1/"$FILE" $PATH2/"$FILE" $PATH3/"$FILE" >> ./dir_new/"$FILE"
done

You can do it like this:
mkdir -p new_dir
for f in path/to/dir*/*.txt; do
cat "$f" >> "new_dir/${f##*/}"
done
This is a common use for substring removal with parameter expansion, in order to use only the basename of the file to construct the output filename.
Or you can use a find command to get the files and execute the command for each one:
find path/to/dir* -type f -name '*.txt' -print0 |\
xargs -0 -n1 sh -c 'cat "$0" >> new_dir/"${0##*/}"'
In the above command, the filenames out of find are preserved with zero separation (-print0), and xargs also accepts a zero separated list (-0). For each argument (-n1) the command following is executed. We call sh -c 'command' for convenience to use the substring removal inside there, we can access the argument provided by xargs as $0.

Map mv command to each row of file and directory names lists

I have two files of file and directory names. I want to map mv command from each row of a filenames file to each row in the directory names file. The files are small.
If it helps I have files named sequentially (say f1, f2, f3...f1000). Is there any way to do it in a loop reading one file and one directory?
There can be 3 use cases: One file to many directories, Many files to one directory and many files to many directories (1 file/line = 1 dir/line in my case). My use case pertains to the last one. I have seen xargs being used in some of the use cases but I am not sure how to modify for my use case.
Following questions do not help: Moving large number of files

Assuming you have a file named files that contains file names, each in a new line, and a file named dirs with directories each in a new line, both having the same number of entries eg:
files
file1
file2
file3
dirs
dir1
dir2
dir3
Then to move file1 to dir1, file2 to dir2 and so on you can use the command:
paste dirs files | xargs -n2 mv -t
paste joins the lines from both files, then xargs takes two arguments and calls the mv command with them. The -t option selects the destination directory. Below is the relevant fragment from the mv documentation.
mv [OPTION]... -t DIRECTORY SOURCE...
-t, --target-directory=DIRECTORY
move all SOURCE arguments into DIRECTORY

Wondering how to delete the files when it's name increments?

I have a file in the dir as
file3.proto
file2.proto
file1.proto
I want to delete the file1 and file2, the highest number is the latest file that I don't want to delete. How can I achieve this in the shell script?
This below thing does the job but I want to be more dynamic. I don't want to change the shell script every time if the number increments, example if the file is 4 then I need to change 1..3.
ls | grep '.proto' | rm file{1..2}.proto

ls *.proto | head -n -1 | xargs rm
which with these files
file1.proto
file2.proto
file3.proto
executes the command
rm file1.proto file2.proto
UPDATE: Be warned that ls command outputs files in alphabetical order, which is not numerical order... I mean, if you have also a file25.proto, you'll get this output from ls:
file1.proto
file25.proto
file2.proto
file3.proto
So it should be better (if possible) to rename files like file001.proto, depending on the maximum possible number of files present in the folder. This is a common issue with file names ordering...

Bash shell script: recursively cat TXT files in folders

I have a directory of files with a structure like below:
./DIR01/2019-01-01/Log.txt
./DIR01/2019-01-01/Log.txt.1
./DIR01/2019-01-02/Log.txt
./DIR01/2019-01-03/Log.txt
./DIR01/2019-01-03/Log.txt.1
...
./DIR02/2019-01-01/Log.txt
./DIR02/2019-01-01/Log.txt.1
...
./DIR03/2019-01-01/Log.txt
...and so on.
Each DIRxx directory has a number of subdirectories named by date, which themselves have a number of log files that need to be concatenated. The number of text files to concatenate varies, but could theoretically could be as many as 5. I would like to see the following command performed for each set of files within the dated directories (note that the files must be concatenated in reverse order):
cd ./DIR01/2019-01-01/
cat Log.txt.4 Log.txt.3 Log.txt.2 Log.txt.1 Log.txt > ../../Log.txt_2019-01-01_DIR01.txt
(I understand the above command will give an error that certain files do not exist, but the cat will do what I need of it anyways)
Aside from cding into each directory and running the above cat command, how can I script this into a Bash shell script?

If you just want to concatenate all files in all subdirectories whose name starts with Log.txt, you could do something like this:
for dir in DIR*/*; do
date=${dir##*/};
dirname=${dir%%/*};
cat $dir/Log.txt* > Log.txt_"${date}"_"${dirname}".txt;
done
If you need the files in reverse numerical order, from 5 to 1 and then Log.txt, you can do this:
for dir in DIR*/*; do
date=${dir##*/};
dirname=${dir%%/*};
cat $dir/Log.txt.{5..1} $dir/Log.txt > Log.txt_"${date}"_"${dirname}".txt;
done
That will, as you mention in your question, complain for files that don't exist, but that's just a warning. If you don't want to see that, you can redirect error output (although that might cause you to miss legitimate error messages as well):
for dir in DIR*/*; do
date=${dir##*/};
dirname=${dir%%/*};
cat $dir/Log.txt.{5..1} $dir/Log.txt > Log.txt_"${date}"_"${dirname}".txt;
done 2>/dev/null

Not as comprehensive as the other, but quick and easy. Use find and sort your output however you like (-zrn is --zero-terminated --reverse --numeric-sort) then iterate over it with read.
find . -type f -print0 |
sort -zrn |
while read -rd ''; do
cat "$REPLY";
done >> log.txt

How to archive files under certain dir that are not text files in Mac OS?

Hey, guys, I used zip command, but I only want to archive all the files except *.txt. For example, if two dirs file1, file2; both of them have some *.txt files. I want archive only the non-text ones from file1 and file2.
tl;dr: How to tell linux to give me all the files that don't match *.txt

$ zip -r zipfile -x'*.txt' folder1 folder2 ...

Move to you desired directory and run:
ls | grep -P '\.(?!txt$)' | zip -# zipname
This will create a zipname.zip file containing everything but .txt files. In short, what it does is:
List all files in the directory, one per line (this can be achieved by using the -1 option, however it is not needed here as it's the default when output is not the terminal, it is a pipe in this case).
Extract from that all lines that do not end in .txt. Note it's grep using a Perl regular expression (option -P) so the negative lookahead can be used.
Zip the list from stdin (-#) into zipname file.
Update
The first method I posted fails with files with two ., like I described in the comments. For some reason though, I forgot about the -v option for grep which prints only what doesn't match the regex. Plus, go ahead and include a case insensitive option.
ls | grep -vi '\.txt$' | zip -# zipname

Simple, use bash's Extended Glob option like so:
#!/bin/bash
shopt -s extglob
zip -some -options !(*.txt)
Edit
This isn't as good as the -x builtin option to zip but my solution is generic across any command that may not have this nice feature.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio