Compare two filenames of first n characters in shell script - bash

My requirement is
localfolder=green.txt,yellow.txt,blue.txt
remotefolder=green_202105050333.txt,yellow_202105050333.txt,blue_202105050333.txt
I want to compare both folders as
IF[[localfolder==remotefolder]] ex: green.txt= green_202105050333.txt(here condition will look always first characters of each file or eliminating the data&time)
then
display the results as "matched"
kindly help me to get the logic here please.
Thanks in advance!!

You can do the next, create a file for each directory(folder), for example if you use ls :
# in local directory
ls localfolder > localFolder.txt
# in some remote directory
ls remoteFolder > remoteFolder.txt
We can suppose that files are the next:
localFolder.txt:
green.txt
yellow.txt
blue.txt
remoteFolder.txt:
green_202105050333.txt
yellow_202105050333.txt
blue_202105050333.txt
gray_202105050333.txt
Now put the files in the same directory and execute the next:
sed -r 's/\.[a-z]+//g' localFolder.txt | xargs -I{} grep {} remoteFolder.txt | sed -r 's/_[0-9]+//g'
The output will be:
green.txt
yellow.txt
blue.txt
Now if you want do it in script, then create a file for example myScript.sh and the content must be:
#!/bin/bash
sed -r 's/\.[a-z]+//g' $1 | xargs -I{} grep {} $2 | sed -r 's/_[0-9]+//g'
Give execution permissions to the script:
chmod +x myScript.sh
And executed like the next:
./diff.sh localFolder.txt remoteFolder.txt
You will have the same result.
This script only works for your case and this is an explain what do each part
# Remove .txt or any extensions for each line in localFolder
sed -r 's/\.[a-z]+//g' localFolder.txt
# The ouput of each line is received by grep and search in remoteFolder
| xargs -I{} grep {} remoteFolder.txt
# Remove the datetime
| sed -r 's/_[0-9]+//g'

Related

How to find many files from txt file in directory and subdirectories, then copy all to new folder

I can't find posts that help with this exact problem:
On Mac Terminal I want to read a txt file (example.txt) containing file names such as:
20130815 144129 865 000000 0172 0780.bmp
20130815 144221 511 000003 1068 0408.bmp
....100 more
And I want to search for them in a certain folder/subfolders (example_folder). After each find, the file should be copied to a new folder x (new_destination).
Your help would be much appreciated!
Chers,
Mo
You could use a piped command with a combination of ls, grep, xargs and cp.
So basically you start with getting the list of files
ls
then you filter them with egrep -e, grep -e or whatever flavor of grep Mac uses for their terminal. If you want to find all files ending with text you can use the regex .txt$ (which means ends with '.txt')
ls | egrep -e "yourRegexExpression"
After that you get an input stream, but cp doesn't work with input streams and only takes a bunch of arguments, that's why we use xargs to convert it to arguments. The final step is to add the flag -t to the argument to signify that the next argument is the target directory.
ls | egrep -e "yourRegexExpression" | xargs cp -t DIRECTORY
I hope this helps!
Edit
Sorry I didn't read the question well enough, I updated to be match your problem. Here you can see that the egrep command compiles a rather large regex string with all the file names in this way (filename1|filename2|...|fileN). The $() evaluates the command inside and uses the tr to translate newLines to "|" for the regex.
ls | egrep -e "("$(cat yourtextfile.txt | tr "\n" "|")")" | xargs cp -t DIRECTORY
You could do something like:
$ for i in `cat example.txt`
find /search/path -type f -name "$i" -exec cp "{}" /new/path \;
This is how it works, for every line within example.txt:
for i in `cat example.txt`
it will try to find a file matching the line $i in the defined path:
find /search/path -type f -name "$i"
And if found it will copy it to the desired location:
-exec cp "{}" /new/path \;

xargs - place the argument in a different location in the command

Let's say I want to write a script that does ls with a prefix for each filename. I tried this
ls | xargs -n 1 echo prefix_
But the result was
prefix_ first_file
prefix_ second_file
...
How can I remove the space between the prefix and the filename? I.e. how to I make xargs put the variable after the command, without space? (Or in any other place for that matter)
The solution: -I
-I lets you name your argument and put it anywhere you like. E.g.
ls | xargs -n 1 -I {} echo prefix_{}
(replace {} with any string)

Display filename of tar file

I would like to know how to display the filename along with the lines matching a specfic word of a tar file.
Command wise :
zcat file | grep "stuff" -r # shows what I want
zcat *.gz | grep "stuff" -ar # this fails
You can use zgrep:
For single file, you can use the following command to display filename:
zgrep "stuff" file.gz /dev/null
For multiple files:
zgrep "stuff" *.gz
Maybe this related answer can help. It uses tar to untar (you would need to add -z) and pipes each file of the archive to awk for "grepping" inside it.
I'm not quite sure what the question is but if you are looking for tar files on your system then just do something like this. This will recursively search your current directory and any child directories for .tar files. Hope this helps.
find -name "*.tar"
If zcat file | grep "stuff" -r shows what you want, you can do this for multiple files:
for name in *.gz ; do zcat "$name" | grep -a "stuff" | sed -e "s/^/${name}: /" ; done
This command uses globbing (*) to expand to a list of .gz files in your working directory, then calls zcat for extraction, grep for the search and sed for prefixing with the filename on each of the files.
Note that if you are working with gzipped tarballs, most people give them a .tgz or .tar.gz instead of just .gz extension.
This will output nameOfFileInTar:LineNumber:Match. Invoke with greptar.sh tarfile.tar pattern
If you don't want the line number, remove the -n option. If you only want the line number, add |cut -f1 -d: after the grep
#!/bin/bash
TARFILE=$1
PATTERN=$2
tar ztf $TARFILE | while read -r FILE
do
res=$(tar zxf $TARFILE $FILE -O | grep -n $2 )
if [[ $? == 0 ]]; then
echo "$res" | while read -r line; do
echo $FILE:$line;
done
fi
done

How to know file name from a pipeline of commands

I search for some text in some file list. I have the following command to print these lines:
ls -1 *.log | xargs tail --lines=10000 | grep text_for_search
The command output contains all of occurrences of text_for_search, but it hasn't information from which file the occurrences are. How to modify the command to provide this information too?
Actually log files are gigabytes in size, so it's essential to use tail --lines=10000 for each of them
You could just use a loop instead, which will keep track of the file name for you:
for file in *.log; do
if tail --lines=-10000 "$file" | grep -q text_for_search; then
echo "$file"
fi
done
The -q switch to grep suppresses the output, returning a 0 (success) exit code if the pattern is matched.
You can use find command:
find . -name "*.log" -exec grep text_for_search '{}' \;
grep will output filename and matched line. If you just need filenames - add -l switch to grep command.
'{}' - macro used for matched file name substitution in find's -exec command,
\; indicates end of arguments for command, called by exec
Replace your tail command with:
awk '{v[NR]=$0}END{for(i=NR-10000;i<=NR;i++)print FILENAME,v[i]}'
This above is just the replacement of the tail command except it adds a file name in the begining of each line.
You must avoid parsing ls output and use shell's for loop to iterate through all *.log files:
for f in *.log; do
awk -v c=$(wc -l < "$f") 'NR>c-10000 && /text_for_search/{print FILENAME ":" $0}' "$f"
done
EDIT:
You can use awk to search through all *.log files:
awk 'NR>=10000 && /text_for_search/ {print FILENAME ":" $0}' *.log

Search file name using a variable and replace with another variable

I have search string in one variable ($AUD_DATE) and replace string in another variable ($YEST_DATE). I need to search file name in a folder using $AUD_DATE and then replace it with $YEST_DATE.
I tried using this link to do it but its not working with variables.
Find and replace filename recursively in a directory
shrivn1 $ AUD_DATE=140101
shrivn1 $ YEST_DATE=140124
shrivn1 $ ls *$AUD_DATE*
NULRL.PREM.DATA.CLRSFIFG.140101.dat NULRL.PREM.DATA.CLRTVEH.140101.dat
shrivn1 $ ls *$AUD_DATE*.dat | awk '{a=$1; gsub("$AUD_DATE","$YEST_DATE");printf "mv \"%s\" \"%s\"\n", a, $1}'
mv "NULRL.PREM.DATA.CLRSFIFG.140101.dat" "NULRL.PREM.DATA.CLRSFIFG.140101.dat"
mv "NULRL.PREM.DATA.CLRTVEH.140101.dat" "NULRL.PREM.DATA.CLRTVEH.140101.dat"
Actual output I need is
mv "NULRL.PREM.DATA.CLRSFIFG.140101.dat" "NULRL.PREM.DATA.CLRSFIFG.140124.dat"
mv "NULRL.PREM.DATA.CLRTVEH.140101.dat" "NULRL.PREM.DATA.CLRTVEH.140124.dat"
Thanks in advance
Approach 1
I generally create mv commands using sed and then pipe the output to sh. This approach allows me to see the commands that will be executed beforehand.
For example:
$ AUD_DATE=140101
$ YEST_DATE=140124
$ ls -1tr | grep "${AUD_DATE}" | sed "s/\(.*\)/mv \1 \1_${YEST_DATE}"
Once you are happpy with the output of the previous command;repeat it and pipe it's output to sh, like so:
$ ls -1tr | grep "${AUD_DATE}" | sed "s/\(.*\)/mv \1 \1_${YEST_DATE}" | sh
Approach 2
You could use xargs command.
ls -1tr | grep ${AUD_DATE}" | xargs -I target_file mv target_file target_file${YEST_DATE}

Resources