Change string using sed with specific condition - bash

I want to change paths in files using sed.
Now I run following :
find . -type f | xargs sed -r -i "s/home\/some_dir/home\/another_dir/g"
But I want this to be applied only for those paths, which correspond actual files in my file system.
For instance : if I don't have file home/some_dir/lol, then corresponding string in some file will be ignored.
UPD (explanation) :
Let's imagine I have following file structure:
-home
--some_dir
---file1
--another_dir
--dir_with_configs
---config.txt
And I am in /home/dir_with_configs directory.
Let config.txt be like:
/home/some_dir/file1
/home/some_dir/lol
After running
find . -type f | xargs sed -r -i "s/home\/some_dir/home\/another_dir/g"
I will have config.txt like:
/home/another_dir/file1
/home/another_dir/lol
But I don't have file /home/another_dir/lol. So I somehow want to add check that file with given path exists and have config.txt like:
/home/another_dir/file1
/home/some_dir/lol

Presumably you want to do the following on each file that you find.
Test that each line is an existing file
Test that each existing file also exists on an alternate path
If both are true, replace the line with the alternate path
sed can't do most of this. You need a different tool. Something like perl or python would probably be the most efficient choice. Either way, your program will have to actually read each line of the input file and test if that line represents a real file on your system before doing anything to it.
Here's a bash example that passes the files discovered by find into a small script that reads the lines from those files, tests them, and makes any necessary substitutions to the output.
find . -type f -exec /bin/bash -c '
for file ; do
while read -r line ; do
if [[ -f "${line}" && -f "${line/some_dir/another_dir}" ]] ; then
printf "%s\n" "${line/some_dir/another_dir}"
else
printf "%s\n" "${line}"
fi
done <"${file}" >"${file}.new" && mv "${file}.new" "${file}"
done
' _ {} +

Related

Shell Script: How to copy files with specific string from big corpus

I have a small bug and don't know how to solve it. I want to copy files from a big folder with many files, where the files contain a specific string. For this I use grep, ack or (in this example) ag. When I'm inside the folder it matches without problem, but when I want to do it with a loop over the files in the following script it doesn't loop over the matches. Here my script:
ag -l "${SEARCH_QUERY}" "${INPUT_DIR}" | while read -d $'\0' file; do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done
SEARCH_QUERY holds the String I want to find inside the files, INPUT_DIR is the folder where the files are located, OUTPUT_DIR is the folder where the found files should be copied to. Is there something wrong with the while do?
EDIT:
Thanks for the suggestions! I took this one now, because it also looks for files in subfolders and saves a list with all the files.
ag -l "${SEARCH_QUERY}" "${INPUT_DIR}" > "output_list.txt"
while read file
do
echo "${file##*/}"
cp "${file}" "${OUTPUT_DIR}/${file##*/}"
done < "output_list.txt"
Better implement it like below with a find command:
find "${INPUT_DIR}" -name "*.*" | xargs grep -l "${SEARCH_QUERY}" > /tmp/file_list.txt
while read file
do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done < /tmp/file_list.txt
rm /tmp/file_list.txt
or another option:
grep -l "${SEARCH_QUERY}" "${INPUT_DIR}/*.*" > /tmp/file_list.txt
while read file
do
echo "$file"
cp "${file}" "${OUTPUT_DIR}/${file}"
done < /tmp/file_list.txt
rm /tmp/file_list.txt
if you do not mind doing it in just one line, then
grep -lr 'ONE\|TWO\|THREE' | xargs -I xxx -P 0 cp xxx dist/
guide:
-l just print file name and nothing else
-r search recursively the CWD and all sub-directories
match these works alternatively: 'ONE' or 'TWO' or 'THREE'
| pipe the output of grep to xargs
-I xxx name of the files is saved in xxx it is just an alias
-P 0 run all the command (= cp) in parallel (= as fast as possible)
cp each file xxx to the dist directory
If i understand the behavior of ag correctly, then you have to
adjust the read delimiter to '\n' or
use ag -0 -l to force delimiting by '\0'
to solve the problem in your loop.
Alternatively, you can use the following script, that is based on find instead of ag.
while read file; do
echo "$file"
cp "$file" "$OUTPUT_DIR/$file"
done < <(find "$INPUT_DIR" -name "*$SEARCH_QUERY*" -print)

Replace the complete filenames for files with their MD5 hash string of the content in bash

Problem:
I have a bunch of files in a folder,i want to rename all of them to the md5 of the content of the file.
What i tried:
This is the command i tried.
for i in $(find /home/admin/test -type f);do mv $i $(md5sum $i|cut -d" " -f 1);done
But this is failing after sometime with the error and only some files are getting renamed leaving rest untouched.
mv: missing destination file operand after /home/admin/test/help.txt
Try `mv --help' for more information.
Is the implementation correct? Am i doing something wrong in the script.
Make things simple by making use the glob patterns that the shell provides, instead of using external utilities like find. Also see Why you don't read lines with "for"
Navigate inside the folder /home/admin/test and do the following which should be sufficient
for file in *; do
[ -f "$file" ] || continue
md5sum -- "$file" | { read sum _; mv "$file" "$sum"; }
done
Try using echo inplace of mv first to check once if they files are renamed as expected.
To go to sub-directories below, which I assume would also be your requirement, enable globstar, which is one of the extended globing options provided by the shell to go deeper
shopt -s globstar
for file in **/*; do
If you want to recursively rename all files with their md5 hash, you could try this:
find /home/admin/test -type f -exec bash -c 'md5sum "$1" | while read s f; do mv "${f#*./}" "$(dirname ${f#*./})/$s"; done' _ {} \;
The hash and filename is given as argument into the s and f variables. The ${f#*./} removes the prefix added by md5sum and find commands.
Note that if some file have exact same content, it will end up with only 1 file.

Find pipe to multiple commands (grep and file)

Here is my problem : I am trying to parse a lot of files on a system to find some tokens. My tokens are stored in a file, one token on each line (for example token.txt). My path to parse are also stored in an other file, one path on each line (for example path.txt).
I use a combination of find and grep to do my stuff. Here is one attempt:
for path in $(cat path.txt)
do
for line in $(find $path -type f -print0 | xargs -0 grep -anf token.txt 2>/dev/null);
do
#some stuffs here
done
done
It seems to work fine, I don't really know if there is an other way to make it faster though (I am a beginner in programmation and shell).
My problem is : For each file found by the find command, I want to get all the files that are compressed. For this, I wanted to use the file command. The problem is that I need the output of the find command for both grep and file.
What is the best way to achieve this ? To summarize my problem, I would like something like this :
for path in $(cat path.txt)
do
for line in $(find $path -type f);
do
#Use file command to test all the files, and do some stuff
#Use grep to find some tokens in all the files, and do some stuff
done
done
I don't know if my explanations are clear, I tried my best.
EDIT : I read that doing for loop to read a file is bad, but some people claims that doing while read loop is also bad. I am a bit lost to be honest, I can't really find the proper way to do my stuffs.
The way you are doing it is fine, but here is another way to do it. With this method you won't have to add additional loops to iterate of each item in your configuration files. There are ways to simplify this further, but it would not be as readable.
To test this:
In "${DIR}/path" I have two directories listed (one on each line). Both directories are contained in the same parent directory as this script. In the "${DIR}/token" file, I have three tokens (one on each line) to search for.
#!/usr/bin/env bash
#
# Directory where this script is located
#
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
#
# Loop through each file contained in our path list
#
for f in $(find $(cat "${DIR}/path") -type f); do
for c in $(cat "${f}" | grep "$(cat ${DIR}/token)"); do
echo "${f}"
echo "${c}"
# Do your file command here
done
done
I think you need something like this:
find $(cat places.txt) -type f -exec bash -c 'file "$1" | grep -q compressed && echo $1 looks compressed' _ {} \;
Sample Output
/Users/mark/tmp/a.tgz looks compressed
This script is looking in all the places listed in places.txt and running a new bash shell for each file it finds. Inside the bash shell it is testing if the file is compressed and echoing a message if it is - I guess you will do something else but you don't say what.
Another way of writing that more verbosely if you have lots to do:
#!/bin/bash
while read -r d; do
find "$d" -type f -exec bash -c '
file "$1" | grep -q "compressed"
if [ $? -eq 0 ]; then
echo "$1" is compressed
else
echo "$1" is not compressed
fi' _ {} \;
done < <(cat places.txt)

How to add a line of text containing the filename to all text files recursively in terminal?

I have a bunch of javascript files and I need to add a comment on a new line at the beginning of every file that contains the filename e.g.
/* app.js */
I'd like to do this recursively on a folder from the terminal on my mac.
Can anyone help?
Thanks!
First, we have to locate those files:
$ find $folder_name -type f -name "*.js" -print
This will locate all of the files with the suffix of *.js. The type -f means only look for files (just incase you have a directory name that ends in .js). I will assume that your JavaScript file names don't contain whitespace, or control characters like <NL> characters in the name. Otherwise, we'll have to do a bit more work.
We can use that find statement to locate all of those files, then we can munge those files. Once we find those files, we have to figure out how to munge the file to get the comment line on top.
We'll take three step approach:
First, we'll create a new file with the comment line in it.
Next we'll concatenate the old file on the end of the new file.
We'll rename the new file to the old file.
For the first step, we'll use echo to create a new file with the old name. Then, we'll use the cat command to concatenate the old file on the new file. And, finally we'll use mv to rename the newly constructed file to the old name.
find $javascript_dir -type f -name "*.js" -print | while read file_name
do
basename=$(basename $file_name)
echo "/* $basename */" > "$basename.tmp"
cat "$file_name" >> "$basename.tmp"
mv "$basename.tmp" "$file_name"
done
Note that > redirects the output of the command to the named file, and that >> appends the output of the command to the named file. mv stands for move. In Unix, file are actually stored by inode identify. There's a table of contents that matches the inode to the full name of the file (including the directory). Thus, moving a file in Unix is the same as simply renaming it, so the directory name is changed. In Windows batch, there are separate move and rename commands. In Unix, they're both mv.
Addition
We could use sed to do an in place prepending of the line:
find . -name "*.js" -print | while read file_name
do
basename=$(basename $file_name)
sed -i "" "1i\\
/* $basename */
" "$file_name"
done
sed is a powerful line editor, and most people use only its substitution command mode. However, sed scripts can be way more complex and rich (and obscure which is why you rarely see them).
You can use cat - and a temp file:
# adds filename to top of each file in $DIR
DIR=. # set to whatever directory
FILES=$(find $DIR -type f)
for FILE in $FILES; do
FILE=$(echo $FILE | sed -e s/\.\\\///) # remove leading "./" if DIR is "."
echo "/* $FILE */" | cat - $FILE > tempfile.tmp # add the line using a temp file
mv tempfile.tmp $FILE # move the temp file back to the original file
done
Note that this will add the full path of the file relative to DIR if DIR is not '.'. To add only the filename itself you could use:
echo "/* $(basename $FILE) */" | cat - $FILE > tempfile.tmp # add the line using a temp file
You can use sed for this, try something like this:
#!/bin/bash
for i in $(find ./ -name *.js);
do
`sed -i 's/$i \*\//$i \*\/\n&/' "$i"`
done
Insert a newline with sed could be a problem since not all version works the same. To have an overview see this reference. You can for example replace the substitution part with G or {G;}. It depends on your sed version.
Similar Questions are already answered here and here
Another interesting solution:
If you want to append a newline after the first line you can also use this:
sed -i "1a\ " test.js
So change the script above into this:
#!/bin/bash
for i in $(find ./ -name *.js);
do
`sed -i "1a\ " $1`
done

How can I manipulate file names using bash and sed?

I am trying to loop through all the files in a directory.
I want to do some stuff on each file (convert it to xml, not included in example), then write the file to a new directory structure.
for file in `find /home/devel/stuff/static/ -iname "*.pdf"`;
do
echo $file;
sed -e 's/static/changethis/' $file > newfile +".xml";
echo $newfile;
done
I want the results to be:
$file => /home/devel/stuff/static/2002/hello.txt
$newfile => /home/devel/stuff/changethis/2002/hello.txt.xml
How do I have to change my sed line?
If you need to rename multiple files, I would suggest to use rename command:
# remove "-n" after you verify it is what you need
rename -n 's/hello/hi/g' $(find /home/devel/stuff/static/ -type f)
or, if you don't have rename try this:
find /home/devel/stuff/static/ -type f | while read FILE
do
# modify line below to do what you need, then remove leading "echo"
echo mv $FILE $(echo $FILE | sed 's/hello/hi/g')
done
Are you trying to change the filename? Then
for file in /home/devel/stuff/static/*/*.txt
do
echo "Moving $file"
mv "$file" "${file/static/changethis}.xml"
done
Please make sure /home/devel/stuff/static/*/*.txt is what you want before using the script.
First, you have to create the name of the new file based on the name of the initial file. The obvious solution is:
newfile=${file/static/changethis}.xml
Second you have to make sure that the new directory exists or create it if not:
mkdir -p $(dirname $newfile)
Then you can do something with your file:
doSomething < $file > $newfile
I wouldn't do the for loop because of the possibility of overloading your command line. Command lines have a limited length, and if you overload it, it'll simply drop off the excess without giving you any warning. It might work if your find returns 100 file. It might work if it returns 1000 files, but it might fail if your find returns 1000 files and you'll never know.
The best way to handle this is to pipe the find into a while read statement as glenn jackman.
The sed command only works on STDIN and on files, but not on file names, so if you want to munge your file name, you'll have to do something like this:
$newname="$(echo $oldname | sed 's/old/new/')"
to get the new name of the file. The $() construct executes the command and puts the results of the command on STDOUT.
So, your script will look something like this:
find /home/devel/stuff/static/ -name "*.pdf" | while read $file
do
echo $file;
newfile="$(echo $file | sed -e 's/static/changethis/')"
newfile="$newfile.xml"
echo $newfile;
done
Now, since you're renaming the file directory, you'll have to make sure the directory exists before you do your move or copy:
find /home/devel/stuff/static/ -name "*.pdf" | while read $file
do
echo $file;
newfile="$(echo $file | sed -e 's/static/changethis/')"
newfile="$newfile.xml"
echo $newfile;
#Check for directory and create it if it doesn't exist
$dirname=$(dirname "$newfile")
if [ ! -d "$dirname" ]
then
mkdir -p "$dirname"
fi
#Directory now exists, so you can do the move
mv "$file" "$newfile"
done
Note the quotation marks to handle the case there's a space in the file name.
By the way, instead of doing this:
if [ ! -d "$dirname" ]
then
mkdir -p "$dirname"
fi
You can do this:
[ -d "$dirname"] || mkdir -p "$dirname"
The || means to execute the following command only if the test isn't true. Thus, if [ -d "$dirname" ] is a false statement (the directory doesn't exist), you run mkdir.
It's a fairly common shortcut when you see shell scripts.
find ... | while read file; do
newfile=$(basename "$file").xml;
do something to "$file" > "$somedir/$newfile"
done
OUTPUT="$(pwd)";
for file in `find . -iname "*.pdf"`;
do
echo $file;
cp $file $file.xml
echo "file created in directory = {$OUTPUT}"
done
This will create a new file with name whatyourfilename.xml, for hello.pdf the new file created would be hello.pdf.xml, basically it creates a new file with .xml appended at the end.
Remember the above script finds files in the directory /home/devel/stuff/static/ whose file names match the matcher string of the find command (in this case *.pdf), and copies it to your present working directory.
The find command in this particular script only finds files with filenames ending with .pdf If you wanted to run this script for files with file names ending with .txt, then you need to change the find command to this find /home/devel/stuff/static/ -iname "*.txt",
Once I wanted to remove trailing -min from my files. i.e. wanted alg-min.jpg to turn into alg.jpg. so after some struggle, managed to figure something like this:
for f in *; do echo $f; mv $f $(echo $f | sed 's/-min//g');done;
Hope this helps someone willing to REMOVE or SUBTITUDE some part of their file names.

Resources