I am already working on a script to replace value of a variable "SUBDIRS" in a Makefile from shell script.
I used below command and it works fine but exits after doing for first occurrene of "SUBDIRS" and makefile is incomplete.
sed -z -i "s/\(SUBDIRS = \).*/\1$(tr '\n' ' ' < changed_fe_modules.log)\n/g" Makefile
Now I want to keep my Makefile as it is and only replace 3 occurrences of "SUBDIRS= abcdefgh" and update Makefile properly.
Please suggest how to just replace all 3 occurrences and keep Makefile also end to end as original.
Makefile input sample:
Makefile Desired output sample:
Right now, current command is giving me below output: it exits after first replacement and file is incomplete.
This will be very hard to do.
The reason you're seeing this behavior is that you're using the -z option with sed. The -z option separates lines with NUL characters, not newlines. This means the entire file (up to the first NUL character, which there isn't one here) is treated as a single "line" for the purposes of sed's pattern matching.
So this regex:
\(SUBDIRS = \).*
the .* here matches the entire rest of the file after the first SUBDIRS = match. Then you replace the entire rest of the file with the contents of the changed_fe_modules.log file. After that there's nothing left to match, so sed is done.
If your original makefile listed all the SUBDIRS on a single line, not using backslash/newline separators, it would be simple; you can just use:
sed -i "s/^SUBDIRS = .*/SUBDIRS = $(tr '\n' ' ' < changed_fe_modules.log)/" Makefile
If you have to use the backslash/newline you probably won't be able to make this change using sed. You'll need to use something more powerful like Perl which has non-greedy matching capabilities.
ETA
You could also write it in plain shell:
new_subdirs=$(tr '\n' ' ' < changed_fe_modules.log)
line_cont=false
in_subdirs=false
while read -r line; do
if $line_cont; then
case $line in
(*\\) : still continuing ;;
(*) line_cont=false ;;
esac
$in_subdirs || printf '%s\n' "$line"
continue
fi
case $line in
(SUBDIRS =*)
echo "SUBDIRS = $new_subdirs"
in_subdirs=true ;;
(*) printf '%s\n' "$line"
in_subdirs=false ;;
esac
case $line in
(*\\) line_cont=true ;;
esac
done < Makefile > Makefile.new
mv -f Makefile.new Makefile
(note, completely untested)
Related
I have a list of files stored in a text file, and if a Python file is found in that list. I want to the corresponding test file using Pytest.
My file looks like this:
/folder1/file1.txt
/folder1/file2.jpg
/folder1/file3.md
/folder1/file4.py
/folder1/folder2/file5.py
When 4th/5th files are found, I want to run the command pytest like:
pytest /folder1/test_file4.py
pytest /folder1/folder2/test_file5.py
Currently, I am using this command:
cat /workspace/filelist.txt | while read line; do if [[ $$line == *.py ]]; then exec "pytest test_$${line}"; fi; done;
which is not working correctly, as I have file path in the text as well. Any idea how to implement this?
Using Bash's variable substring removal to add the test_. One-liner:
$ while read line; do if [[ $line == *.py ]]; then echo "pytest ${line%/*}/test_${line##*/}"; fi; done < file
In more readable form:
while read line
do
if [[ $line == *.py ]]
then
echo "pytest ${line%/*}/test_${line##*/}"
fi
done < file
Output:
pytest /folder1/test_file4.py
pytest /folder1/folder2/test_file5.py
Don't know anything about the Google Cloudbuild so I'll let you experiment with the double dollar signs.
Update:
In case there are files already with test_ prefix, use this bash script that utilizes extglob in variable substring removal:
shopt -s extglob # notice
while read line
do
if [[ $line == *.py ]]
then
echo "pytest ${line%/*}/test_${line##*/?(test_)}" # notice
fi
done < file
You can easily refactor all your conditions into a simple sed script. This also gets rid of the useless cat and the similarly useless exec.
sed -n 's%[^/]*\.py$%test_&%p' /workspace/filelist.txt |
xargs -n 1 pytest
The regular expression matches anything after the last slash, which means the entire line if there is no slash; we include the .py suffix to make sure this only matches those files.
The pipe to xargs is a common way to convert standard input into command-line arguments. The -n 1 says to pass one argument at a time, rather than as many as possible. (Maybe pytest allows you to specify many tests; then, you can take out the -n 1 and let xargs pass in as many as it can fit.)
If you want to avoid adding the test_ prefix to files which already have it, one solution is to break up the sed script into two separate actions:
sed -n '/test_[^/]*\.py/p;t;s%[^/]*\.py$%test_&%p' /workspace/filelist.txt |
xargs -n 1 pytest
The first p simply prints the matches verbatim; the t says if that matched, skip the rest of the script for this input.
(MacOS / BSD sed will want a newline instead of a semicolon after the t command.)
sed is arguably a bit of a read-only language; this is already pressing towards the boundary where perhaps you would rewrite this in Awk instead.
You may want to focus on lines that ends with ".py" string
You can achieve that using grep combined with a regex so you can figure out if a line ends with .py - that eliminates the if statement.
IFS=$'\n'
for file in $(cat /workspace/filelist.txt|grep '\.py$');do pytest $file;done
I have many very large files. Within each file it repeats 3 times. My intent is to delete the first portion of all of them such that only the last two repeats remain.
The code I have loops through the lines and identifies the position of each repeat (via a counter) and saves them as a variable (FIRST and END). My hope is that I would then use: sed -i '${FIRST},${END}d ${i}.log' to cut out that section of the file.
However when I run the code I get an error as follows: sed: -e expression #1, char 3: extra characters after command
Here is the code that reads the files, where "Cite" is the keyword that identifies repeats:
while read -r LINE ; do
((LCOUNT++))
if [[ "$LINE" =~ "Cite" ]] ; then
((CITE++))
if [[ "$CITE" = 1 ]] ; then
FIRST=${LCOUNT}
fi
if [[ "$CITE" = 2 ]] ; then
END=$((LCOUNT - 1))
fi
fi
done < "./${i}.log"
Your command
sed -i '${FIRST},${END}d ${i}.log'
does not make sense. You call sed here with two arguments: The option
-i
and a single string which is literally
${FIRST},${END}d ${i}.log
Since you have used single quotes, no parameter expansion occurs, and the whole piece is passed to sed as a single argument to be interpreted as a sed program. sed tries to read from stdin (since you have not passed a file argument), and the sed program obviously does not make sense.
You could do something like
sed $FIRST,${END}d "${i}.log"
A note aside, regarding the title of your post: "numerical variables" do not exist in bash. Every variable is a string. You can do a
typeset -i foo
which makes bash do some processing to ensure that the strings assigned represent natural numbers, but they are still strings. For instance,
foo=abc # sets foo to the string 0
foo=00005 # sets foo to the string 5
foo=5a # raises an error
This might work for you (GNU sed):
sed -ni '/Cite/!{p;b};:a;n;//!ba;:b;n;p;bb' file1 file2 ... filen
Turn off implicit printing -n and turn on edit inplace -i.
If a line does not match Cite, print it and repeat.
Otherwise filter following lines until another match and then print the remaining lines until the end of the file.
N.B. The -i treats each file separately in the same way the -s option does but edits the files inplace, so make sure by using the -s option first and when satisfied the results are as expected substitute the -i option.
I am working on a script and I need to compare a filename to another one and look for specific changes (in this case a "(x)" added to a filename when OS X needs to add a file to a directory, when a filename already exists) so this is an excerpt of the script, modified to be tested without the rest of it.
#!/bin/bash
p2_s2="/Path/to file (name)/containing - many.special chars.docx.gdoc"
next_line="/Path/to file (name)/containing - many.special chars.docx (1).gdoc"
file_ext=$(echo "${p2_s2}" | rev | cut -d '.' -f 1 | rev)
file_name=$(basename "${p2_s2}" ".${file_ext}")
file_dir=$(dirname "${p2_s2}")
esc_file_name=$(printf '%q' "${file_name}")
esc_file_dir=$(printf '%q' "${file_dir}")
esc_next_line=$(printf '%q' "${next_line}")
if [[ ${esc_next_line} =~ ${esc_file_dir}/${esc_file_name}\ \(?\).${file_ext} ]]
then
echo "It's a duplicate!"
fi
What I'm trying to do here is detect if the file next_line is a duplicate of p2_s2. As I am expecting multiple duplicates, next_line can have a (1) appended at the end of a filename or any other number in brackets (Although I am sure no double digits). As I can't do a simple string compare with a wildcard in the middle, I tried using the "=~" operator and escaping all the special chars. Any idea what I'm doing wrong?
You can trim ps2_s2's extension, trim next_line's extension including the number inside the parenthesis and see if you get the same file name. If you do - it's a duplicate. In order to do so, [[ allows us to perform a comparison between a string and a Glob.
I used extglob's +( ... ) pattern, so I can use +([0-9]) to match the number inside the parenthesis. Notice that extglob is enabled by shopt -s extglob.
#!/bin/bash
p2_s2="/Path/to/ps2.docx.gdoc"
next_line="/Path/to/ps2(1).docx.gdoc"
shopt -s extglob
if [[ "${p2_s2%%.*}" = "${next_line%%\(+([0-9])\).*}" ]]; then
printf '%s is a duplicate of %s\n' "$next_line" "$p2_s2"
fi
EDIT:
I now see that you've edited your question, so in case this solution is not enough, I'm positive that it'll be a good template to work with.
The (1) in next_line doesn't come before the final . it comes before the second to final . in the original filename but you only strip off a single . as the extension.
So when you generate the comparison filename you end up with /Path/to\ file\ \(name\)/containing\ -\ many.special\ chars.docx\ \(?\).gdoc which doesn't match what you expect.
If you had added set -x to the top of your script you'd have seen what the shell was actually doing and seen this.
What does OS X actually do in this situation? Does it add (#) before .gdoc? Does it add it before.docx`? Does it depend on whether OS X knows what the filename is (it is some type it can open natively)?
Me and a friend are working on a project, and We have to create a script that can go into a file, and replace all occurances of a certain expression/word/letter with another using Sed. It is designed to go through multiple tests replacing all these occurances, and we don't know what they will be so we have to anticipate anything. We are having trouble on a certain test where we need to replace 'l*' with 'L' in different files using a loop. The code that i have is
#!/bin/sh
p1="$1"
shift
p2="$1"
shift
for file in "$#" #for any file in the directory
do
# A="$1"
#echo $A
#B="$2"
echo "$p1" | sed -e 's/\([*.[^$]\)/\\\1/g' > temporary #treat all special characters as plain text
A="`cat 'temporary'`"
rm temporary
echo "$p1"
echo "$file"
sed "s/$p1/$p2/g" "$file" > myFile.txt.updated #replace occurances
mv myFile.txt.updated "$file"
cat "$file"
done
I have tried testing this on practice files that contain different words and also 'l*' But whenever i test it, it deletes all the text in the file. Can someone help me with this, we would like to get it done soon. Thanks
It looks like you are trying to set A to a version of p1 with all special characters escaped. But you use p1 later instead of A. Try using the variable A, and also try setting it without a temporary file:
A=$( echo "$p1" | sed -e 's/\([*.[^$]\)/\\\1/g' )
I am writing a simple shell script to make automated backups, and I am trying to use basename to create a list of directories and them parse this list to get the first and the last directory from the list.
The problem is: when I use basename in the terminal, all goes fine and it gives me the list exactly as I want it. For example:
basename -a /var/*/
gives me a list of all the directories inside /var without the / in the end of the name, one per line.
BUT, when I use it inside a script and pass a variable to basename, it puts single quotes around the variable:
while read line; do
dir_name=$(echo $line)
basename -a $dir_name/*/ > dir_list.tmp
done < file_with_list.txt
When running with +x:
+ basename -a '/Volumes/OUTROS/backup/test/*/'
and, therefore, the result is not what I need.
Now, I know there must be a thousand ways to go around the basename problem, but then I'd learn nothing, right? ;)
How to get rid of the single quotes?
And if my directory name has spaces in it?
If your directory name could include spaces, you need to quote the value of dir_name (which is a good idea for any variable expansion, whether you expect spaces or not).
while read line; do
dir_name=$line
basename -a "$dir_name"/*/ > dir_list.tmp
done < file_with_list.txt
(As jordanm points out, you don't need to quote the RHS of a variable assignment.)
Assuming your goal is to populate dir_list.tmp with a list of directories found under each directory listed in file_with_list.txt, this might do.
#!/bin/bash
inputfile=file_with_list.txt
outputfile=dir_list.tmp
rm -f "$outputfile" # the -f makes rm fail silently if file does not exist
while read line; do
# basic syntax checking
if [[ ! ${line} =~ ^/[a-z][a-z0-9/-]*$ ]]; then
continue
fi
# collect targets using globbing
for target in "$line"/*; do
if [[ -d "$target" ]]; then
printf "%s\n" "$target" >> $outputfile
fi
done
done < $inputfile
As you develop whatever tool will process your dir_list.tmp file, be careful of special characters (including spaces) in that file.
Note that I'm using printf instead of echo so that targets whose first character is a hyphen won't cause errors.
This might work
while read; do
find "$REPLY" >> dir_list.tmp
done < file_with_list.txt