Deleting every line with a specific string in bash - bash

Edit:
I realized that my current code is garbage, but other ways I tried before also did not work. The problem was that I edited the files in Notepad++ on Windows, and had them running on Linux. The programm dos2unix does the trick.
Solution:
I have used Notepad++ in Windows to write my files which caused the problem. Running the files trough dos2unix fixed it.
I have written a little bash script which should delete every line of $2 which contains a word which is specified in $1 and writes the output to $3. But somehow it does not work like it should.
#!/bin/bash
set -f
while IFS='' read -r i || [[ -n "$i" ]]; do
sed -i "/$i/d" "$2"
done < "$1"
Edit:
Example
file 1.test:
123
678
456
file 2.test:
dasdas123dasd
3fsef344
678 3423423
r23r23rfsad
456 dasdasd
running the script:
./script.sh 1.test 2.test
The output should be:
3fsef344
r23r23rfsad
but instead it is:
dasdas123dasd
3fsef344
678 3423423
r23r23rfsad

Why are you using a shell loop and sed for this?
$ grep -vFf file1 file2
3fsef344
r23r23rfsad
If that doesn't do what you need then clarify your question with a more truly representative example because "use a shell loop calling sed multiple times" is not the answer to any question. See https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice for some of the reasons I say that.

You're reloading $2 and overwriting $3 each time you call sed. You should apply all operations to the file instead of reloading.
#!/bin/bash
set -f
cat "$2" > "$3"
while IFS='' read -r i || [[ -n "$i" ]]; do
sed -i "/$i/d" "$3"
done < "$1"

Related

Updating a config file based on the presence of a specific string

I want to be able to comment and uncomment lines which are "managed" using a bash script.
I am trying to write a script which will update all of the config lines which have the word #managed after them and remove the preceeding # if it exists.
The rest of the config file needs to be left unchanged. The config file looks like this:
configFile.txt
#config1=abc #managed
#config2=abc #managed
config3=abc #managed
config3=abc
This is the script I have created so far. It iterates the file, finds lines which contain "#managed" and detects if they are currently commented.
I need to then write this back to the file, how do I do that?
manage.sh
#!/bin/bash
while read line; do
STR='#managed'
if grep -q "$STR" <<< "$line"; then
echo "debug - this is managed"
firstLetter=${$line:0:1}
if [ "$firstLetter" = "#" ]; then
echo "Remove the initial # from this line"
fi
fi
echo "$line"
done < configFile.txt
With your approach using grep and sed.
str='#managed$'
file=ConfigFile.txt
grep -q "^#.*$str" "$file" && sed "/^#.*$str/s/^#//" "$file"
Looping through files ending in *.txt
#!/usr/bin/env bash
str='#managed$'
for file in *.txt; do
grep -q "^#.*$str" "$file" &&
sed "/^#.*$str/s/^#//" "$file"
done
In place editing with sed requires the -i flag/option but that varies from different version of sed, the GNU version does not require an -i.bak args, while the BSD version does.
On a Mac, ed should be installed by default, so just replace the sed part with.
printf '%s\n' "g/^#.*$str/s/^#//" ,p Q | ed -s "$file"
Replace the Q with w to actually write back the changes to the file.
Remove the ,p if no output to stdout is needed/required.
On a side note, embedding grep and sed in a shell loop that reads line-by-line the contents of a text file is considered a bad practice from shell users/developers/coders. Say the file has 100k lines, then grep and sed would have to run 100k times too!
This sed one-liner should do the trick:
sed -i.orig '/#managed/s/^#//' configFile.txt
It deletes the # character at the beginning of the line if the line contains the string #managed.
I wouldn't do it in bash (because that would be slower than sed or awk, for instance), but if you want to stick with bash:
#! /bin/bash
while IFS= read -r line; do
if [[ $line = *'#managed'* && ${line:0:1} = '#' ]]; then
line=${line:1}
fi
printf '%s\n' "$line"
done < configFile.txt > configFile.tmp
mv configFile.txt configFile.txt.orig && mv configFile.tmp configFile.txt

'sed: no input files' when using sed -i in a loop

I checked some solutions for this in other questions, but they are not working with my case and I'm stuck so here we go.
I have a csv file that I want to convert all to uppercase. It has to be with a loop and occupate 7 lines of code minimum. I have to run the script with this command:
./c_bash.sh student-mat.csv
So I tried this Script:
#!/bin/bash
declare -i c=0
while read -r line; do
if [ "$c" -gt '0' ]; then
sed -e 's/\(.*\)/\U\1/'
else
echo "$line"
fi
((c++))
done < student-mat.csv
I know that maybe there are a couple of unnecessary things on it, but I want to focus in the sed command because it looks like the problem here.
That script shows this output:(first 5 lines):
school,sex,age,address,famsize,Pstatus,Medu,Fedu,Mjob,Fjob,reason,guardian,traveltime,studytime,failures,schoolsup,famsup,paid,activities,nursery,higher,internet,romantic,famrel,freetime,goout,Dalc,Walc,health,absences,G1,G2,G3
GP,F,17,U,GT3,T,1,1,AT_HOME,OTHER,COURSE,FATHER,1,2,0,NO,YES,NO,NO,NO,YES,YES,NO,5,3,3,1,1,3,4,5,5,6
GP,F,15,U,LE3,T,1,1,AT_HOME,OTHER,OTHER,MOTHER,1,2,3,YES,NO,YES,NO,YES,YES,YES,NO,4,3,2,2,3,3,10,7,8,10
GP,F,15,U,GT3,T,4,2,HEALTH,SERVICES,HOME,MOTHER,1,3,0,NO,YES,YES,YES,YES,YES,YES,YES,3,2,2,1,1,5,2,15,14,15
GP,F,16,U,GT3,T,3,3,OTHER,OTHER,HOME,FATHER,1,2,0,NO,YES,YES,NO,YES,YES,NO,NO,4,3,2,1,2,5,4,6,10,10
GP,M,16,U,LE3,T,4,3,SERVICES,OTHER,REPUTATION,MOTHER,1,2,0,NO,YES,YES,YES,YES,YES,YES,NO,5,4,2,1,2,5,10,15,15,15
Now that I see that it works, I want to apply that sed command permanently to the csv file, so I put -i after it:
#!/bin/bash
declare -i c=0
while read -r line; do
if [ "$c" -gt '0' ]; then
sed -i -e 's/\(.*\)/\U\1/'
else
echo "$line"
fi
((c++))
done < student-mat.csv
But the output instead of applying the changes, shows this:(first 5 lines)
school,sex,age,address,famsize,Pstatus,Medu,Fedu,Mjob,Fjob,reason,guardian,traveltime,studytime,failures,schoolsup,famsup,paid,activities,nursery,higher,internet,romantic,famrel,freetime,goout,Dalc,Walc,health,absences,G1,G2,G3
sed: no input files
sed: no input files
sed: no input files
sed: no input files
sed: no input files
So checking a lot of different solutions on the internet, I also tried to change single quoting to double quoting.
#!/bin/bash
declare -i c=0
while read -r line; do
if [ "$c" -gt '0' ]; then
sed -i -e "s/\(.*\)/\U\1/"
else
echo "$line"
fi
((c++))
done < student-mat.csv
But in this case, instead of applying the changes, it generate a file with 0 bytes. So no output when I do this:
cat student-mat.csv
My expected solution here is that, when I apply this script, it changes permanently all the data to uppercase. And after applying the script, it should show this with the command cat student-mat.csv: (first 5 lines)
school,sex,age,address,famsize,Pstatus,Medu,Fedu,Mjob,Fjob,reason,guardian,traveltime,studytime,failures,schoolsup,famsup,paid,activities,nursery,higher,internet,romantic,famrel,freetime,goout,Dalc,Walc,health,absences,G1,G2,G3
GP,F,17,U,GT3,T,1,1,AT_HOME,OTHER,COURSE,FATHER,1,2,0,NO,YES,NO,NO,NO,YES,YES,NO,5,3,3,1,1,3,4,5,5,6
GP,F,15,U,LE3,T,1,1,AT_HOME,OTHER,OTHER,MOTHER,1,2,3,YES,NO,YES,NO,YES,YES,YES,NO,4,3,2,2,3,3,10,7,8,10
GP,F,15,U,GT3,T,4,2,HEALTH,SERVICES,HOME,MOTHER,1,3,0,NO,YES,YES,YES,YES,YES,YES,YES,3,2,2,1,1,5,2,15,14,15
GP,F,16,U,GT3,T,3,3,OTHER,OTHER,HOME,FATHER,1,2,0,NO,YES,YES,NO,YES,YES,NO,NO,4,3,2,1,2,5,4,6,10,10
GP,M,16,U,LE3,T,4,3,SERVICES,OTHER,REPUTATION,MOTHER,1,2,0,NO,YES,YES,YES,YES,YES,YES,NO,5,4,2,1,2,5,10,15,15,15
Sed works on files, not on lines. Do not read lines, use sed on the file. Sed can exclude the first line by itself. See sed manual.
You want:
sed -i -e '2,$s/\(.*\)/\U\1/' student-mat.csv
You can do shorter with s/.*/\U&/.
Your code does not work as you think it does. Note that your code removes the second line from the output. Your code:
reads first line with read -r line
echo "$line" first line is printed
c++ is incremented
read -r line reads second line
then sed processes the rest of the file (from line 3 till the end) and prints them in upper case
then c++ is incremented
then read -r line fails, and the loop exits

How to continually process last lines of two files when the files change randomly?

I have the following simple snippet:
#!/bin/bash
tail -f "data/top.right.log" | while read val1
do
val2=$(tail -n 1 "data/top.left.log")
echo $(echo "$val1 - $val2" | bc)
done
top.left.log and top.right.log are files to which some other processes continually write. The bash script simply subtracts the last lines of both files and show a result.
I would like to make the script more efficient. In pseudo-code I would like to do this:
#!/bin/bash
magiccommand "data/top.right.log" "data/top.left.log" | while read val1 val2
do
echo $(echo "$val1 - $val2" | bc)
done
so that whenever top.left.log OR top.right.log changes the echo command is called.
I have already tried various snippets from StackOverflow but often they rely on the fact that the files do not change or that both files contain the same amount of lines which is not my case.
If you have inotify-tools you can use following command:
inotifywait -q -e modify file1 file2
Description:
inotifywait efficiently waits for changes to files using Linux's inotify(7) interface.
It is suitable for waiting for changes to files from shell scripts.
It can either exit once an event occurs, or continually execute and output events as they occur.
An example:
while : ;
do
inotifywait -q -e modify file1 file2
echo `tail -n1 file1`
echo `tail -n1 file2`
done
Create a temporary file that you touch each time the files are processed. If any of the files is newer than the temporary file, process the files again.
#!/bin/bash
log1=top.left.log
log2=top.right.log
tmp=last_change
last_change=0
touch "$tmp"
while : ; do
if [[ $log1 -nt $tmp || $log2 -nt $tmp ]] ; then
touch "$tmp"
x=$(tail -n1 "$log1")
y=$(tail -n1 "$log2")
echo $(( x - y ))
fi
done
You might need to remove the temporary file once the script is killed.
If the files are changing fast, you might miss some lines. Otherwise, adding sleep 1 somewhere would decrease the CPU usage.
Instead of calling tail every time, you can open file descriptors once and read line after line. This makes use of the fact that the files are kept open, and read will always read from the next line of a file.
First, open the files in bash, assigning them file descriptors 3 and 4
exec 3<file1 4<file2
Now, you can read from these files using read -u <fd>. In combination with inotifywait of Dawid's answer, this gives you an efficient way to read files line by line:
while :; do
# TODO: add some break condition
# wait until one of the files has changed
inotifywait -q -e modify file1 file2
# read the next line of file1 into val1_new
# if file1 has not changed and there is no new line, read will return with failure
read -u 3 val1_new && val1="$val1_new"
# same for file2
read -u 4 val2_new && val2="$val2_new"
done
You may extend this by reading until you have reached the last line, or parsing inotifywait's output to detect which file has changed.
A possible way is to parse the output of tail -f and display the difference in value whenever the ==> <== pattern is found.
I came up with this script:
$ cat test.awk
$0 ~ /==>.*right.*<==/ {var=1}
$0 ~ /==>.*left.*<==/ {var=2}
$1~/[0-9]+/ && var==1 { val1=$1 }
$1~/[0-9]+/ && var==2 { val2=$1 }
val1 != "" && val2 != "" && $1~/[0-9]+/{
print val1-val2
}
The script assume the values are integer [0-9]+ in both file.
You can use it like this:
tail -f top.right.log top.left.log | awk -f test.awk
Whenever a value is appended in any of the file, the difference between the last value of each file is displayed.

How to process lines which is read from standard input in UNIX shell script?

I get stuck by this problem:
I wrote a shell script and it gets a large file with many lines from stdin, that's how it is executed:
./script < filename
I want use the file as an input to another operation in the script, however I don't know how to store this file's name in a variable.
It is a script that takes a file from stdin as argument and then do awk operation in this file it self. Say if I write in script:
script:
#!/bin/sh
...
read file
...
awk '...' < "$file"
...
it only reads first line of the input file.
And I find a way to write like this:
Min=-1
while read line; do
n=$(echo $line | awk -F$delim '{print NF}')
if [ $Min -eq -1 ] || [ $n -lt $Min ];then
Min=$n
fi
done
it would take very very long time to wait for processing, it seems awk takes much time.
So how to improve this?
/dev/stdin can be quite useful here.
In fact, it's just a chain of links to your input.
So, writing cat /dev/stdin will give you all input from your file and you can deny using input filename at all.
Now answer to question :) Recursively read links, beginning at /dev/stdin, and you will get filename. Bash code:
r(){
l=`readlink $1`
if [ $? -ne 0 ]
then
echo $1
else
r $l
fi
}
filename=`r /dev/stdin`
echo $filename
UPD:
in Ubuntu I found an option -f to readlink. i.e. readlink -f /dev/stdin gives the same output. This option may absent in some systems.
UPD2:tests (test.sh is code above):
$ ./test.sh <input # that is a file
/home/sfedorov/input
$ ./test.sh <<EOF
> line
> EOF
/tmp/sh-thd-214216298213
$ echo 1 | ./test.sh
pipe:[91219]
$ readlink -f /dev/stdin < input
/home/sfedorov/input
$ readlink -f /dev/stdin << EOF
> line
> EOF
/tmp/sh-thd-3423766239895 (deleted)
$ echo 1 | readlink -f /dev/stdin
/proc/18489/fd/pipe:[92382]
You're overdoing this. The way you invoke your script:
the file contents are the script's standard input
the script receives no argument
But awk already takes input from stdin by default, so all you need to do to make this work is:
not give awk any file name argument, it's going to be the wrapping shell's stdin automatically
not consume any of that input before the wrapping script reaches the awk part. Specifically: no read
If that's all there is to your script, it reduces to the awk invocation, so you might consider doing away with it altogether and just call awk directly. Or make your script directly an awk one instead of a sh one.
Aside: the reason your while read line/multiple awk variant (the one in the question) is slow is because it spawns an awk process for each and every line of the input, and process spawning is order of magnitudes slower than awk processing a single line. The reason why the generate tmpfile/single awk variant (the one in your answer) is still a bit slow is because it's generating the tmpfile line by line, reopening to append every time.
Modify your script to that it takes the input file name as an argument, then read from the file in your script:
$ ./script filename
In script:
filename=$1
awk '...' < "$filename"
If your script just reads from standard input, there is no guarantee that there is a named file providing the input; it could just as easily be reading from a pipe or a network socket.
How about invoking the script differently pipe standard output of YourFilename into
your scriptName as follows (the standard output of the cat filename now becomes standard
input to you script, actually in this case to the awk command
For I have filename Names.data and script showNames.sh execute as follows
cat Names.data | ./showNames.sh
Contents of filename Names.data
Huckleberry Finn
Jack Spratt
Humpty Dumpty
Contents of scrip;t showNames.sh
#!/bin/bash
#whatever awk commands you need
awk "{ print }"
Well I finally find this way to solve my problem, although it will take several seconds.
grep '.*' >> /tmp/tmpfile
Min=$(awk -F$delim 'NF < min || min == "" { min = NF };END {printmin}'</tmp/tmpfile)
Just append each line into a temporary file so that after reading from stdin, the tmpfile is the same as input file.

BASH: Strip new-line character from string (read line)

I bumped into the following problem: I'm writing a Linux bash script which does the following:
Read line from file
Strip the \n character from the end of the line just read
Execute the command that's in there
Example:
commands.txt
ls
ls -l
ls -ltra
ps as
The execution of the bash file should get the first line, and execute it, but while the \n present, the shell just outputs "command not found: ls"
That part of the script looks like this
read line
if [ -n "$line" ]; then #if not empty line
#myline=`echo -n $line | tr -d '\n'`
#myline=`echo -e $line | sed ':start /^.*$/N;s/\n//g; t start'`
myline=`echo -n $line | tr -d "\n"`
$myline #execute it
cat $fname | tail -n+2 > $fname.txt
mv $fname.txt $fname
fi
Commented you have the things I tried before asking SO. Any solutions? I'm smashing my brains for the last couple of hours over this...
I always like perl -ne 'chomp and print' , for trimming newlines. Nice and easy to remember.
e.g. ls -l | perl -ne 'chomp and print'
However
I don't think that is your problem here though. Although I'm not sure I understand how you're passing the commands in the file through to the 'read' in your shell script.
With a test script of my own like this (test.sh)
read line
if [ -n "$line" ]; then
$line
fi
and a sample input file like this (test.cmds)
ls
ls -l
ls -ltra
If I run it like this ./test.sh < test.cmds, I see the expected result, which is to run the first command 'ls' on the current working directory.
Perhaps your input file has additional non-printable characters in it ?
mine looks like this
od -c test.cmds
0000000 l s \n l s - l \n l s - l t
0000020 r a \n
0000023
From your comments below, I suspect you may have carriage returns ( "\r" ) in your input file, which is not the same thing as a newline. Is the input file originally in DOS format ? If so, then you need to convert the 2 byte DOS line ending "\r\n" to the single byte UNIX one, "\n" to achieve the expected results.
You should be able to do this by swapping the "\n" for "\r" in any of your commented out lines.
Someone already wrote a program which executes shell commands: sh file
If you really only want to execute the first line of a file: head -n 1 file |sh
If your problem is carriage-returns: tr -d '\r' <file |sh
I tried this:
read line
echo -n $line | od -x
For the input 'xxxx', I get:
0000000 7878 7878
As you can see, there is no \n at the end of the contents of the variable. I suggest to run the script with the option -x (bash -x script). This will print all commands as they are executed.
[EDIT] Your problem is that you edited commands.txt on Windows. Now, the file contains CRLF (0d0a) as line delimiters which confuses read (and ls\r is not a known command). Use dos2unix or similar to turn it into a Unix file.
You may also try to replace carriage returns with newlines only using Bash builtins:
line=$'a line\r'
line="${line//$'\r'/$'\n'}"
#line="${line/%$'\r'/$'\n'}" # replace only at line end
printf "%s" "$line" | ruby -0777 -n -e 'p $_.to_s'
you need eval command
#!/bin/bash -x
while read cmd
do
if [ "$cmd" ]
then
eval "$cmd"
fi
done
I ran it as
./script.sh < file.txt
And file.txt was:
ls
ls -l
ls -ltra
ps as
though not working for ls, I recommend having a look at find’s -print0 option
The following script works (at least for me):
#!/bin/bash
while read I ; do if [ "$I" ] ; then $I ; fi ; done ;

Resources