bash to capture specific instance of pattern and exclude others - bash
I am trying to capture and read into $line the line or lines in file that have only del in them (line 2 is an example). Line 3 has del in it but it also has ins and the bash when executed currently captures both. I am not sure how to exclude anything but del and only capture those lines. Thank you :).
file
NM_003924.3:c.765_779dupGGCAGCGGCGGCAGC
NM_003924.3:c.765_779delGGCAGCGGCGGCAGC
NM_003924.3:c.765_779delGGCAGCinsGGCGGCAGC
NM_003924.3:c.765_779insGGCAGCGGCGGCAGC
desired output
NM_003924.3:c.765_779delGGCAGCGGCGGCAGC
bash w/ current output
while read line; do
if [[ $line =~ del ]] ; then echo $line; fi
done < file
NM_003924.3:c.765_779delGGCAGCGGCGGCAGC
NM_003924.3:c.765_779delGGCAGCinsGGCGGCAGC
Could you please try following(if ok with awk).
awk '/del/ && !/ins/' Input_file
Try:
while read -r line; do
[[ $line =~ del && ! $line =~ ins ]] && printf '%s\n' "$line"
done < file
The revised code is also ShellCheck clean and avoids BashPitfall #14.
This solution may fail if the last line in the file does not have a terminating newline. If that is a concern, see the accepted answer to Read last line of file in bash script when reading file line by line for a fix.
Here is a sed solution. It negates the match del followed by ins and prints everything that has a del in it. -n to silent every other output.
$ sed -n -e '/del.*ins/!{/.*del.*/p}' inputFile
NM_003924.3:c.765_779delGGCAGCGGCGGCAGC
Here is another answer using PCRE enabled grep. This should work with -P option in GNU grep
$ grep -P 'del\.*(?!.*ins)' inputFile
NM_003924.3:c.765_779delGGCAGCGGCGGCAGC
Split it in 2 steps. You do not need a loop:
grep "del" file | grep -v "ins"
Related
Updating a config file based on the presence of a specific string
I want to be able to comment and uncomment lines which are "managed" using a bash script. I am trying to write a script which will update all of the config lines which have the word #managed after them and remove the preceeding # if it exists. The rest of the config file needs to be left unchanged. The config file looks like this: configFile.txt #config1=abc #managed #config2=abc #managed config3=abc #managed config3=abc This is the script I have created so far. It iterates the file, finds lines which contain "#managed" and detects if they are currently commented. I need to then write this back to the file, how do I do that? manage.sh #!/bin/bash while read line; do STR='#managed' if grep -q "$STR" <<< "$line"; then echo "debug - this is managed" firstLetter=${$line:0:1} if [ "$firstLetter" = "#" ]; then echo "Remove the initial # from this line" fi fi echo "$line" done < configFile.txt
With your approach using grep and sed. str='#managed$' file=ConfigFile.txt grep -q "^#.*$str" "$file" && sed "/^#.*$str/s/^#//" "$file" Looping through files ending in *.txt #!/usr/bin/env bash str='#managed$' for file in *.txt; do grep -q "^#.*$str" "$file" && sed "/^#.*$str/s/^#//" "$file" done In place editing with sed requires the -i flag/option but that varies from different version of sed, the GNU version does not require an -i.bak args, while the BSD version does. On a Mac, ed should be installed by default, so just replace the sed part with. printf '%s\n' "g/^#.*$str/s/^#//" ,p Q | ed -s "$file" Replace the Q with w to actually write back the changes to the file. Remove the ,p if no output to stdout is needed/required. On a side note, embedding grep and sed in a shell loop that reads line-by-line the contents of a text file is considered a bad practice from shell users/developers/coders. Say the file has 100k lines, then grep and sed would have to run 100k times too!
This sed one-liner should do the trick: sed -i.orig '/#managed/s/^#//' configFile.txt It deletes the # character at the beginning of the line if the line contains the string #managed. I wouldn't do it in bash (because that would be slower than sed or awk, for instance), but if you want to stick with bash: #! /bin/bash while IFS= read -r line; do if [[ $line = *'#managed'* && ${line:0:1} = '#' ]]; then line=${line:1} fi printf '%s\n' "$line" done < configFile.txt > configFile.tmp mv configFile.txt configFile.txt.orig && mv configFile.tmp configFile.txt
'sed: no input files' when using sed -i in a loop
I checked some solutions for this in other questions, but they are not working with my case and I'm stuck so here we go. I have a csv file that I want to convert all to uppercase. It has to be with a loop and occupate 7 lines of code minimum. I have to run the script with this command: ./c_bash.sh student-mat.csv So I tried this Script: #!/bin/bash declare -i c=0 while read -r line; do if [ "$c" -gt '0' ]; then sed -e 's/\(.*\)/\U\1/' else echo "$line" fi ((c++)) done < student-mat.csv I know that maybe there are a couple of unnecessary things on it, but I want to focus in the sed command because it looks like the problem here. That script shows this output:(first 5 lines): school,sex,age,address,famsize,Pstatus,Medu,Fedu,Mjob,Fjob,reason,guardian,traveltime,studytime,failures,schoolsup,famsup,paid,activities,nursery,higher,internet,romantic,famrel,freetime,goout,Dalc,Walc,health,absences,G1,G2,G3 GP,F,17,U,GT3,T,1,1,AT_HOME,OTHER,COURSE,FATHER,1,2,0,NO,YES,NO,NO,NO,YES,YES,NO,5,3,3,1,1,3,4,5,5,6 GP,F,15,U,LE3,T,1,1,AT_HOME,OTHER,OTHER,MOTHER,1,2,3,YES,NO,YES,NO,YES,YES,YES,NO,4,3,2,2,3,3,10,7,8,10 GP,F,15,U,GT3,T,4,2,HEALTH,SERVICES,HOME,MOTHER,1,3,0,NO,YES,YES,YES,YES,YES,YES,YES,3,2,2,1,1,5,2,15,14,15 GP,F,16,U,GT3,T,3,3,OTHER,OTHER,HOME,FATHER,1,2,0,NO,YES,YES,NO,YES,YES,NO,NO,4,3,2,1,2,5,4,6,10,10 GP,M,16,U,LE3,T,4,3,SERVICES,OTHER,REPUTATION,MOTHER,1,2,0,NO,YES,YES,YES,YES,YES,YES,NO,5,4,2,1,2,5,10,15,15,15 Now that I see that it works, I want to apply that sed command permanently to the csv file, so I put -i after it: #!/bin/bash declare -i c=0 while read -r line; do if [ "$c" -gt '0' ]; then sed -i -e 's/\(.*\)/\U\1/' else echo "$line" fi ((c++)) done < student-mat.csv But the output instead of applying the changes, shows this:(first 5 lines) school,sex,age,address,famsize,Pstatus,Medu,Fedu,Mjob,Fjob,reason,guardian,traveltime,studytime,failures,schoolsup,famsup,paid,activities,nursery,higher,internet,romantic,famrel,freetime,goout,Dalc,Walc,health,absences,G1,G2,G3 sed: no input files sed: no input files sed: no input files sed: no input files sed: no input files So checking a lot of different solutions on the internet, I also tried to change single quoting to double quoting. #!/bin/bash declare -i c=0 while read -r line; do if [ "$c" -gt '0' ]; then sed -i -e "s/\(.*\)/\U\1/" else echo "$line" fi ((c++)) done < student-mat.csv But in this case, instead of applying the changes, it generate a file with 0 bytes. So no output when I do this: cat student-mat.csv My expected solution here is that, when I apply this script, it changes permanently all the data to uppercase. And after applying the script, it should show this with the command cat student-mat.csv: (first 5 lines) school,sex,age,address,famsize,Pstatus,Medu,Fedu,Mjob,Fjob,reason,guardian,traveltime,studytime,failures,schoolsup,famsup,paid,activities,nursery,higher,internet,romantic,famrel,freetime,goout,Dalc,Walc,health,absences,G1,G2,G3 GP,F,17,U,GT3,T,1,1,AT_HOME,OTHER,COURSE,FATHER,1,2,0,NO,YES,NO,NO,NO,YES,YES,NO,5,3,3,1,1,3,4,5,5,6 GP,F,15,U,LE3,T,1,1,AT_HOME,OTHER,OTHER,MOTHER,1,2,3,YES,NO,YES,NO,YES,YES,YES,NO,4,3,2,2,3,3,10,7,8,10 GP,F,15,U,GT3,T,4,2,HEALTH,SERVICES,HOME,MOTHER,1,3,0,NO,YES,YES,YES,YES,YES,YES,YES,3,2,2,1,1,5,2,15,14,15 GP,F,16,U,GT3,T,3,3,OTHER,OTHER,HOME,FATHER,1,2,0,NO,YES,YES,NO,YES,YES,NO,NO,4,3,2,1,2,5,4,6,10,10 GP,M,16,U,LE3,T,4,3,SERVICES,OTHER,REPUTATION,MOTHER,1,2,0,NO,YES,YES,YES,YES,YES,YES,NO,5,4,2,1,2,5,10,15,15,15
Sed works on files, not on lines. Do not read lines, use sed on the file. Sed can exclude the first line by itself. See sed manual. You want: sed -i -e '2,$s/\(.*\)/\U\1/' student-mat.csv You can do shorter with s/.*/\U&/. Your code does not work as you think it does. Note that your code removes the second line from the output. Your code: reads first line with read -r line echo "$line" first line is printed c++ is incremented read -r line reads second line then sed processes the rest of the file (from line 3 till the end) and prints them in upper case then c++ is incremented then read -r line fails, and the loop exits
How to read specific lines in a file in BASH?
while read -r line will run through each line in a file. How can I have it run through specific lines in a file, for example, lines "1-20", then "30-100"?
One option would be to use sed to get the desired lines: while read -r line; do echo "$line" done < <(sed -n '1,20p; 30,100p' inputfile) Saying so would feed lines 1-20, 30-100 from the inputfile to read.
#devnull's sed command does the job. Another alternative is using awk since it avoids doing the read and you can do the processing in awk itself: awk '(NR>=1 && NR<=20) || (NR>=30 && NR<=100) {print "processing $0"}' file
fast way to replace characters in file ignoring comment lines
How can I replace/delete characters in a file while leaving comment lines unchanged? I'm looking for a something to the effect of the following lines (where 'X' is replaced for 'Y' in file.txt), just substantially faster: while read line do if [[ ${line:0:1} = "#" ]] then echo "$line" else echo "$line" | tr "X" "Y" fi done < file.txt Thank you!
Equivalent, more accurate (and faster) will be this sed command as compared to your script: sed '/^ *#/!{s/X/Y/g;}' file.txt This means match any line that doesn't have 0 or more spaces followed by # at the start of line and replace X with Y globally.
i am willing to bet perl will be faster than all above : perl -i -pe 's/X/Y/g unless /^#/' file.txt
for fast replacement, use sed, and only replace in lines not starting with "#": cat foo.txt | sed -e '/^#/! s/X/Y/g'
sed -i '/^#/! s/{what_to_replace}/{to_what_to_replace}/g' file.txt
awk version: awk '!/^ *#/{gsub(/X/,"Y")}1' file.txt Do look for word boundaries to prevent sub strings of your substitution from getting replaced. For example, with gawk you can use \< and \>
Help with Bash script
I'm trying to get this script to basically read input from a file on a command line, match the user id in the file using grep and output these lines with line numbers starting from 1)...n in a new file. so far my script looks like this #!/bin/bash linenum=1 grep $USER $1 | while [ read LINE ] do echo $linenum ")" $LINE >> usrout $linenum+=1 done when i run it ./username file i get line 4: [: read: unary operator expected could anyone explain the problem to me? thanks
Just remove the [] around read line - they should be used to perform tests (file exists, string is empty etc.).
How about the following? $ grep $USER file | cat -n >usrout
Leave off the square brackets. while read line; do echo $linenum ")" $LINE done >> usrout
just use awk awk -vu="$USER" '$0~u{print ++d") "$0}' file or grep $USER file |nl or with the shell, (no need to use grep) i=1 while read -r line do case "$line" in *"$USER"*) echo $((i++)) $line >> newfile;; esac done <"file"
Why not just use grep with the -n (or --line-number) switch? $ grep -n ${USERNAME} ${FILE} The -n switch gives the line number that the match was found on in the file. From grep's man page: -n, --line-number Prefix each line of output with the 1-based line number within its input file. So, running this against the /etc/passwd file in linux for user test_user, gives: 31:test_user:x:5000:5000:Test User,,,:/home/test_user:/bin/bash This shows that the test_user account appears on line 31 of the /etc/passwd file.
Also, instead of $foo+=1, you should write foo=$(($foo+1)).