fast way to replace characters in file ignoring comment lines

fast way to replace characters in file ignoring comment lines - bash

How can I replace/delete characters in a file while leaving comment lines unchanged? I'm looking for a something to the effect of the following lines (where 'X' is replaced for 'Y' in file.txt), just substantially faster:
while read line
do
if [[ ${line:0:1} = "#" ]]
then
echo "$line"
else
echo "$line" | tr "X" "Y"
fi
done < file.txt
Thank you!

Equivalent, more accurate (and faster) will be this sed command as compared to your script:
sed '/^ *#/!{s/X/Y/g;}' file.txt
This means match any line that doesn't have 0 or more spaces followed by # at the start of line and replace X with Y globally.

i am willing to bet perl will be faster than all above :
perl -i -pe 's/X/Y/g unless /^#/' file.txt

for fast replacement, use sed, and only replace in lines not starting with "#":
cat foo.txt | sed -e '/^#/! s/X/Y/g'

sed -i '/^#/! s/{what_to_replace}/{to_what_to_replace}/g' file.txt

awk version:
awk '!/^ *#/{gsub(/X/,"Y")}1' file.txt
Do look for word boundaries to prevent sub strings of your substitution from getting replaced. For example, with gawk you can use \< and \>

Related

How to properly validate a part of the output of a command in BASH [duplicate]

Given a file, for example:
potato: 1234
apple: 5678
potato: 5432
grape: 4567
banana: 5432
sushi: 56789
I'd like to grep for all lines that start with potato: but only pipe the numbers that follow potato:. So in the above example, the output would be:
1234
5432
How can I do that?

grep 'potato:' file.txt | sed 's/^.*: //'
grep looks for any line that contains the string potato:, then, for each of these lines, sed replaces (s/// - substitute) any character (.*) from the beginning of the line (^) until the last occurrence of the sequence : (colon followed by space) with the empty string (s/...// - substitute the first part with the second part, which is empty).
or
grep 'potato:' file.txt | cut -d\ -f2
For each line that contains potato:, cut will split the line into multiple fields delimited by space (-d\ - d = delimiter, \ = escaped space character, something like -d" " would have also worked) and print the second field of each such line (-f2).
or
grep 'potato:' file.txt | awk '{print $2}'
For each line that contains potato:, awk will print the second field (print $2) which is delimited by default by spaces.
or
grep 'potato:' file.txt | perl -e 'for(<>){s/^.*: //;print}'
All lines that contain potato: are sent to an inline (-e) Perl script that takes all lines from stdin, then, for each of these lines, does the same substitution as in the first example above, then prints it.
or
awk '{if(/potato:/) print $2}' < file.txt
The file is sent via stdin (< file.txt sends the contents of the file via stdin to the command on the left) to an awk script that, for each line that contains potato: (if(/potato:/) returns true if the regular expression /potato:/ matches the current line), prints the second field, as described above.
or
perl -e 'for(<>){/potato:/ && s/^.*: // && print}' < file.txt
The file is sent via stdin (< file.txt, see above) to a Perl script that works similarly to the one above, but this time it also makes sure each line contains the string potato: (/potato:/ is a regular expression that matches if the current line contains potato:, and, if it does (&&), then proceeds to apply the regular expression described above and prints the result).

Or use regex assertions: grep -oP '(?<=potato: ).*' file.txt

grep -Po 'potato:\s\K.*' file
-P to use Perl regular expression
-o to output only the match
\s to match the space after potato:
\K to omit the match
.* to match rest of the string(s)

sed -n 's/^potato:[[:space:]]*//p' file.txt
One can think of Grep as a restricted Sed, or of Sed as a generalized Grep. In this case, Sed is one good, lightweight tool that does what you want -- though, of course, there exist several other reasonable ways to do it, too.

This will print everything after each match, on that same line only:
perl -lne 'print $1 if /^potato:\s*(.*)/' file.txt
This will do the same, except it will also print all subsequent lines:
perl -lne 'if ($found){print} elsif (/^potato:\s*(.*)/){print $1; $found++}' file.txt
These command-line options are used:
-n loop around each line of the input file
-l removes newlines before processing, and adds them back in afterwards
-e execute the perl code

You can use grep, as the other answers state. But you don't need grep, awk, sed, perl, cut, or any external tool. You can do it with pure bash.
Try this (semicolons are there to allow you to put it all on one line):
$ while read line;
do
if [[ "${line%%:\ *}" == "potato" ]];
then
echo ${line##*:\ };
fi;
done< file.txt
## tells bash to delete the longest match of ": " in $line from the front.
$ while read line; do echo ${line##*:\ }; done< file.txt
1234
5678
5432
4567
5432
56789
or if you wanted the key rather than the value, %% tells bash to delete the longest match of ": " in $line from the end.
$ while read line; do echo ${line%%:\ *}; done< file.txt
potato
apple
potato
grape
banana
sushi
The substring to split on is ":\ " because the space character must be escaped with the backslash.
You can find more like these at the linux documentation project.

Modern BASH has support for regular expressions:
while read -r line; do
if [[ $line =~ ^potato:\ ([0-9]+) ]]; then
echo "${BASH_REMATCH[1]}"
fi
done

grep potato file | grep -o "[0-9].*"

Bash Print between two matches [duplicate]

Given a file, for example:
potato: 1234
apple: 5678
potato: 5432
grape: 4567
banana: 5432
sushi: 56789
I'd like to grep for all lines that start with potato: but only pipe the numbers that follow potato:. So in the above example, the output would be:
1234
5432
How can I do that?

grep 'potato:' file.txt | sed 's/^.*: //'
grep looks for any line that contains the string potato:, then, for each of these lines, sed replaces (s/// - substitute) any character (.*) from the beginning of the line (^) until the last occurrence of the sequence : (colon followed by space) with the empty string (s/...// - substitute the first part with the second part, which is empty).
or
grep 'potato:' file.txt | cut -d\ -f2
For each line that contains potato:, cut will split the line into multiple fields delimited by space (-d\ - d = delimiter, \ = escaped space character, something like -d" " would have also worked) and print the second field of each such line (-f2).
or
grep 'potato:' file.txt | awk '{print $2}'
For each line that contains potato:, awk will print the second field (print $2) which is delimited by default by spaces.
or
grep 'potato:' file.txt | perl -e 'for(<>){s/^.*: //;print}'
All lines that contain potato: are sent to an inline (-e) Perl script that takes all lines from stdin, then, for each of these lines, does the same substitution as in the first example above, then prints it.
or
awk '{if(/potato:/) print $2}' < file.txt
The file is sent via stdin (< file.txt sends the contents of the file via stdin to the command on the left) to an awk script that, for each line that contains potato: (if(/potato:/) returns true if the regular expression /potato:/ matches the current line), prints the second field, as described above.
or
perl -e 'for(<>){/potato:/ && s/^.*: // && print}' < file.txt
The file is sent via stdin (< file.txt, see above) to a Perl script that works similarly to the one above, but this time it also makes sure each line contains the string potato: (/potato:/ is a regular expression that matches if the current line contains potato:, and, if it does (&&), then proceeds to apply the regular expression described above and prints the result).

Or use regex assertions: grep -oP '(?<=potato: ).*' file.txt

grep -Po 'potato:\s\K.*' file
-P to use Perl regular expression
-o to output only the match
\s to match the space after potato:
\K to omit the match
.* to match rest of the string(s)

sed -n 's/^potato:[[:space:]]*//p' file.txt
One can think of Grep as a restricted Sed, or of Sed as a generalized Grep. In this case, Sed is one good, lightweight tool that does what you want -- though, of course, there exist several other reasonable ways to do it, too.

This will print everything after each match, on that same line only:
perl -lne 'print $1 if /^potato:\s*(.*)/' file.txt
This will do the same, except it will also print all subsequent lines:
perl -lne 'if ($found){print} elsif (/^potato:\s*(.*)/){print $1; $found++}' file.txt
These command-line options are used:
-n loop around each line of the input file
-l removes newlines before processing, and adds them back in afterwards
-e execute the perl code

You can use grep, as the other answers state. But you don't need grep, awk, sed, perl, cut, or any external tool. You can do it with pure bash.
Try this (semicolons are there to allow you to put it all on one line):
$ while read line;
do
if [[ "${line%%:\ *}" == "potato" ]];
then
echo ${line##*:\ };
fi;
done< file.txt
## tells bash to delete the longest match of ": " in $line from the front.
$ while read line; do echo ${line##*:\ }; done< file.txt
1234
5678
5432
4567
5432
56789
or if you wanted the key rather than the value, %% tells bash to delete the longest match of ": " in $line from the end.
$ while read line; do echo ${line%%:\ *}; done< file.txt
potato
apple
potato
grape
banana
sushi
The substring to split on is ":\ " because the space character must be escaped with the backslash.
You can find more like these at the linux documentation project.

Modern BASH has support for regular expressions:
while read -r line; do
if [[ $line =~ ^potato:\ ([0-9]+) ]]; then
echo "${BASH_REMATCH[1]}"
fi
done

grep potato file | grep -o "[0-9].*"

sed/awk look for pattern in string and change another pattern on the same line

I have an F5 bigip.conf text file in which I want to change the route domain from 701 to 703 for all lines showing "10.166.201." The route domain is represented by %701
10.166.201.10%701
10.166.201.15%701
10.166.201.117%701
I am able to do this with bash but the problem is that the "else printf" command (I've also tried echo), which is supposed to print out all other lines, incorrectly parses things like "\r\n" and leaves them as "rn"
#!/bin/bash
while read line
do
if [[ $line = *"10.166.201"* ]];
then
printf '%s\n' "$line" | sed -e 's/701/703/'
else printf '%s\n' "$line"
fi
done < bigip.conf > bigip.conf_updated
Is there a way to stop printf and echo from modifying the "\r\n"?
Is there a better way to do this in sed/awk?
Thanks.

Use a regexp address:
sed '/10\.166\.201\./s/%701/%703/' bigip.conf
Once you made sure that the command works for you, you can change the file in place using -i:
sed -i'' '/10\.166\.201\./s/%701/%703/' bigip.conf
With GNU sed you can omit the option value for -i:
sed -i '/10\.166\.201\./s/%701/%703/' bigip.conf

sed:
sed -E 's/(10\.166\.201\.[[:digit:]]+%70)1/\13/'
The captured group, (10\.166\.201\.[[:digit:]]+%70) matches 10.166.201. literally, then one or more digits, then %70 literally
Outside the captured group, 1 matches literally; in the replacement, the the captured group is used and 1 is replaced by 3
Example:
% cat file.txt
10.166.201.10%701
10.166.201.15%701
10.166.201.117%701
% sed -E 's/(10\.166\.201\.[[:digit:]]+%70)1/\13/' file.txt
10.166.201.10%703
10.166.201.15%703
10.166.201.117%703

Following awk may help you on same.
awk '{gsub(/\r/,"")} /your_string/{sub(/701/,"703")} 1' Input_file
Also in case you want to save output into same Input_file itself then do following:
awk '{gsub(/\r/,"")} /your_string/{sub(/701/,"703")} 1' Input_file > temp_file && mv temp_file Input_file
EDIT: In case your Input_file has \r in them then I added {gsub(/\r/,"")} in my above codes, in case you don't have them you could remove them from codes.
EDIT2: Changing string to your_string also change . to \. too in your address.

The clear, simple, robust, efficient way is:
awk 'BEGIN{RS=ORS="\r\n"; FS=OFS="%"} index($1,"10.166.201.")==1{ $2="703" } 1' file
Note that you don't need to escape the .s or anchor to avoid partial matches because the above simply treats the IP address as a string appearing at the start of the line. The above uses GNU awk for multi-char RS to preserve your \r\n line endings.

Remove everything in a pipe delimited file after second-to-last pipe

How can remove everything in a pipe delimited file after the second-to-last pipe? Like for the line
David|3456|ACCOUNT|MALFUNCTION|CANON|456
the result should be
David|3456|ACCOUNT|MALFUNCTION

Replace |(string without pipe)|(string without pipe) at the end of each line:
sed 's/|[^|]*|[^|]*$//' inputfile

Using awk, something like
awk -F'|' 'BEGIN{OFS="|"}{NF=NF-2; print}' inputfile
David|3456|ACCOUNT|MALFUNCTION
(or) use cut if you know the number of columns in total, i,e 6 -> 4
cut -d'|' -f -4 inputfile
David|3456|ACCOUNT|MALFUNCTION

The command I would use is
cat input.txt | sed -r 's/(.*)\|.*/\1/' > output.txt

A pure Bash solution:
while IFS= read -r line || [[ -n $line ]] ; do
printf '%s\n' "${line%|*|*}"
done <inputfile
See Reading input files by line using read command in shell scripting skips last line (particularly the answer by Jahid) for details of how the while loop works.
See pattern matching in Bash for information about ${line%|*|*}.

sed or grep to read between a set of parentheses

I'm trying to read a version number from between a set of parentheses, from this output of some command:
Test Application version 1.3.5
card 0: A version 0x1010000 (1.0.0), 20 ch
Total known cards: 1
What I'm looking to get is 1.0.0.
I've tried variations of sed and grep:
command.sh | grep -o -P '(?<="(").*(?=")")'
command.sh | sed -e 's/(\(.*\))/\1/'
and plenty of variations. No luck :-(
Help?

You were almost there! In pgrep, use backslashes to keep literal meaning of parentheses, not double quotes:
grep -o -P '(?<=\().*(?=\))'

Having GNU grep you can also use the \K escape sequence available in perl mode:
grep -oP '\(\K[^)]+'
\K removes what has been matched so far. In this case the starting ( gets removed from match.
Alternatively you could use awk:
awk -F'[()]' 'NF>1{print $2}'
The command splits input lines using parentheses as delimiters. Once a line has been splitted into multiple fields (meaning the parentheses were found) the version number is the second field and gets printed.
Btw, the sed command you've shown should be:
sed -ne 's/.*(\(.*\)).*/\1/p'

There are a couple of variations that will work. First with grep and sed:
grep '(' filename | sed 's/^.*[(]\(.*\)[)].*$/\1/'
or with a short shell script:
#!/bin/sh
while read -r line; do
value=$(expr "$line" : ".*(\(.*\)).*")
if [ "x$value" != "x" ]; then
printf "%s\n" "$value"
fi
done <"$1"
Both return 1.0.0 for your given input file.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

fast way to replace characters in file ignoring comment lines - bash

Equivalent, more accurate (and faster) will be this sed command as compared to your script: sed '/^ *#/!{s/X/Y/g;}' file.txt This means match any line that doesn't have 0 or more spaces followed by # at the start of line and replace X with Y globally.

i am willing to bet perl will be faster than all above : perl -i -pe 's/X/Y/g unless /^#/' file.txt

for fast replacement, use sed, and only replace in lines not starting with "#": cat foo.txt | sed -e '/^#/! s/X/Y/g'

sed -i '/^#/! s/{what_to_replace}/{to_what_to_replace}/g' file.txt

awk version: awk '!/^ *#/{gsub(/X/,"Y")}1' file.txt Do look for word boundaries to prevent sub strings of your substitution from getting replaced. For example, with gawk you can use \< and \>

Related

How to properly validate a part of the output of a command in BASH [duplicate]

Bash Print between two matches [duplicate]

sed/awk look for pattern in string and change another pattern on the same line

Remove everything in a pipe delimited file after second-to-last pipe

sed or grep to read between a set of parentheses

Categories

Resources