awk to ignore leading and trailing space and blank lines and commented lines if any from a file - shell

Need help on awk
awk to ignore leading and trailing space and blank lines and commented lines if any from a file

Here you go,
grep "MyText" FromMyLog.log |awk -F " " '{print $2}'|awk -F "#" '{print $1}'
Here MyText is the key to grep from file FromMyLog.log
-F is used to avoid the following value, here space between quotes.
'{print $2}' will print the 2nd argument from the output, you can use $1, $2 as your requirement.
awk -F "#" This will ignore the commented lines.
This is just a hint for you, Modify the code with your requirements. This works for me while grep.
grep -v '^$\|^\s*\#' <filename> or egrep -v '^[[:space:]]*$|^ *#' <file_name> (if white spaces)

I think this is what you were asking for:
$> echo -e ' abc \t
\t efg
# alskdjfl
#
awk
# askdfh
' |
awk '
# match if first none space character is not a hash sign
/^[[:space:]]*[^#]/ {
# delete any spaces from start and end of line
sub(/^[[:space:]]*/, "");
sub(/[[:space:]]*$/, "", NF); # `NF` is Number of Fields
print
}'
abc
efg
awk
This can be folded onto a single line if so needed. Any problems, an actual example of the input (in a code block in your question) would be helpful.

Here's one way to extract required content ignoring spaces
FILE CONTENT
Server: 192.168.XX.XX
Address 1: 192.168.YY.YY
Name: central.google.com
Now to extract the server's address without spaces.
COMMAND
awk -F':' '/Server/ '{print $2}' YOURFILENAME | tr -s " "
option -s for squeezing the repetition of spaces.
which gives,
192.168.XX.XX
Here, notice that there is one leading space in the address.
To completely ignore spaces you can change that to,
awk -F':' '/Server/ '{print $2}' YOURFILENAME | tr -d [:space:]
option -d for removing particular characters, which is [:space:] here.
which gives,
192.168.YY.YY
without any leading or trailing spaces.
tr is an UNIX utility for translating, or deleting, or squeezing repeated characters. tr refers to translate here.
Examples:
tr [:lower:] [:upper:]
gives,
YOUAREAWESOME
for
youareawesome
Hope that helps.

Related

Remove starting substring http from strings using AWK?

I'm wondering Is there a better and cleaner way to remove strings at beginning and last of each line in a file using AWK only?
Here's what I got so far
cat results.txt | awk '{gsub("https://", "") ;print}' | tr -d ":443"
File: results.txt
https://www.google.com:443
https://www.tiktok.com:443
https://www.instagram.com:443
To get the result
www.google.com
www.tiktok.com
www.instagram.com
With GNU awk.
Use / and : as field separators and print fourth column:
awk -F '[/:]' '{print $4}' results.txt
Or use https:// and : as field separators and print second column:
awk -F 'https://|:' '{print $2}' results.txt
Output:
www.google.com
www.tiktok.com
www.instagram.com
If it's a list of URLs like that, you could take advantage of the fact that the field separator in awk can be a regular expression:
awk -F':(//)?' '{print $2}'
This says that your field seperator is ": optionally followed by //", which would split each line into:
[$1] http
[$2] www.google.com
[$3] 443
And then we print out only field $2.
cat results.txt | awk '{gsub("https://", "") ;print}' | tr -d ":443"
I think you are misunderstading what tr -d does, it is used to delete enumerated characters (not substring), it does seems to do what you want because your test input
https://www.google.com:443
https://www.tiktok.com:443
https://www.instagram.com:443
do not contain : or 4 or 3 which should be kept, if you need test case which will shown malfunction try
https://www.normandy1944.info:443
Also code as above feature anti-pattern known as useless use of cat as GNU AWK can deal with file on its' own that is
cat results.txt | awk '{gsub("https://", "") ;print}'
can be written more succintly as
awk '{gsub("https://", "") ;print}' results.txt
I would rewrite whole your code (cat,awk,tr) to single awk as follows
awk '{gsub("^https://|:443$","");print}' results.txt
Explanation: replace https:// following start of line (^) or (|) :443 before end of line ($) using empty string (i.e. delete these parts) then print. Note that ^ and $ will prevent deleting https:// and :443 in middle of strings, though feel free to remove ^ and $ if you find these to be unlikely.

Bash + sed/awk/cut to delete nth character

I trying to delete 6,7 and 8th character for each line.
Below is the file containing text format.
Actual output..
#cat test
18:40:12,172.16.70.217,UP
18:42:15,172.16.70.218,DOWN
Expecting below, after formatting.
#cat test
18:40,172.16.70.217,UP
18:42,172.16.70.218,DOWN
Even I tried with below , no luck
#awk -F ":" '{print $1":"$2","$3}' test
18:40,12,172.16.70.217,UP
#sed 's/^\(.\{7\}\).\(.*\)/\1\2/' test { Here I can remove only one character }
18:40:1,172.16.70.217,UP
Even with cut also failed
#cut -d ":" -f1,2,3 test
18:40:12,172.16.70.217,UP
Need to delete character in each line like 6th , 7th , 8th
Suggestion please
With GNU cut you can use the --complement switch to remove characters 6 to 8:
cut --complement -c6-8 file
Otherwise, you can just select the rest of the characters yourself:
cut -c1-5,9- file
i.e. characters 1 to 5, then 9 to the end of each line.
With awk you could use substrings:
awk '{ print substr($0, 1, 5) substr($0, 9) }' file
Or you could write a regular expression, but the result will be more complex.
For example, to remove the last three characters from the first comma-separated field:
awk -F, -v OFS=, '{ sub(/...$/, "", $1) } 1' file
Or, using sed with a capture group:
sed -E 's/(.{5}).{3}/\1/' file
Capture the first 5 characters and use them in the replacement, dropping the next 3.
it's a structured text, why count the chars if you can describe them?
$ awk '{sub(":..,",",")}1' file
18:40,172.16.70.217,UP
18:42,172.16.70.218,DOWN
remove the seconds.
The solutions below are generic and assume no knowledge of any format. They just delete character 6,7 and 8 of any line.
sed:
sed 's/.//8;s/.//7;s/.//6' <file> # from high to low
sed 's/.//6;s/.//6;s/.//6' <file> # from low to high (subtract 1)
sed 's/\(.....\).../\1/' <file>
sed 's/\(.{5}\).../\1/' <file>
s/BRE/replacement/n :: substitute nth occurrence of BRE with replacement
awk:
awk 'BEGIN{OFS=FS=""}{$6=$7=$8="";print $0}' <file>
awk -F "" '{OFS=$6=$7=$8="";print}' <file>
awk -F "" '{OFS=$6=$7=$8=""}1' <file>
This is 3 times the same, removing the field separator FS let awk assume a field to be a character. We empty field 6,7 and 8, and reprint the line with an output field separator OFS which is empty.
cut:
cut -c -5,9- <file>
cut --complement -c 6-8 <file>
Just for fun, perl, where you can assign to a substring
perl -pe 'substr($_,5,3)=""' file
With awk :
echo "18:40:12,172.16.70.217,UP" | awk '{ $0 = ( substr($0,1,5) substr($0,9) ) ; print $0}'
Regards!
If you are running on bash, you can use the string manipulation functionality of it instead of having to call awk, sed, cut or whatever binary:
while read STRING
do
echo ${STRING:0:5}${STRING:9}
done < myfile.txt
${STRING:0:5} represents the first five characters of your string, ${STRING:9} represents the 9th character and all remaining characters until the end of the line. This way you cut out characters 6,7 and 8 ...

Count number of Special Character in Unix Shell

I have a delimited file that is separated by octal \036 or Hexadecimal value 1e.
I need to count the number of delimiters on each line using a bash shell script.
I was trying to use awk, not sure if this is the best way.
Sample Input (| is a representation of \036)
Example|Running|123|
Expected output:
3
awk -F'|' '{print NF-1}' file
Change | to whatever separator you like. If your file can have empty lines then you need to tweak it to:
awk -F'|' '{print (NF ? NF-1 : 0)}' file
You can try
awk '{print gsub(/\|/,"")}'
Simply try
awk -F"|" '{print substr($3,length($3))}' OFS="|" Input_file
Explanation: Making field separator -F as | and then printing the 3rd column by doing $3 only as per your need. Then setting OFS(output field separator) to |. Finally mentioning Input_file name here.
This will work as far as I know
echo "Example|Running|123|" | tr -cd '|' | wc -c
Output
3
This should work for you:
awk -F '\036' '{print NF-1}' file
3
-F '\036' sets input field delimiter as octal value 036
Awk may not be the best tool for this. Gnu grep has a cool -o option that prints each matching pattern on a separate line. You can then count how many matching lines are generated for each input line, and that's the count of your delimiters. E.g. (where ^^ in the file is actually hex 1e)
$ cat -v i
a^^b^^c
d^^e^^f^^g
$ grep -n -o $'\x1e' i | uniq -c
2 1:
3 2:
if you remove the uniq -c you can see how it's working. You'll get "1" printed twice because there are two matching patterns on the first line. Or try it with some regular ascii characters and it becomes clearer what the -o and -n options are doing.
If you want to print the line number followed by the field count for that line, I'd do something like:
$grep -n -o $'\x1e' i | tr -d ':' | uniq -c | awk '{print $2 " " $1}'
1 2
2 3
This assumes that every line in the file contains at least one delimiter. If that's not the case, here's another approach that's probably faster too:
$ tr -d -c $'\x1e\n' < i | awk '{print length}'
2
3
0
0
0
This uses tr to delete (-d) all characters that are not (-c) 1e or \n. It then pipes that stream of data to awk which just counts how many characters are left on each line. If you want the line number, add " | cat -n" to the end.

How to truncate trailing space in xargs

I would like to use xargs to list the contents of some files based on the output of command A. Xargs replace-str seem to be adding a space to the end and causing the command to fail. Any suggestions? I know this can be worked around using for loop. But curious to know how to do this using xargs.
lsscsi |awk -F\/ '/ATA/ {print $NF}' | xargs -L 1 -I % cat /sys/block/%/queue/scheduler
cat: /sys/block/sda /queue/scheduler: No such file or directory
The problem is not with xargs -I, which does not append a space to each argument, which can be verified as follows:
$ echo 'sda' | xargs -I % echo '[%]'
[sda]
Incidentally, specifying -L 1 in addition to -I is pointless: -I implies line-by-line processing.
Therefore, it must be the output from the command that provides input to xargs that contains the trailing space.
You can adapt your awk command to fix that:
lsscsi |
awk -F/ '/ATA/ {sub(/ $/,"", $NF); print $NF}' |
xargs -I % cat '/sys/block/%/queue/scheduler'
sub(/ $/,"", $NF) replaces a trailing space in field $NF with the empty string, thereby effectively removing it.
Note how I've (single-)quoted cat's argument so as to make it work even with filenames with spaces.
lsscsi |awk -F\/ '/ATA/ {print $NF}'| awk '{print $NF}' | xargs -L 1 -I % cat /sys/block/%/queue/scheduler
The first awk stmt splits by "/" so anything else is considered as field. In this is case "sda " becomes whole field including a space at the end. But by default, awk removes space . So after the pipe, the second awk prints $NF (which is last word of the line) and leaves out " " space as delimiter. awk { print $1 } will do the same because we have only one word, "sda" which is both first and last.

Bash: How can I read from file and display the output in one line?

I'd like to run:
grep nfs /etc/fstab | awk '{print $2}'
[root#nyproxy5 ~]# grep nfs /etc/fstab | awk '{print $2}'
/proxy_logs
/proxy_dump
/sync_logs
[root#nyproxy5 ~]#
And to get the output in one line delimited by space.
How can I do that?
If you don't mind a space (and no newline) at the end, you could do use this awk script:
awk '/nfs/{printf "%s ", $2}' /etc/fstab
For lines that match the pattern /nfs/, the second column is printed followed by a space. As a general rule, piping grep into awk is unnecessary as awk can do the pattern matching itself.
If you would like a newline at the end, you could use the END block:
awk '/nfs/{printf "%s ", $2}END{print ""}' /etc/fstab
This prints an empty string, followed by the output record separator (which is a newline). This will mean that you always have a newline in the output even if no matching records were found. If that's a problem, you could use a flag:
awk '/nfs/{f=1;printf "%s ", $2}END{if(f)print ""}' /etc/fstab
The flag f is set to true if the pattern is ever matched, causing the newline to be printed.
Newlines or any other character could be removed or replaced with tr command:
grep nfs /etc/fstab | awk '{print $2}' | tr -c '\n' ' '
If you want to get rid of tabs also:
| tr -c '\t' ' '

Resources