How to append string to file if it is not included in the file? - bash

Protagonists
The Admin
Pipes
The Cron Daemon
A bunch of text processing utilities
netstat
>> the Scribe
Setting
The Cron Daemon is repeatedly performing the same job where he forces an innocent netstat to show the network status (netstat -n). Pipes then have to pick up the information and deliver it to bystanding text processing utilities (| grep tcp | awk '{ print $5 }' | cut -d "." -f-4). >> has to scribe the important results to a file. As his highness, The Admin, is a lazy and easily annoyed ruler, >> only wants to scribe new information to the file.
*/1 * * * * netstat -n | grep tcp | awk '{ print $5 }' | cut -d "." -f-4 >> /tmp/file
Soliloquy by >>
To append, or not append, that is the question:
Whether 'tis new information to bother The Admin with
and earn an outrageous Fortune,
Or to take Arms against `netstat` and the others,
And by opposing, ignore them? To die: to sleep;
note by the publisher: For all those that had problems understanding Hamlet, like I did, the question is, how do I check if the string is already included in the file and if not, append it to the file?

Unless you are dealing with a very big file, you can use the uniq command to remove the duplicate lines from the file. This means you will also have the file sorted, I don't know if this is an advantage or disadvantage for you:
netstat -n | grep tcp | awk '{ print $5 }' | cut -d "." -f-4 >> /tmp/file && sort /tmp/file | uniq > /tmp/file.uniq
This will give you the sorted results without duplicates in /tmp/file.uniq

What a piece of work is piping, how easy to reason about,
how infinite in use cases, in bash and script,
how elegant and admirable in action,
how like a vim in flexibility,
how like a gnu!
Here is a slightly different take:
netstat -n | awk -F"[\t .]+" '/tcp/ {print $9"."$10"."$11"."$12}' | sort -nu | while read ip; do if ! grep -q $ip /tmp/file; then echo $ip >> /tmp/file; fi; done;
Explanation:
awk -F"[\t .]+" '/tcp/ {print $9"."$10"."$11"."$12}'
Awk splits the input string by tabs and ".". The input string is filtered (instead of using a separate grep invocation) by lines containing "tcp". Finally the resulting output fields are concatenated with dots and printed out.
sort -nu
Sorts the IPs numerically and creates a set of unique entries. This eliminates the need for the separate uniq command.
if ! grep -q $ip /tmp/file; then echo $ip >> /tmp/file; fi;
Greps for the ip in the file, if it doesn't find it, the ip gets appended.
Note: This solution does not remove old entries and clean up the file after each run - it merely appends - as your question implied.

Related

Trouble Allocating Memory in Bash Script

I tried to automate the process of cleaning up various wordlists I am working with. This is the following code for it:
#!/bin/bash
# Removes spaces and duplicates in a wordlist
echo "Please be in the same directory as wordlist!"
read -p "Enter Worldlist: " WORDLIST
RESULT=$( awk '{print length, $0}' $WORDLIST | sort -n | cut -d " " -f2- )
awk '!(count[$0]++)' $RESULT > better-$RESULT
This is the error I recieve after running the program:
./wordlist-cleaner.sh: fork: Cannot allocate memory
First post, I hope I formatted it correctly.
You didn't describe your intentions or desired output, but I guess this may do what you want
awk '{print length, $0}' "$WORDLIST" | sort -n | cut -d " " -f2- | uniq > better-RESULT
Notice that it's better-RESULT instead of better-$RESULT as you don't want that as a filename.
Yeah okay I got it to run successfully. I was trying to clean up wordlists I was downloading of the net. I have some knowledge of the basic variable usage in Bash, but not enough of the text manipulation commands like sed or awk. Thanks for the support.

How to grep only the first string in a line

I'm writing a script that checks a list of all the users connected to the server (using who) and writes to the file Information the list of usernames of only those having letters a, b, c or d. This is what I have so far:
who | grep '[a-d]' >> Information
However, the command who displays this:
username pts/148 2019-01-29 16:09 (IP address)
What I don't understand is why my grep search is also displaying the pts/148, date, time, and IP address. I just want it to send the username to the file Information.
Any help is appreciated.
Another way is to use the command cut to get the first part of the string only.
who | cut -f 1 -d ' ' | grep '[a-d]' >> Information
Using awk to output records where the first clumn matches [a-d]:
$ who | awk '$1~/[a-d]/' >> Information
Using grep to search for lines with [a-d] before the first space:
$ who | grep -o "^[^ ]*[a-d][^ ]*" >> Information
You need to get the first word, otherwise grep will display the entire line that has the matching text. You could use awk:
who | awk '{ if (substr($1,1,1) ~ /^[a-d]/ ) print $1 }' >>Information

grep: compare string from file with another string

I have a list of files paths that I need to compare with a string:
git_root_path=$(git rev-parse --show-toplevel)
list_of_files=.git/ForGeneratingSBConfigAlert.txt
cd $git_root_path
echo "These files needs new OSB config:"
while read -r line
do
modfied="$line"
echo "File for compare: $modfied"
if grep -qf $list_of_files" $modfied"; then
echo "Found: $modfied"
fi
done < <(git status -s | grep -v " M" | awk '{if ($1 == "M") print $2}')
$modified - is a string variable that stores path to file
Pattern file example:
SVCS/resources/
SVCS/bus/projects/busCallout/
SVCS/bus/projects/busconverter/
SVCS/bus/projects/Resources/ (ignore .jar)
SVCS/bus/projects/Teema/
SVCS/common/
SVCS/domain/
SVCS/techutil/src/
SVCS/tech/mds/src/java/fi/vr/h/service/tech/mds/exception/
SVCS/tech/mds/src/java/fi/vr/h/service/tech/mds/interfaces/
SVCS/app/cashmgmt/src/java/fi/vr/h/service/app/cashmgmt/exception/
SVCS/app/cashmgmt/src/java/fi/vr/h/service/app/cashmgmt/interfaces/
SVCS/app/customer/src/java/fi/vr/h/service/app/customer/exception/
SVCS/app/customer/src/java/fi/vr/h/service/app/customer/interfaces/
SVCS/app/etravel/src/java/fi/vr/h/service/app/etravel/exception/
SVCS/app/etravel/src/java/fi/vr/h/service/app/etravel/interfaces/
SVCS/app/hermes/src/java/fi/vr/h/service/app/hermes/exception/
SVCS/app/hermes/src/java/fi/vr/h/service/app/hermes/interfaces/
SVCS/app/journey/src/java/fi/vr/h/service/app/journey/exception/
SVCS/app/journey/src/java/fi/vr/h/service/app/journey/interfaces/
SVCS/app/offline/src/java/fi/vr/h/service/app/offline/exception/
SVCS/app/offline/src/java/fi/vr/h/service/app/offline/interfaces/
SVCS/app/order/src/java/fi/vr/h/service/app/order/exception/
SVCS/app/order/src/java/fi/vr/h/service/app/order/interfaces/
SVCS/app/payment/src/java/fi/vr/h/service/app/payment/exception/
SVCS/app/payment/src/java/fi/vr/h/service/app/payment/interfaces/
SVCS/app/price/src/java/fi/vr/h/service/app/price/exception/
SVCS/app/price/src/java/fi/vr/h/service/app/price/interfaces/
SVCS/app/product/src/java/fi/vr/h/service/app/product/exception/
SVCS/app/product/src/java/fi/vr/h/service/app/product/interfaces/
SVCS/app/railcar/src/java/fi/vr/h/service/app/railcar/exception/
SVCS/app/railcar/src/java/fi/vr/h/service/app/railcar/interfaces/
SVCS/app/reservation/src/java/fi/vr/h/service/app/reservation/exception/
SVCS/app/reservation/src/java/fi/vr/h/service/app/reservation/interfaces/
kraken_test.txt
namaker_test.txt
shmaker_test.txt
I need to compare file search pattern with a string, is it possible using grep?
I'm not sure I understand the overall logic, but a few immediate suggestions come to mind.
You can avoid grep | awk in the vast majority of cases.
A while loop with a grep on a line at a time inside the loop is an antipattern. You probably just want to run one grep on the whole input.
Your question would still benefit from an explanation of what you are actually trying to accomplish.
cd "$(git rev-parse --show-toplevel)"
git status -s | awk '!/ M/ && $1 == "M" { print $2 }' |
grep -Fxf .git/ForGeneratingSBConfigAlert.txt
I was trying to think of a way to add back your human-readable babble, but on second thought, this program is probably better without it.
The -x option to grep might be wrong, depending on what you are really hoping to accomplish.
This should work:
git status -s | grep -v " M" | awk '{if ($1 == "M") print $2}' | \
grep --file=.git/ForGeneratingSBConfigAlert.txt --fixed-strings --line-regexp
Piping the awk output directly to grep avoids the while loop entirely. In most cases you'll find you don't really need to print debug messages and the like in it.
--file takes a file with one pattern to match per line.
--fixed-strings avoids treating any characters in the patterns as special.
--line-regexp anchors the patterns so that they only match if a full line of input matches one of the patterns.
All that said, could you clarify what you are actually trying to accomplish?

Bash Output different from command line

I have tried all kinds of filters using grep to try and solve this but just cannot crack it.
cpumem="$(ps aux | grep -v 'grep' | grep 'firefox-bin' | awk '{printf $3 "\t" $4}'
I am extracting the CPU and Memory usage for a process and when I run it from the command line, I get the 2 fields outputted correctly:
ps aux | grep -v 'grep' | grep 'firefox-bin' | awk '{printf $3 "\t" $4}'
> 1.1 4.4
but the same command executed from within the bash script produces this:
cpumem="$(ps aux | grep -v 'grep' | grep 'firefox-bin' | awk '{printf $3 "\t" $4}')"
echo -e cpumem
> 1.1 4.40.0 0.10.0 0.0
I am guessing that it is picking up 3 records, but I just don't know where from.
I am filtering out any other grep processes by using grep -v 'grep', can someone offer any suggestions or a more reliable way ??
Maybe you have 3 records because 3 firefox are running (or one is running, and it is threading itself).
You can avoid the grep hazzle by giving ps and option to select the processes. E.g. the -C to select processes by name. With ps -C firefox-bin you get only the firefox processes. But this does not help at all, when there is more than one process.
(You can also use the ps option to output only the columns you want, so your line would be like
ps -C less --no-headers -o %cpu,%mem
).
For the triple-record you must come up with a solution, what should happen, where more than one is running. In a multiuser environment with programms that are threading there can always be situations where you have more than one process of a kind. There are many possible solution where none can help you, as you dont say, way you are going to do with it. One can think of solutions like selecting only from one user, and only the one with the lowest pid, or the process-leader in case of groups, to change the enclosing bash-script to use a loop to handle the multiple values or make it working somehow different when ps returns multiple results.
I was not able to reproduce the problem, but to help you debug, try print $11 in your awk command, that will tell you what process it is talking about
cpumem="$(ps aux | grep -v 'grep' | grep 'firefox-bin' | awk '{printf $3 "\t" $4 "\t" $11 "\n"}')"
echo -e cpumem
It's actually an easy fix for the output display; In your echo statement, wrap the variable in double-quotes:
echo -e "$cpumem"
Without using double-quotes, newlines are not preserved by converting them to single-spaces (or empty values). With quotes, the original text of the variable is preserved when outputted.
If your output contains multiple processes (i.e. - multiple lines), that means your grep actually matched multiple lines. There's a chance a child-process is running for firefox-bin, maybe a plugin/container? With ps aux, the 11th column will tell you what the actual process is, so you can update your awk to be the following (for debugging):
awk '{printf $3 "\t" $4 "\t" $11}'

Get open ports as an array

So, I'm using netstat -lt to get open ports. However, I'm not interested in certain values (like SSH or 22), so I want to be able to exclude them. I also want to get them as an array in bash. So far I have netstat -lt | sed -r 's/tcp[^:]+://g' | cut -d' ' -f1 but they're not an array, nor am I excluding anything.
Try using the ss command, which replaces netstat.
ss -atu | awk '{print $5}' | awk -F: '{print $NF}'
The ss command gives you all TCP and UDP ports on the local machine (the only sockets that would have ports). The first awk extracts the column containing the local address and port number. The second awk takes only the last field following a colon; this is necessary in case you have IPv6 sockets on your machine, whose IP address will also include colons.
Once you've done this, you can grep out the ports you don't want. Also, see the documentation referred to by the ss man page for information on filters, which may let you filter out unwanted sockets from the output of ss.
Add ($()) around your statement:
port=($(netstat -ltn | sed -rne '/^tcp/{/:(22|25)\>/d;s/.*:([0-9]+)\>.*/\1/p}'))
Filtering ports 22 and 25.
a=( `netstat -ltn --inet | sed -r -e '1,2d''s/tcp[^:]+://g' | cut -d' ' -f1 | sed -e '1,2d' | grep -v "22\|33\|25"` )
second sed command removes headers if your version of netstat prints such. I have "Active" and "Proto" as first two lines. Use grep to filter unwanted ports. add -n to netstat to see port numbers instead of names. --inet is to force ipv4, otherwise you may see IPv6 which may confuse your script.
btw not sure you need an array. usually arrays are needed only if you are going to work on a subset of values you have. If you work on all values there are simpler constructs but not sure what you're going to do.
Regards.
update: you can use a single sed command with two operations instead of two separate invocations:
sed -r -e '1,2d' -e 's/tcp[^:]+://g'

Resources