How to grep only the first string in a line - bash

I'm writing a script that checks a list of all the users connected to the server (using who) and writes to the file Information the list of usernames of only those having letters a, b, c or d. This is what I have so far:
who | grep '[a-d]' >> Information
However, the command who displays this:
username pts/148 2019-01-29 16:09 (IP address)
What I don't understand is why my grep search is also displaying the pts/148, date, time, and IP address. I just want it to send the username to the file Information.
Any help is appreciated.

Another way is to use the command cut to get the first part of the string only.
who | cut -f 1 -d ' ' | grep '[a-d]' >> Information

Using awk to output records where the first clumn matches [a-d]:
$ who | awk '$1~/[a-d]/' >> Information
Using grep to search for lines with [a-d] before the first space:
$ who | grep -o "^[^ ]*[a-d][^ ]*" >> Information

You need to get the first word, otherwise grep will display the entire line that has the matching text. You could use awk:
who | awk '{ if (substr($1,1,1) ~ /^[a-d]/ ) print $1 }' >>Information

Related

Substring in linux using cut

I would like to grab a substring of a file to get the default password of mysql in centos.
This is the command I am using to get the password:
sudo grep 'temporary password' /var/log/mysqld.log
which result is:
2018-02-21T07:03:11.681201Z 1 [Note] A temporary password is generated for root#localhost: >KkHAt=#z6OV
Now, I am using this command to get the password only and remove the unnecessary stuff, so I can use it in a script:
sudo grep 'temporary password' /var/log/mysqld.log | cut -d ':' -f 4 | cut -d ' ' -f 2
But using 2 cuts seems very ugly. Is there another command or tool that I can use, or a more elegant way to do this?
Using awk:
$ awk '/temporary password/{print $NF}' file
>KkHAt=#z6OV
Bearing in mind that awk splits the lines in fields based on a field separator (by default whitspaces) and NF refers to the number of fields, you can print the last field with:
$ grep 'temporary password' /var/log/mysqld.log | awk '{print $NF}'

bash - get usernames from command last output

I need to get all users from file, containing information about all loggins within some time interval. (The delimiter is : )
So, I need to get all users from output of command last -f.
I tried to do this:
last -f file| cut -d ":" -f1
but in the output aren't just the usernames. It seems to me like some record take more than just one line and therefore it can't distinguish the records. I don't know.
Could you help me please? I would be grateful for any advice.
You could say:
last -f file | awk '{print $1}'
If you want to use cut, say:
last -f file| cut -d " " -f1

How to append string to file if it is not included in the file?

Protagonists
The Admin
Pipes
The Cron Daemon
A bunch of text processing utilities
netstat
>> the Scribe
Setting
The Cron Daemon is repeatedly performing the same job where he forces an innocent netstat to show the network status (netstat -n). Pipes then have to pick up the information and deliver it to bystanding text processing utilities (| grep tcp | awk '{ print $5 }' | cut -d "." -f-4). >> has to scribe the important results to a file. As his highness, The Admin, is a lazy and easily annoyed ruler, >> only wants to scribe new information to the file.
*/1 * * * * netstat -n | grep tcp | awk '{ print $5 }' | cut -d "." -f-4 >> /tmp/file
Soliloquy by >>
To append, or not append, that is the question:
Whether 'tis new information to bother The Admin with
and earn an outrageous Fortune,
Or to take Arms against `netstat` and the others,
And by opposing, ignore them? To die: to sleep;
note by the publisher: For all those that had problems understanding Hamlet, like I did, the question is, how do I check if the string is already included in the file and if not, append it to the file?
Unless you are dealing with a very big file, you can use the uniq command to remove the duplicate lines from the file. This means you will also have the file sorted, I don't know if this is an advantage or disadvantage for you:
netstat -n | grep tcp | awk '{ print $5 }' | cut -d "." -f-4 >> /tmp/file && sort /tmp/file | uniq > /tmp/file.uniq
This will give you the sorted results without duplicates in /tmp/file.uniq
What a piece of work is piping, how easy to reason about,
how infinite in use cases, in bash and script,
how elegant and admirable in action,
how like a vim in flexibility,
how like a gnu!
Here is a slightly different take:
netstat -n | awk -F"[\t .]+" '/tcp/ {print $9"."$10"."$11"."$12}' | sort -nu | while read ip; do if ! grep -q $ip /tmp/file; then echo $ip >> /tmp/file; fi; done;
Explanation:
awk -F"[\t .]+" '/tcp/ {print $9"."$10"."$11"."$12}'
Awk splits the input string by tabs and ".". The input string is filtered (instead of using a separate grep invocation) by lines containing "tcp". Finally the resulting output fields are concatenated with dots and printed out.
sort -nu
Sorts the IPs numerically and creates a set of unique entries. This eliminates the need for the separate uniq command.
if ! grep -q $ip /tmp/file; then echo $ip >> /tmp/file; fi;
Greps for the ip in the file, if it doesn't find it, the ip gets appended.
Note: This solution does not remove old entries and clean up the file after each run - it merely appends - as your question implied.

Extracting a word from the middle of a sentence

I have a log file like this:
2013-07-10 21:40:54 [INFO] Janus_Mesca joined the game
2013-07-10 21:40:54 [INFO] Fenlig joined the game
2013-07-10 21:41:21 [INFO] BigRedHoodie joined the game
I'm trying to print whatever appears in between "[INFO]" and "joined".
With my attempts I've only been able to remove the two words themselves.
tail -500 $rfile | grep "INFO.*joined the game" | \
sed -e 's/\[INFO\]\(.*\)joined/\1/'
Can you help?
Pure grep version with lookahead/lookbehind.
P.S. Option -P might not be available everywhere, but I thought it was clever.
tail test.log | grep -Po '(?<=\[INFO\] ).*(?= joined .*)'
You're almost there. You just need to make the pattern match the entire line, and replace it with the name you've captured.
You can also eliminate the need for grep by using a lesser-known feature of sed: Use the -n flag to prevent it from printing each line by default, and add a p command to make it print the matching lines:
tail -n 500 $rfile | sed -n 's/.*INFO] \(.*\)joined .*/\1/p'
This is an awk answer:
awk -F" " '{print $4}' data
where data is the input file. Provided the delimiter is a space, the output is like:
Janus_Mesca
Fenlig
BigRedHoodie
If you want to stick more strictly to the between [INFO] and joined here's an alternative:
awk -F"\\[INFO\\] " '{ split( $2, arr, " joined" ); print arr[1] }' data
for which I had to check out this answer to find out how to escape the square brackets. If you want the leading and trailing spaces left in the user name, take them out of each respective pattern.

Using sed/awk to limit/parse output of LDAP DN's

I have a large list of LDAP DN's that are all related in that they failed to import into my application. I need to query these against my back-end database based on a very specific portion of the CN, but I'm not entirely sure on how I can restrict down the strings to a very specific value that is not necessarily located in the same position every time.
Using the following bash command:
grep 'Failed to process entry' /var/log/tomcat6/catalina.out | awk '{print substr($0, index($0,$14))}'
I am able to return a list of DN's similar to: (sorry for the redacted nature, security dictates)
"cn=[Last Name] [Optional Middle Initial or Suffix] [First Name] [User name],ou=[value],ou=[value],o=[value],c=[value]".
The CN value can be confusing as the order of surname, given name, middle initial, prefix or suffix can be displayed in any order if the values even exist, but one thing does remain consistent, the username is always the last field in the cn (followed by a "," then the first of many potential OU's). I need to parse out that user name for querying, preferably into a comma separated list for easy copy and paste for use in a SQL IN() query or use in a bash script. So as an example, imagine the following short list of abbreviated DNs, only showing the CN value (since the rest of the DN is irrelevant):
"cn=Doe Jr. John john.doe,ou=...".
"cn=Doe A. Jane jane.a.doe,ou=...".
"cn=Smith Bob J bsmith,ou=...".
"cn=Powers Richard richard.powers1,ou=...".
I would like to have a csv list returned that looks like:
john.doe,jane.a.doe,bsmith,richard.powers1
Can a mix of awk and/or sed accomplish this?
sed -e 's/"^[^,]* \([^ ,]*\),.*/\1/'
will parse the username part of the common name and isolate the username. Follow up with
| tr '\n' , | sed -e 's/,$/\n/'
to convert the one-per-line username format into comma-separated form.
Here is one quick and dirty way of doing it -
awk -v FS="[\"=,]" '{ print $3}' file | awk -v ORS="," '{print $NF}' | sed 's/,$//'
Test:
[jaypal:~/Temp] cat ff
"cn=Doe Jr. John john.doe,ou=...".
"cn=Doe A. Jane jane.a.doe,ou=...".
"cn=Smith Bob J bsmith,ou=...".
"cn=Powers Richard richard.powers1,ou=...".
[jaypal:~/Temp] awk -v FS="[\"=,]" '{ print $3}' ff | awk -v ORS="," '{print $NF}' | sed 's/,$//'
john.doe,jane.a.doe,bsmith,richard.powers1
OR
If you have gawk then
gawk '{ print gensub(/.* (.*[^,]),.*/,"\\1","$0")}' filename | sed ':a;{N;s/\n/,/}; ba'
Test:
[jaypal:~/Temp] gawk '{ print gensub(/.* (.*[^,]),.*/,"\\1","$0")}' ff | sed ':a;{N;s/\n/,/}; ba'
john.doe,jane.a.doe,bsmith,richard.powers1
Given a file "Document1.txt" containing
cn=Smith Jane batty.cow,ou=ou1_value,ou=oun_value,o=o_value,c=c_value
cn=Marley Bob reggae.boy,ou=ou1_value,ou=oun_value,o=o_value,c=c_value
cn=Clinton J Bill ex.president,ou=ou1_value,ou=oun_value,o=o_value,c=c_value
you can do a
cat Document1.txt | sed -e "s/^cn=.* \([A-Za-z0-9._]*\),ou=.*/\1/p"
which gets you
batty.cow
reggae.boy
ex.president
using tr to transalate the end of line character
cat Document1.txt | sed -n "s/^cn=.* \([A-Za-z0-9._]*\),ou=.*/\1/p" | tr '\n' ','
produces
batty.cow,reggae.boy,ex.president,
you will need to deal with the last comma
but if you want it in a database say oracle for example, a script containing:
#!/bin/bash
doc=$1
cat ${doc} | sed -e "s/^cn=.* \([A-Za-z0-9._]*\),ou=.*/\1/p" | while read username
do
sqlplus -s username/password#instance <<+++ insert into mytable (user_name) values ('${username}'\;)
exit
+++
done
N.B.
The A-Za-z0-9._ in the sed expression is every type of character you expect in the username - you may need to play with that one.
caveat - I did't test the last bit with the database insert in it!
Perl regex solution that I consider more readable than the alternatives, in case you're interested:
perl -ne 'print "$1," if /(([[:alnum:]]|[[:punct:]])+),ou/' input.txt
Prints the string preceding 'ou', accepts alphanumeric and punctuation chars (but no spaces, so it stops at the username).
Output:
john.doe,jane.a.doe,bsmith,
It has been over a year since there has been an idea posted to this, but wanted a place to refer to in the future when this class of question comes up again. Also, I did not see a similar answer posted.
Of the pattern of data provided, my interpretation is that we can strip away everything after the first comma, leaving us with a true CN rather than a DN that starts with a CN.
In the CN, we strip everything before and including the last white space.
This will leave us with the username.
awk -F',' /^cn=/{print $1}' ldapfile | awk '{print $NF}' >> usernames
Passing your ldap file to awk, with the field separator set to comma, and the match string set to cn= at the beginning of a line, we print everything up to the first comma. Then we pipe that output into an awk with the default field separator and print only the last field, resulting in just the username. We redirect and append this to a file in the current directory named usernames, and we end up with one username per line.
To convert this into a single comma separated line of usernames, we change the last print command to printf, leaving out the \n newline character, but adding a comma.
awk -F',' /^cn=/{print $1}' ldapfile | awk '{printf $NF","}' >> usersnames
This leaves the only line in the file with a trailing comma, but since it is only intended to be used for cut and paste, simply do not cut the last character. :)

Resources