Read two files simultaneously and create one from them

Read two files simultaneously and create one from them - bash

I am new to Bash scripting, but do understand most of the basics. My scenario is as follows:
I have a server from which I get a load of data via cURL. This is parsed properly (XML format) and from these results I then extract the data I want. The cURL statement writes its output to a file called temp-rec-schedule.txt. The below code is what I use to get the values I want to use in further calculation.
MP=`cat temp-rec-schedule.txt | grep "<ns3:mediapackage" | awk -F' ' '{print $3}' | cut -d '=' -f 2 | awk -F\" '{print $(NF-1)}'`
REC_TIME=`cat temp-rec-schedule.txt | grep "<ns3:mediapackage" | awk -F' ' '{print $2}' | cut -d '=' -f 2 | awk -F\" '{print $(NF-1)}'`
So this all still work perfectly. The output of the above code is respectively (if written to two separate files):
MP output:
b1706f0d-2cf1-4fd6-ab60-ae4d08608f1f
fd578fcc-342c-4f6c-986a-794ccb1abd0c
ce9f40e9-8e2c-4654-ba1c-7f79d35a69fd
c31a2354-6f4b-4bfe-b51e-2bac80889897
df342d88-c660-490e-9da6-9c91a7966536
49083f88-4264-4629-80fb-fae480d0bb25
946121c7-4948-4254-9cb5-2457e1b99685
f7bd0cad-e8f5-4e3d-a219-650d07a4bb34
REC_TIME output:
2014-09-15T07:30:00Z
2014-09-19T08:58:00Z
2014-09-22T07:30:00Z
2014-10-13T07:30:00Z
2014-10-17T08:58:00Z
2014-10-20T07:30:00Z
2014-10-22T13:28:00Z
2014-10-27T07:30:00Z
What I want to do now is create a file where line1 from file1 is appended with line1 from file2. i.e. :
b1706f0d-2cf1-4fd6-ab60-ae4d08608f1f 2014-09-15T07:30:00Z
fd578fcc-342c-4f6c-986a-794ccb1abd0c 2014-09-19T08:58:00Z
and so on.
I am not really familiar with Perl, but do know a little bit about Bash, so if it is possible, I would like to do this in Bash.
Further, from here, I want to compare two files that contain the same MP variable, but have two different TIME values assigned: subtract the one value from the other, and calculate the amount of hours that have passed between. This is all to calculate the amount of hours that have passed between publishing a video on our system, and the start time of the recording. Basically:
File1's output: b1706f0d-2cf1-4fd6-ab60-ae4d08608f1f 2014-09-15T07:30:00Z
File2's output: b1706f0d-2cf1-4fd6-ab60-ae4d08608f1f 2014-09-15T09:30:00Z
The output of my script should yield a value of 2 hours.
How can I do this with Bash?

You're probably better off just using awk for the whole thing. Something like:
awk '/<ns3:medipacakge/{gsub("\"","");
split($3,mp,"=");
split($2,rt,"="); print mp[2],rt[2]}' temp-rec-schedule.txt

The answer to the first question is to write the output to two different files and then use paste.
grep "<ns3:mediapackage" temp-rec-schedule.txt | awk -F' ' '{print $3}' | cut -d '=' -f 2 | awk -F\" '{print $(NF-1)}' > MP_out.txt
grep "<ns3:mediapackage" temp-rec-schedule.txt | awk -F' ' '{print $2}' | cut -d '=' -f 2 | awk -F\" '{print $(NF-1)}' > REC_out.txt
paste MP_out.txt REC_out.txt
That being said (and as #WilliamPursell says in his comment on the OP) there is never a reason to string this series of commands together since awk can do all the things you are doing there with significantly less overhead and more flexibility.

Related

Print all the instances of a matching pattern in a file

I've been trying to print all the instances of a matching pattern from file.
Input file:
{"id":"prod123","a":1.3,"c":"xyz","q":2},
{"id":"prod456","a":1.3,"c":"xyz","q":1}]}
{"id":"prod789","a":1.3,"currency":"xyz","q":2},
{"id":"prod101112","a":1.3,"c":"xyz","q":1}]}
I'd want to print everything between "id":" and ",.
Expected output:
prod123
prod456
prod789
prod101112
I'm using the command
grep -Eo 'id\"\:\"[^"]+"\"\,*' | grep -Eo '^[^"]+'
Am I missing anything here?

What went wrong is the place of the comma in the first grep:
grep -Eo 'id.\:.[^"]+"\,"' inputfile
You need to do something extra for getting the desired substring.
grep -Eo 'id.\:.[^"]+"\,"' inputfile | cut -d: -f2 | grep -Eo '[^",]+'
I used cut, that would be easy for your example input.
cut -d'"' -f4 < inputfile
You have alternatives, like using jq, or
sed -r 's/\{"id":"([^"]*).*/\1/' inputfile
or using awk (solution now like cut but can be changed easy)
awk -F'"' '{print $4}' inputfile

Loop that prints twice in bash

I am writing this bash script that is supposed to print out all the users that have never logged in with an option to sort them. I have managed to get all the input working, however, I am encountering issues when it comes to printing the output. The loop goes as follows:
for user in $(lastlog | grep -i 'never' | awk '{print $1}'); do
grep $user /etc/passwd | awk -F ':' '{print $1, $3}'
done
of course, this loop doesn't sort the output, however, from my limited understanding of shells and shell scripting it should only be a matter of putting a ' | sort' after the first "awk '{print $1}'". my problem is that the output of this loop prints every user at least twice, and in some instances, four times. Why is that and how can I fix it?

Well, let's try to debug it:
for user in $(lastlog | grep -i 'never' | awk '{print $1}'); do
echo "The user '$user' matches these lines:"
grep $user /etc/passwd | awk -F ':' '{print $1, $3}'
echo
done
This outputs:
The user 'daemon' matches these lines:
daemon 1
colord 112
The user 'bin' matches these lines:
root 0
daemon 1
bin 2
sys 3
sync 4
games 5
man 6
(...)
And indeed, the entry for colord does contain daemon:
colord:x:112:120:colord colour management daemon,,,:/var/lib/colord:/bin/false
^-- Here
And the games entry does match bin:
games:x:5:60:games:/usr/games:/usr/sbin/nologin
^-- Here
So instead of matching the username string anywhere, we just want to match it from the start of the line until the first colon:
for user in $(lastlog | grep -i 'never' | awk '{print $1}'); do
echo "The user '$user' matches these lines:"
grep "^$user:" /etc/passwd | awk -F ':' '{print $1, $3}'
echo
done
And now each entry only shows the singly entry it was supposed to, so you can remove the echos and keep going.
If you're interested in finesse and polish, here's an alternative solution that works efficiently across language settings, weird usernames, network auth, large lists, etc:
LC_ALL=C lastlog |
awk -F ' ' '/Never logged in/ {printf "%s\0", $1}' |
xargs -0 getent passwd |
awk -F : '{print $1,$3}' |
sort

Just think what happens with a user named sh. How many users would grep sh match? Probably all of them, since each is using some shell in the shell field.
You should think about
awk -F ':' '$1 == "'"$user"'" {print $1, $3}' /etc/passwd
or with an awk variable for user:
awk -F ':' -vuser="$user" '$1 == user {print $1, $3}' /etc/passwd

Your grep will match multiple lines, (man will match manuel and norman etc.) anchor it to the beginning of the line and add a trail :.
grep "^${user}:" /etc/passwd | awk -F ':' '{print $1, $3}'
A better option might be to forget about grepping /etc/passwd completely and use the id command to get the user id:
id=$(id -u "${user}" 2>/dev/null) && printf "%s %d\n" "${user}" "${id}"
If the id command fails nothing is printed, or it could be modified to be:
id=$(id -u "${user}" 2>/dev/null)
printf "%s %s\n" "${user}" "${id:-(User not found)}"
In gnu linux I'm pretty sure that the found users id not existing isn't possible as lastlog will only report existing users so the second example may be pointless.

bash to extract second half of name

Ok so with the new High Sierra, I am trying to write a script to automatically delete there local snapshots that eat up HDD space. I know you can shrink using thinlocalsnapshots / 1000000000 4 but I feel like that is only a band-aid.
So what I am trying to do is extract the date 2018-02-##-###### from:
sudo tmutil listlocalsnapshots /
com.apple.TimeMachine.2018-02-15-170531
com.apple.TimeMachine.2018-02-15-181655
com.apple.TimeMachine.2018-02-15-223352
com.apple.TimeMachine.2018-02-16-000403
com.apple.TimeMachine.2018-02-16-013400
com.apple.TimeMachine.2018-02-16-033621
com.apple.TimeMachine.2018-02-16-063811
com.apple.TimeMachine.2018-02-16-080812
com.apple.TimeMachine.2018-02-16-090939
com.apple.TimeMachine.2018-02-16-100459
com.apple.TimeMachine.2018-02-16-110325
com.apple.TimeMachine.2018-02-16-122954
com.apple.TimeMachine.2018-02-16-141223
com.apple.TimeMachine.2018-02-16-151309
com.apple.TimeMachine.2018-02-16-161040
I have tried variations of
| awk '{print $ } (insert number after $)
along with
| cut -d ' ' -f 10-.
Please if you know what I am missing here I would greatly appreciate it
edit: Here is the script that will get rid of those pesky Local snapshots.If anyone is interested, Thanks again:
#! /bin/bash
dates=`tmutil listlocalsnapshots / | awk -F "." 'NR++1{print $4}'`
for dates in $dates
do
tmutil deletelocalsnapshots $dates
done

You were close:
somecommand | cut -d"." -f4-
# or
somecommand | awk -F"." '{print $4}'
You can also try sed, but cut is made for this.

1- awk: you can either specify the field separator with the -F option, or print a substring
awk -F. '{print $4}'
awk '{print substr($0,23)}'
2- cut: equivalently.
cut -d. -f4
cut -c23-
3- Pure bash (sloooooow!): same as above.
while IFS=. read s1 s2 s3 d; do echo "$d"; done
while read line; do echo "${line:23}"; done
In practice, with a small number of records as in your use case, speed is not an issue and even pure bash or regexps (as in other aswers) can be used. As the number of records grows, the higher speed of awk and cut becomes noticeable.

Using grep and a regex :
$ grep -oP '\d{4}-\d{2}-\d{2}-\d{6}$'
2018-02-15-170531
2018-02-15-181655
2018-02-15-223352
2018-02-16-000403
2018-02-16-013400
2018-02-16-033621
2018-02-16-063811
2018-02-16-080812
2018-02-16-090939
2018-02-16-100459
2018-02-16-110325
2018-02-16-122954
2018-02-16-141223
2018-02-16-151309
2018-02-16-161040

grep serial numbers not starting with specific prefix

I have this file (serials.txt) containing serial numbers:
S/N:175-1915011190
S/N:244-1920023447
S/N:335-1920101144
S/N:244-1920101149
Using grep or similar tool I want to select all serials NOT starting with '244'
I'm able to select all the '244' with grep -Eo '244-[0-9]*' serials.txt but I want the opposite.
Something like grep -Eo '(^244)-[0-9]*' serials.txt
The output should be (without S/N:)
175-1915011190
335-1920101144

Following awk may help you in same.
awk '!/S\/N:244/' Input_file
EDIT: Above code will give complete line as output if you need starting from serial number to till end in output then following may help you.
awk -F':' '!/S\/N:244/{print $2}' Input_file
EDIT2: Adding a sed solution too here for same.
sed -n '/:244/d;s/.*://;p' Input_file

The -v option on grep would be helpful here, and then cut to remove the leading cruft:
grep -v ':244-' serials.txt | cut -c5-

Here you go, without S/N:
grep -v ':244' serials.txt | cut -d':' -f2
Antigrep for :244, cuts with delimiter : shows field 2.

awk -F':' '$2!~/^244/{print $2}' file

Parse file by splitting string in file and get desired output using single command

I'm using bash to look into file and parse the results. Can someone tell me how to use cut/awk to split the string and get desired output by using single command? I can get through individual cut and get the below output (with 2 commands and concatenation) but i want to do using single command instead of two commands.
test.log:
1/98 | (PASSED) com.yahoo.qa.java.projects.stackoverview.questions.Password_01() | 21:20:20
Tried code:
str1=`cat test.log | tail -1 | cut -d '|' -f 1`
str2=`cat test.log | tail -1 | cut -d '|' -f 2 | sed -e 's/com.yahoo.qa.java.projects./''/g'`
str3="${str1} | ${str2}"
Expected:
1/98 | (PASSED) stackoverview.questions.Password_01

Since this is a simple substitution on an individual line it's better suited to sed than awk and not at all appropriate for cut:
$ sed 's/\(.*| [^ ]* \)com\.yahoo\.qa\.java\.projects\.\([^(]*\).*/\1\2/' file
1/98 | (PASSED) stackoverview.questions.Password_01

Following single awk may help you in same.
awk 'END{sub(/com\.yahoo\.qa\.java\.projects\./,"",$4);print $1,$2,$3,$4}' Input_file
OR for all kind of awks following may help you in same too.(As per SIR ED's suggestions):
awk '{value=$0} END{split(value, a," ");sub(/com.yahoo.qa.java.projects\./,"",a[4]);print a[1],a[2],a[3],a[4]}' Input_file

Using awk
$ awk -F "com[.]yahoo[.]qa[.]java[.]projects[.]" 'sub(/\(\).*/,"",$2)' file
1/98 | (PASSED) stackoverview.questions.Password_01

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Read two files simultaneously and create one from them - bash

You're probably better off just using awk for the whole thing. Something like: awk '/<ns3:medipacakge/{gsub("\"",""); split($3,mp,"="); split($2,rt,"="); print mp[2],rt[2]}' temp-rec-schedule.txt

Related

Print all the instances of a matching pattern in a file

Loop that prints twice in bash

bash to extract second half of name

grep serial numbers not starting with specific prefix

Parse file by splitting string in file and get desired output using single command

Categories

Resources