AWK print after match with multi search - shell

i have a log as below need to parse with new format :
2018-08-14 12:07:06,410 - MAILER - INFO - Email sent! - (TEMPORARY PASSWORD: cristronaldode ) to ['cristronaldode#eeee.com'] - Message ID: 01010165369da693-216f985f-e1b0-4dc2-bcea-8a2cd275a506-000000 Result: {'MessageId': '01010165369da693-216f985f-e1b0-4dc2-bcea-8a2cd275a506-000000', 'ResponseMetadata': {'HTTPHeaders': {'content-length': '338', 'date': 'Tue, 14 Aug 2018 04:07:05 GMT', 'x-amzn-requestid': '81bbc0c4-9f77-11e8-81fe-8502a68e3b7d', 'content-type': 'text/xml'}, 'RetryAttempts': 0, 'RequestId': '81bbc0c4-9f77-11e8-81fe-8502a68e3b7d', 'HTTPStatusCode': 200}}
output :
2018-08-14 12:07:06,410|TEMPORARY PASSWORD: cristronaldode|cristronaldode#eeee.com|'HTTPStatusCode': 200|
i'm trying use awk and match function but don't know how can use multi match in 1 line. Thanks
Update: i was using this command for parsing field but because i'm seperating fields by space so need to correct field in all lines. so i want other solutions.
awk -F ' ' '{print $1,$2"|"$11,$12,$13,$14,$15,$16,$17,$18,$19"|"$21"|"$48,$49}' | sed -e 's/[()]//g' | sed -e 's/[][]//g'| sed -e 's/}//g'

The field seperator of awk can be regex.
awk -F '[][)(}{ ]*' '{print $1,$2}' file
We do now delimit on all the brackets like characters and count multiple occurrences of those characters as one.
You can figure out what fields you have to use now.

Related

Grep and awk use

i try one day but dont fixed. I dont know this method.
content query --uri content://com.android.contacts/contacts | grep "+9053158888" | awk -F'[,,= ]' '{cmd="content delete --uri content://com.android.contacts/contacts/"$(NF-3);system(cmd)}'
but not finding
My string
Row: 9991 last_time_contacted=0, phonetic_name=NULL, custom_ringtone=NULL, contact_status_ts=NULL, pinned=0, photo_id=NULL, photo_file_id=NULL, contact_status_res_package=NULL, contact_chat_capability=NULL, contact_status_icon=NULL, display_name_alt=+90532555688, sort_key_alt=+90532555688, in_visible_group=1, starred=0, contact_status_label=NULL, phonebook_label=#, is_user_profile=0, has_phone_number=1, display_name_source=40, phonetic_name_style=0, send_to_voicemail=0, lookup=0r10070-24121C1814241820221C1A14.3789r10071-24121C1814241820221C1A14.0r10072-24121C1814241820221C1A14.0r10073-24121C1814241820221C1A14.0r10074-24121C1814241820221C1A14.0r10075-24121C1814241820221C1A14.0r10078-24121C1814241820221C1A14.0r10082-24121C1814241820221C1A14.0r10083-24121C1814241820221C1A14.0r10084-24121C1814241820221C1A14.0r10085-24121C1814241820221C1A14.0r10086-24121C1814241820221C1A14.0r10087-24121C1814241820221C1A14.0r10092-24121C1814241820221C1A14.0r10094-24121C1814241820221C1A14.0r10097-24121C1814241820221C1A14, phonebook_label_alt=#, contact_last_updated_timestamp=1612984348874, photo_uri=NULL, phonebook_bucket=213, contact_status=NULL, display_name=+90532555688, sort_key=+90532555688, photo_thumb_uri=NULL, contact_presence=NULL, in_default_directory=1, times_contacted=0, _id=10097, name_raw_contact_id=10070, phonebook_bucket_alt=213
i need string " _id=10097 "
You may use this grep to find word _id followed by a = and 1+ digits:
... | grep -Eo '\b_id=[0-9]+'
_id=10097
To get all occurrences of if try following, written and tested with shown samples in GNU grep. Where str is your shell variable have your shown sample input in it.
echo "$str" | grep -oP ', \K_id=\d+'
OR try with awk:
echo "$str" |
awk 'match($0,/, _id=[0-9]+/){print substr($0,RSTART+2,RLENGTH-2)}'
Above will output as:
_id=10097

using shell commands how to extract field from command (zdb)

I need to expand/resize a ZFS disk but for this, I need to extract the children[0] guid from the output of zdb, the current output looks like this:
root#:/ # zdb
zroot:
version: 5000
name: 'zroot'
state: 0
txg: 448
pool_guid: 14102710366601156377
hostid: 1798585735
hostname: ''
com.delphix:has_per_vdev_zaps
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 14102710366601156377
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 12530249324826415927
path: '/dev/gpt/disk0'
whole_disk: 1
metaslab_array: 38
metaslab_shift: 24
ashift: 12
asize: 1066926080
is_log: 0
create_txg: 4
com.delphix:vdev_zap_leaf: 36
com.delphix:vdev_zap_top: 37
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
For automating this process and putting all the steps in a shell script, I came up with this:
zdb | grep -A4 "children\[0" | grep guid | awk -F ": " '{print $2}'
Which returns:
12530249324826415927
Putting all together script looks like this:
#!/bin/sh
DISK=`gpart list | head -n 1 | awk -F ": " '{print $2}'`
GUID=`zdb | grep -A4 "children\[0" | grep guid | awk -F ": " '{print $2}'`
gpart recover ${DISK}
gpart resize -i 3 ${DISK}
zpool online -e zroot ${GUID}
zfs set readonly=off zroot/ROOT/default
This is working but would like to know if there is a better way of extracting the fields without need to pipe too much, I am doing this on a raw FreeBSD setup in where the root/zpool is read-only so I can't install python, bash, etc, I am bound to use only the basic stack within /usr/bin like cut, awk, sed, etc.
Would be nice if I could get the values directly from the commands like zdb but since haven't found a straight way of doing it I need to do some shell kung-fu.
Any tips, suggestions for improving this?
You can do with a standalone Awk as below,
zdb | awk '/children\[0\]/{flag=1; next} flag && /guid:/{split($0,arr,":"); print arr[2]; flag=0}'
If a leading whitespace is worrying someway remove it using the sub() function as sub(/^[[:space:]]/,"",arr[2]) as
zdb | awk '/children\[0\]/{flag=1; next} flag && /guid:/{split($0,arr,":"); sub(/^[[:space:]]/,"",arr[2]); print arr[2]; flag=0}'
The idea is to identify the patter children[0], enable a flag and the next matching guid: is matched only when the flag is set. This avoids processing of lines with guid: is repeated more than once and skip them. The flag is reset when the first match is identified.
And do not use back-ticks for command substitution ever, use the more efficient way using $(..)

bash get value from long string

i dont understand sed, awk or grep - i try the last 3 hours to get a result, but i dont get the right answer.
What i have:
i get this information from icloud.com
{u'timeStamp': 1470252602355, u'locationFinished': True, u'longitude': XX.XXXXXXXXXXXXX, u'positionType': u'GPS-Course', u'locationType': None, u'latitude': XX.XXXXXXXXXXXXX, u'isOld': False, u'isInaccurate': False, u'horizontalAccuracy': 200.0}
this is the location of my iphone.
but i only need the latitude and the longitude. i have try with AWK, sed and grep - i dont become the right result.
In sed (where a.json is your file), it will print just the longitude value and the latitude value):
sed "s/.*u'longitude': \([^,]\+\).*u'latitude': \([^,]\+\).*/\1 \2/g" a.json
This grep pipe will snag those fields:
grep -o "'l[^:]\+itude':[^,]\+"
Looking like a json. You can parse it using jsawk, I haven't used it but looks like it will solve your problem easily.
echo "your json string" | jsawk "this.latitude"
Note: this will give answer(thanks sjsam). Handling regex can became more complex if json string varies. So better to handle this string as json not as ordinary string.
here you go...
$ tr ',' '\n' <file | grep '\(lati\|longi\)tude'
u'longitude': XX.XXXXXXXXXXXXX
u'latitude': XX.XXXXXXXXXXXXX
or, without tr
$ grep -o '\(lati\|longi\)tude[^,]*' file
longitude': XX.XXXXXXXXXXXXX
latitude': XX.XXXXXXXXXXXXX
Like jsawk, you can use jq to process your JSON:
json="{u'timeStamp': 1470252602355, u'locationFinished': True, u'longitude': XX.XXXXXXXXXXXXX, u'positionType': u'GPS-Course', u'locationType': None, u'latitude': XX.XXXXXXXXXXXXX, u'isOld': False, u'isInaccurate': False, u'horizontalAccuracy': 200.0}"
echo "$json" | jq '.latitude'
$ awk -v RS=',' -F'[\n:][[:space:]]*' '{for (i=1;i<NF;i++) if ($i ~ /(lat|long)itude/) print $(i+1) }' file
XX.XXXXXXXXXXXXX
XX.XXXXXXXXXXXXX

Bash : Huge file size processing issue in vim mode

I have a huge file size of about 500MB and each line will have data something like mentioned below.
#vim results.txt
{"count": 8, "time_first": 1450801456, "record": "A", "domain": "api.ai.", "ip": "54.240.166.223", "time_last": 1458561052}
{"count": 9, "time_first": 1450801456, "record": "A", "domain": "cnn.com.", "ip": "54.240.166.223", "time_last": 1458561052}
.........
25 Million lines are in total.
Now , I would like to keep the results.txt file as ,
8,1450801456,A,api.ai,54.240.166.223,1458561052
9,1450801456,A,cnn.com,54.240.166.223,1458561052
....
By removing the unwanted strings like count , time_first , record ,domain , ip , time_last.
Right now , In vim mode i'm removing each and every string. For example, I would do %s/{"count": //g .
For one string , It took more time to replace it.
I'm a beginner in Bash/shell, How can i do this using sed / awk ? Any suggestions please ?
With sed:
sed -E 's/[{ ]*"[^"]*": *|["}]//g' file
# ^ ^ ^ ^^---- remaining double quotes and the closing bracket
# | | | '----- OR
# | | '--------------- key enclosed between double quotes
# | '-------------------- leading opening curly bracket and spaces
# '------------------------- use ERE (Extended Regular Expression) syntax
Other way: using xidel that includes a json parser:
xidel -q file -e '$json/*' | sed 'N;N;N;N;N;y/\n/,/'
# ^ ^ ^ ^ ^---- translate newlines to commas
# | | | '-------------- append the next five lines
# | | '------------------------ all values
# | '------------------------------ for each json string
# '------------------------------------------ quiet mode
Shorter way from #BeniBela that doesn't need sed to join the fields together:
xidel -q file -e '$json/join(*,",")'
Something to consider:
$ awk -F'[{}":, ]+' -v OFS=, '{for (i=3;i<NF;i+=2) printf "%s%s", $i, (i<(NF-1)?OFS:ORS)}' file
8,1450801456,A,api.ai.,54.240.166.223,1458561052
9,1450801456,A,cnn.com.,54.240.166.223,1458561052
Get the book Effective Awk Programming, 4th Edition, by Arnold Robbins.

how to clear the specific lines from file based on some string in file in unix

I have a file as below -
Password:
Msg 2401, Level 11, State 2:
Server 'test':
Character set conversion is not available between client character set 'utf8' and server character set 'iso_1'.
No conversions will be done.
|Extraction_Date|Agent_Cde_1|Agent_Cde_2|Agent_Cde_3|Agent_Cde_4|Agent_Name
|20140902 |0010 | NULL| NULL| NULL|NULL
I want to delete all the lines which are present before column names. The number of lines present before column names can vary every time. Is there any way wherein we can check for 'Extraction_date' string and delete all the lines present above it using unix commands ?
This will print all line starting from Extraction date:
awk '/^\|Extraction_Date/ {f=1} f' file
|Extraction_Date|Agent_Cde_1|Agent_Cde_2|Agent_Cde_3|Agent_Cde_4|Agent_Name
|20140902 |0010 | NULL| NULL| NULL|NULL
Or this may be ok:
awk '/^\|/' file
Try the grep command
grep -F '|'
using sed address range:
sed -n '/^|Extraction_Date/,$p' file

Resources