bash get value from long string - bash

i dont understand sed, awk or grep - i try the last 3 hours to get a result, but i dont get the right answer.
What i have:
i get this information from icloud.com
{u'timeStamp': 1470252602355, u'locationFinished': True, u'longitude': XX.XXXXXXXXXXXXX, u'positionType': u'GPS-Course', u'locationType': None, u'latitude': XX.XXXXXXXXXXXXX, u'isOld': False, u'isInaccurate': False, u'horizontalAccuracy': 200.0}
this is the location of my iphone.
but i only need the latitude and the longitude. i have try with AWK, sed and grep - i dont become the right result.

In sed (where a.json is your file), it will print just the longitude value and the latitude value):
sed "s/.*u'longitude': \([^,]\+\).*u'latitude': \([^,]\+\).*/\1 \2/g" a.json

This grep pipe will snag those fields:
grep -o "'l[^:]\+itude':[^,]\+"

Looking like a json. You can parse it using jsawk, I haven't used it but looks like it will solve your problem easily.
echo "your json string" | jsawk "this.latitude"
Note: this will give answer(thanks sjsam). Handling regex can became more complex if json string varies. So better to handle this string as json not as ordinary string.

here you go...
$ tr ',' '\n' <file | grep '\(lati\|longi\)tude'
u'longitude': XX.XXXXXXXXXXXXX
u'latitude': XX.XXXXXXXXXXXXX
or, without tr
$ grep -o '\(lati\|longi\)tude[^,]*' file
longitude': XX.XXXXXXXXXXXXX
latitude': XX.XXXXXXXXXXXXX

Like jsawk, you can use jq to process your JSON:
json="{u'timeStamp': 1470252602355, u'locationFinished': True, u'longitude': XX.XXXXXXXXXXXXX, u'positionType': u'GPS-Course', u'locationType': None, u'latitude': XX.XXXXXXXXXXXXX, u'isOld': False, u'isInaccurate': False, u'horizontalAccuracy': 200.0}"
echo "$json" | jq '.latitude'

$ awk -v RS=',' -F'[\n:][[:space:]]*' '{for (i=1;i<NF;i++) if ($i ~ /(lat|long)itude/) print $(i+1) }' file
XX.XXXXXXXXXXXXX
XX.XXXXXXXXXXXXX

Related

Grep and awk use

i try one day but dont fixed. I dont know this method.
content query --uri content://com.android.contacts/contacts | grep "+9053158888" | awk -F'[,,= ]' '{cmd="content delete --uri content://com.android.contacts/contacts/"$(NF-3);system(cmd)}'
but not finding
My string
Row: 9991 last_time_contacted=0, phonetic_name=NULL, custom_ringtone=NULL, contact_status_ts=NULL, pinned=0, photo_id=NULL, photo_file_id=NULL, contact_status_res_package=NULL, contact_chat_capability=NULL, contact_status_icon=NULL, display_name_alt=+90532555688, sort_key_alt=+90532555688, in_visible_group=1, starred=0, contact_status_label=NULL, phonebook_label=#, is_user_profile=0, has_phone_number=1, display_name_source=40, phonetic_name_style=0, send_to_voicemail=0, lookup=0r10070-24121C1814241820221C1A14.3789r10071-24121C1814241820221C1A14.0r10072-24121C1814241820221C1A14.0r10073-24121C1814241820221C1A14.0r10074-24121C1814241820221C1A14.0r10075-24121C1814241820221C1A14.0r10078-24121C1814241820221C1A14.0r10082-24121C1814241820221C1A14.0r10083-24121C1814241820221C1A14.0r10084-24121C1814241820221C1A14.0r10085-24121C1814241820221C1A14.0r10086-24121C1814241820221C1A14.0r10087-24121C1814241820221C1A14.0r10092-24121C1814241820221C1A14.0r10094-24121C1814241820221C1A14.0r10097-24121C1814241820221C1A14, phonebook_label_alt=#, contact_last_updated_timestamp=1612984348874, photo_uri=NULL, phonebook_bucket=213, contact_status=NULL, display_name=+90532555688, sort_key=+90532555688, photo_thumb_uri=NULL, contact_presence=NULL, in_default_directory=1, times_contacted=0, _id=10097, name_raw_contact_id=10070, phonebook_bucket_alt=213
i need string " _id=10097 "
You may use this grep to find word _id followed by a = and 1+ digits:
... | grep -Eo '\b_id=[0-9]+'
_id=10097
To get all occurrences of if try following, written and tested with shown samples in GNU grep. Where str is your shell variable have your shown sample input in it.
echo "$str" | grep -oP ', \K_id=\d+'
OR try with awk:
echo "$str" |
awk 'match($0,/, _id=[0-9]+/){print substr($0,RSTART+2,RLENGTH-2)}'
Above will output as:
_id=10097

remove first column from hexdump output

I have a hexdump output that looks like this
0101f10 64534 64943 00568 00262 01077 00721 00297 00140
0101f20 00748 00288 02211 01124 02533 01271 02451 00997
0101f30 03056 01248 02894 01026 02397 00696 00646 65114
0101f40 00943 64707 01113 64179 01135 64179 00805 64109
0101f50 00514 64045 64654 63037 63026 62014 62173 61625
I want to remove the first column, but I don't know what delimiter has been used by the hexdump command. I tried with awk and cut, but cant figure it out. Any help is appreciated.
Output I want is
64534 64943 00568 00262 01077 00721 00297 00140
00748 00288 02211 01124 02533 01271 02451 00997
03056 01248 02894 01026 02397 00696 00646 65114
00943 64707 01113 64179 01135 64179 00805 64109
00514 64045 64654 63037 63026 62014 62173 61625
With sed
sed 's/[^[:blank:]]*[[:blank:]]*//' infile
With gnu sed
sed 's/\S*\s*//' infile
input... | sed -E $'s/ +/\t/g' | cut -f2-
(Assuming Bash for $'\t', but GNU sed supports \t directly anyway.)
hexdump /path/to/file | awk '{sub(/[^ ]+ /, ""); print $0}'
This will do the job.
If the the delimiter is really a bunch of space, use tr to squeeze-repeats (-s) of psace to a tab and use cut for getting rid of the first column:
$ cat file | tr -s ' ' '\t' | cut -f 2-
64534 64943 00568 00262 01077 00721 00297 00140
00748 00288 02211 01124 02533 01271 02451 00997
03056 01248 02894 01026 02397 00696 00646 65114
00943 64707 01113 64179 01135 64179 00805 64109
00514 64045 64654 63037 63026 62014 62173 61625
All solution above works fine, Just adding an awk solution.
So, you only need to omit first column, but get the rest of it, you can try this :
awk '{$1=""; print $0}' /path/to/hexfile
It works perfectly, except that it leaves a space at beginning of each line. If that bothers you, there is a workaround using the substr() function in awk itself.
awk '{$1=""; print substr($0,2)}' /path/to/hexfile
To see more possible ways to do it, follow this link

How do I extract the content of quoted strings from the output of a shell command

The following shell command returns an output with 3 items:
cred="$(aws sts assume-role --role-arn arn:aws:iam::01234567899:role/test --role-session-name s3-access-example --query '[Credentials.AccessKeyId, Credentials.SecretAccessKey, Credentials.SessionToken]')"
echo $cred returns the following output:
[ "ASRDTDRSIJGISGDT", "trttr435", "DF/////eraesr43" ]
How do I retrieve the value between double quotes? For example, trttr435
How to achieve this? Use regex? or other options?
IFS=', ' credArray=(`echo "$cred" | tr -d '"[]'`)
Simple as ... that
Testing
cred='[ "ASRDTDRSIJGISGDT", "trttr435", "DF/////eraesr43" ]'
IFS=', ' credArray=(`echo "$cred" | tr -d '"[]'`)
for i in "${credArray[#]}"; do echo "[$i]"; done
echo "2nd parameter is ${credArray[1]}"
Output
[ASRDTDRSIJGISGDT]
[trttr435]
[DF/////eraesr43]
2nd parameter is trttr435
Tested on Mac OS bash and CentOS bash
I didn't quite catch if the [ and ] are in the $cred or not, or what is your expected output but this will return everything between double quotes:
$ awk '{while(match($0,/"[^"]+"/)){print substr($0,RSTART+1,RLENGTH-2);$0=substr($0,RSTART+RLENGTH)}}' file
ASRDTDRSIJGISGDT
trttr435
DF/////eraesr43
You could and probably would like to:
$ echo "$cred" | awk ... # add above script here
Edit: If you just want to get the quoted string from second field ($2):
$ awk -F, '{match($2,/"[^"]+"/);print substr($2,RSTART+1,RLENGTH-2)}' file
trttr435
or even:
$ awk -F, '{gsub(/^[^"]+"|"[^"]*$/,"",$2);print $2}' file
Or use python, because the content of cred is already a valid python array:
#!/bin/bash
cred='[ "ASRDTDRSIJGISGDT", "trttr435", "DF/////eraesr43" ]'
python-script() {
local INDEX=$1
echo "arr=$cred"
echo "print(arr[$INDEX])"
}
item() {
local INDEX=$1
python-script "$INDEX" | python
}
echo "item1=$(item 1)"
echo "item2=$(item 2)"
Another crude but effective way of extracting the values you need would be to use awk with " as the split delimiter. The valid positions, in this case, would be $2, $4, $6
OUT="[ \"ASRDTDRSIJGISGDT\", \"trttr435\", \"DF/////eraesr43\" ]"
echo $OUT | awk -F '"' '{print $4}'
I would advise you to use python if you need to do a lot of string parsing.

Bash command to extract characters in a string

I want to write a small script to generate the location of a file in an NGINX cache directory.
The format of the path is:
/path/to/nginx/cache/d8/40/32/13febd65d65112badd0aa90a15d84032
Note the last 6 characters: d8 40 32, are represented in the path.
As an input I give the md5 hash (13febd65d65112badd0aa90a15d84032) and I want to generate the output: d8/40/32/13febd65d65112badd0aa90a15d84032
I'm sure sed or awk will be handy, but I don't know yet how...
This awk can make it:
awk 'BEGIN{FS=""; OFS="/"}{print $(NF-5)$(NF-4), $(NF-3)$(NF-2), $(NF-1)$NF, $0}'
Explanation
BEGIN{FS=""; OFS="/"}. FS="" sets the input field separator to be "", so that every char will be a different field. OFS="/" sets the output field separator as /, for print matters.
print ... $(NF-1)$NF, $0 prints the penultimate field and the last one all together; then, the whole string. The comma is "filled" with the OFS, which is /.
Test
$ awk 'BEGIN{FS=""; OFS="/"}{print $(NF-5)$(NF-4), $(NF-3)$(NF-2), $(NF-1)$NF, $0}' <<< "13febd65d65112badd0aa90a15d84032"
d8/40/32/13febd65d65112badd0aa90a15d84032
Or with a file:
$ cat a
13febd65d65112badd0aa90a15d84032
13febd65d65112badd0aa90a15f1f2f3
$ awk 'BEGIN{FS=""; OFS="/"}{print $(NF-5)$(NF-4), $(NF-3)$(NF-2), $(NF-1)$NF, $0}' a
d8/40/32/13febd65d65112badd0aa90a15d84032
f1/f2/f3/13febd65d65112badd0aa90a15f1f2f3
With sed:
echo '13febd65d65112badd0aa90a15d84032' | \
sed -n 's/\(.*\([0-9a-f]\{2\}\)\([0-9a-f]\{2\}\)\([0-9a-f]\{2\}\)\)$/\2\/\3\/\4\/\1/p;'
Having GNU sed you can even simplify the pattern using the -r option. Now you won't need to escape {} and () any more. Using ~ as the regex delimiter allows to use the path separator / without need to escape it:
sed -nr 's~(.*([0-9a-f]{2})([0-9a-f]{2})([0-9a-f]{2}))$~\2/\3/\4/\1~p;'
Output:
d8/40/32/13febd65d65112badd0aa90a15d84032
Explained simple the pattern does the following: It matches:
(all (n-5 - n-4) (n-3 - n-2) (n-1 - n-0))
and replaces it by
/$1/$2/$3/$0
You can use a regular expression to separate each of the last 3 bytes from the rest of the hash.
hash=13febd65d65112badd0aa90a15d84032
[[ $hash =~ (..)(..)(..)$ ]]
new_path="/path/to/nginx/cache/${BASH_REMATCH[1]}/${BASH_REMATCH[2]}/${BASH_REMATCH[3]}/$hash"
Base="/path/to/nginx/cache/"
echo '13febd65d65112badd0aa90a15d84032' | \
sed "s|\(.*\(..\)\(..\)\(..\)\)|${Base}\2/\3/\4/\1|"
# or
# sed sed 's|.*\(..\)\(..\)\(..\)$|${Base}\1/\2/\3/&|'
Assuming info is a correct MD5 (and only) string
First of all - thanks to all of the responders - this was extremely quick!
I also did my own scripting meantime, and came up with this solution:
Run this script with a parameter of the URL you're looking for (www.example.com/article/76232?q=hello for example)
#!/bin/bash
path=$1
md5=$(echo -n "$path" | md5sum | cut -f1 -d' ')
p3=$(echo "${md5:0-2:2}")
p2=$(echo "${md5:0-4:2}")
p1=$(echo "${md5:0-6:2}")
echo "/path/to/nginx/cache/$p1/$p2/$p3/$md5"
This assumes the NGINX cache has a key structure of 2:2:2.

extract substring from lines using grep, awk,sed or etc

I have a files with many lines like:
lily weisy
I want to extract www.youtube.com/user/airuike and lily weisy, and then I also want to separate airuike from www.youtube.com/user/
so I want to get 3 strings: www.youtube.com/user/airuike, airuike and lily weisy
how to achieve this? thanks
do this:
sed -e 's/.*href="\([^"]*\)".*>\([^<]*\)<.*/link:\1 name:\2/' < data
will give you the first part. But I'm not sure what you are doing with it after this.
Since it is html, and html should be parsed with a html parser and not with grep/sed/awk, you could use the pattern matching function of my Xidel.
xidel yourfile.html -e '<a class="yt-uix-sessionlink yt-user-name " dir="ltr">{$link := #href, $user := substring-after($link, "www.youtube.com/user/"), $name:=text()}</a>*'
Or if you want a CSV like result:
xidel yourfile.html -e '<a class="yt-uix-sessionlink yt-user-name " dir="ltr">{string-join((#href, substring-after(#href, "www.youtube.com/user/"), text()), ", ")}</a>*' --hide-variable-names
It is kind of sad, that you also want to have the airuike string, otherwise it could be as simple as
xidel /yourfile.html -e '{$name}*'
(and you were supposed to be able to use xidel '{$name}*', but it seems I haven't thought the syntax through. Just one error check and it is breaking everything. )
$ awk '{split($0,a,/(["<>]|:\/\/)/); u=a[4]; sub(/.*\//,"",a[4]); print u,a[4],a[12]}' file
www.youtube.com/user/airuike airuike lily weisy
I think something like this must work
while read line
do
href=$(echo $line | grep -o 'http[^"]*')
user=$(echo $href | grep -o '[^/]*$')
text=$(echo $line | grep -o '[^>]*<\/a>$' | grep -o '^[^<]*')
echo href: $href
echo user: $user
echo text: $text
done < yourfile
Regular expressions basics: http://en.wikipedia.org/wiki/Regular_expression#POSIX_Basic_Regular_Expressions
Upd: checked and fixed

Resources