Find and replace URL with content from URL - bash

Background info:
I've got an XML file that my supplier uploads each night with new products and updated stock counts etc.
But they've stitched me up and they don't have a Description in the XML file, they have a link to their site which has the description in raw text.
What i need to do is have a script that loops through the document i download from them and replace the URL with the content of the URL.
For example, if i have
<DescriptionLink>http://www.leadersystems.com.au/DataFeed/ProductDetails/AT-CHARGERSTATION-45</DescriptionLink>
I want it to end up as
<DescriptionLink>Astrotek USB Charging Station Charger Hub 3 Port 5V 4A with 1.5m Power Cable White for iPhone Samsung iPad Tablet GPS</DescriptionLink>
I've tried a few things but i'm not very proficient with scripting or loops.
So far i've got:
#!/bin/bash
LINKGET=`awk -F '|' '{ print $2 }' products-daily.txt`
wget -O products-daily.txt http://www.suppliers-site-url.com
sed 's/<DescriptionLink>*/<DescriptionLink>$(wget -S -O- $LINKGET/g' products-daily.txt
But again, i'm not sure how this all really works so it's been trial and error.
Any help is appreciated!!!
Updated to include example URL.

You'll want something like this (using GNU awk for the 3rd arg to match()):
$ cat tst.awk
{
head = ""
tail = encode($0)
while ( match(tail,/^([^{]*[{])([^}]+)(.*)/,a) ) {
desc = ""
cmd = "curl -s \047" a[2] "\047"
while ( (cmd | getline line) > 0 ) {
desc = (desc=="" ? "" : desc ORS) line
}
close(cmd)
head = head decode(a[1]) desc
tail = a[3]
}
print head decode(tail)
}
function encode(str) {
gsub(/#/,"#A",str)
gsub(/{/,"#B",str)
gsub(/}/,"#C",str)
gsub(/<DescriptionLink>/,"{",str)
gsub(/<\/DescriptionLink>/,"}",str)
return str
}
function decode(str) {
gsub(/}/,"</DescriptionLink>",str)
gsub(/{/,"<DescriptionLink>",str)
gsub(/#C/,"}",str)
gsub(/#B/,"{",str)
gsub(/#A/,"#",str)
return str
}
$ awk -f tst.awk file
<DescriptionLink>Astrotek USB Charging Station Charger Hub 3 Port 5V 4A with 1.5m Power Cable White for iPhone Samsung iPad Tablet GPS</DescriptionLink>
See https://stackoverflow.com/a/40512703/1745001 for info on what the encode/decode functions are doing and why.
Note that this is one of the rare cases where use of getline is appropriate. If you're ever considering using getline in future make sure you read and fully understand all of the caveats and uses cases discussed at http://awk.freeshell.org/AllAboutGetline first.

Related

How to filter lines written to log by gnu-screen

I have a device that I can log it's output using screen:
screen -L log.txt /dev/ttyUSB0 115200
and the log.txt file will have entries like this:
Seconds: 2001.609
Centigrade: 38.780
Humidity %: 29.534
Seconds: 2002.756
Centigrade: 38.950
Humidity %: 29.274
with a blank line between each block of data. I'd like to drop the blank lines and the Seconds line to get:
Centigrade: 38.780
Humidity %: 29.534
Centigrade: 38.950
Humidity %: 29.274
Is there anyway to do this with screen? Or is post processing the only option? If a grep can be run then I can also add an awk to produce:
2001.609, 38.780, 29.534
2002.756, 38.950, 29.274
Is screen the best tool for this logging? It seems not.
If you are not going to interact with the device, then you can simply read straight from it with awk </dev/ttyUSB0 '....' and extract the wanted fields. You can set the speed first with stty -F /dev/ttyUSB0 115200 and perhaps also choose other stty options like raw -echo at the same time.
stty -F /dev/ttyUSB0 115200
awk </dev/ttyUSB0 '
/^Seconds:/ { s = $2; next }
/^Centigrade:/ { c = $2; next }
/^Humidity %:/ { h = $3; printf "%s, %s, %s\n",s,c,h }
'

Having SUM issues with a bash script

I'm trying to write a script to pull the integers out of 4 files that store temperature readings from 4 industrial freezers, this is a hobby script it generates the general readouts I wanted, however when I try to generate a SUM of the temperature readings I get the following printout into the file and my goal is to print the end SUM only not the individual numbers printed out in a vertical format
Any help would be greatly appreciated;here's my code
grep -o "[0.00-9.99]" "/location/$value-1.txt" | awk '{ SUM += $1; print $1} END { print SUM }' >> "/location/$value-1.txt"
here is what I am getting in return
Morningtemp:17.28
Noontemp:17.01
Lowtemp:17.00 Hightemp:18.72
1
7
.
2
8
1
7
.
0
1
1
7
.
0
0
1
8
.
7
2
53
It does generate the SUM I don't need the already listed numbers, just the SUM total
Why not stick with AWK completely? Code:
$ cat > summer.awk
{
while(match($0,/[0-9]+\.[0-9]+/)) # while matches on record
{
sum+=substr($0, RSTART, RLENGTH) # extract matches and sum them
$0=substr($0, RSTART + RLENGTH) # reset to start after previous match
count++ # count matches
}
}
END {
print sum"/"count"="sum/count # print stuff
Data:
$ cat > data.txt
Morningtemp:17.28
Noontemp:17.01
Lowtemp:17.00 Hightemp:18.72
Run:
$ awk -f summer.awk file
70.01/4=17.5025
It might work in the winter too.
The regex in grep -o "[0.00-9.99]" "/location/$value-1.txt" is equivalent to [0-9.], but you're probably looking for numbers in the range 0.00 to 9.99. For that, you need a different regex:
grep -o "[0-9]\.[0-9][0-9]" "/location/$value-1.txt"
That looks for a digit, a dot, and two more digits. It was almost tempting to use [.] in place of \.; it would also work. A plain . would not; that would select entries such as 0X87.
Note that the pattern shown ([0-9]\.[0-9][0-9]) will match 192.16.24.231 twice (2.16 and 4.23). If that's not what you want, you have to be a lot more precise. OTOH, it may not matter in the slightest for the actual data you have. If you'd want it to match 192.16 and 24.231 (or .24 and .231), you have to refine your regex.
Your command structure:
grep … filename | awk '…' >> filename
is living dangerously. In the example, it is 'OK' (but there's a huge grimace on my face as I type 'OK') because the awk script doesn't write anything to the file until grep has read it all. But change the >> to > and you have an empty input, or have awk write material before the grep is complete and suddenly it gets very tricky to determine what happens (it depends, in part, on what awk writes to the end of the file).

Get package name and corr. data from file

I've been banging my head lately,trying to parse dumpsys output.
Here is the output:
NotificationRecord(0x4297d448: pkg=com.android.systemui user=UserHandle{0} id=273 tag=null score=0: Notification(pri=0 icon=7f020148 contentView=com.android.systemui/0x1090069 vibrate=null sound=null defaults=0x0 flags=0x2 when=0 ledARGB=0x0 contentIntent=N deleteIntent=N contentTitle=6 contentText=15 tickerText=6 kind=[null]))
uid=10012 userId=0
icon=0x7f020148 / com.android.systemui:drawable/stat_sys_no_sim
pri=0 score=0
contentIntent=null
deleteIntent=null
tickerText=No SIM
contentView=android.widget.RemoteViews#429c1f58
defaults=0x00000000 flags=0x00000002
sound=null
vibrate=null
led=0x00000000 onMs=0 offMs=0
extras={
android.title=No SIM
android.subText=null
android.showChronometer=false
android.icon=2130837832
android.text=Insert SIM card
android.progress=0
android.progressMax=0
android.showWhen=true
android.infoText=null
android.progressIndeterminate=false
android.scoreModified=false
}
NotificationRecord(0x427e1878: pkg=jackpal.androidterm user=UserHandle{0} id=1 tag=null score=0: Notification(pri=0 icon=7f02000d contentView=jackpal.androidterm/0x1090069 vibrate=null sound=null defaults=0x0 flags=0x62 when=1456782124817 ledARGB=0x0 contentIntent=Y deleteIntent=N contentTitle=17 contentText=27 tickerText=27 kind=[null]))
uid=10094 userId=0
icon=0x7f02000d / jackpal.androidterm:drawable/ic_stat_service_notification_icon
pri=0 score=0
contentIntent=PendingIntent{42754f78: PendingIntentRecord{42802aa0 jackpal.androidterm startActivity}}
deleteIntent=null
tickerText=Terminal session is running
contentView=android.widget.RemoteViews#4279b510
defaults=0x00000000 flags=0x00000062
sound=null
vibrate=null
led=0x00000000 onMs=0 offMs=0
extras={
android.title=Terminal Emulator
android.subText=null
android.showChronometer=false
android.icon=2130837517
android.text=Terminal session is running
android.progress=0
android.progressMax=0
android.showWhen=true
android.infoText=null
android.progressIndeterminate=false
android.scoreModified=false
}
NotificationRecord(0x429381f8: pkg=com.droidsail.dsapp2sd user=UserHandle{0} id=128 tag=null score=0: Notification(pri=0 icon=7f020000 contentView=com.droidsail.dsapp2sd/0x1090069 vibrate=null sound=null defaults=0x0 flags=0x10 when=1456786729004 ledARGB=0x0 contentIntent=Y deleteIntent=N contentTitle=13 contentText=35 tickerText=35 kind=[null]))
uid=10107 userId=0
icon=0x7f020000 / com.droidsail.dsapp2sd:drawable/appicon
pri=0 score=0
contentIntent=PendingIntent{42955a60: PendingIntentRecord{4286db18 com.droidsail.dsapp2sd startActivity}}
deleteIntent=null
tickerText=Detected new app can be moved to SD
contentView=android.widget.RemoteViews#42a891a8
defaults=0x00000000 flags=0x00000010
sound=null
vibrate=null
led=0x00000000 onMs=0 offMs=0
extras={
android.title=New app to SD
android.subText=null
android.showChronometer=false
android.icon=2130837504
android.text=Detected new app can be moved to SD
android.progress=0
android.progressMax=0
android.showWhen=true
android.infoText=null
android.progressIndeterminate=false
android.scoreModified=false
}
NotificationRecord(0x423708b0: pkg=android user=UserHandle{-1} id=17041135 tag=null score=0: Notification(pri=0 icon=1080399 contentView=android/0x1090069 vibrate=null sound=null defaults=0x0 flags=0x1002 when=0 ledARGB=0x0 contentIntent=Y deleteIntent=N contentTitle=19 contentText=17 tickerText=N kind=[android.system.imeswitcher]))
uid=1000 userId=-1
icon=0x1080399 / android:drawable/ic_notification_ime_default
pri=0 score=0
contentIntent=PendingIntent{425a8960: PendingIntentRecord{426f84b0 android broadcastIntent}}
deleteIntent=null
tickerText=null
contentView=android.widget.RemoteViews#428846b8
defaults=0x00000000 flags=0x00001002
sound=null
vibrate=null
led=0x00000000 onMs=0 offMs=0
extras={
android.title=Choose input method
android.subText=null
android.showChronometer=false
android.icon=17302425
android.text=Hacker's Keyboard
android.progress=0
android.progressMax=0
android.showWhen=true
android.infoText=null
android.progressIndeterminate=false
android.scoreModified=false
}
I want to get the package name and the corresponding extras={}
for each of them.
For example:
pkg:com.android.systemui
extras={
.....
}
So far I've tried:
dumpsys notification | awk '/pkg=/,/\n}/'
But without any success.
I'm a newbie to awk,and if possible I want to do it with awk or perl.Of course,any other tool like sed or grep is fine by me too,I just wanna parse it somehow.
Can anyone help me?
If you have GNU awk, try the following:
awk -v RS='(^|\n)NotificationRecord\\([^=]+=' \
'NF { print "pkg:" $1; print gensub(/^.*\n\s*(extras=\{[^}]+\}).*$/, "\\1", 1) }' file
-v RS='(^|\n)NotificationRecord\\([^=]+=' breaks the input into records by lines starting with NotificationRecord( up to and including the following = char.
In effect, that means you get records starting with the package names (com.android.systemui, ...`)
NF is a condition that only executes the following block if it evaluates to nonzero; NF is the count of fields in the record, so as long as at least 1 field is present, the block is evaluated - in effect, this skips the implied empty record before the very first line.
print "pkg:" $1 prints the package name, prefixed with literal pkg:.
gensub(/^.*\n\s*(extras=\{[^}]+\}).*$/, "\\1", 1) matches the entire record and replaces it with the extras property captured via a capture group, effectively returning the extras property only.
I would suggest perl over awk, because you'll be storing whether you're inside the extras=... block in a variable:
dumpsys notification | perl -lne '
print $1 if /^Notif.*?: pkg=(\S+)/;
$in_extras = 0 if /^ \}/;
print if $in_extras;
$in_extras = 1 if /^ extras=\{/'
Oh, if you want the extra pkg: and extras= text, slight modification:
dumpsys notification | perl -lne '
print "pkg: $1" if /^Notif.*?: pkg=(\S+)/;
$in_extras = 1 if /^ extras=\{/;
print if $in_extras;
$in_extras = 0 if /^ \}/;'
Sed version:
dumpsys notification |\
sed -n 's/.*pkg=\([^ ]*\).*/pkg:\1/p;/^ extras={$/,/^ }$/s/^ //p'
I'm assuming you always have two spaces in front of extras={ and } and you also want to remove these spaces.

Extract text and evaluate in bash

I need some help getting a script up and running. Basically I have some data that comes from a command output and want to select some of it and evaluate
Example data is
JSnow <jsnow#email.com> John Snow spotted 30/1/2015
BBaggins <bbaggins#email.com> Bilbo Baggins spotted 20/03/2015
Batman <batman#email.com> Batman spotted 09/09/2015
So far I have something along the lines of
# Define date to check
check=$(date -d "-90 days" "+%Y/%m/%d")
# Return user name
for user in $(command | awk '{print $1}')
do
# Return last logon date
$lastdate=(command | awk '{for(i=1;i<=NF;i++) if ($i==spotted) $(i+1)}')
# Evaluation date again current -90days
if $lastdate < $check; then
printf "$user not logged on for ages"
fi
done
I have a couple of problems, not least the fact that whilst I can get information from places I don't know how to go about getting it all together!! I'm also guessing my date evaluation will be more complicated but at this point that's another problem and just there to give a better idea of my intentions. If anyone can explain the logical steps needed to achieve my goal as well as propose a solution that would be great. Thanks
Every time you write a loop in shell just to manipulate text you have the wrong approach (see, for example, https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice). The general purpose text manipulation tool that comes on every UNIX installation is awk. This uses GNU awk for time functions:
$ cat tst.awk
BEGIN { check = systime() - (90 * 24 * 60 * 60) }
{
user = $1
date = gensub(/([0-9]+)\/([0-9]+)\/([0-9]+)/,"\\3 \\2 \\1 0 0 0",1,$NF)
secs = mktime(date)
if (secs < check) {
printf "%s not logged in for ages\n", user
}
}
$ cat file
JSnow <jsnow#email.com> John Snow spotted 30/1/2015
BBaggins <bbaggins#email.com> Bilbo Baggins spotted 20/03/2015
Batman <batman#email.com> Batman spotted 09/09/2015
$ cat file | awk -f tst.awk
JSnow not logged in for ages
BBaggins not logged in for ages
Batman not logged in for ages
Replace cat file with command.

Query information inside the files(UNIX,AWK)

I need to query the information of about 1000 files in once.
For example
My filename is
Test_001_20150517
Test_001_20150530
Information inside the file
{
1=2015
2=8
3=4
4=98888
5=123456
}
{
1=2014
2=456
3=5588
4=95858
5=67889
}
I want to query these 2 files with the conditions that 1=2015 and only show the result of 5
cat *201505*|awk -F '=' '{if ($1=="5"){print $2}}'
I'm trying to show the result but there is no condition that 1=2015 I don't know what should I do because 1 and 5 is the same as $1.
Sorry for my poor English if there is something wrong or misunderstand in my question.
Is this what you want?
$ awk -F'=' '{a[$1]=$2} /}/ && (a[1] == 2015) {print a[5]}' file
123456

Resources