How to process continuous stream output with grep utility? - shell

I have a requirement where my curl command is receiving continuous output from a streaming HTTP service. The stream never ends. I want to just grep a string from the stream and pass/pipe this command output to another utility such as xargs and say, echo for an example, for further continuous processing.
This is the output of the continuous stream which I shall stop receiving only when I end running the curl command.
curl -X "POST" "http://localhost:8088/query" --header "Content-Type: application/json" -d $'{"ksql": "select * from SENSOR_S EMIT CHANGES;","streamsProperties": {"ksql.streams.auto.offset.reset": "earliest"}}' -s -N
[{"header":{"queryId":"none","schema":"`ROWTIME` BIGINT, `ROWKEY` STRING, `SENSOR_ID` STRING, `TEMP` BIGINT, `HUM` BIGINT"}},
{"row":{"columns":[1599624891102,"S2","S2",40,20]}},
{"row":{"columns":[1599624891113,"S1","S1",90,80]}},
{"row":{"columns":[1599624909117,"S2","S2",40,20]}},
{"row":{"columns":[1599624909125,"S1","S1",90,80]}},
{"row":{"columns":[1599625090320,"S2","S2",40,20]}},
Now when I pipe the output to grep, it works as expected and I keep receiving any new events.
curl -X "POST" "http://localhost:8088/query" --header "Content-Type: application/json" -d $'{"ksql": "select * from SENSOR_S EMIT CHANGES;","streamsProperties": {"ksql.streams.auto.offset.reset": "earliest"}}' -s -N | grep S1
{"row":{"columns":[1599624891113,"S1","S1",90,80]}},
{"row":{"columns":[1599624909125,"S1","S1",90,80]}},
But when I pipe this grep output to xargs and echo, the output just don't move at all.
curl -X "POST" "http://localhost:8088/query" --header "Content-Type: application/json" -d $'{"ksql": "select * from SENSOR_S EMIT CHANGES;","streamsProperties": {"ksql.streams.auto.offset.reset": "earliest"}}' -s -N | grep S1 | xargs -I {} echo {}
^C
When I remove grep from the middle, it works as expected.
curl -X "POST" "http://localhost:8088/query" --header "Content-Type: application/json" -d $'{"ksql": "select * from SENSOR_S EMIT CHANGES;","streamsProperties": {"ksql.streams.auto.offset.reset": "earliest"}}' -s -N | xargs -I {} echo {}
[{header:{queryId:none,schema:`ROWTIME` BIGINT, `ROWKEY` STRING, `SENSOR_ID` STRING, `TEMP` BIGINT, `HUM` BIGINT}},
{row:{columns:[1599624891102,S2,S2,40,20]}},
{row:{columns:[1599624891113,S1,S1,90,80]}},
{row:{columns:[1599624909117,S2,S2,40,20]}},
{row:{columns:[1599624909125,S1,S1,90,80]}},
{row:{columns:[1599625090320,S2,S2,40,20]}},
Looks like grep is looking for the input to end before it can pipe it further. When I tested the same thing with a finite input, it works as expected.
ls | grep sh | xargs -I {} echo {};
abcd.sh
123.sh
pqr.sh
xyz.sh
So, the questions are: Is my understanding correct? Is there a way where grep can keep passing the output to subsequent commands in real time? I want to keep some basic filtering logic out of the further scripting, hence wanting grep to work.
Thanks in Advance !
Anurag

As suggested by #larsks , " --line-buffered flush output on every line" option for grep is working fine when is test for similar requirement as yous .
So the command would be
curl -X "POST" "http://localhost:8088/query" --header "Content-Type: application/json" -d $'{"ksql": "select * from SENSOR_S EMIT CHANGES;","streamsProperties": {"ksql.streams.auto.offset.reset": "earliest"}}' -s -N | grep S1 --line-buffered | xargs -I {} echo {}
I tested on "/var/log/messages" file which gets continously updated as following :
[root#project1-master ~]# tail -f /var/log/messages | grep journal --line-buffered | xargs -I {} echo {}
Sep 11 11:15:47 project1-master journal: I0911 15:15:47.448254 1 node_lifecycle_controller.go:1429] Initializing eviction metric for zone:
Sep 11 11:15:52 project1-master journal: I0911 15:15:52.448704 1 node_lifecycle_controller.go:1429] Initializing eviction metric for zone:
Sep 11 11:15:54 project1-master journal: 2020-09-11 15:15:54.006 [INFO][46] felix/int_dataplane.go 1300: Applying dataplane updates

Related

Tail a log file and send rows to curl in 100 line batches

I have a bash script that looks like this:
tail -f -n +1 my.log | \
awk -f influx.awk | \
xargs \
-I '{}' \
curl \
-XPOST 'http://influxdb/write?db=foo' \
--data-binary '{}'
What can I change so that instead of creating a curl request for each row, it would batch them up into say 100 rows (see influx curl docs)?
The problem I'm having is that each InfluxDB "point" needs to be separated by a new line, which is also the delimiter for xargs e.g. adding -L 100 to xargs doesn't work.
Bonus: how would I also make this terminate if no new lines has been added to the file after say 10s?
Rather than xargs, you want to use split, with its --filter option. For example, the following batches lines into groups of two:
$ seq 5 | split -l 2 --filter='echo begin; cat; echo end'
begin
1
2
end
begin
3
4
end
begin
5
end
In your case, you could try something like
tail -f -n +1 my.log | \
awk -f influx.awk | \
split -l 100 --filter='\
curl \
-XPOST "http://influxdb/write?db=foo" \
--data-binary #-'
The #- makes curl read data from standard input.

Bash - for looping a curl gives no output

so, I am currently trying to connect Data from a Consul Cluster with those from FNT. I get the Data I need by curling for it in the Consul API and the returning Server names shall be checked against FNT to get the Server owner.
Following is the Consul curl:
gethosts=$(curl -s -H "Authorization: Bearer <TOKEN>" <CONSUL URL> | jq -cr '.[] | select(.NodeMeta.type == "physical") | .ServiceAddress')
Following is the FNT curl:
curl -s -H "Content-Type: application/json" -k -X POST -d '{}' "<FNT URL>" | jq '.returnData[] | select(.cFqdn == "<FQDN>") | .cResponsible + "/" + .cFqdn'
Both work perfectly fine on their own. The Consul Curl gets me every FQDN from every physical (hardware) hosts and if i paste one of those FQDNs into the FNT curl it gets me the FQDN again + the responsible Owner for that server.
Now i wanted to combine those in a loop to get every single FQDN from Consul checked against FNT with the following:
 for i in $gethosts; do curl -s -H "Content-Type: application/json" -k -X POST -d '{}' "<FNT URL>" | jq '.returnData[] | select(.cFqdn == $i) | .cResponsible + " " + .cFqdn'; done
But it simply doesnt work. There is no error or anything i can work with. Just no output at all.
Does anyone of you see the mistake in my for loop? cause I definitely can't, probably already code blind after all those hours of troubleshooting :D
Thanks in advance!
P.S.:
I also tried
for i in $gethosts; do $(curl -s -H "Content-Type: application/json" -k -X POST -d '{}' "<FNT URL>" | jq '.returnData[] | select(.cFqdn == $i) | .cResponsible + " " + .cFqdn'); done
or
for i in $gethosts; do curl -s -H "Content-Type: application/json" -k -X POST -d '{}' "<FNT URL>" | jq '.returnData[] | select(.cFqdn == <FQDN>) | .cResponsible + " " + .cFqdn'; done
For my understanding, the last one should always have the Same outpout but as many times as hosts are in $gethosts. I did this to see if $i in .cFqdn is the problem, but it seems like it isnt.
I fixed it.
for a in $gethosts; do curl -s -H "Content-Type: application/json" -k -X POST -d '{}' "<FNTURL>" | jq "[.returnData[] | select(.cFqdn == \"$a\") | .cResponsible + \";\" + .cFqdn] | .[]"; done
Guess I had some quoting issues.

curl sends empty json on content-type: application/json

I am sending cpu and memory usage on json format from Ubuntu to Node.js API using curl with POST method. However the json data is empty at Node.js server.
Bash script on Ubuntu
top -b -n 2 -d 0.5 > top.txt
cpu_i=$(grep Cpu top.txt | cut -d ',' -f 4 | cut -d ' ' -f 2)
cpu=$(echo $cpu_i | cut -d ' ' -f 2)
echo $cpu
mem=$(grep "KiB Mem :" top.txt | cut -d ':' -f 2)
#echo $mem
mem_used=$(echo $mem | cut -d ',' -f 3 | cut -d ' ' -f 2)
echo $mem_used
curl --header "Content-Type: application/json" -d "{\"cpu\":\"$cpu\", \"memory\":\"$mem_used\",\"device\":\"ubuntu\"}" http://192.168.10.10:4000/collector
Output at Node.js server
{}
Remote Address: ::ffff:192.168.10.5
If I remember correctly, the httpverb is set to get by default. If you intend to post it, use -X POST. That will probably solve your problem, because the curl command is OK.
curl -X POST --header "Content-Type: application/json" -d "{\"cpu\":\"$cpu\", \"memory\":\"$mem_used\",\"device\":\"ubuntu\"}" http://192.168.10.10:4000/collector

Curl command in shell script returning error {"Error":"bpapigw-300 Cannot authorize access to resource"

I am trying to execute curl using this shell script:
#!/bin/bash
curl -k -H "Content-Type:application/json" -d '{"username":"admin","password":"adminpw", "tenant":"master"}' https://localhost/tron/api/v1/tokens > /tmp/token.data
grep -Po '{\"token\":\"\K[^ ]...................' /tmp/token.data > /tmp/token
tokendetails=`cat /tmp/token`
for token in $tokendetails
do
TOKEN=`echo $token`
done
userdetails=`cat /tmp/curloutput.txt | sed 's/{"clientInactivityTime"/\n{"clientInactivityTime"/g' | sed 's/\(.*\).\("firstName":[^,]*\)\(.*\)\("lastName":[^,]*\)\(.*\)\("email":[^,]*\)\(.*\)\("username":[^,]*\)\(.*\)/\2,\4,\6,\8/g' | grep username`
for user in $userdetails
do
firstName=`echo $user | sed 's/,/\n/g' | grep firstName | sed 's/.*:"\([^"]*\).*/\1/g'`
lastName=`echo $user | sed 's/,/\n/g' | grep lastName | sed 's/.*:"\([^"]*\).*/\1/g'`
email=`echo $user | sed 's/,/\n/g' | grep email | sed 's/.*:"\([^"]*\).*/\1/g'`
username=`echo $user | sed 's/,/\n/g' | grep username | sed 's/.*:"\([^"]*\).*/\1/g'`
curl -k -X POST "https://haxsne09/tron/api/v1/users" -H "accept: application/json" -H "Authorization: Bearer =${TOKEN}" -H "Content-Type: application/x-www-form-urlencoded" -d "first_name=${firstName}\&last_name=${lastName}\&email=${email}\&password=Tata123^\&username=${username}\&is_active=true"
echo $RESPONSE
done
I am getting ths error:
{"Error":"bpapigw-300 Cannot authorize access to resource: Could not authorize path for user identifier: Failed to get Roles for identifier: REST operation failed 0 times: '[GET /api/v1/current-user][401] currentUserListUnauthorized \u0026{Detail:Invalid token}'. This user is unauthenticated?"}
Do I need to add any syntax before executing curl -k -X POST?
What I see is that -H "Authorization: Bearer =${TOKEN}" contains an = sign which shouldn't be there...
It should be: -H "Authorization: Bearer ${TOKEN}"
More, in a command you use /tmp/curloutput.txt file, which is never created by your script...
The Authorization header you are using is not working. Maybe the syntax is not Bearer =aAbBcCdDeEfF0123456 but something else for the server running on haxsne09, maybe without the = like #MarcoS suggests.
Alternatively, your grep command may be returning one too many characters (a rogue quote maybe).
I rewrote your code below to be more readable. You will notice that I:
Changed your matching groups in sed to capture only the needed parts and put them in variables using read. I also used the -E flag to avoid having to use \( and \)
Removed the useless for loops
Quoted all variable expansions properly
Added some line breaks for readability
Removed some temporary files and associated useless uses of cat
Here is the updated script:
#!/bin/bash
curl -k -H 'Content-Type:application/json' -d \
'{"username":"admin","password":"adminpw", "tenant":"master"}' \
https://localhost/tron/api/v1/tokens > /tmp/token.data
token=$(grep -Po '{"token":"\K[^ ]...................' /tmp/token.data)
IFS=, read -r firstName lastName email username < <(
</tmp/curloutput.txt sed 's/{"clientInactivityTime"/\n&/' |
sed -nE 's/.*."firstName":"([^"]*)".*"lastName":"([^"]*)").*"email":"([^"]*).*"username":"([^"]*)".*/\1,\2,\3,\4/p'
)
curl -k -X POST 'https://haxsne09/tron/api/v1/users' -H 'accept: application/json' \
-H "Authorization: Bearer $token" -H "Content-Type: application/x-www-form-urlencoded" -d \
"first_name=$firstName&last_name=$lastName&email=$email&password=Tata123^&username=$username&is_active=true"
echo

Strange behavior when parsing result from curl + awk

Using curl on ubuntu I am trying to fetch the Jenkins version inspired by:
https://wiki.jenkins.io/display/JENKINS/Remote+access+API
In a bash script I do:
VERSION=$(curl -k -i -X GET --insecure --silent --header \"Authorization: Bearer $TOKEN \" $URL | grep -Fi X-Jenkins: | awk '{print $2}')
echo "__A__[${VERSION}]__B__"
But when I run the script I get:
]__B__2.89.2
So for some reason the prefix: __A__[ gets swallowed and the suffix gets turned into a prefix.
I have also tried to trim the output with:
VERSION=$(curl -k -i -X GET --insecure --silent --header \"Authorization: Bearer $TOKEN \" $URL | grep -Fi X-Jenkins: | awk '{print $2}' | sed -e 's/^[ \t]*//')
But it gives the same result.
As suggested below I have also tried with:
echo '__A__['"${VERSION}"']__B__'
But still gives the same/wrong result.
A few other things I have tried (giving the same result)
Same/wrong output
VERSION=$(curl -k -i -X GET --insecure --silent --header \"Authorization: Bearer $TOKEN \" $URL | grep -i X-Jenkins: | awk '{print $2}')
echo '__A__['"${VERSION}"']__B__'
Same/wrong output
VERSION=$(curl -k -i -X GET --insecure --silent --header \"Authorization: Bearer $TOKEN \" $URL | grep X-Jenkins: | awk '{print $2}')
echo '__A__['"${VERSION}"']__B__'
Based on below suggestion I have now tried:
echo $VERSION|od -ax
Which gives:
0000000 2 . 8 9 . 2 cr nl
2e32 3938 322e 0a0d
0000010
If I compare that with:
VERSION_TEMP="2.89.2"
echo $VERSION_TEMP|od -ax
I get:
0000000 2 . 8 9 . 2 nl
2e32 3938 322e 000a
0000007
So looks like its the cr in the VERSION var that is causing the issue (not sure how that explains the whole reversing of prefix/suffix as described above).
SOLVED: Based on input from Romeo I now got it to work with adding |tr -d '\r' :
VERSION=$(curl -k -i -X GET --insecure --silent --header \"Authorization: Bearer $TOKEN \" $URL | grep X-Jenkins: | awk '{print $2}'|tr -d '\r')
Apparently the output contains a DOS carriage return.
Try adding tr -d '\015':
version=$(curl -k -i -X GET --insecure --silent --header \"Authorization: Bearer $TOKEN \" "$URL" |
tr -d '\015' |
awk 'tolower($0) !~ /x-jenkins:/{print $2}')
echo "__A__[$version]__B__"
Uppercase variable names are reserved for system use, so I changed yours to lower case, too, and removed the useless grep.

Resources