Elasticsearch : Number of search operation per second - performance

I am looking for a way to get the number of search operation per a second on a node (and / or on all nodes).
Is there a way to get this information without the Marvel plugin?
My ElasticSearch version is 0.90.11

Marvel does it by sampling. If you write a script to repeatedly run curl http://localhost:9200/_stats/search and parse the a result that looks like this:
...
"_all" : {
"primaries" : {
"search" : {
"open_contexts" : 0,
"query_total" : 51556,
"query_time_in_millis" : 2339958,
"query_current" : 0,
"fetch_total" : 8276,
"fetch_time_in_millis" : 34916,
"fetch_current" : 0
}
},
"total" : {
"search" : {
"open_contexts" : 0,
"query_total" : 73703,
"query_time_in_millis" : 2773745,
"query_current" : 0,
"fetch_total" : 10428,
"fetch_time_in_millis" : 45570,
"fetch_current" : 0
}
}
},
...
You can see the query_total values -- just repeatedly query those at some interval and then do the math.

Thanks Alcanzar.
Here is the script I created :
if [[ $1 == "" ]]
then
echo "Usage : $0 <refresh_interval_in second>"
echo "Example : ./$0 10"
exit 1
fi
refresh_interval=$1
while true; do
begin=$(curl --silent http://localhost:9200/_stats/search?pretty | grep '"query_total" :' | sed -n 2p | sed 's/,$//' | awk '{print $3}')
sleep $refresh_interval
end=$(curl --silent http://localhost:9200/_stats/search?pretty | grep '"query_total" :' | sed -n 2p | sed 's/,$//' | awk '{print $3}')
total=$(echo $(((( $end - $begin )) / $refresh_interval)))
echo $(date +"[%T] ")"Search ops/sec : "$total
done
For instance, refreshing every 10 seconds, execute : sh script.sh 10

Related

Is there a way to extract a value from log and use it further to extract another value using bash

I am trying to read a value from log file and then search another text based on this value .
Below is how my log file looks like. all I have with me is the customerId , the order ID is dynamically generated
I want to capture the orderId first based on customerId and store in a variable.
once successful I want to check the status of this order id which is some 10 lines below
Finally, print it in console or write to a file doesn't matter
2019-05-18 09:46:02.944 [thread-2 ] Orderprocesing: Sending order info '{
"customerName" : "jenny",
"customerId" : "JE19802501",
"customerphone" : null,
"orderId" : "8456/72530548",
"orderInfo" : {
"Item" : "comic series 2018",
"count" : "10"
}
}'
.............................................................
.............................................................
2019-05-18 09:46:02.944 [thread-2 ] Orderprocesing: Sending order info '{
"customerName" : "jenny",
"customerId" : "JE19802501",
"customerphone" : null,
"orderId" : "8456/82530548",
"orderInfo" : {
"Item" : "comic series 2019",
"count" : "10"
}
}'
.............................................................
.............................................................
2019-05-18 09:49:02.944 [thread-2 ] Orderprocesing: status for 82530548 is success
.............................................................
.............................................................
.............................................................
2019-05-18 09:50:06.872 [thread-2 ] Orderprocesing: status for 72530548 is success
I am new bash, I managed to slice a block of 10 lines that contains the OrderId corresponding to CustomerID but couldn't slice the OrderId and store it in a variable
$ cat orderlog_may_18 grep -A 15 "JE19802501"
expected results are to print
customerId : JE19802501
orderId : 72530548
status for 72530548 is success
customerId : JE19802501
orderId : 82530548
status for 82530548 is success
Two lines of bash, using sed.
ord=$(sed -n '/JE19802501/,/orderId/{/orderId/{s/[^0-9]//gp}}' orderlog_may18)
sed -n "/status for $ord/s/.*://p" orderlog_may18
$ord stores the numerals from the orderId line subsequent to JE198002501
The tail end of the status line is then printed.
You should be able to do the formatting you want in your bash script.
$ awk -v trgt='JE19802501' '
{ gsub(/[",]/," "); $1=$1 }
$1 == "customerId" { cust=$NF; print }
($1 == "orderId") && (cust == trgt) { ordr=$NF; print }
$0 ~ ("status for " ordr " is") { sub(/.*: /,""); print }
' file
customerId : JE19802501
orderId : 72530548
status for 72530548 is success

How to extract a value from json file in unix? [duplicate]

This question already has answers here:
Parsing JSON with Unix tools
(45 answers)
Closed 3 years ago.
I have the below json content in my sample file:
{
"listingRequest": {
"id": "016a1050-82dc-1262-cc9b-4baf3e0b7123",
"uri": "http://localhost:9090/nifi-api/flowfile-queues/016a104a-82dc-1262-7d78-d84a704abfbf/listing-requests/016a1050-82dc-1262-cc9b-4baf3e0b7123",
"submissionTime": "04/28/2019 19:40:58.593 UTC",
"lastUpdated": "19:40:58 UTC",
"percentCompleted": 0,
"finished": false,
"maxResults": 100,
"state": "Waiting for other queue requests to complete",
"queueSize": {
"byteCount": 480,
"objectCount": 20
},
"sourceRunning": false,
"destinationRunning": false
}
}
I want to retrieve the value of the byte count i.e. byteCount. The result should be 480.
Using other tools like jq is not allowed to be installed in our ourganization due to restrictions.
How do I do it via sed/grep? I tried grep -Po '"byteCount":.*?[^\\]",' but did not get any output
$ sed -n 's/.*"byteCount": *\([0-9]*\).*/\1/p' file
480
More generally you could use this (using any POSIX awk) to convert your specific format of JSON to a flat file and then print whatever you want by it's tag hierarchy:
$ cat tst.awk
{ gsub(/^[[:space:]]+|[[:space:]]+$/,"") }
match($0,/^"[^"]+"/) {
subTag = substr($0,RSTART+1,RLENGTH-2)
$0 = substr($0,RSTART+RLENGTH)
}
!NF || /^{/ { next }
/^:[[:space:]]*{/ {
preTag = (preTag=="" ? "" : preTag ".") subTag
next
}
/^}/ {
sub(/\.[^.]+$/,"",preTag)
next
}
{
gsub(/^[[:space:]]*:[[:space:]]*|[[:space:]]*,[[:space:]]*$/,"")
tag = preTag "." subTag
val = $0
printf "%s=%s\n", tag, val
}
.
$ awk -f tst.awk file
listingRequest.id="016a1050-82dc-1262-cc9b-4baf3e0b7123"
listingRequest.uri="http://localhost:9090/nifi-api/flowfile-queues/016a104a-82dc-1262-7d78-d84a704abfbf/listing-requests/016a1050-82dc-1262-cc9b-4baf3e0b7123"
listingRequest.submissionTime="04/28/2019 19:40:58.593 UTC"
listingRequest.lastUpdated="19:40:58 UTC"
listingRequest.percentCompleted=0
listingRequest.finished=false
listingRequest.maxResults=100
listingRequest.state="Waiting for other queue requests to complete"
listingRequest.queueSize.byteCount=480
listingRequest.queueSize.objectCount=20
listingRequest.sourceRunning=false
listingRequest.destinationRunning=false
$ awk -f tst.awk file | awk -F'=' '$1=="listingRequest.queueSize.byteCount"{print $2}'
480
I think you could count the characters
a= your_json
b="byteCount"
strindex() {
x="${1%%$2*}"
[[ "$x" = "$1" ]] && echo -1 || echo "${#x}"
}
index= strindex "$a" "$b" #print 4
result ={your_json:(index+11)}
Sources:
https://www.tldp.org/LDP/abs/html/string-manipulation.html
Position of a string within a string using Linux shell script?

Creating csv file from text

Using the the following text file i would like to create a csv file
input file
time : 5/14/18 10:31:26.832 AM
dt # : 0
Shot # : 587
name : 2851
cdn # : 2306
cdl : C5
Comment : N/A
________________________________________________________________________
time : 5/14/18 10:31:23.280 AM
dt # : 0
Shot # : 974
name : 2852
cdn # : 2306
cdl : C5
Comment : N/A
________________________________________________________________________
time : 5/14/18 6:04:27.880 AM
dt # : 21
Shot # : 316
name : 2854
cdn # : 2306
cdl : C5
Comment : N/A
________________________________________________________________________
time : 5/14/18 10:12:53.932 AM
dt # : 21
Shot # : 731
name : 2849
cdn # : 2306
cdl : C5
Comment : N/A
________________________________________________________________________
I tried to use this code to transpose the rows to columns.
gawk -F'\n' -v RS= -v OFS=',' -v ORS='\n' '{$1=$1}1' file.txt
this the output I got.
time : 5/14/18 10:31:26.832 AM,dt # : 0,Shot # : 587,name : 2851,cdn # : 2306,cdl : C5,Comment : N/A,________________________________________________________________________
time : 5/14/18 10:31:23.280 AM,dt # : 0,Shot # : 974,name : 2852,cdn # : 2306,cdl : C5,Comment : N/A,________________________________________________________________________
time : 5/14/18 6:04:27.880 AM,dt # : 21,Shot # : 316,name : 2854,cdn # : 2306,cdl : C5,Comment : N/A,________________________________________________________________________
time : 5/14/18 10:12:53.932 AM,dt # : 21,Shot # : 731,name : 2849,cdn # : 2306,cdl : C5,Comment : N/A,________________________________________________________________________
But the desired output file should be like the below:
time,dt,Shot,name,cdn,cdl,Comment,
5/14/18 10:31:26.832 AM,0,587,2851,2306,C5,N/A
5/14/18 10:31:23.280 AM,0,974,2852,2306,C5,N/A
5/14/18 6:04:27.880 AM,21,316,2854,2306,C5,N/A
5/14/18 10:12:53.932 AM,21,731,2849,2306,C5,N/A
Thanks in advance.
EDIT:
awk -F" : " '!a[$1]++ && NF && !/^__/{sub(/ #/,"");heading=heading?heading OFS $1:$1} /^__/ && val{val=val ORS;next} NF{val=val?val OFS $2:$2} END{gsub(/\n,/,"\n",val);print heading ORS val}' OFS=, Input_file
Following awk may help you on same.
awk -F" : " 'BEGIN{print "time,dt,Shot,name,cdn,cdl,Comment,"}/^__/ && val{print val;val="";next} {val=val?val OFS $2:$2}' OFS=, Input_file

bash log file count words and replace them by number

I need to keep warnings from my script log and add a "LAST" to every line after each start so I know when the alert occurs at a glance so I add this to my script :
This is the fist line of my script :
echo "$( cat $ALERT_LOG_FILE | grep WARNING | tail -n 2k | ts "LAST ")" > $ALERT_LOG_FILE
Script log looks like this at first run :
WARNING : ...
WARNING : ...
WARNING : ...
WARNING : ...
When script start/restart the echo line adds "LAST" to each line and make it like this :
LAST WARNING : ...
LAST WARNING : ...
LAST WARNING : ...
LAST WARNING : ...
Problem is the log file becomes like this after some restarts:
LAST LAST LAST LAST WARNING : ....
LAST LAST LAST WARNING : ....
LAST LAST WARNING : ....
LAST LAST WARNING : ....
LAST WARNING : ....
WARNING:
Any way to make it like this:
LAST 4 WARNING : ....
LAST 3 WARNING : ....
LAST 2 WARNING : ....
LAST 2 WARNING : ....
LAST 2 WARNING : ....
LAST 1 WARNING : ....
WARNING:
EDIT:
code with #Yoda suggestion:
cat $LOG_FILE | grep WARNING | tail -n 2k | ts "LAST " | awk '{n=gsub("LAST ",X);if(n) print "LAST",n,$0;else print}')" > $LOG_FILE
out put log after some restarts with #Yoda suggestion:
LAST 2 2 1 WARNING : ...
LAST 2 1 WARNING : ...
LAST 1 WARNING : ...
WARNING : ...
Based on some assumptions:-
$ awk '{n=gsub("LAST ",X);if(n) print "LAST",n,$0;else print}' file
LAST 4 WARNING : ....
LAST 3 WARNING : ....
LAST 2 WARNING : ....
LAST 2 WARNING : ....
LAST 1 WARNING : ....
WARNING:
If this is not what your are looking for, then I would suggest posting a representative sample of your log file and expected output.
Here is something that might help:-
awk '
{
n = gsub("LAST ",X)
if( n )
{
for ( i = 1; i <= NF; i++ )
{
if ( $i ~ /WARNING/ )
{
sub(/^ */,X)
print "LAST",n,$0;
next
}
if ( $i ~ /^[0-9]$/ )
{
n += $i
$i = ""
}
}
}
else
print $0
}
'

How can I do jq nested for-loops from bash?

I have 52 json files (r$i.json) containing each 25 results (0 to 24). I'd like to create a json file with a special name for each of these results. The name would be composed according to the content of each of these results : YYYYMMDDHHMMSS_company_jobtitle.json
the command generating names work fine :
#!bin/bash
for ((j=0;j<=24;j++))
do
datein=$(jq <"r1.json" ".results[$j].date" | sed 's/"//g')
dateout=$(date -d "${datein}" +"%Y%m%d%H%M%S")
company=$(jq <"r1.json" ".results[$j].company" | sed 's/,//g;s/"//g;s/ //g')
job=$(jq <"r1.json" ".results[$j].jobtitle" | sed 's/,//g;s/"//g;s/ //g')
jq <"r1.json" ".results[$j]" > ${dateout}_${company}_${job}.json
done
Now when I replace r1 by r$i and add ((i=1;i<=52;j++)) it doesn't work... So I guess my problem comes from nested loop syntax in jq...
r1.json would look like that :
{
"radius" : 25,
"totalResults" : 1329,
"results" : [
{
"jobtitle" : "job1",
"company" : "company1,
"date" : "Sun, 01 Sep 2015 07:59:58 GMT",
}
,
{
"jobtitle" : "job2",
"company" : "company2",
"date" : "Sun, 02 Sep 2015 07:59:58 GMT",
}
,
|...]
{
"jobtitle" : "job25",
"company" : "company25,
"date" : "Sun, 25 Sep 2015 07:59:58 GMT",
}
]
}
You should respect the bash syntax in your fors:
for (( i=0; i<5; i++ ))
((i=1,i< =52,j++)) won't work, use ; instead of ,.
1) You wrote that your i-loop used ((i=1;i< =52;j++)); that should be ((i-1; i<=52; i++))
2) We can't see exactly what you did with respect to r1 and r$i, so if (1) doesn't resolve your difficulty, maybe you should double-check that what you did is actually what is needed. Should you change "> $outputname" to ">> $outputname"?
3) I suspect that rather than using s/"//g, it might be better to use the -r option of jq; you might also consider avoiding sed altogether (jq 1.5 has sub and gsub functions).
4) As I said, it would be better to get rid of all the backticks.
Finally I found the solution, and my issue didn't come from jq but from the syntax I was using for nested loops... Here it is :
for ((i=1;i<=kdr;i++))
do
for ((j=0;j<=24;j++))
do
datein=$(jq <"r$i.json" ".results[$j].date" | sed 's/"//g')
dateout=$(date -d "${datein}" +"%Y%m%d%H%M%S")
company=$(jq <"r$i.json" ".results[$j].company" | sed 's/,//g;s/"//g;s/ //g')
job=$(jq <"r$i.json" ".results[$j].jobtitle" | sed 's/,//g;s/"//g;s/ //g')
jq <"r$i.json" ".results[$j]" > ${dateout}_${company}_${job}.json
done
done

Resources