I have a goal of having this query return data in less than 10 ms, currently it's less than 50ms.
The relationships are
(l:list)<-[:IN_LIST]-(p:product)<-[:PRODUCT]-(a:a)
I have over 20m of :a in the db and this significantly slows down the query. But I need to order the :list by count of :a. Is there a fast way of getting the size/count?
GRAPH.profile g "MATCH (l:list{kind: 'Trending'})
MATCH (l)<-[:IN_LIST]-(p:product)
WITH p, size((p)<-[:PRODUCT]-()) as count, l ORDER BY count DESC
WITH count(l) as total, collect(l{.id})[0..20] as lists
RETURN *"
1) "Results | Records produced: 1, Execution time: 0.001034 ms"
2) " Project | Records produced: 1, Execution time: 0.007407 ms"
3) " Aggregate | Records produced: 1, Execution time: 0.331449 ms"
4) " Sort | Records produced: 636, Execution time: 0.155567 ms"
5) " Project | Records produced: 636, Execution time: 0.383724 ms"
6) " Apply | Records produced: 636, Execution time: 4.117716 ms"
7) " Conditional Traverse | (p:product)->(p:product) | Records produced: 636, Execution time: 0.409086 ms"
8) " Filter | Records produced: 39, Execution time: 0.050714 ms"
9) " Node By Label Scan | (p:list) | Records produced: 77, Execution time: 0.020350 ms"
10) " Aggregate | Records produced: 636, Execution time: 19.862072 ms"
11) " Conditional Traverse | (anon_0)-[anon_1:PRODUCT]->(anon_0) | Records produced: 46947, Execution time: 65.262939 ms"
12) " Argument | Records produced: 636, Execution time: 0.050361 ms"
UPDATE, this way it works ~30ms
GRAPH.profile g "MATCH (l:list{kind: 'Trending'})
MATCH (l)<-[:IN_LIST]-(p:product)<-[:PRODUCT]-(a)
WITH l ORDER BY count(a) ASC
RETURN count(l) as total, collect(l{.id})[0..20] as lists"
1) "Results | Records produced: 1, Execution time: 0.001405 ms"
2) " Aggregate | Records produced: 1, Execution time: 0.055607 ms"
3) " Sort | Records produced: 20, Execution time: 0.007036 ms"
4) " Aggregate | Records produced: 20, Execution time: 9.456037 ms"
5) " Conditional Traverse | (a)->(a) | Records produced: 46947, Execution time: 19.134115 ms"
6) " Filter | Records produced: 39, Execution time: 0.057806 ms"
7) " Node By Label Scan | (l:list) | Records produced: 77, Execution time: 0.019137 ms"
Related
I have the following text file.
Account1,2h 01m 00s
Account2,4h 25m 23s
Account3,5h 43m 59s
I wish to add the values of hours, minutes and seconds in order to total them to their respective minute totals.
Account1 minute total = 121
Account2 minute total = 265
Account3 minute total = 343
I have the following bash file
cat data.txt | cut -f2 -d','
This isolates the time values; however, from here I don't know what steps I would take to isolate the time, convert it to integers and then convert it to minutes. I have tried using a PARAM but to no avail.
If awk is an option, you can try this
awk -F"[, ]" '{h=60; m=1; s=0.01666667}{split($2,a,/h/); split($3,b,/m/); split($4,c,/s/); print$1, "minute total = " int(a[1] * h + b[1] * m + c[1] * s)}' input_file
$ cat awk.script
BEGIN {
FS=",| "
} {
h=60
m=1
s=0.01666667
}{
split($2,a,/h/)
split($3,b,/m/)
split($4,c,/s/)
print $1, "minute total = " int(a[1] * h + b[1] * m + c[1] * s)
}
Output
awk -f awk.script input_file
Account1 minute total = 121
Account2 minute total = 265
Account3 minute total = 343
I'm trying to parse a log file in SHELL and want to print first word of first line and last word of each line under it.
For instance:
$ grep -A3 "2015-01-22T07" Test.log | grep -A3 "Messages from Summary report is"
2015-01-22T07:36:30 | 9316 | 461 | 50 | Messages from Summary report is :[ Number of C is 1500
Total distance 10 km
Total number of A is 2
Number of B is 2
]
--
2015-01-22T07:37:30 | 9316 | 461 | 50 | Messages from Summary report is :[ Number of C is 1600
Total distance 11 km
Total number of A is 3
Number of B is 3
]
--
2015-01-22T07:38:30 | 9316 | 461 | 50 | Messages from Summary report is :[ Number of C is 1700
Total distance 12 km
Total number of A is 4
Number of B is 4
]
Expected output:
2015-01-22T07:36:30,1500,10 km,2,2
2015-01-22T07:37:30,1600,11 km,3,3
2015-01-22T07:38:30,1700,12 km,4,4
sorry, im new to this site.
cat test1.log
2015-01-22T07:36:30 | 9316 | 461 | 50 | Messages from Summary report is :[
Number of C is 1500
Total distance 10 km
Total number of A is 2
Number of B is 2
]
--
2015-01-22T07:37:30 | 9316 | 461 | 50 | Messages from Summary report is :[
Number of C is 1600
Total distance 11 km
Total number of A is 3
Number of B is 3
]
--
2015-01-22T07:38:30 | 9316 | 461 | 50 | Messages from Summary report is :[
Number of C is 1700
Total distance 12 km
Total number of A is 4
Number of B is 4
]
Re-attempt:
# awk -v RS='\n' -v OFS=, '$1~/^[0-9]{4}-[0-9]{2}-[0-9]{2}T/ {if (s) print s; s=$1; next}
/Total distance/{s = s OFS $(NF-1) " " $NF;next}
NF>2{s = s OFS $NF}
END{print s
}' test1.log
Output
,:[,1500,10 km,2,2,:[,1600,11 km,3,3,:[,1700,12 km,4,4
Check*
# head -1 test.log|cat -vte
2015-01-22T07:36:30 | 9316 | 461 | 50 | Messages from Summary report is :[ $
You can use this awk on your given outpur:
awk -v RS='\r' -v OFS=, '$1~/^[0-9]{4}-[0-9]{2}-[0-9]{2}T/ {if (s) print s; s=$1; next}
/Total distance/{s = s OFS $(NF-1) " " $NF;next}
NF>2{s = s OFS $NF}
END{print s
}' file
2015-01-22T07:36:30,1500,10 km,2,2
2015-01-22T07:37:30,1600,11 km,3,3
2015-01-22T07:38:30,1700,12 km,4,4
newtext.csv looks like below :
Record 1
---------
line 1 line 2 Sample Number: 123456789 (line no. 3) | | | | | Time In: 2012-05-29T10:21:06Z (line no. 21) | | | Time Out: 2012-05-29T13:07:46Z (line no. 30)
Record 2
----------
line 1 line 2 Sample Number: 363214563 (line no. 3) | | | | | Time In: 2012-05-29T10:21:06Z (line no. 21) | | | Time Out: 2012-05-29T13:07:46Z (line no. 30)
Record 3
---------
line 1 line 2 Sample Number: 987654321 (line no. 3) | | | | | Time In: 2012-05-29T10:21:06Z (line no. 21) | | | Time Out: 2012-05-29T13:07:46Z (line no. 30)
Assume there are such 100 records in a newtext.csv So, now i need the parameters of the entered i/p string, which is something below
Input
Enter the search String :
123456789
Output
Sample Number is, Sample Number: 123456789
Connected Time is,Time In: 2012-05-29T10:21:06Z
Disconnected Time is, Time Out: 2012-05-29T13:07:46Z
This is what exactly i need. Can you please help me with shell scripting for the above mentioned format ?
OK, the input and the desired output are kinda weird, but it's still not difficult to get what you want, try the following:
var=123456789
awk -v "var=$var" --exec /dev/stdin newtext.csv <<'EOF'
($7 == var) {
printf("Sample Number is, Sample Number: %s\n", $7);
printf("Connected Time is, Time In: %s\n", $18);
printf("Disconnected Time is, Time Out: %s\n", $27);
}
EOF
I have the script below to subtract the counts of files between two directories but the COUNT= expression does not work. What is the correct syntax?
#!/usr/bin/env bash
FIRSTV=`ls -1 | wc -l`
cd ..
SECONDV=`ls -1 | wc -l`
COUNT=expr $FIRSTV-$SECONDV ## -> gives 'command not found' error
echo $COUNT
Try this Bash syntax instead of trying to use an external program expr:
count=$((FIRSTV-SECONDV))
BTW, the correct syntax of using expr is:
count=$(expr $FIRSTV - $SECONDV)
But keep in mind using expr is going to be slower than the internal Bash syntax I provided above.
You just need a little extra whitespace around the minus sign, and backticks:
COUNT=`expr $FIRSTV - $SECONDV`
Be aware of the exit status:
The exit status is 0 if EXPRESSION is neither null nor 0, 1 if EXPRESSION is null or 0.
Keep this in mind when using the expression in a bash script in combination with set -e which will exit immediately if a command exits with a non-zero status.
You can use:
((count = FIRSTV - SECONDV))
to avoid invoking a separate process, as per the following transcript:
pax:~$ FIRSTV=7
pax:~$ SECONDV=2
pax:~$ ((count = FIRSTV - SECONDV))
pax:~$ echo $count
5
This is how I always do maths in Bash:
count=$(echo "$FIRSTV - $SECONDV"|bc)
echo $count
White space is important, expr expects its operands and operators as separate arguments. You also have to capture the output. Like this:
COUNT=$(expr $FIRSTV - $SECONDV)
but it's more common to use the builtin arithmetic expansion:
COUNT=$((FIRSTV - SECONDV))
For simple integer arithmetic, you can also use the builtin let command.
ONE=1
TWO=2
let "THREE = $ONE + $TWO"
echo $THREE
3
For more info on let, look here.
Alternatively to the suggested 3 methods you can try let which carries out arithmetic operations on variables as follows:
let COUNT=$FIRSTV-$SECONDV
or
let COUNT=FIRSTV-SECONDV
Diff Real Positive Numbers
diff_real () {
echo "df=($1 - $2); if (df < 0) { df=df* -1}; print df" | bc -l;
}
Usage
var_a=10
var_b=4
output=$(diff_real $var_a $var_b)
# 6
#########
var_a=4
var_b=10
output=$(diff_real $var_a $var_b)
# 6
Use BASH:
#!/bin/bash
# home/victoria/test.sh
START=$(date +"%s") ## seconds since Epoch
for i in $(seq 1 10)
do
sleep 1.5
END=$(date +"%s") ## integer
TIME=$((END - START)) ## integer
AVG_TIME=$(python -c "print(float($TIME/$i))") ## int to float
printf 'i: %i | elapsed time: %0.1f sec | avg. time: %0.3f\n' $i $TIME $AVG_TIME
((i++)) ## increment $i
done
Output
$ ./test.sh
i: 1 | elapsed time: 1.0 sec | avg. time: 1.000
i: 2 | elapsed time: 3.0 sec | avg. time: 1.500
i: 3 | elapsed time: 5.0 sec | avg. time: 1.667
i: 4 | elapsed time: 6.0 sec | avg. time: 1.500
i: 5 | elapsed time: 8.0 sec | avg. time: 1.600
i: 6 | elapsed time: 9.0 sec | avg. time: 1.500
i: 7 | elapsed time: 11.0 sec | avg. time: 1.571
i: 8 | elapsed time: 12.0 sec | avg. time: 1.500
i: 9 | elapsed time: 14.0 sec | avg. time: 1.556
i: 10 | elapsed time: 15.0 sec | avg. time: 1.500
$
Get current time in seconds since the Epoch on Linux, Bash
i have a list of entries from the logs:
15:38:52.363 1031
15:41:06.347 1259
15:41:06.597 1171
15:48:44.115 1588
15:48:44.125 1366
15:48:44.125 1132
15:53:14.525 1348
15:53:15.121 1553
15:53:15.181 1286
15:53:15.187 1293
the first one is the timestamp, the second one is the value.
now i'm trying to group them up by an interval of, say, 20 sec. i want to either sum the values, or get their average. i wonder what's the easiest way to do this? preferrably i can do this thru some simple shell script, so i can pipe my grep statement into and get a divided list. thanks!
This gawk script completely ignores fractional seconds. It also knows nothing about spanning from one day to the next (crossing 00:00:00):
grep ... | awk -v interval=20 'function groupout() {print "----", "Timespan ending:", strftime("%T", prevtime), "Sum:", sum, "Avg:", sum/count, "----"} BEGIN {prevtime = 0} {split($1, a, "[:.]"); time = mktime(strftime("%Y %m %d") " " a[1] " " a[2] " " a[3]); if (time > prevtime + interval) {if (NR != 1) {groupout(); sum=0; count=0}}; print; sum+=$2; count++; prevtime = time} END {groupout()}'
Output:
15:38:52.363 1031
---- Timespan ending: 15:38:52 Sum: 1031 Avg: 1031 ----
15:41:06.347 1259
15:41:06.597 1171
---- Timespan ending: 15:41:06 Sum: 2430 Avg: 1215 ----
15:48:44.115 1588
15:48:44.125 1366
15:48:44.125 1132
---- Timespan ending: 15:48:44 Sum: 4086 Avg: 1362 ----
15:53:14.525 1348
15:53:15.121 1553
15:53:15.181 1286
15:53:15.187 1293
---- Timespan ending: 15:53:15 Sum: 5480 Avg: 1370 ----
Here it is again more readably:
awk -v interval=20 '
function groupout() {
print "----", "Timespan ending:", strftime("%T", prevtime), "Sum:", sum, "Avg:", sum/count, "----"
}
BEGIN {
prevtime = 0
}
{
split($1, a, "[:.]");
time = mktime(strftime("%Y %m %d") " " a[1] " " a[2] " " a[3]);
if (time > prevtime + interval) {
if (NR != 1) {groupout(); sum=0; count=0}
};
print;
sum+=$2;
count++;
prevtime = time
}
END {groupout()}'