jq : Generate UUID in field

jq : Generate UUID in field - random

I have a requirement to tag records uniquely with UUIDs (for a correlation id). I cant see a direct way to do this via the options, is such a thing possible? If not, is there some kind of workaround that might be able to do this?
Is it even possible to generate a random number or string in jq?

It is possible to generate pseudo-random numbers in jq if you provide one initial random number (--argjson initialRandomNumber).
Passing $RANDOM$RANDOM instead of $RANDOM is intended to increase the range for the initial pseudo-random value.
I used a slightly modified version of the function nextRandomNumber from Rosettacode jq: random numbers
to generate random numbers, Strings and UUIDs as shown in the following code.
Each function takes a parameter $state and delivers a newState in the response for a subsequent call.
Because you ask how to generate a
random number
random string
UUID
you can find 6 functions in my code to generate a single instance and an array of them.
You can pick the function you need and use it as a filter in jq.
#!/bin/bash
jq -c -r -n --argjson initialRandomNumber "$RANDOM$RANDOM" '
# 15-bit integers generated using the same formula as rand() from the Microsoft C Runtime.
# The random numbers are in [0 -- 32767] inclusive.
#
# Input:
# first call: $state = a random number provided to jq by parameter
# subsequent call: $state = "newState" from last response
#
# Output:
# object with pseudo-random number and "newState" for a subsequent call.
def nextRandomNumber($state):
( (214013 * $state) + 2531011) % 2147483648 # mod 2^31
| { newState: .,
randomNumber: (. / 65536 | floor) };
def nextRandomNumbers($state; $count):
[foreach range($count) as $x (nextRandomNumber($state); nextRandomNumber(.newState); .)]
| { newState: .[-1].newState,
randomNumbers: map(.randomNumber) };
# ----- random UUID ---------------
def hexByte:
[. / 256 % 16, . % 16]
| map(if . < 10 then . + 48 else . + 87 end) # ASCII: 0...9: 48-57, a...f: 97-102
| implode;
def nextRandomUUID($state):
nextRandomNumbers($state; 16)
| .newState as $newState
| .randomNumbers
| map(hexByte)
| "\(.[0:4] | join(""))-\(.[4:6] | join(""))-\(.[6:8] | join(""))-\(.[8:10] | join(""))-\(.[10:] | join(""))"
| { newState: $newState,
randomUUID: . };
def nextRandomUUIDs($state; $count):
[foreach range($count) as $x (nextRandomUUID($state); nextRandomUUID(.newState); .)]
| { newState: .[-1].newState,
randomUUIDs: map(.randomUUID) };
# ----- random String ---------------
def letter:
. % 52
| [if . < 26 then . + 65 else . + 71 end] # ASCII: A...Z: 65-90, a...z: 97-122
| implode;
def nextRandomString($state; $minLength; $maxLength):
nextRandomNumber($state)
| (try (.randomNumber % ($maxLength - $minLength + 1) + $minLength) catch $minLength) as $length
| nextRandomNumbers(.newState; $length)
| .newState as $newState
| .randomNumbers
| map(letter)
| join("")
| { newState: $newState,
randomString: . };
def nextRandomStrings($state; $count; $minLength; $maxLength):
[foreach range($count) as $x (nextRandomString($state; $minLength; $maxLength); nextRandomString(.newState; $minLength; $maxLength); .)]
| { newState: .[-1].newState,
randomStrings: map(.randomString) };
# ----- example usage ---------------
nextRandomNumber($initialRandomNumber) # see output 1
# nextRandomNumbers($initialRandomNumber; 3) # see output 2
# nextRandomUUID($initialRandomNumber) # see output 3
# nextRandomUUIDs($initialRandomNumber; 3) # see output 4
# nextRandomString($initialRandomNumber; 10; 15) # see output 5
# nextRandomStrings($initialRandomNumber; 3; 6; 10) # see output 6
# nextRandomNumber($initialRandomNumber) | nextRandomNumbers(.newState; 3) # see output 7
'
Outputs
output 1: generate pseudo-random number
{"newState":912028498,"randomNumber":13916}
output 2: generate 3 pseudo-random numbers
{"newState":677282016,"randomNumbers":[10202,20943,6980]}`
output 3: generate random UUID
{"newState":1188119770,"randomUUID":"cdcda95b-af57-1303-da72-d21c6e7b1861"}
output 4: generate 3 random UUIDs
{"newState":907540185,"randomUUIDs":["855c1445-b529-4301-a535-20cb298feaff","5b685e49-8596-830e-f56a-0a22c43c4c32","35fed6d8-d72b-2833-fd6f-f99154358067"]}
output 5: generate random Strings with length 10-15
{"newState":1037126684,"randomString":"SJadqPGkERAu"}`
output 6: generate 3 random Strings with length 6-10
{"newState":316121190,"randomStrings":["eNKxechu","XPkvNg","TIABHbYCxB"]}`
output 7: using newState for a second call to generate 3 random numbers
{"newState":808494511,"randomNumbers":[26045,16811,12336]}`

jq currently has no support for UUID generation, so your best bet would be to feed UUIDs in to jq, e.g. along these lines:
ruby -e 'require "securerandom"; p SecureRandom.uuid' | jq '{uuid: .}'
{
"uuid": "5657dd65-a495-4487-9887-c7f0e01645c9"
}
The PRNG contributions for jq have unfortunately not yet made their way into an official release. For examples of PRNG generators written in jq, see e.g. rosettacode:
https://rosettacode.org/wiki/Linear_congruential_generator#jq
Reading from an unbounded stream of UUIDs
Assuming the availability of a uuid generator such as uuidgen, you could use input or inputs along the following lines:
jq -nR '[range(0;10) | input]' < <(while true; do uuidgen ; done)
(Notice that an OS pipe has been avoided here.)

Related

sum the odd numbers in line form file.txt bash

hello the StackOverflow i wanted to ask you how to sum the odd numbers in every line from input file.txt
the input.txt file looks like that
4 1 8 3 7
2 5 8 2 7
4 7 2 5 2
0 2 5 3 5
3 6 3 1 6
the output must be
11
12
12
13
7
start of the code like this
read -p "Enter file name:" filename
while read line
do
...
my code whats the wrong here
#!/bin/sh
read -p "Enter file name:" filename
while read line
do
sum = 0
if ($_ % 2 -nq 0){
sum = sum + $_
}
echo $sum
sum = 0
done <$filename

The logic seems correct in your question, so I'll go with you're not sure how to do this line by line as stated on your comment.
if ($_ % 2 -nq 0){
sum = sum + $_
}
I think it's a good place for a function in this case. Takes a string containing integers as input and returns the sum of all odd numbers on that string or -1 assuming there are no integers or all even numbers.
function Sum-OddNumbers {
[cmdletbinding()]
param(
[parameter(mandatory,ValueFromPipeline)]
[string]$Line
)
process {
[regex]::Matches($Line,'\d+').Value.ForEach{
begin {
$result = 0
}
process {
if($_ % 2) {
$result += $_
}
}
end {
if(-not $result) {
return -1
}
return $result
}
}
}
}
Usage
#'
4 1 8 3 7
2 5 8 2 7
4 7 2 5 2
0 2 5 3 5
3 6 3 1 6
2 4 6 8 10
asd asd asd
'# -split '\r?\n' | Sum-OddNumbers
Result
11
12
12
13
7
-1
-1

If that's how your txt file is set up, you can use Get-Content and a bit of logic to accomplish this.
Get-Content will read the file line by line (unless -Raw is specified), which we can pipe to a Foreach-Object to have the current line in the iteration split by the white space.
Then, we can evaluate the newly formed array (due to splitting the white space, leaving the numbers to create an array).
Finally, just get the sum of the odd numbers.
Get-Content -Path .\input.txt | ForEach-Object {
# Split the current line into an array of just #'s
$OddNumbers = $_.Split(' ').Trim() | Foreach {
if ($_ % 2 -eq 1) { $_ } # odd number filter
}
# Add the filtered results
($OddNumbers | Measure-Object -Sum).Sum
}

"what's wrong here":
while read line
do
sum = 0
if ($_ % 2 -nq 0){
sum = sum + $_
}
echo $sum
sum = 0
done <$filename
First, in sh, spaces are not allowed around the = in an assigmnent
Next the if syntax is wrong. See https://www.gnu.org/software/bash/manual/bash.html#index-if
See also https://www.gnu.org/software/bash/manual/bash.html#Shell-Arithmetic

Grep variable in for loop

I want to grep a specific line for each loop in a for loop. I've already looked on the internet to see an answer to my problem, I tried them but it doesn't seem to work for me... And I don't find what I'm doing wrong.
Here is the code :
for n in 2 4 6 8 10 12 14 ; do
for U in 1 10 100 ; do
for L in 2 4 6 8 ; do
i=0
cat results/output_iteration/occ_"$L"_"$n"_"$U"_it"$i".dat
for k in $(seq 1 1 $L) ; do
${'var'.$k}=`grep " $k " results/output_iteration/occ_"$L"_"$n"_"$U"_it"$i".dat | tail -n 1`
done
which gives me :
%
%
% site density double occupancy
1 0.49791021 0.03866179
2 0.49891438 0.06077808
3 0.50426102 0.05718336
4 0.49891438 0.06077808
./run_deviation_functionL.sh: line 109: ${'var'.$k}=`grep " $k " results/output_iteration/occ_"$L"_"$n"_"$U"_it"$i".dat | tail -n 1`: bad substitution
Then, I would like to take only the density number, with something like:
${'density'.$k}=`echo "${'var'.$k:10:10}" | bc -l`
Anyone knows the reason why it fails?

Use declare to create variable names from variables:
declare density$k="`...`"
Use the variable indirection to retrieve them:
var=var$k
echo ${!var:10:10}

Using Selenium IDE, how to get the value of dynamic variable

i'm using selenium ide 2.8 and i'm trying to store values please find my below commands:
store | ayman | val1
store | 1 | n
store | val${n} | e
how to echo the value of e which is ayman? when i try :
echo | ${e}
i got echo | val1
what is the issue with my commands?
Thanks

From what you've done there the value of 'e' is not ayman, you have stored ayman as the variable 'val1'. I'm not 100% what you're trying to do here but it looks like you're trying to store 2 individual variables and then combine them as one as well. If that is the cast then what you'd need is this
store | ayman | val1
store | 1 | n
store | ${val1}${n} | e
in which case:
val1 = ayman
n = 1
e = ayman1

It sounds like you are trying to force an array type structure? val[1], val[2]? Because what you want e to be is ${val${n}} right? Except that doesn't work. you could do this in javascript though (with storeEval): storeEval storedVars['val'+storedVars['n']] final

Data management with several variables

Currently I am facing the following problem, which I'm working in Stata to solve. I have added the algorithm tag, because it's mainly the steps that I'm interested in rather than the Stata code.
I have some variables, say, var1 - var20 that can possibly contain a string. I am only interested in some of these strings, let us call them A,B,C,D,E,F, but other strings can occur also (all of these will be denoted X). Also I have a unique identifier ID. A part of the data could look like this:
ID | var1 | var2 | var3 | .. | var20
1 | E | | | | X
1 | | A | | | C
2 | X | F | A | |
8 | | | | | E
Now I want to create an entry for every ID and for every occurrence of one of the strings A,B,C,E,D,F in any of the variables. The above data should look like this:
ID | var1 | var2 | var3 | .. | var20
1 | E | | | .. |
1 | | A | | |
1 | | | | | C
2 | | F | | |
2 | | | A | |
8 | | | | | E
Here we ignore every time there's a string X that is NOT A,B,C,D,E or F. My attempt so far was to create a variable that for each entry counts the number, N, of occurrences of A,B,C,D,E,F. In the original data above that variable would be N=1,2,2,1. Then for each entry I create N duplicates of this. This results in the data:
ID | var1 | var2 | var3 | .. | var20
1 | E | | | | X
1 | | A | | | C
1 | | A | | | C
2 | X | F | A | |
2 | X | F | A | |
8 | | | | | E
My problem is how do I attack this problem from here? And sorry for the poor title, but I couldn't word it any more specific.

Sorry, I thought the finally block was your desired output (now I understand that it's what you've accomplished so far). You can get the middle block with two calls to reshape (long, then wide).
First I'll generate data to match yours.
clear
set obs 4
* ids
generate n = _n
generate id = 1 in 1/2
replace id = 2 in 3
replace id = 8 in 4
* generate your variables
forvalues i = 1/20 {
generate var`i' = ""
}
replace var1 = "E" in 1
replace var1 = "X" in 3
replace var2 = "A" in 2
replace var2 = "F" in 3
replace var3 = "A" in 3
replace var20 = "X" in 1
replace var20 = "C" in 2
replace var20 = "E" in 4
Now the two calls to reshape.
* reshape to long, keep only desired obs, then reshape to wide
reshape long var, i(n id) string
keep if inlist(var, "A", "B", "C", "D", "E", "F")
tempvar long_id
generate int `long_id' = _n
reshape wide var, i(`long_id') string
The first reshape converts your data from wide to long. The var specifies that the variables you want to reshape to long all start with var. The i(n id) specifies that each unique combination of n and i is a unique observation. The reshape call provides one observation for each n-id combination for each of your var1 through var20 variables. So now there are 4*20=80 observations. Then I keep only the strings that you'd like to keep with inlist().
For the second reshape call var specifies that the values you're reshaping are in variable var and that you'll use this as the prefix. You wanted one row per remaining letter, so I made a new index (that has no real meaning in the end) that becomes the i index for the second reshape call (if I used n-id as the unique observation, then we'd end up back where we started, but with only the good strings). The j index remains from the first reshape call (variable _j) so the reshape already knows what suffix to give to each var.
These two reshape calls yield:
. list n id var1 var2 var3 var20
+-------------------------------------+
| n id var1 var2 var3 var20 |
|-------------------------------------|
1. | 1 1 E |
2. | 2 1 A |
3. | 2 1 C |
4. | 3 2 F |
5. | 3 2 A |
|-------------------------------------|
6. | 4 8 E |
+-------------------------------------+
You can easily add back variables that don't survive the two reshapes.
* if you need to add back dropped variables
forvalues i =1/20 {
capture confirm variable var`i'
if _rc {
generate var`i' = ""
}
}

Calculate mean, variance and range using Bash script

Given a file file.txt:
AAA 1 2 3 4 5 6 3 4 5 2 3
BBB 3 2 3 34 56 1
CCC 4 7 4 6 222 45
Does any one have any ideas on how to calculate the mean, variance and range for each item, i.e. AAA, BBB, CCC respectively using Bash script? Thanks.

Here's a solution with awk, which calculates:
minimum = smallest value on each line
maximum = largest value on each line
average = μ = sum of all values on each line, divided by the count of the numbers.
variance = 1/n × [(Σx)² - Σ(x²)] where
n = number of values on the line = NF - 1 (in awk, NF = number of fields on the line)
(Σx)² = square of the sum of the values on the line
Σ(x²) = sum of the squares of the values on the line
awk '{
min = max = sum = $2; # Initialize to the first value (2nd field)
sum2 = $2 * $2 # Running sum of squares
for (n=3; n <= NF; n++) { # Process each value on the line
if ($n < min) min = $n # Current minimum
if ($n > max) max = $n # Current maximum
sum += $n; # Running sum of values
sum2 += $n * $n # Running sum of squares
}
print $1 ": min=" min ", avg=" sum/(NF-1) ", max=" max ", var=" ((sum*sum) - sum2)/(NF-1);
}' filename
Output:
AAA: min=1, avg=3.45455, max=6, var=117.273
BBB: min=1, avg=16.5, max=56, var=914.333
CCC: min=4, avg=48, max=222, var=5253
Note that you can save the awk script (everything between, but not including, the single-quotes) in a file, say called script, and execute it with awk -f script filename

You can use python:
$ AAA() { echo "$#" | python -c 'from sys import stdin; nums = [float(i) for i in stdin.read().split()]; print(sum(nums)/len(nums))'; }
$ AAA 1 2 3 4 5 6 3 4 5 2 3
3.45454545455

Part 1 (mean):
mean () {
len=$#
echo $* | tr " " "\n" | sort -n | head -n $(((len+1)/2)) | tail -n 1
}
nMean () {
echo -n "$1 "
shift
mean $*
}
mean usage:
nMean AAA 3 4 5 6 3 4 3 6 2 4
4
Part 2 (variance):
variance () {
count=$1
avg=$2
shift
shift
sum=0
for n in $*
do
diff=$((avg-n))
quad=$((diff*diff))
sum=$((sum+quad))
done
echo $((sum/count))
}
sum () {
form="$(echo $*)"
formula=${form// /+}
echo $((formula))
}
nVariance () {
echo -n "$1 "
shift
count=$#
s=$(sum $*)
avg=$((s/$count))
var=$(variance $count $avg $*)
echo $var
}
usage:
nVariance AAA 3 4 5 6 3 4 3 6 2 4
1
Part 3 (range):
range () {
min=$1
max=$1
for p in $* ; do
(( $p < $min )) && min=$p
(( $p > $max )) && max=$p
done
echo $min ":" $max
}
nRange () {
echo -n "$1 "
shift
range $*
}
usage:
nRange AAA 1 2 3 4 5 6 3 4 5 2 3
AAA 1 : 6
nX is short for named X, named mean, named variance, ... .
Note, that I use integer arithmetic, which is, what is possible with the shell. To use floating point arithmetic, you would use bc, for instance. Here you loose precision, which might be acceptable for big natural numbers.
Process all 3 commands for an input line:
processLine () {
nVariance $*
nMean $*
nRange $*
}
Read the data from a file, line by line:
# data:
# AAA 1 2 3 4 5 6 3 4 5 2 3
# BBB 3 2 3 34 56 1
# CCC 4 7 4 6 222 45
while read line
do
processLine $line
done < data
update:
Contrary to my expectation, it doesn't seem easy to handle an unknown number of arguments with functions in bc, for example min (3, 4, 5, 2, 6).
But the need to call bc can be reduced to 2 places, if the input are integers. I used a precision of 2 ("scale=2") - you may change this to your needs.
variance () {
count=$1
avg=$2
shift
shift
sum=0
for n in $*
do
diff="($avg-$n)"
quad="($diff*$diff)"
sum="($sum+$quad)"
done
# echo "$sum/$count"
echo "scale=2;$sum/$count" | bc
}
nVariance () {
echo -n "$1 "
shift
count=$#
s=$(sum $*)
avg=$(echo "scale=2;$s/$count" | bc)
var=$(variance $count $avg $*)
echo $var
}
The rest of the code can stay the same. Please verify that the formula for the variance is correct - I used what I had in mind:
For values (1, 5, 9), I sum up (15) divide by count (3) => 5.
Then I create the diff to the avg for each value (-4, 0, 4), build the square (16, 0, 16), sum them up (32) and divide by count (3) => 10.66
Is this correct, or do I need a square root somewhere ;) ?
Note, that I had to correct the mean calculation. For 1, 5, 9, the mean is 5, not 1 - am I right? It now uses sort -n (numeric) and (len+1)/2.

There is a typo in the accepted answer that causes the variance to be miscalculated. In the print statement:
", var=" ((sum*sum) - sum2)/(NF-1)
should be:
", var=" (sum2 - ((sum*sum)/NF))/(NF-1)
Also, it is better to use something like Welford's algorithm to calculate variance; the algorithm in the accepted answer is unstable when the variance is small relative to the mean:
foo="1 2 3 4 5 6 3 4 5 2 3";
awk '{
M = 0;
S = 0;
for (k=1; k <= NF; k++) {
x = $k;
oldM = M;
M = M + ((x - M)/k);
S = S + (x - M)*(x - oldM);
}
var = S/(NF - 1);
print " var=" var;
}' <<< $foo

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

jq : Generate UUID in field - random

I have a requirement to tag records uniquely with UUIDs (for a correlation id). I cant see a direct way to do this via the options, is such a thing possible? If not, is there some kind of workaround that might be able to do this? Is it even possible to generate a random number or string in jq?

Related

sum the odd numbers in line form file.txt bash

Grep variable in for loop

Using Selenium IDE, how to get the value of dynamic variable

Data management with several variables

Calculate mean, variance and range using Bash script

Categories

Resources