Using Sensors command for Lua to use in generating rings

Using Sensors command for Lua to use in generating rings - bash

I am trying to reverse engineer a conky/lua script to tailor fit my machine. The thing is, hwmon does not pickup the correct value. I resorted in using sensors to detect CPU temperature. I was able to successfully use it in lua to create an if/then/else function that changes colour depending on the temperature however, I could not get it to work with the "ring"
This is what I have currently:
local cpu_temp = io.popen("sensors -u | awk '/temp1_input:/ {print $2; exit}'"):read("*l")
local cpu0_temp = tonumber(cpu_temp:read('*a'))
settings_table = {
{
--1
name='hwmon',
arg='cpu0_temp',
max=110,
bg_colour=0x3b3b3b,
bg_alpha=0.8,
fg_colour=0x165cc4,
fg_alpha=0.8,
x=185, y=125,
radius=120,
thickness=4,
start_angle=0,
end_angle=240
},
}
function temp_watch()
warn_value=60
crit_value=80
local file= io.popen("sensors -u | awk '/temp1_input:/ {print $2; exit}'")
local temperature = tonumber(file:read('*a'))
if temperature<warn_value then
settings_table[1]['fg_colour']=normal
elseif temperature<crit_value then
settings_table[1]['fg_colour']=warn
else
settings_table[1]['fg_colour']=crit
end
file:close()
end
function conky_main()
temp_watch()
end
I am no expert in lua and I was patterning the first 2 lines on what I have learned and from this Question
What I get is this error:
conky: get_sysfs_info(): read from failed
conky: can't open '': No such file or directory
The original settings table looks like this:
settings_table = {
{
--1
name='hwmon',
arg='1 temp 1',
max=110,
bg_colour=0x3b3b3b,
bg_alpha=0.8,
fg_colour=0x165cc4,
fg_alpha=0.8,
x=185, y=125,
radius=120,
thickness=4,
start_angle=0,
end_angle=240
},
And it is stuck to 50 degrees Celsius.

Related

Parsing multiple instances of data

I am trying to parse multiple instances of data from a textfile. I can grep and grab one line and the lat/lon associated with that find, but I am having issued parsing multiple instances:
... CATEGORICAL ...
SLGT 33618675 34608681 35658642 36668567 38218542 41018363
41588227 41918045 41377903 40177805 38927813 37817869
36678030 35068154 33368262 33078321 32888462 33618675
SLGT 30440169 31710202 33010185 33730148 34010037 33999962
33709892 32869871 30979883 29539912 29430025 30440169
SLGT 41788755 41698893 42069059 42639132 43889124 44438960
44438757 43988717 43278708 42398720 41788755
MRGL 42897922 41907743 40147624 38837627 37637700 35897915
35028021 34038079 33118130 31998226 31698419 32078601
32818733 33848809 34758764 36998623 38588677 39458701
40178757 40608870 41069099 43549479 44499512 44809478
45259379 44989263 45109100 45718986 46478920 46758853
46738752 46398664 44768565 44308457 43198218
MRGL 29720174 31900221 33650181 34160154 34430032 34649931
34159800 32539784 31359767 29739808 29299723 28969581
28959440 99999999 26769674 26579796 26139874
TSTM 45077438 43177245 40597113 99999999 30488085 30248563
29588926 28739072 28569092 99999999 27138160 27578139
27908100 27848061 27518032 26968006 26338005 25698017
25338025 25088048 25058071 25238109 25578128 25888157
26218171 26578170 26988163 27138160 99999999 29200399
31910374 33520340 35190229 35450147 36109944 36399709
35779395 36399167 38559059 40189373 41729594 43029985
42820283 42860489 43580863 44121062 44521135 45281179
46271166 47561286 48251548 48671765 49051814 99999999
38810245 37660271 37120322 36950398 37090559 37380662
38090741 39410791 39980777 40930695 41380598 41370510
41190353 40840299 40220263 38810245
From: https://www.spc.noaa.gov/products/outlook/archive/2019/KWNSPTSDY1_201906241300.txt
Here is my code and results:
#!/bin/sh
sed -n '/^MRGL/,/^TSTM/p;/^TSTM/q' day1_status | sed '$ d' | sed -e 's/MRGL//g' > MRGL
while read line
do
count=1
ncols=$(echo $line | wc -w)
while [ $count -le $ncols ]
do
echo $line | cut -d' ' -f$count
((count++))
done
done < MRGL > MRGL_output.txt
cat MRGL_output.txt | sed ':a;s/\B[0-9]\{2\}\>/.&/;ta'| sed 's/./, -/6' > MRGL_final
Results:
one instance of MRGL and the lat/lon associated with that polygon
more MRGL
32947889 34137855 35307825 36147735 36327622 35797468
27107968 25518232 99999999 27088303 28418215 30208125
30618064
Turn the line above into a single instance of lines
more MRGL_output.txt
32947889
34137855
35307825
36147735
36327622
35797468
27107968
25518232
99999999
27088303
28418215
30208125
30618064
Final format that I need it in
more MRGL_final
32.94, -78.89
34.13, -78.55
35.30, -78.25
36.14, -77.35
36.32, -76.22
35.79, -74.68
27.10, -79.68
25.51, -82.32
99.99, -99.99
27.08, -83.03
28.41, -82.15
30.20, -81.25
30.61, -80.64
Just need to parse multiple instances that show up.
UPDATE for better explanation.
... CATEGORICAL ...
ENH 38298326 40108202 40518094 40357974 39907953 39017948
38038052 36148202 35848297 35888367 36618371 38298326
SLGT 30440169 31710202 33010185 33730148 34010037 33999962
33709892 32869871 30979883 29539912 29430025 30440169
SLGT 33548672 34408661 35918543 36858496 38648520 41018363
41588227 41918045 41377903 40177805 38927813 37817869
36678030 35068154 33368262 33078321 32888462 33548672
SLGT 41788755 41698893 42069059 42639132 43889124 44438960
44438757 43988717 43278708 42398720 41788755
MRGL 29720174 31900221 33650181 34160154 34430032 34649931
34159800 32539784 31359767 30059748 29299723 28969581
28959440 99999999 26769674 26579796 26139874
MRGL 42897922 41907743 40147624 38837627 37637700 35897915
35028021 34038079 33118130 31938225 30758424 30678620
30988709 34128741 36208583 37738554 39508601 40628878
41069099 43549479 44499512 44809478 45259379 44989263
45109100 45718986 46478920 46758853 46738752 46398664
44768565 44308457 43198218
TSTM 30488085 29978211 29408316 29068379 99999999 27138160
27578139 27908100 27848061 27518032 26968006 26338005
25698017 25338025 25088048 25058071 25238109 25578128
25888157 26218171 26578170 26988163 27138160 99999999
45427410 43217292 40247181 99999999 28650405 31910374
33520340 35190229 35450147 36109944 36399709 35779395
36769245 38319148 40189373 41219571 41299753 39959979
38220054 37320091 36560136 36070290 36100295 35840394
36790544 37150626 37880709 39110774 40120876 41150895
41600769 41890540 43070599 43580863 43390914 43401262
44171458 45521497 46131301 47181242 47561286 48251548
48671765 49371856
Wanting to take this data set above and grab each available risk ENH, SLGT, MRGL, TSTM lat and long and place into this format:
"Enhanced Risk"
38.29, -83.26
40.10, -82.02
40.51, -80.94
40.35, -79.74
39.90, -79.53
39.01, -79.48
38.03, -80.52
36.14, -82.02
35.84, -82.97
35.88, -83.67
36.61, -83.71
38.29, -83.26
End:
"Slight Risk"
30.44, -101.69
31.71, -102.02
33.01, -101.85
33.73, -101.48
34.01, -100.37
33.99, -99.62
33.70, -98.92
32.86, -98.71
30.97, -98.83
29.53, -99.12
29.43, -100.25
30.44, -101.69
End:
"Slight Risk"
33.54, -86.72
34.40, -86.61
35.91, -85.43
36.85, -84.96
38.64, -85.20
41.01, -83.63
41.58, -82.27
41.91, -80.45
41.37, -79.03
40.17, -78.05
38.92, -78.13
37.81, -78.69
36.67, -80.30
35.06, -81.54
33.36, -82.62
33.07, -83.21
32.88, -84.62
33.54, -86.72
End:
"Slight Risk"
41.78, -87.55
41.69, -88.93
42.06, -90.59
42.63, -91.32
43.88, -91.24
44.43, -89.60
44.43, -87.57
43.98, -87.17
43.27, -87.08
42.39, -87.20
41.78, -87.55
End:
"Marginal Risk"
29.72, -101.74
31.90, -102.21
33.65, -101.81
34.16, -101.54
34.43, -100.32
34.64, -99.31
34.15, -98.00
32.53, -97.84
31.35, -97.67
30.05, -97.48
29.29, -97.23
28.96, -95.81
28.95, -94.40
26.76, -96.74
26.57, -97.96
26.13, -98.74
End:

Here's a little awk program which seems to work, although I'm not certain about some of the details. In particular, I don't know what the minimum value for longitude is; evidently, a value under the minimum has 100 added to it before the longitude is negated. So you'll have to change LON_THRESHOLD to what you consider the correct value.
I've tried to avoid the usual temptation to golf awk programs into a textual minimum, in the hopes that the way this program works is less obscure. But it's entirely possible that some awkisms snuck in anyway. I added a bit of explanation at the end.
BEGIN { risk["HIGH"] = "High Risk"
risk["ENH"] = "Enhanced Risk"
risk["SLGT"] = "Slight Risk"
risk["MRGL"] = "Marginal Risk"
LON_THRESHOLD = 30
END_STRING = "End:"
}
END { if (in_risk) print END_STRING }
in_risk && substr($0, 1, 1) != " " {
print END_STRING "\n" "\n"
in_risk = 0
}
$1 in risk { printf("\"%s\"\n", risk[$1])
in_risk = 2
}
in_risk { for (i = in_risk; i <= NF; ++i) {
lat = substr($i, 1, 4) / 100
lon = substr($i, 5, 4) / 100
if (lon < LON_THRESHOLD) lon += 100
printf "%5.2f, %.2f\n", lat, -lon
}
in_risk = 1
}
Save that program as, for example, noaa.awk, and then apply it with:
awk -f noaa.awk input.txt
By way of explanation:
Awk programs consist of a series of rules. Each rule has a predicate -- that is, an expression which evaluates to a true or false value -- and an action.
Awk processes each line from its input in turn, running through all of the rules and executing the actions of the ones whose predicates evaluate to a true value. Inside the action, you can use the $ operator to access individual fields in the input (by default, fields are separated with whitespace). $0 is the entire input line, and $n is field number n. Unlike bash/sh, $ is an operator and can be applied to an expression.
BEGIN and END rules are special, in that they are not real variables. BEGIN rules are executed exactly once, before any other processing; END rules are executed exactly once after all processing is finished. In this example, as is common, BEGIN is used to initialise reference data, while END is used for any necessary termination -- in this case, printing the final End: line.
In cases like this, where the desired action is really dependent on where we are in the file, it's necessary to build some kind of state machine, and I did that using the variable in_risk, which has three possible values:
0 or undefined: We're not currently in a block corresponding to a risk selector.
1: The current line, if it starts with a space, is part of a previously identified risk selector.
2: The current line has been detected as starting with a risk selector.
The reason for the difference between the last two values is that $1 in a line which starts with a risk selector is the risk selector, whereas in a line which starts with a space, $1 is actually the first number. So when we're iterating over the numbers in a line, we have to start with $2 for lines which start with a risk selector.

If you're just asking how to turn a file of lines of like AABBCCDD into lines like AA.BB, -CC.DD:
perl -nE '/^(..)(..)(..)(..)$/ && say "$1.$2, -$3.$4"' MRGL_output.txt
(There's almost certainly better ways to get from your original input to those lines, but I'm not really clear on what your posted code is doing or why)
I think this will process your original input correctly, but can't be sure because the numbers in your sample output don't match up with your sample input so I can't verify:
perl -anE 'if (/^MRGL/ .. /^TSTM/) { exit if /^TSTM/; push #nums, #F }
END { for (#nums) {
if (/^(..)(..)(..)(..)$/) { say "$1.$2, -$3.$4" }
}}' day1_status

Got GNU Awk?
awk -v RS='\\s+' '
/[A-Z]/ {p = /^MRGL$/? 1: 0; next}
p {print gensub(/(..)(..)(..)(..)/, "\\1.\\2, -\\3.\\4", "G")}
' file
-v RS'\\s+' - Use any amount of whitespace as the Record Separator
/[A-Z]/ {...} - On records with uppercase alphabetics, do
p = /^MRGL$/? 1: 0; next - Set flag if record is MRGL, else unset, but always skip any other rules.
p {print gensub(...)} - Print result of gensub if flag is set
/(...)/, "\\1", "G" - Capturing groups, Backreferences, Global substitution.

Scan each Lines of the file and look for match (|##|) , if found add double quotes in front of next line

Need your help and Suggestion on the below requirements.
Read each line and look for '|##|' ( this is my end line char) if found, insert /add Double quotes (") in front or beginning of the next line
Input file #1:
101|*|101-C-01071|*|FIRSTLINE|##|
102|*|DOUBT and Stage go to
and admin|*|SIMCA-ON|##|
103|*|101-C-01071|*|VEEVA|##|
Output expected:
101|*|101-C-01071|*|FIRSTLINE|##|
"102|*|DOUBT and Stage go to
and admin|*|SIMCA-ON|##|
"103|*|101-C-01071|*|VEEVA|##|
Input file #2:
101|*|101-C-01071|*|Integrity Testing of Bleed Back Filter (Emflon II ) Installed on ETC Autoclave|*|PUBLIC|*||*|Y|*|Y|*||*|CFC6E8227CDB0A7AE040800AA5644B19|*|mdarling|*|2007.06.22 09:59:03|##|
102|*|102-CC-10|*|Sterilization and Leak Rate Testing of Stokes Style Freeze Drier|*|PUBLIC|*||*|Y|*|Y|*||*|CFC6E8227CDC0A7AE040800AA5644B19|*|mdarling|*|2007.06.22 10:00:05|##|
103|*|103-CC-13|*|CIP/SIP Leak Testing Hull 380sq. ft. Freeze Drier|*|PUBLIC|*||*|Y|*|Y|*||*|CFC6E8227CDD0A7AE040800AA5644B19|*|mdarling|*|2007.06.22 10:00:16|##|
104|*|104-CC-15|*|Emergency Situations & Alarms Involving the Automated Hull Freeze Drier Control|*|PUBLIC|*||*|Y|*|Y|*||*|CFC6E8227CDE0A7AE040800AA5644B19|*|mdarling|*|2007.06.22 10:00:32|##|
Results:
101|*|101-C-01071|*|Integrity Testing of Bleed Back Filter (Emflon II ) Installed on ETC Autoclave|*|PUBLIC|*||*|Y|*|Y|*||*|CFC6E8227CDB0A7AE040800AA5644B19|*|mdarling|*|2007.06.22 09:59:03|##|
"102|*|102-CC-10|*|Sterilization and Leak Rate Testing of Stokes Style Freeze Drier|*|PUBLIC|*||*|Y|*|Y|*||*|CFC6E8227CDC0A7AE040800AA5644B19|*|mdarling|*|2007.06.22 10:00:05|##|
103|*|103-CC-13|*|CIP/SIP Leak Testing Hull 380sq. ft. Freeze Drier|*|PUBLIC|*||*|Y|*|Y|*||*|CFC6E8227CDD0A7AE040800AA5644B19|*|mdarling|*|2007.06.22 10:00:16|##|
"104|*|104-CC-15|*|Emergency Situations & Alarms Involving the Automated Hull Freeze Drier Control|*|PUBLIC|*||*|Y|*|Y|*||*|CFC6E8227CDE0A7AE040800AA5644B19|*|mdarling|*|2007.06.22 10:00:32|##|

Sed works:
sed -n ':a;/|##|/{p;n;s/^/"/;ba;};p;'
Live example available at tutorialpoints.

How about awk solution:
> cat in_file
101|*|101-C-01071|*|FIRSTLINE|##|
102|*|DOUBT and Stage go to
and admin|*|SIMCA-ON|##|
103|*|101-C-01071|*|VEEVA|##|
> awk -F" " ' { if($0~/\|##\|/) { print; getline; print "\x22"$0 } else { print } } ' in_file
101|*|101-C-01071|*|FIRSTLINE|##|
"102|*|DOUBT and Stage go to
and admin|*|SIMCA-ON|##|
"103|*|101-C-01071|*|VEEVA|##|
>
EDIT1:
> cat double_quote.txt
101||101-C-01071||Integrity Testing of Bleed Back Filter (Emflon II ) Installed on ETC
Autoclave||PUBLIC||||Y||Y||||CFC6E8227CDB0A7AE040800AA5644B19||mdarling||2007.06.22
09:59:03|##| 102||102-CC-10||Sterilization and Leak Rate Testing of Stokes Style Freeze
Drier||PUBLIC||||Y||Y||||CFC6E8227CDC0A7AE040800AA5644B19||mdarling||2007.06.22
10:00:05|##| 103||103-CC-13||CIP/SIP Leak Testing Hull 380sq. ft. Freeze
Drier||PUBLIC||||Y||Y||||CFC6E8227CDD0A7AE040800AA5644B19||mdarling||2007.06.22
10:00:16|##| 104||104-CC-15||Emergency Situations & Alarms Involving the Automated Hull Freeze
Drier Control||PUBLIC||||Y||Y||||CFC6E8227CDE0A7AE040800AA5644B19||mdarling||2007.06.22
10:00:32|##| strong text
>
> awk -F" " ' { if($0~/\|##\|/) { print; pre=$0;getline; if(pre!=$0) {print "\x22"$0 } } else { print } } ' double_quote.txt
101||101-C-01071||Integrity Testing of Bleed Back Filter (Emflon II ) Installed on ETC
Autoclave||PUBLIC||||Y||Y||||CFC6E8227CDB0A7AE040800AA5644B19||mdarling||2007.06.22
09:59:03|##| 102||102-CC-10||Sterilization and Leak Rate Testing of Stokes Style Freeze
"Drier||PUBLIC||||Y||Y||||CFC6E8227CDC0A7AE040800AA5644B19||mdarling||2007.06.22
10:00:05|##| 103||103-CC-13||CIP/SIP Leak Testing Hull 380sq. ft. Freeze
"Drier||PUBLIC||||Y||Y||||CFC6E8227CDD0A7AE040800AA5644B19||mdarling||2007.06.22
10:00:16|##| 104||104-CC-15||Emergency Situations & Alarms Involving the Automated Hull Freeze
"Drier Control||PUBLIC||||Y||Y||||CFC6E8227CDE0A7AE040800AA5644B19||mdarling||2007.06.22
10:00:32|##| strong text
>

comparing 2 data sets possibly with concurrency/asynchronous/parallel approach

I am currently trying to improve upon an existing mechanism (to compare data from 2 sources, implemented in perl5) and would like to use perl6 instead.
My target data volume range is about 20-30 GB in uncompressed flat files.
In terms of lines, a file can contain anywhere from 18 million to 28 million lines.
It has around 40-50 columns per line.
I do this type of data reconciliation on a daily basis and it can take about ~10 minutes to read from a file and populate the hash. ~20 minutes spent to read both files and to populate hash.
comparison process takes about ~30-50 minutes including iterating over hash, collecting desired result(s), and writing to output file (csv,psv).
All in all it can take anywhere between 30 minutes to 60 minutes on a 32 core dual xeon cpu server with 256gb of RAM, including intermittent server load, to perform the process.
Now I am trying to bring down the total processing time even further.
Here is my current single threaded approach using perl5.
fetch data from 2 sources (let's say s1 and s2) one by one and populate my hash based on key-value pairs. Source of data could be either a flat csv or psv file OR a database query Array of Array result, via DBI client. Data is always unsorted to start with.
To be specific, I read the file line by line,split fields, and choose desired indexes for key,value pair and insert into hash.
After collecting data and populating hash with desired key/value pairs,I start to compare and collect results (mainy comparing on what is missing or different in s2 w.r.t s1 and vice-versa).
dump output in an excel file (very costly if no. of lines is large like ~1 million or greater) or in a simple CSV (cheap operation. preferred method).
I was wondering whether if I could somehow do the first step in parallel i.e. collect data from both sources at once and populate my global hash, and then proceed to compare and dump output?
What options can perl6 provide to deal with this situation? I have read about concurrency, asynchronous and parallel operations using perl6 but I am not so certain which one can help me here.
I would really appreciate any general guidance on the matter. I hope I explained my problem well but sadly I don't have much to show for what have I tried till now? and reason is that I am just beginning to tackle this one. I am just unable to see past the single threaded approach and need some help.
Thanks.
EDIT
As my existing problem statement has been deemed by the community as 'too broad' - allow me to attempt to highlight my pain points below:
I would like to do file comparison by utilizing all 32 cores if possible. I am just not able to come up with a strategy or initial idea.
What type of new techniques are available or applicable with perl6 in order to tackle this problem or type of problem.
If I spawn 2 processes to read file(s) and collect data - is it possible to get the result back as an array or hash?
Is it possible to compare the data (stored in hash) in parallel?
My current p5 comparison logic is shown below for your reference. Hope this helps and not let this question shutdown.
package COMP;
use strict;
use Data::Dumper;
sub comp
{
my ($data,$src,$tgt) = #_;
my $result = {};
my $ms = ($result->{ms} = {});
my $mt = ($result->{mt} = {});
my $diff = ($result->{diff} = {});
foreach my $key (keys %{$data->{$src}})
{
my $src_val = $data->{$src}{$key};
my $tgt_val = $data->{$tgt}{$key};
next if ($src_val eq $tgt_val);
if (!exists $data->{$tgt}{$key}) {
push (#{$mt->{$key}}, "$src_val|NULL");
}
if (exists $data->{$tgt}{$key} && $src_val ne $tgt_val) {
push (#{$diff->{$key}}, "$src_val|$tgt_val")
}
}
foreach my $key (keys %{$data->{$tgt}})
{
my $src_val = $data->{$src}{$key};
my $tgt_val = $data->{$tgt}{$key};
next if ($src_val eq $tgt_val);
if (!exists $data->{$src}{$key}) {
push (#{$ms->{$key}},"NULL|$tgt_val");
}
}
return $result;
}
1;
If someone would like to try it out, here is the sample output and the test script used.
script output
[User#Host:]$ perl testCOMP.pl
$VAR1 = {
'mt' => {
'Source' => [
'source|NULL'
]
},
'ms' => {
'Target' => [
'NULL|target'
]
},
'diff' => {
'Sunday_isit' => [
'Yes|No'
]
}
};
Test Script
[User#Host:]$ cat testCOMP.pl
#!/usr/bin/env perl
use lib $ENV{PWD};
use COMP;
use strict;
use warnings;
use Data::Dumper;
my $data2 = {
f1 => {
Amitabh => 'Bacchan',
YellowSun => 'Yes',
Sunday_isit => 'Yes',
Source => 'source',
},
f2 => {
Amitabh => 'Bacchan',
YellowSun => 'Yes',
Sunday_isit => 'No',
Target => 'target',
},
};
my $result = COMP::comp ($data2,'f1','f2');
print Dumper $result;
[User#Host:]$

If you have an existing and working toolchain you don't have to rewrite it all to use Perl6. It's parallelism mechanisms work fine with external processess too. Consider
allnum.pl6
use v6;
my #processes =
[ "num1.txt", "num2.txt", "num3.txt", "num4.txt", "num5.txt" ]
.map( -> $filename {
[ $filename, run "perl", "num.pl", $filename, :out ];
})
.hyper;
say "Lazyness Here!";
my $time = time;
for #processes
{
say "<{$_[0]} : {$_[1].out.slurp}>";
}
say time - $time, "s";
num.pl
use warnings;
use strict;
my $file = shift #ARGV;
my $start = time;
my $result = 0;
open my $in, "<", $file or die $!;
while (my $thing = <$in>)
{
chomp $thing;
$thing =~ s/ //g;
$result = ($result + $thing) / 2;
}
print $result, " : ", time - $start, "s";
On my system
C:\Users\holli\tmp>perl6 allnum.pl6
Lazyness Here!
<num1.txt : 7684.16347578616 : 3s>
<num2.txt : 3307.36261498186 : 7s>
<num3.txt : 5834.32817942962 : 10s>
<num4.txt : 6575.55944995197 : 0s>
<num5.txt : 6157.63100049619 : 0s>
10s
Files were set up like so
C:\Users\holli\tmp>perl -e "for($i=0;$i<10000000;$i++) { print chr(32) ** 100, int(rand(1000)), chr(32) ** 100, qq(\n); }">num1.txt
C:\Users\holli\tmp>perl -e "for($i=0;$i<20000000;$i++) { print chr(32) ** 100, int(rand(1000)), chr(32) ** 100, qq(\n); }">num2.txt
C:\Users\holli\tmp>perl -e "for($i=0;$i<30000000;$i++) { print chr(32) ** 100, int(rand(1000)), chr(32) ** 100, qq(\n); }">num3.txt
C:\Users\holli\tmp>perl -e "for($i=0;$i<400000;$i++) { print chr(32) ** 100, int(rand(1000)), chr(32) ** 100, qq(\n); }">num4.txt
C:\Users\holli\tmp>perl -e "for($i=0;$i<5000;$i++) { print chr(32) ** 100, int(rand(1000)), chr(32) ** 100, qq(\n); }">num5.txt

Reverse Geocoding in Bash using GPS Position from exiftool

I am writing a bash script that renames JPG files based on their EXIF tags. My original files are named like this:
IMG_2110.JPG
IMG_2112.JPG
IMG_2113.JPG
IMG_2114.JPG
I need to rename them like this:
2015-06-07_11-21-38_iPhone6Plus_USA-CA-Los_Angeles_IMG_2110.JPG
2015-06-07_11-22-41_iPhone6Plus_USA-CA-Los_Angeles_IMG_2112.JPG
2015-06-13_19-05-10_iPhone6Plus_Morocco-Fez_IMG_2113.JPG
2015-06-13_19-12-55_iPhone6Plus_Morocco-Fez_IMG_2114.JPG
My bash script uses exiftool to parse the EXIF header and rename the files. For those files that do not contain an EXIF create date, I am using the file modification time.
#!/bin/bash
IFS=$'\n'
for i in *.*; do
MOD=`stat -f %Sm -t %Y-%m-%d_%H-%m-%S $i`
model=$( exiftool -f -s3 -"Model" "${i}" )
datetime=$( exiftool -f -s3 -"DateTimeOriginal" "${i}" )
stamp=${datetime//:/-}"_"${model// /}
echo ${stamp// /_}$i
done
I am stuck on the location. I need to determine the country and city using the GPS information from the EXIF tag. exiftool provides a field called "GPS Position." Of all the fields, this seems the most useful to determine location.
GPS Position : 40 deg 44' 49.36" N, 73 deg 56' 28.18" W
Google provides a public API for geolocation, but it requires latitude/longitude coordinates in this format:
40.7470444°, -073.9411611°
The API returns quite a bit of information (click the link to see the results):
https://maps.googleapis.com/maps/api/geocode/json?latlng=40.7470444,-073.9411611
My question is:
How do I format the GPS Position to a latitude/longitude value that will provide acceptable input to a service such as Google geolocation?
How do I parse the JSON results to extract just the country and city, in a way that is consistent with many different kinds of locations? Curl, and then? Ideally, I’d like to handle USA locations one way, and non-USA locations, another. USA locations would be formatted USA-STATE-City, whereas non-USA locations would be formatted COUNTRY-City.
I need to do this all in a bash script. I've looked at pygeocoder and gpsbabel but they do not seem to do the trick. There are a few free web tools available but they don't provide an API (http://www.earthpoint.us/Convert.aspx).

Better later than never, right.
So, I just came across the same issue and I've managed to make the conversion using the EXIFTool itself. Try this:
exiftool -n -p '$GPSLatitude,$GPSLongitude' image_name.jpg
The converted coordinates are slightly longer than proposed by Google, but the API accepted it fine.
Cheers.

For #1, the awk should not be that complicated:
awk '/GPS Position/{
lat=$4; lat+=strtonum($6)/60; lat+=strtonum($7)/3600; if($8!="N,")lat=-lat;
lon=$9; lon+=strtonum($11)/60; lon+=strtonum($12)/3600; if($13!="E")lon=-lon;
printf "%.7f %.7f\n",lat,lon
}'

I ended up doing it in PHP, but thanks for the tip Marco I'll check it out!
function get_gps($gps_pos) {
$parts = explode(" ",str_replace(array("deg ",",","'","\""),"",$gps_pos));
$lat_deg = $parts[0];
$lat_min = $parts[1];
$lat_sec = $parts[2];
$lat_dir = $parts[3];
$lon_deg = $parts[4];
$lon_min = $parts[5];
$lon_sec = $parts[6];
$lon_dir = $parts[7];
if ($lat_dir == "N") {
$lat_sin = "+";
} else {
$lat_sin = "-";
}
if ($lon_dir == "E") {
$lon_sin = "+";
} else {
$lon_sin = "-";
}
$latitiude = $lat_sin.($lat_deg+($lat_min/60)+($lat_sec/3600));
$longitude = $lon_sin.($lon_deg+($lon_min/60)+($lon_sec/3600));
return $latitiude.",".$longitude;
}

From man exiftool (note the last line):
-c FMT (-coordFormat)
Set the print format for GPS coordinates. FMT uses the same syntax
as a "printf" format string. The specifiers correspond to degrees,
minutes and seconds in that order, but minutes and seconds are
optional. For example, the following table gives the output for
the same coordinate using various formats:
FMT Output
------------------- ------------------
"%d deg %d' %.2f"\" 54 deg 59' 22.80" (default for reading)
"%d %d %.8f" 54 59 22.80000000 (default for copying)
"%d deg %.4f min" 54 deg 59.3800 min
"%.6f degrees" 54.989667 degrees
And regarding "There are a few free web tools available but they don't provide an API"—geoapify.com offers a free web tool but also an API. Their API is free for up to three thousand requests per day. Their web service does five hundred at a time.

Comparing many files in Bash

I'm trying to automate a task at work that I normally do by hand, that is taking database output from the permissions of multiple users and comparing them to see what they have in common. I have a script right now that uses comm and paste, but it's not giving me all the output I'd like.
Part of the problem comes in comm only dealing with two files at once, and I need to compare at least three to find a trend. I also need to determine if two out of the three have something in common, but the third one doesn't have it (so comparing the output of two comm commands doesn't work). I need these in comma separated values so it can be imported into Excel. Each user has a column, and at the end is a listing of everything they have in common. comm would work perfectly if it could compare more than two files (and show two-out-of-three comparisons).
In addition to the code I have to clean all the extra cruft off the raw csv file, here's what I have so far in comparing four users. It's highly inefficient, but it's what I know.
cat foo1 | sort > foo5
cat foo2 | sort > foo6
cat foo3 | sort > foo7
cat foo4 | sort > foo8
comm foo5 foo6 > foomp
comm foo7 foo8 > foomp2
paste foomp foomp2 > output2
sed 's/[\t]/,/g' output2 > output4.csv
cat output4.csv
Right now this outputs two users, their similarities and differences, then does the same for another two users and pastes it together. This works better than doing it by hand, but I know I could be doing more.
An example input file would be something like:
User1
Active Directory
Internet
S: Drive
Sales Records
User2
Active Directory
Internet
Pricing Lookup
S: Drive
User3
Active Directory
Internet
Novell
Sales Records
where they have AD and Internet in common, two out of three have sales records access and S: drive permission, only one of each has Novell and Pricing access.
Can someone give me a hand in what I'm missing?

Using GNU AWK (gawk) you can print a table that shows how multiple users' permissions correlate. You could also do the same thing in any language that supports associative arrays (hashes), such as Bash 4, Python, Perl, etc.
#!/usr/bin/awk -f
{
array[FILENAME, $0] = $0
perms[$0] = $0
if (length($0) > maxplen) {
maxplen = length($0)
}
users[FILENAME] = FILENAME
}
END {
pcount = asort(perms)
ucount = asort(users)
maxplen += 2
colwidth = 8
printf("%*s", maxplen, "")
for (u = 1; u <= ucount; u++) {
printf("%-*s", colwidth, users[u])
}
printf("\n")
for (p = 1; p <= pcount; p++) {
printf("%-*s", maxplen, perms[p])
for (u = 1; u <= ucount; u++) {
if (array[users[u], perms[p]]) {
printf("%-*s", colwidth, " X")
} else {
printf("%-*s", colwidth, "")
}
}
printf("\n")
}
}
Save this file, perhaps calling it "correlate", then set it to be executable:
$ chmod u+x correlate
Then, assuming that the filenames correspond to the usernames or are otherwise meaningful (your examples are "user1" through "user3" so that works well), you can run it like this:
$ ./correlate user*
and you would get the following output based on your sample input:
user1 user2 user3
Active Directory X X X
Internet X X X
Novell X
Pricing Lookup X
S: Drive X X
Sales Records X X
Edit:
This version doesn't use asort() and so it should work on non-GNU versions of AWK. The disadvantage is that the order of rows and columns is unpredictable.
#!/usr/bin/awk -f
{
array[FILENAME, $0] = $0
perms[$0] = $0
if (length($0) > maxplen) {
maxplen = length($0)
}
users[FILENAME] = FILENAME
}
END {
maxplen += 2
colwidth = 8
printf("%*s", maxplen, "")
for (u in users) {
printf("%-*s", colwidth, u)
}
printf("\n")
for (p in perms) {
printf("%-*s", maxplen, p)
for (u in users) {
if (array[u, p]) {
printf("%-*s", colwidth, " X")
} else {
printf("%-*s", colwidth, "")
}
}
printf("\n")
}
}

You can use the diff3 program. From the man page:
diff3 - compare three files line by line
Given your sample inputs, above, running diff3 results in:
====
1:3,4c
S: Drive
Sales Records
2:3,4c
Pricing Lookup
S: Drive
3:3,4c
Novell
Sales Records
Does this get you any closer to what you're looking for?

I would use the strings command to remove any binary from the files, cat them together then use uniq -c on the concatenated file to get a count of occurrences of a string

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio