Finding nearest timestamp in Mathematica - wolfram-mathematica

in Mathematica I have to find a timestamp closest to a given one. I have:
alltrafotstamps = (DateList[#1]) & ### reddata[[All, 1]]
what gives something what looks like a list of timestamps:
{"2017-11-10 21:36:12.135", "2017-11-10 21:36:50.535",
"2017-11-10 21:37:28.935", "2017-11-10 21:38:07.335", ...}
So now I do:
Nearest[alltrafotstamps, DateList["2017-11-10 22:56:50.535"]]
and I get this message:
Nearest::neard: The default distance function does not give a real numeric distance when applied to the point pair 2017 and 2017-11-10 21:36:12.135.
Can it be that Nearest cannot do this for timestamp, but can do it only for times?

alltrafotstamps = {
"2017-11-10 21:36:12.135",
"2017-11-10 21:36:50.535",
"2017-11-10 21:37:28.935",
"2017-11-10 21:38:07.335"};
target = "2017-11-10 21:37:00";
nearest = Nearest[
AbsoluteTime /# alltrafotstamps,
AbsoluteTime[target]];
DateObject ## nearest
DateList ## nearest
{2017, 11, 10, 21, 36, 50.535}

Related

Date and Time Objects in Mathematica

I want to use the DateObject in Mathematica to calculate the difference in times between two cities.
How do I convert the DateObject and TimeObject output to numerical values so I can manipulate them and plot them?
You can obtain numerical values using DateList e.g.
d = Today
t = TimeObject[Now]
o = DateObject[d, t]
{dt, tm} = TakeDrop[DateList[o], 3]
DateString[o, {"DayName", ", ", "DayShort", " ", "MonthName"}]
By putting your code into comments you've made it very difficult to be certain what you have done, for example, it's no surprise that the expression
GeoPosition[Toronto]
is unevaluated and that the enclosing expressions do not do what you want them to do. Nevertheless, guessing at what you might be trying to do ...
If I execute the following:
sList = Table[Sunrise[Entity["City", {"Toronto", "Ontario", "Canada"}],
DateObject[{2022, 1, 9 + i, 0, 0, 0}]], {i, 0, 9}]
my Mathematica (v12.something on Mac) returns a list of 10 DateObjects, starting with
DateObject[{2022, 1, 9, 12, 50}, "Minute", "Gregorian", 0.]
And if I execute
AbsoluteTime /# sList
MMA returns a list of 10 absolute times.
Now, set
t1 = DateObject[{2022, 1, 9, 12, 50}, "Minute", "Gregorian", 0.]
and, using another city ...
t2 = Sunrise[Entity["City", {"Liverpool", "Liverpool", "UnitedKingdom"}],
DateObject[{2022, 1, 9, 0, 0, 0}]]
then
DateDifference[t1, t2]
returns
Quantity[-0.18472222222222223, "Days"]
You'll notice that I have not bothered with the time zone for the DateObject (irrelevant for calculating differences between cities) and I have not wrapped Toronto in GeoPosition, MMA is smart enough to not need that assistance.
The objective of the file is to calculate the numbers of hours of daylight in Toronto and Edmonton. First, create a counter called "daysElapsed"
daysElapsed=365;
Next, create a table of sunrises and sunsets. These are returned Date Objects / Quantities.
sunriseToList =
Table[Sunrise[Entity["City", {"Toronto", "Ontario", "Canada"}],
DateObject[{2022, 1, 1 + i, 0, 0, 0}]], {i, 0, daysElapsed}];
sunsetToList =
Table[Sunset[Entity["City", {"Toronto", "Ontario", "Canada"}],
DateObject[{2022, 1, 1 + i, 0, 0, 0}]], {i, 0, daysElapsed}];
Using the function, AbsoluteTime, convert the two lists of Date Objects into times in milliseconds. This allows you to manipulate the data easily.
sunrisenumTo = AbsoluteTime /# sunriseToList;
sunsetnumTo = AbsoluteTime /# sunsetToList;
Subtracting the time of sunrise from the time of sunset gives the total time of daylight.
hoursoflightTo = N[(sunsetnumTo - sunrisenumTo)/60/60];
Repeat the above process for the next city: Edmonton
sunriseEdList =
Table[Sunrise[Entity["City", {"Edmonton", "Alberta", "Canada"}],
DateObject[{2022, 1, 1 + i, 0, 0, 0}]], {i, 0, daysElapsed}];
sunsetEdList =
Table[Sunset[Entity["City", {"Edmonton", "Alberta", "Canada"}],
DateObject[{2022, 1, 1 + i, 0, 0, 0}]], {i, 0, daysElapsed}];
sunrisenumEd = AbsoluteTime /# sunriseEdList;
sunsetnumEd = AbsoluteTime /# sunsetEdList;
hoursoflightEd = N[(sunsetnumEd - sunrisenumEd)/60/60];
t = hoursoflightTo - hoursoflightEd;
The first plot below shows the difference in hours of sunlight. As Edmonton is in the far north, it gets less light in the winter and way more in the summer.
plotT = ListLinePlot[hoursoflightTo - hoursoflightEd]
Difference in hours of light between cities
This plots the hours of light for each city over 365 days.
ListPlot[{hoursoflightEd, hoursoflightTo},
ColorFunctionScaling -> True]
Hours of Light

Reorder columns and rows of Holoviews Heatmap based on similarity measure (e.g. cosine similarity etc.)

I was surprised that no one seems to have asked this before.
Assuming I have a pandas dataframe (random example), I can get a heatmap with Holoviews and Bokeh renderer:
rownames = 'ABCDEFGHIJKLMNO'
df = pd.DataFrame(np.random.randint(0,20,size=(20, len(rownames))), columns=list(rownames))
hv.HeatMap({'x': df.columns, 'y': df.index, 'z': df},
kdims=[('x', 'Col Categories'), ('y', 'Row Categories')],
vdims='z').opts(cmap="viridis", width=520, height=520)
The data (x and y) is categorical, therefore the initial order of rows or columns is unimportant. I wanted to sort rows/columns based on some similarity measure.
One way is to use seaborn clustermap:
heatmap_sns = sns.clustermap(df, metric="cosine", standard_scale=1, method="ward", cmap="viridis")
The output looks like this:
Columns and rows have been ordered according to similarity (in this case, cosine based on dot product; others are available such as 'correlation' etc.).
However, I want to display the clustermap in Holoviews. How do I update ordering of the original dataframe from the seaborn matrix?
A much cleaner approach to Alex's answer (i.e. that was the accepted answer earlier) is to use the data2d property of the returned object from sns.clustermap() function. This property contains the reordered data (i.e. the data after clustering). So:
df_ro = heatmap_sns.data2d
replaces all the following lines:
# get col and row names by ID
colname_list = [df.columns[col_id] for col_id in
heatmap_sns.dendrogram_col.reordered_ind]
rowname_list = [df.index[row_id] for row_id in
heatmap_sns.dendrogram_row.reordered_ind]
# update dataframe
df_ro = df.reindex(rowname_list)
df_ro = df_ro[colname_list]
It is possible to access the indices of reordered columns/rows from the seaborn clustermap using:
> print(f'rows: {heatmap_sns.dendrogram_row.reordered_ind}')
> print(f'columns: {heatmap_sns.dendrogram_col.reordered_ind}')
rows: [5, 0, 13, 2, 18, 7, 4, 16, 12, 19, 14, 15, 10, 3, 8, 6, 17, 11, 1, 9]
columns: [7, 1, 10, 5, 9, 0, 8, 13, 2, 6, 14, 3, 4, 11, 12]
To update row/column order of the original dataframe:
# get col and row names by ID
colname_list = [df.columns[col_id] for col_id in heatmap_sns.dendrogram_col.reordered_ind]
rowname_list = [df.index[row_id] for row_id in heatmap_sns.dendrogram_row.reordered_ind]
# update dataframe
df_ro = df.reindex(rowname_list)
df_ro = df_ro[colname_list]
I've done it here by first getting the names, perhaps there's even a direct way to update columns/rows by indices.
hv.HeatMap({'x': df_ro.columns, 'y': df_ro.index, 'z': df_ro},
kdims=[('x', 'Col Categories'), ('y', 'Row Categories')],
vdims='z').opts(cmap="viridis", width=520, height=520)
Since I have used random data, there's little order in the categories, but still the picture looks a little less noisy. Note that holoviews/df y axis is simply inverse compared to the seaborn clustermap-matrix, that's why the graphic looks flipped.

How to convert mapinfo projection to proj4

I have some projection string exported by MapInfo, but I can't find way to convert them into proj4 string, could any one help me with this?
Here is the string:
"CoordSys Earth Projection 8, 104, "m", 29, 0, 1, 0, 0"
"CoordSys Earth Projection 8, 150, "m", 27, 0, 1, 0, 0"
Thanks,
Edgar
With a little googling:
+proj=tmerc +ellps=WGS84 +lon_0=29
+proj=tmerc +ellps=WGS84 +lon_0=27
Link:
MapInfo projection and datum types: http://reference1.mapinfo.com/software/mapinfo_pro/english/16.0/MapInfoProUserGuide.pdf
Notes:
WGS84 and Hartebeesthoek datums are coincident.
No need to specify default proj4 parameters.

Mathematica, efficient way to compare dates

I have a list like this:
{{2002, 4, 10}, 9.61}, {{2002, 4, 11}, 9.53}, {{2002, 4, 12}, 9.58},
I need to lookup this list to find the exact match of date, if there is no match, I'll have the next available date in the list, here is my code:
Select[history, DateDifference[#[[1]], {2012, 3, 17}] <= 0 &, 1]
but it's a lot slower than just looking for exact match, is there a faster way to do this? Thank you very much!
It is true that DateDifference is rather slow. This can be worked around by converting all dates to "absolute times", which in Mathematica means the number of seconds elapsed since 1900 January 1.
Here's an example. This is the data:
data = {AbsoluteTime[#1], #2} & ###
FinancialData["GOOG", {{2010, 1, 1}, {2011, 1, 1}}];
We're looking for this date or the next one if this is not available:
date = AbsoluteTime[{2010, 8, 1}]
One way to retrieve it is:
dt[[1 + LengthWhile[dt[[All, 1]], # < date &]]]
You'll find other methods, including an already implemented binary search, in the answers to this question.
finddate[data:{{{_Integer, _Integer, _Integer}, _}..},
date:{_Integer, _Integer, _Integer}] :=
First[Extract[data, (Position[#1, First[Nearest[#1, AbsoluteTime[date]]]] & )[
AbsoluteTime/# data[[All,1]]]]]
will do what you want.
E.g.,
finddate[{{{2002, 4, 10}, 9.61}, {{2002, 4, 11}, 9.53}, {{2002, 4, 12}, 9.58}},
{2012, 3, 17}]
gives {{2002, 4, 12}, 9.58}
It seems to be reasonably fast ( half a second for 10^5 dates ).
Could you / would it be faster for you to write a binary search, assuming that your history is ordered?
That should give you the date in log(n) comparisons, which is way better than the linear filter you appear to be using now.
If will give you the date, if it exists, or if the date does not exist, it will give you the point where you should insert the new date.
Fastest thing for many accesses into the same dataset is to create an interpolation function based on the AbsoluteTime[] of the date and the value. If the default swings the wrong way, you can negate all the "seconds" and it'll swing that way.

Mathematica 8.0 efficient way to parse a DateList from a string in the format of yyyyMMddHHmmssSSS

I have the following string: "20110103224832494" it is in the following format: yyyyMMddHHmmssSSS. Where yyyy is the year, MM is the month, dd is the day, HH is the hour, mm is the minutes, ss is the seconds, SSS is the milliseconds.
I thought that the following would have worked:
DateList[{"20110103224832494",
{"Year", "Month", "Day", "Hour", "Minute", "Second", "Millisecond"}}
]
but returns:
DateString::str: String 20110103224832494 cannot be interpreted as a date in
format {Year,Month,Day,Hour,Minute,Second,Millisecond}. >>
And if that worked, would it have been efficient?
Use:
DateList[{"20110103224832494",
Riffle[{"Year",
"Month",
"Day",
"Hour",
"Minute",
"Second",
"Millisecond"}, ""]}]
Corrected to combine milliseconds with seconds.
You did specify "efficient" and I believe this is two orders of magnitude faster than DateList:
stringDynP[s_String, p_] :=
StringTake[s, Thread#{{0}~Join~Most## + 1, #} &#Accumulate#p]
toDateList[string_String] :=
MapAt[#/1000` &, #, -1] &[
FromDigits /# stringDynP[string, {4, 2, 2, 2, 2, 5}]
]
toDateList["20110103224832494"]
{2011, 1, 3, 22, 48, 32.494}
stringDynP is a string adaptation of my "Dynamic Partition" function.
Warning for Mathematica 7 users: the DateList method produces a spurious result:
{2011, 1, 12, 9, 23, 24.094}
Presumably in version 8 the following method can be used:
DateList[
{"20110103224832494",
{"Year", "Month", "Day", "Hour","Minute", "Second", "Millisecond"}},
DateDelimiters -> None
]

Resources