Is there a drawback in using rxjs for readonly collection manipulation - linq

I need to do a Min and Max operation on a array getting from server side.
I am new to rxjs extensions but those library is actually mean to observe changes on a collection, but in my case its just a ONE time calculation on a collection which is no further changed then until I do a server side refresh of the data.
I just want to use the right tool for the right job, thus I ask is it correct to use rxjs here or is that shooting with bombs on flys?
Or should I rather use a library like https://github.com/ENikS/LINQ
to get the Min/Max value of a collection?

There is a LINQ implementation IxJS that is developed and maintained by the same team that is developing RxJS. This might be the right tool for you.
However, you could go with RxJS as well. When using Rx.Observable.from([1, 2, ...]) the execution is synchronous on subscription.
I would use IxJS however:
// An array of values.. (just creating some random ones here)
const values = [2, 4, 23, 1, 0, 34, 56, 2, 3, 45, 98, 6, 3];
// Create an enumerable from the array
const valEnum = Ix.Enumerable.fromArray(values);
const min = valEnum.min();
const max = valEnum.max();
Working example on jsfiddle.

https://github.com/ENikS/LINQ uses all the latest language features and theoretically much faster than IxJS. Last edit on IxJS is 3 years old. (ECMA-262/6.0/) introduced few very important advancements and speed improvements.
It also has better compliance with standard LINQ API and can operate on any collection implementing iterables, including strings, maps, typed arrays, and etc. IxJS can only query array types.

Related

Parameters for dlib::find_min_bobyqa

I'm working on the C++ version of Matt Zucker's Page dewarping. So far everything works fine, but I have a problem with optimization. In line 748 of Github repo Matt uses optimize function from Scipy. My C++ equivalent is find_min_bobyqa from dlib.net. The code is:
auto f = [&](const column_vector& ppts) { return objective( dstpoints, ppts, keypoint_index); };
dlib::find_min_bobyqa(f,
params,
2 * params.nr() + 1, // npt - number of interpolation points: x.size() + 2 <= npt && npt <= (x.size()+1)*(x.size()+2)/2
dlib::uniform_matrix<double>(params.nr(), 1, -2), // lower bound constraint
dlib::uniform_matrix<double>(params.nr(), 1, 2), // upper bound constraint
1, // initial trust region radius
1e-5, // stopping trust region radius
4000 // max number of objective function evaluations
);
In my concrete example params is a dlib::column_vector with double values and length = 189. Every element of params is less than 2.0 and greater than -2.0. Function objective() returns double value and "alone" it works properly because I get the same value as in the Python version. But after running fin_min_bobyqa function I usually get the message:
Terminate called after throwing an instance of 'dlib:bobyqa_failure', return from BOBYQA because the objective function has been called max_f_evals times.
I set max_f_evals to quite big value to see if it optimizes at all, but it doesn't. I did some tweaking with parameters but without good results. How should I set the parameters of find_min_bobyqa to get the right solution?
I am very interested in this issue as well. Zucker's work, with very minor tweaks, is ideal for straightening sheet music images, and I was looking for ways to implement it in a mobile platform when I came across your question.
My research so far suggests that BOBYQA is not the equivalent of Powell's method in scipy. BOBYQA is constrained, and the one in scipy is not.
See these links for more information, and a possible way to compile the right supporting library - I would try UOBYQA or NEWUOA.
https://github.com/jacobwilliams/PowellOpt
https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html#rdd2e1855725e-3
(See the Notes section)
EDIT: see C version here:
https://github.com/emmt/Algorithms/tree/master/newuoa
I wanted to post this as a comment, but I don't have enough points for that.
I am very interested in your progress. If you're willing, please keep me posted.
I finally solved this problem. I used PRAXIS library, because it doesn't need derivative information and is fast.
I modified the code a little to my needs and now it is faster around few seconds than original version written in Python.

Iteratively populate dataframes using a for loop in Julia

I am looking to find a way to iteratively populate a dataframe in Julia.
I have a working function that creates multiple points along a line:
#function to draw QMD lines
using DataFrames
function make_lines(qmd)
BA=Float64[]
TPA=Float64[]
QMD=Int[]
for i in stk_percent
tpa= 1*(i*10)/(a[1]+a[2]*(-0.259+0.973*qmd)+a[3]*qmd^2)
ba=pi*(qmd/24)^2*tpa
push!(TPA,tpa)
push!(BA,ba)
push!(QMD,qmd)
end
return DataFrame(TPA=TPA,BA=BA,QMD=QMD)
end
The next step I am trying to accomplish is to run the make_lines function in a loop using a pre-defined set of inputs with all the outputs in one single dataframe but I cannot get it to work.
dia = [7, 8, 10, 12, 14, 16, 18, 20, 22]
# can't get for loop to append all the data frames?
for i in dia
df=DataFrame(TPA=Float64[],BA=Float64[],QMD=Int[])
append!(df,make_lines(i))
return df
end
At first I thought it was how I was using Dataframes, I have never used Push! etc before but I got this code chunk to work
#this works to combine dataframe
test=make_lines(22)
test2=make_lines(8)
test[:]
append!(test,test2)
So why when I run the for loop, do I end up with only the last dataframe it produces?
Am I misinterpreting something? From what I have read Dataframes in Julia work differently than dataframes in R, but I cannot wrap my head around how to get this working.
You are pretty close, but there are a couple of places where you are getting tripped up in your code. You currently have:
dia = [7, 8, 10, 12, 14, 16, 18, 20, 22]
# can't get for loop to append all the data frames?
for i in dia
df=DataFrame(TPA=Float64[],BA=Float64[],QMD=Int[])
append!(df,make_lines(i))
return df
end
This isn't quite what you want for two reasons:
One: This snippet isn't a function. It thus doesn't make sense, and will cause problems, to have return in it.
Two: At each step in your loop, you are re-creating your dataframe df from scratch, erasing everything that you put before it. This is why, as you say, you only end up with the last data frame that it produces. Instead, you would want something like:
dia = [7, 8, 10, 12, 14, 16, 18, 20, 22]
df=DataFrame(TPA=Float64[],BA=Float64[],QMD=Int[])
for i in dia
append!(df,make_lines(i))
end
Note: I couldn't get a completely working version of your code going - the objects stk_percent and a in your main function never get defined, so I didn't really know what to put in for those. But, I believe that if you fix these issues you'll likely be in a better spot (I made up some values for them and it worked fine).
Performance Tip: When you do fix those, my recommendation would be to make them as explicit arguments that you pass to your function. Although it will still work if they are just variables in the global space, this will lead to suboptimal performance of your code, both now and in the future, and potentially worse things, like confusing the scope of variables, having their values change when you don't want, etc. Best to start off from the beginning of your journey with Julia adopting as many best practices in writing your code as is practicable.
I managed to create a blank dataframe by providing the type of variable and the column names
df = DataFrame([DateTime;fill(Float64, 2);String;fill(Float64, 2)],
["Date","A","B","Letter","C","D"])
Then I can append the results to populate the new dataframe by using rename! and then append! functions inside the for loop.
This is very useful for large datasets with numerous columns.

SuperCollider patterns library: how to get a reference to the synths' nodeIDs?

Patterns library question:
How can I get a reference to the Synth that is created by a Pbind?
For instance,
Pbind(
\type, myCustomSynthDef,
\midinote, Pseq([60, 62, 64], inf),
\dur, 0.5
).play
gets me a repeating do-re-mi sequence. If I'd like to change some modulation parameter on the synth that plays 're', how can I get that synth's nodeID into a variable?
To control the "re" synth, you would normally put some extra parameters into the Pbind and then simply use them in the synth, e.g. add
\craziness, Pseq([0, 100, 0], inf)
to your Pdef, and add something in your SynthDef to use it.
If you really really want to know the nodeID (bleh, not pleasant) then you don't use Pattern.play. I guess you could iterate the pattern manually (e.g. using .next) and manually call .play on each Event in that iteration. When you call the Event's .play it returns an Event that has the node ID inside, stored in the id key.

Python3 Make tie-breaking lambda sort more pythonic?

As an exercise in python lambdas (just so I can learn how to use them more properly) I gave myself an assignment to sort some strings based on something other than their natural string order.
I scraped apache for version number strings and then came up with a lambda to sort them based on numbers I extracted with regexes. It works, but I think it can be better I just don't know how to improve it so it's more robust.
from lxml import html
import requests
import re
# Send GET request to page and parse it into a list of html links
jmeter_archive_url='https://archive.apache.org/dist/jmeter/binaries/'
jmeter_archive_get=requests.get(url=jmeter_archive_url)
page_tree=html.fromstring(jmeter_archive_get.text)
list_of_links=page_tree.xpath('//a[#href]/text()')
# Filter out all the non-md5s. There are a lot of links, and ultimately
# it's more data than needed for his exercise
jmeter_md5_list=list(filter(lambda x: x.endswith('.tgz.md5'), list_of_links))
# Here's where the 'magic' happens. We use two different regexes to rip the first
# and then the second number out of the string and turn them into integers. We
# then return them in the order we grabbed them, allowing us to tie break.
jmeter_md5_list.sort(key=lambda val: (int(re.search('(\d+)\.\d+', val).group(1)), int(re.search('\d+\.(\d+)', val).group(1))))
print(jmeter_md5_list)
This does have the desired effect, The output is:
['jakarta-jmeter-2.5.1.tgz.md5', 'apache-jmeter-2.6.tgz.md5', 'apache-jmeter-2.7.tgz.md5', 'apache-jmeter-2.8.tgz.md5', 'apache-jmeter-2.9.tgz.md5', 'apache-jmeter-2.10.tgz.md5', 'apache-jmeter-2.11.tgz.md5', 'apache-jmeter-2.12.tgz.md5', 'apache-jmeter-2.13.tgz.md5']
So we can see that the strings are sorted into an order that makes sense. Lowest version first and highest version last. Immediate problems that I see with my solution are two-fold.
First, we have to create two different regexes to get the numbers we want instead of just capturing groups 1 and 2. Mainly because I know there are no multiline lambdas, I don't know how to reuse a single regex object instead of creating a second.
Secondly, this only works as long as the version numbers are two numbers separated by a single period. The first element is 2.5.1, which is sorted into the correct place but the current method wouldn't know how to tie break for 2.5.2, or 2.5.3, or for any string with an arbitrary number of version points.
So it works, but there's got to be a better way to do it. How can I improve this?
This is not a full answer, but it will get you far along the road to one.
The return value of the key function can be a tuple, and tuples sort naturally. You want the output from the key function to be:
((2, 5, 1), 'jakarta-jmeter')
((2, 6), 'apache-jmeter')
etc.
Do note that this is a poor use case for a lambda regardless.
Originally, I came up with this:
jmeter_md5_list.sort(key=lambda val: list(map(int, re.compile('(\d+(?!$))').findall(val))))
However, based on Ignacio Vazquez-Abrams's answer, I made the following changes.
def sortable_key_from_string(value):
version_tuple = tuple(map(int, re.compile('(\d+(?!$))').findall(value)))
match = re.match('^(\D+)', value)
version_name = ''
if match:
version_name = match.group(1)
return (version_tuple, version_name)
and this:
jmeter_md5_list.sort(key = lambda val: sortable_key_from_string(val))

Problem converting a Matrix to Data Frame in R (R thinks all numeric types are factors)

I am passing data from C# to R over a COM interface. When the data arrives in R it is housed in a 'Matrix'. Some of the functions that I use require that the data be inside a 'DataFrame' instead. I convert the data structure using
newDataFrame <- as.data.frame(oldMatrix)
The table of data reaches R just fine, once I make the conversion to the DataFrame however, it assumes all of my numeric data are factors!
So it turns: {34, 46, 90, 54, 69, 54} into {1, 2, 3, 4, 5, 4}
My data table DOES have factors in it though, so I just can't force the whole thing to be numeric. Is there any way around this? Note: I can't export the data as a CSV onto the filesystem and read it into R manually.
On a side note, the function I am using that requires a DataFrame is the 'Hmisc' package using
hist.data.frame(dataFrame)
this produces a frequency histogram for every column of data in the DataFram and arranges them in all in a grid pattern (quite nifty)!
Thanks!
-Dave
I think you have mis-diagnosed the problem - all columns in a matrix must be of the same type, so this is likely to be where the problem arises, not the conversion to a data frame.
I've had this problem before. You need to set stringsAsFactors=F when you read the data.
Now, you can convert individual variables/columns to factors (ie, with as.numeric() and the like), without worrying about how the numbers are treated.

Resources