I'm trying to build a regression model with XGBoost regressor.
Right now inference takes approximately 0.025 seconds, but I'd like to speed things up.
Can someone explain to me what can influence the inference speed? For example, max_depth, number of trees, number of features... I don't know very much on this topic and I didn't find a satisfying answer on the internet. Thank you!
Related
I am currently using CONOPT4 solver to solve nonlinear programming problem.The nonlinearity is in the form of z=x*y and z=x/y, all variables are continuous. I specify some scaling factors and solving performance improves a lot. However, when I further refine some scaling factors, to project the value into the range from 0.01 to 100. The solving time becomes longer, which is really weird. I cannot provide my code here and I know it's impossible to give specific reason without the code. Could you talk about your experience on generally tuning the scaling factors when using CONOPT sovler. Thanks a lot.
I have some data about programmer actions inside an ide. From this data I am trying to make a good algorithm to calculate a programmers efficiency.
If we consider
efficiency = useful energy out / energy in
I made this rough equation:
energy in = active time(run events x code editing time)
Basically its the the time where stuff is actually being done by the programmer multiplied by run events like
debugging,build etc x the time where the programmer is actually editing code.
useful energy out = energy in - (#unsuccessfulbuilds + abortedtestruns
+ debuggerusetime)
Useful energy out is basically energy in minus things that I consider to be inefficient.
Can anyone see how to improve this, particularly from a mathematical point of view. Maths isn't my strong point and am not sure if I should use some sort of weighting for the equations and how to do this correctly. Also, I'm thinking of how to make it that whats minused from energy in in the useful energy out equation cant end up as less than 0. Can anyone give a hand with these questions ?
Your "algorithm" is completely arbitrary, making judgement of value over things that are inocuous to whatever you called "efficient/inneficient", and will endup with a completely incoherent final value after being calculated. Compilation time? So the first compilation of a C++ plugin that takes 30+ minutes is good? Debugging time is both efficient and inefficient in your proposal.
A programmer that codes for 10 minutes and make 6 consecutive builds with close to no changes will have the same output as the guy that code for 60 minutes.
I suggest you look firts to what is a good use of a programmers time, how other programs contabilize programmers efficiency. Etc.
Just on a side note, to create a model of efficiency of work of a highly technical and creative field, you must understand quite well math, statistics and project management. Thats why good scrum masters are so sought after.
Anyway, what you propose is not an algorithm, but a scoring system, usually algorithms do make use of scoring systems to help their internal rules work out the best solution based on the scoring. The scoring is just a value, while the algorithm is a process to an end.
For example, I'm running the k-means algorithm on 1 million data points. Each point is 128-dimensional, and I want 1000 clusters. Wikipedia tells me that its complexity is n^(dk+1)log(n), where d is the number of dimensions, k number of clusters and n number of instances.
Knowing that, how can I get an estimate of how long it will run on my 8-Gb RAM, 2.6GHz Intel Core i5 MacBook Pro? What is the best way to calculate this estimate? Is there a way to calculate it theoretically or should I do a few experiments on smaller sets and see how long it takes?
I would really like to have rough estimate before I spend hours or days without knowing when it might stop.
Thanks so much for your help! I really appreciate it :).
Ps. I'm using pythons' scipy kmeans
Do some experiments. k-means is popular only because it usually runs much, much faster than that asymptotic bound would suggest.
There is so much machine-related and algorithm-related specifics which is hidden in the big-O constant that it won't be possible to estimate it (for your machine and your SciPy) theoretically.
However nothing will prevent you from finding the constant experimentally - as you said: "do a few experiments on smaller sets".
My problem statement is "How to find co-relation between fields"
Let me explain it by an example:
Suppose I have a dataset which contains room temperature and CPU speed after regular intervals of time. i.e. two fields, one is room temperature and other one is CPU speed. As we know that CPU speed increases with rise in external temperature. So, there lies a relationship between room-temperature and CPU speed, as a result computer's performance decreases.
I want such an algorithm which may tell me relationship between two fields whether they are directly proportional to each other or inversely and what happens with third parameter (computer's performance) with change in other two parameters (room-temperature and CPU speed). Please tell me if you know some sort of algorithm to solve this problem.
I'm not sure I fully understand your question but a simple linear regression would work.
Wikipedia article on linear regressions
For example in R you would use lm function :
lm(formula = cpuSpeed ~ roomTemp)
Linear regression is a common approach, but you should also take the time to plot the two variables against each other. This visualization will help you discover nonlinear relationships as well
I'm building a single-layer perceptron that has a reasonably long feature vector (30-200k), all normalised.
Let's say I have 30k features which are somewhat useful at predicting a class but then add 100 more features which are excellent predictors. The accuracy of the predictions only goes up a negligible amount. However, if I manually increase the weights on the 100 excellent features (say by 5x), the accuracy goes up several percent.
It was my impression that the nature of the training process should give better features a higher weight naturally. However, it seems like the best features are being 'drowned out' by the worse ones.
I tried running it with a larger number of iterations, but that didn't help.
How can I adapt the algorithm to better weight features in a reasonably simple way? Also, a reasonably fast way; if I had fewer features it'd be easy to just run the algorithm leaving one out at a time but it's not really feasible with 30k.
My experience with implementing perceptron based network is that it takes a lot of iterations to learn something. I believe I used each sample about 1k times to learn the xor function(when having only 4 inputs). So if you have 200k inputs it will take a lot of samples and a lot of time to train your network.
I have a few suggestions for you:
try to reduce the size of the input(try to aggregate several inputs into a single one or try to remove redundant once).
try to use each sample more times. As I said it may take a lot of iterations to learn even a simple function
Hope this helps.