What's the acceptable range for reg_alpha and reg_lambda in H2O XGboost? - h2o

H2O document doesn't detail on these two hyper parameters. It only says these are L1 and L2 regularization parameters, with default values as 0 and 1. I can't find more info googling it either. Can someone provide any insight? TIA!

Alpha and lambda have no acceptable range. Usually, it depends on the problem you're trying to solve and the other parameters you're using, like max depth. The value for both typically ranges from 0 to 5, but it is not limited to that range.
I recommend you to take a look at this link: https://medium.com/data-design/xgboost-hi-im-gamma-what-can-i-do-for-you-and-the-tuning-of-regularization-a42ea17e6ab6
The idea is very clearly explained.

Related

Castalia S-MAC parameters

I need to have the values of some of the s-mac protocol parameters that used in Castalia. As DIFS duration, Contention Window size, size of SlotTime, backoff time, SYNC duration (duration of a sync cycle = 6 ms?) I would be very grateful if anyone gave me these values.
Is the CBR value obtained through Application.packet_rate *Application.constantDataPayload ?
What is the unit of Consumed Energy in the ResourceManager module? mW of nJ or ...?
Is SN.numNodes number of nodes in a virtual cluster? If not, what calculate number of nodes in a virtual cluster?
Almost all of your questions are answered in the manual and the source code comments already. It would be good to provide some evidence that you actually tried to find the answers on your own (looking at the obvious places) and explaining why you were uncertain of the answers, or why you where unable to find the answers.
Most of these parameters are covered in section 4.3.2 of the manual. To understand some of them you might need to read the relevant published article
If CBR means Constant Bit Rate, then you need to multiply by 8, since constantDataPayload is expressed in bytes. (found in the code at iApplication.ned)
Units of energy are in Joules (found in the manual, and the source code)
SN.numNodes is the total number of nodes in your simulation (found in the manual)

Is it possible to configure the h2o.ai automl with a vector response

I've been working with the h2o.ai automl function on a few problems with quite a bit of success, but have come across a bit of a roadblock.
I've got a problem that uses 500-odd predictors (all float) to map onto 6 responses (again all float.)
Required Data Parameters
y: This argument is the name (or index) of the response column.
3.16 docs
It seems that the automl library only handles a single response. Am I missing something? Perhaps in the terminology even?
In the case that I'm not, my plan is to build 6 separate leaderboards, one for each response, and use the results to kick-start a manual network search.
In theory I guess I could actually run the 6 automl models individually to get the vector response, but that feels like an odd approach.
Any insight would be appreciated,
Cheers.
Not just AutoML, but H2O generally, will only let you predict a single thing.
Without more information about what those 6 outputs represent, and their relationship to each other, I can think of 3 approaches.
Approach 1: 6 different models, as you suggest.
Approach 2: Train an auto-encoder to compress 6 dimensions to 1 dimension. Then train your model to predict that single value. Then expand it back out. (E.g. by a lookup table on the training data, e.g. if your model predicts 1.123, and you have [1,2,3,4,5,6] was represented by 1.122, and [3.14,0,0,3.14,0,0] was represented by 1.125, you could choose [1,2,3,4,5,6], or a weighted average of those 2 closest matches.) (Other dimension-reduction approaches, such as PCA, are the same idea.)
Approach 3: If the possible combinations of your 6 floats is a (relatively small) finite set, you could have an explicit lookup table, to N categories.
I assume each are continuous variables, which is why they are float, so I expect approach 3 will be inferior to approach 2. If there is very little correlation/relationship between the 6 outputs, approach 1 is going to be best.

Good algorithm for maximum likelihood estimation

I have a problem. I need to estimate some statistics with GARCH/ARCH model. In Matlab I use something like this:
spec = garchset('P', 1, 'Q', 1)
[fit01,~,LogL01] =garchfit(spec, STAT);
so this returns three parameters of GARCH model with maximum likelihood.
But I really need to how which algorithm is used in garchfit , because I need to write a program which makes the same work in estimating parameters automatically.
My program works now very slow and sometimes not correct.
So the questions are:
How get the code of garchfit or MLE in Matlab?
Does anyone know some good and fast algorithm on MLE?
(MLE = maximum likelihood estimation)
To see the code (if possible) you can type edit garchfit.
From the documentation of garchfit I have found some recommendations:
garchfit will be removed in a future release. Use estimate, estimate,
estimate, or estimate instead.
My guess is that you want to look into garch.estimate.

Principle Component Analysis with very big dimension of data

I have a set of samples (vectors) each have a dimension about of M (10000) and the size of the set is also about N(10000), and i want to find first (with biggest eiegenvalues) 10 PC of this set. Due to the big dimension of samples i cannot calculate covariation matrix in reasonable time. Are there any methods to select PC without calculation of full cov matrix or methods that can effectively handle big dimension of data or something like this? So these methods should require less operations than O(M*M*N).
NIPALS -- Non-linear iterative partial least squares
see for example here: http://en.wikipedia.org/wiki/NIPALS
guys, maybe it could help somehow, i have found solution in family of EM-PCA methods (see for example this, http://www.cmlab.csie.ntu.edu.tw/~cyy/learning/papers/PCA_RoweisEMPCA.pdf)

Equation for "importance" value of twitter user according to #followers #following

I am trying to find an equation which calculates the "importance" of a twitter user according to #following #followers
Things I want to consider:
1. The more #followers / #following is bigger, the more important he his.
2. differ between 20/20 and 10k/10k (10k is more important although the ratio is the same).
Considering these two, I expect to get a similar output importance value to these two inputs:
#followers=1000 #following=100
#followers=30k #following=30k
I'm having problems inserting the second point into consideration. I believe it needs to be quite simple. Help?
Thanks
one possibility is (#followers/#following)*[log(#followers) - CONST] where CONST is some predefined value, tested as appropriate. this will ensure the ratio has its appropriate importance, but also the scale matters.
for your last example, you will need to set CONST~=9.4 to achieve similar results.
There are too many answers to this question, you need to weight how important is the number of followers compared to the ratio so you get a common number to relationate this two. For example the first idea that come to my mind is to multiply the ratio by the log of the #Followers. Something like this.
Importance = (#Followers / #Following)*Log(#Followers)
Based on what you said there, you could do 3*followers^2/following.
But you've described a system where users can increase their importance by following fewer other users. Doesn't seem too awesome.
You could normalize it by the total number of users.
I'd suggest using logarithms on all the values to get a less dramatic increase or change in higher values.
(log(#followers)/log(#TotalNumberOfPeopleInTwitter))*(log(#followers)/log(#following))

Resources