I'm adding Automatic Mixed Precision to a TensorFlow 2 model. What's the interaction between
tf.keras.mixed_precision.experimental.set_policy('mixed_float16') and tf.config.optimizer.set_experimental_options({"auto_mixed_precision": True}) ?
The former is documented at the Keras mixed precision guide, and the latter doesn't seem to be mentioned anywhere except the API docs. Using the experimental options call seems to work, while setting the policy gives me datatype mismatches in my model.
I'm using GradientTape, so the simplistic solutions relating to model.fit() don't apply to me. Any insight on the similarities, differences, and interactions between these two functions?
Related
As far as I understand, the LightGBM integration in Optuna uses a step-wise algorithm to optimize the hyper-parameters such as lambda_l1, lambda_l2 etc.
Although it is great, I would very much want to add additional parameters such as learning_rates.
I know I can just use Optuna the "regular" way but since the integrated lgbm part should be way faster, I would prefer using that.
Is there a way to add additional parameters to optimize, or are we forced to use all of (and only) the specified parameters? I can see theres e.g a parameter called learning_rates but in the docs it is not specified what that does and how to use it (I think it's the learning-rate for each tree). Setting it in the lgb.train like
model = lgb.train(
params,
dtrain,
valid_sets=[dtrain, dval],
)
With Gensim < 4.0, we can retrain a word2vec model using the following code:
model = Word2Vec.load_word2vec_format("GoogleNews-vectors-negative300.bin", binary=True)
model.train(my_corpus, total_examples=len(my_corpus), epochs=model.epochs)
However, what I understand is that Gensim 4.0 is no longer supporting Word2Vec.load_word2vec_format. Instead, I can only load the keyedVectors.
How to fine-tune a pre-trained word2vec model (such as the model trained on GoogleNews) with my domain-specific corpus using Gensim 4.0?
I don't think that code would've ever have worked in Gensim versions before 4.0. A plain list-of-word-vectors, like GoogleNews-vectors-negative300.bin, does not (& never has) had enough info to continue training.
It's missing the hidden-to-output layer weights & word-frequency info essential for training.
Looking at past source code, as of release 1.0.0 (February 2017), that code wouldn've already given a deprecation-error with a pointer to the method for loading a plain set-of-word-vectors - to address people with the mistaken notion that could work – and raised other errors on any attempts to train() such a model. (Pre-1.0.0, docs also warned that this would not work, & would have failed with a less-helpful error.)
As one of those errors mentioned, there has at times been experimental support for loading some of a prior set-of-word-vectors to clobber any words in an existing model's already-initialized vocabulary, via .intersect_word2vec_format(). But by default that both (1) locks the imported vectors against further change; (2) brings in no new words. That's unlike what people most often want from "fine-tuning", so it's not a ready-made help for that goal.
I believe some people have cobbled together custom code to achieve various kinds of fine-tuning in their projects – but I don't know of anyone who's published a reliable recipe or strong results. (And I suspect some of the people who think they're doing this well just haven't rigorously evaluated the steps they are taking.)
If you have any recipe you know worked pre-Gensim-4.0.0, it should be adaptable - 4.0 changes to the Word2Vec-related classes were mainly refactorings, optimizations, & new options (with little-to-none removal of functionality). But a reliable description of what used-to-work, or which particular fine-tuning strategy is being pursued for what specific benefits, to make more specific recommendations.
I have started to in PyFMI use parameter estimation with the procedure model.estimate() and works well.
From the documentation (Andersson et al 2016) as well as practical use I understand that model parameters are taken from the compiled FMU-model if not estimated. It would have been very practical to have an option to provide a dictionary with a set of the fixed parameter values different from the default of the model. Is there any way to provide that?
The current workflow is that for a larger model built up of parts from libraries, then you need to make a copy of these models and set parameters to the proper value in the code, and then compile it. It is a somewhat tedious procedure. Perhaps I have misunderstood something?
Andersson et al (2016): "PyFMI: A Python package for…”
https://portal.research.lu.se/portal/files/7201641/pyfmi_tech.pdf
From my contact Christian Winther at Modelon I learn that I understand the workflow right. He see also the advantage to have a possibility to have a list (or dictionary) of parameters that is changed from the default parameters and remain constant during parameter estimation. It may come in a future update.
EDIT to try to address #user2864740's edit and comment: I am wondering if there is any information particularly relevant to 0.4rc1/rc2 or in particular a strategy or suggestion from one of the Julia developers more recent than those cited below (particularly #StefanKarpinski's Jan 2014 answer in #6 below). Thx
Please see e.g.
https://groups.google.com/forum/#!topic/julia-users/pCuDx6jNJzU
https://groups.google.com/forum/#!topic/julia-users/2kLNdQTGZcA
https://groups.google.com/forum/#!msg/julia-dev/JEiH96ofclY/_amm9Cah6YAJ
https://github.com/JuliaLang/julia/pull/10269
https://github.com/JuliaLang/julia/issues/1090
Can I add type information to arguments that are functions in Julia?
Performance penalty using anonymous function in Julia
(As a fairly inexperienced Julia user) my best synthesis of this information, some of which seems to be dated, is that the best practice is either "avoid doing this" or "use FastAnonymous.jl."
I'm wondering what the bleeding edge latest and greatest way to handle this is.
[Longer version:]
In particular, suppose I have a big hierarchy of functions. I would like to be able to do something like
function transform(function_one::Function{from A to B},
function_two::Function{from B to C},
function_three::Function{from A to D})
function::Function{from Set{A} to Dict{C,D}}(set_of_As::Set{A})
Dict{C,D}([function_two(function_one(a)) => function_three(a)
for a in set_of_As])
end
end
Please don't take the code too literally. This is a narrow example of a more general form of transformation I'd like to be able to do regardless of the actual specifics of the transformation, BUT I'd like to do it in such a way that I don't have to worry (too much) about checking the performance (that is, beyond the normal worries I'd apply in any non-function-with-function-as-parameter case) each time I write a function that behaves this way.
For example, in my ideal world, the correct answer would be "so long as you annotate each input function with #anon before you call this function with those functions as arguments, then you're going to do as well as you can without tuning to the specific case of the concrete arguments you're passing."
If that's true, great--I'm just wondering if that's the right interpretation, or if not, if there is some resource I could read on this topic that is closer to a "logically" presented synthesis than the collection of links here (which are more a stream of collective consciousness or history of thought on this issue).
The answer is still "use FastAnonymous.jl," or create "functor types" manually (see NumericFuns.jl).
If you're using julia 0.4, FastAnonymous.jl works essentially the same way that official "fast closures" will eventually work in base julia. See https://github.com/JuliaLang/julia/issues/11452#issuecomment-125854499.
(FastAnonymous is implemented in a very different way on julia 0.3, and has many more weaknesses.)
(Mathematica version: 8.0.4)
lst = Names["Internal`*"];
Length[lst]
Pick[lst, StringMatchQ[lst, "*Bag*"]]
gives
293
{"Internal`Bag", "Internal`BagLength", "Internal`BagPart", "Internal`StuffBag"}
The Mathematica guidebook for programming By Michael Trott, page 494 says on the Internal context
"But similar to Experimental` context, no guarantee exists that the behavior and syntax of the functions will still be available in later versions of Mathematica"
Also, here is a mention of Bag functions:
Implementing a Quadtree in Mathematica
But since I've seen number of Mathematica experts here suggest Internal`Bag functions and use them themselves, I am assuming it would be sort of safe to use them in actual code? and if so, I have the following question:
Where can I find a more official description of these functions (the API, etc..) like one finds in documenation center? There is nothing now about them now
??Internal`Bag
Internal`Bag
Attributes[Internal`Bag]={Protected}
If I am to start using them, I find it hard to learn about new functions by just looking at some examples and trial and error to see what they do. I wonder if someone here might have a more complete and self contained document on the use of these, describe the API and such more than what is out there already or a link to such place.
The Internal context is exactly what its name says: Meant for internal use by Wolfram developers.
This means, among other things, the following things hold about anything you might find in there:
You most likely won't be able to find any official documentation on it, as it's not meant to be used by the public.
It's not necessarily as robust about invalid arguments. (Crashing the kernel can easily happen on some of them.)
The API may change without notice.
The function may disappear completely without notice.
Now, in practice some of them may be reasonably stable, but I would strongly advise you to steer away from them. Using undocumented APIs can easily leave you in for a lot of pain and a nasty surprise in the future.