setup early stopping for Vertex automl for text classification - google-cloud-vertex-ai

I'm running an automl job using google cloud SDK like this:
I couldn't see any parameters to set up early stopping. Is this a missing feature or I am missing something here?
job = aiplatform.AutoMLTextTrainingJob(
display_name=training_job_display_name,
prediction_type="classification",
multi_label=False,
)
model = job.run(
dataset=text_dataset,
model_display_name=model_display_name,
training_fraction_split=0.1,
validation_fraction_split=0.1,
test_fraction_split=0.1,
sync=True,
)

Related

How to determine a windows process has restarted using PromQL and Grafana

I am using the 'windows_exporter' metric exporter on a windows server...I am trying to use the metric from that called "windows_process_start_time" to visualize how many times a process has restarted based on the process and process_id.
this is my promQL used within the Grafana dashboard
changes( windows_process_start_time { process="Foo.*", process_id=~".*"[$__interval] )
however, when a process has stopped, and a new one starts with a new process_id, the 'changes' is no detecting it, so in the grafana dashboard it still shows 0 for a 'Foo' Processor.
What would be an effective PromQL do show when a process restarts?
updated with suggested solution in comment, but this does not work:
avg by (windows_process_start_time { process="Foo.*", process_id=~".*"}) unless avg by (process,process_d) ( windows_process_start_time { process="Foo.*", process_id=~".*"} )

Where to find the $PEERID used in Chainlink Offchain Reporting Jobs .toml file --> P2P BootstrapPeers =[...]

Where can the $PEERID be found, which is used in Chainlink Offchain Reporting Jobs .toml file
--> P2P BootstrapPeers =[...] ?
This is the link to Offchain Reporting Jobs in Chainlink:
https://docs.chain.link/docs/jobs/types/offchain-reporting/
Question concerning the OCR toml job specification:
The syntax of p2pBootstrapPeer is the following:
/dns4/$DNS/tcp/$PORT/p2p/$PEERID
---> where in my specifications can I find the $PEERID ?
In the Chainlink Nodeoperator GUI? If yes, where exactly?
If not, where else? (edited)
Couple ways to set this:
You can manually set it via an env_car called P2P_PEER_ID. If you don't override it, it will try use this value by default
As per the docs you linked, set the p2pPeerID value in your job spec, which will become the $PEERID. eg
p2pPeerID = "12D3KooWPjceQrSwdWXPyLLeABRXmuqt69Rg3sBYbU1Nft9HyQ6X"

Access the leaderboard in h2o automl in between and extract metrics of models that are completed?

In h2o automl can we find which models are completed and see their metrics while training is going on ?
eg: maxmodels=5 , so can we extract the information of model 1 while others(models 2 ,3) are getting trained.
If you're using Python or R client to run H2OAutoML, you can see the progression — e.g. which model is being built or completed — using the verbosity='info' parameter.
For example:
aml = H2OAutoML(project_name="my_aml",
...,
verbosity='info')
aml.train(...)
If you're using the Flow web UI, this progress is shown automatically when you click on the running job.
You can then obtain the model ids from those console logs, and retrieve the models as usual in a separate client:
m = h2o.get_model('the_model_id')
or you can even access the current state of the leaderboard with the model metrics:
h2o.automl.get_automl('my_aml').leaderboard
The same logic applies to R client.
An alternative, maybe simpler if the goal is just to monitor progress, is to use Flow ( usually availble at http://localhost:54321 ) > 'Admin' > 'Jobs' > search for the AutoML job..

How can I handle TensorFlow sessions to train multiple Keras models at the same time?

I need to train multiple Keras models at the same time. I'm using TensorFlow backend. Problem is, when I try to train, say, two models at the same time, I get Attempting to use uninitialized value.
The error is not really relevant, the main problem seems to be that Keras is forcing the two models to be created in the same session with the same graph so it conflicts.
I am a newbie in TensorFlow but my gut feeling is that the answer is pretty straightforward : you would have to create a different session for each Keras model and train them in their own session. Could someone explain me how it would be done ?
I really hope it is possible to solve this problem while still using Keras and not coding everything in pure TensorFlow. Any workaround would be appreciated too.
You are right, Keras automatically works with the default session.
You could use tf.compat.v1.keras.backend.get_session() or tf.compat.v1.keras.backend.set_session(sess) to manually set the global Keras session (see documentation).
For instance:
sess1 = tf.Session()
tf.compat.v1.keras.backend.set_session(sess1)
# Train your first Keras model here ...
sess2 = tf.Session()
tf.compat.v1.keras.backend.set_session(sess2)
# Train your second Keras model here ...
I train multiple models in parallel by using pythons multiprocessing, https://docs.python.org/3.4/library/multiprocessing.html.
I have a function that takes two parameters, an input queue and an output queue, this function runs in each process. The function has the following structure:
def worker(in_queue, out_queue):
import keras
while True:
parameters = in_queue.get()
network_parameters = parameters[0]
train_inputs = parameters[1]
train_outputs = parameters[2]
test_inputs = parameters[3]
test_outputs = parameters[4]
build the network based on the given parameters
train the network
test the network if required
out_queue.put(result)
From the main python script start as many processes (and create as many in and out queues) as required. Add jobs to a worker by calling put on its in queue and get the results by calling get on its out queue.

Hive execution hook

I am in need to hook a custom execution hook in Apache Hive. Please let me know if somebody know how to do it.
The current environment I am using is given below:
Hadoop : Cloudera version 4.1.2
Operating system : Centos
Thanks,
Arun
There are several types of hooks depending on at which stage you want to inject your custom code:
Driver run hooks (Pre/Post)
Semantic analyizer hooks (Pre/Post)
Execution hooks (Pre/Failure/Post)
Client statistics publisher
If you run a script the processing flow looks like as follows:
Driver.run() takes the command
HiveDriverRunHook.preDriverRun()
(HiveConf.ConfVars.HIVE_DRIVER_RUN_HOOKS)
Driver.compile() starts processing the command: creates the abstract syntax tree
AbstractSemanticAnalyzerHook.preAnalyze()
(HiveConf.ConfVars.SEMANTIC_ANALYZER_HOOK)
Semantic analysis
AbstractSemanticAnalyzerHook.postAnalyze()
(HiveConf.ConfVars.SEMANTIC_ANALYZER_HOOK)
Create and validate the query plan (physical plan)
Driver.execute() : ready to run the jobs
ExecuteWithHookContext.run()
(HiveConf.ConfVars.PREEXECHOOKS)
ExecDriver.execute() runs all the jobs
For each job at every HiveConf.ConfVars.HIVECOUNTERSPULLINTERVAL interval:
ClientStatsPublisher.run() is called to publish statistics
(HiveConf.ConfVars.CLIENTSTATSPUBLISHERS)
If a task fails: ExecuteWithHookContext.run()
(HiveConf.ConfVars.ONFAILUREHOOKS)
Finish all the tasks
ExecuteWithHookContext.run() (HiveConf.ConfVars.POSTEXECHOOKS)
Before returning the result HiveDriverRunHook.postDriverRun() ( HiveConf.ConfVars.HIVE_DRIVER_RUN_HOOKS)
Return the result.
For each of the hooks I indicated the interfaces you have to implement. In the brackets
there's the corresponding conf. prop. key you have to set in order to register the
class at the beginning of the script.
E.g: setting the PreExecution hook (9th stage of the workflow)
HiveConf.ConfVars.PREEXECHOOKS -> hive.exec.pre.hooks :
set hive.exec.pre.hooks=com.example.MyPreHook;
Unfortunately these features aren't really documented, but you can always look into the Driver class to see the evaluation order of the hooks.
Remark: I assumed here Hive 0.11.0, I don't think that the Cloudera distribution
differs (too much)
a good start --> http://dharmeshkakadia.github.io/hive-hook/
there are examples...
note: hive cli from console show the messages if you execute from hue, add a logger and you can see the results in hiveserver2 log role.

Resources