Aborting AutoML after too many consecutive model failures - h2o

I am getting the following error using H2O AutoML. I couldn't figure out the reason and wondering if you know anything to read about possible causes for this error in H2O?
Aborting AutoML after too many consecutive model failures

Related

Can't serialize due to concurrent operations: memgraph

I am performing mix of queries(reads/write/updates/deletes) to a single memgraph instance.
To do the same I am using Java client by Neo4j, all the APIs I am currently using are sync APIs from the driver.
Nature of queries in my case is such that I can execute them concurrently with no side effects. For better performance I am firing the queries in parallel. The error I am getting is for a CREATE operation where I am creating an edge between two nodes. This is consistent as I tried running this same setup multiple times and every time, all queries go through except it crashes when it comes to this create edge stage.
Query for reference:
OPTIONAL MATCH (node1) WHERE id(node1) = $nodeId1
OPTIONAL MATCH (node2) WHERE id(node2) = $nodeId2
CREATE (node1)-[:KNOWS]-> (node2)
I am not able to find any documentation around any such error. Please point me to some document like this or any workaround using which I can ask memgraph to put the query on hold if same objects are being operated by some other query.
One approach I am thinking is just implement retry for any such failed queries, but I am looking for a cleaner approach.
P.S. I was running the same setup on Neo4j earlier and did not encounter any problems with it.
Yep, in the case of this error, the code should retry the query. I think an equivalent issue can happen in Neo4j, but since Memgraph is more optimistic about locking, sometimes the error might happen more often. In general, the correct approach is to have error handling for this case implemented.

QueryDNS stuck, how to get out of UnexpectedNamingException?

Using QueryDNS, some of my incoming flowfiles carry an "invalid" fully qualified domain name.
In this case the QueryDNS processor displays an ugly error message
Failed to process session due to Unexpected NamingException while processing records. Please review your configuration.: org.apache.nifi.processor.exception.ProcessException: Unexpected NamingException while processing records. Please review your configuration.
It returns the flowfile to the incoming queue and will loop indefinitely yielding and trying to process the flowfile. Meanwhile other incoming flowfiles are stuck in the incoming queue and will never get processed since there are only "found" or "not found" relationships available with the processor.
How is it possible to get rid of these flowfiles (in NiFi 1.9.2), for example passing them to a LogAttribute processor ?
QueryDNS stuck
The only way I found to get round this issue was to thoroughly clean/validate the hostnames/IPs I was looking up before it reached the processor.
If I'm honest the processor isn't really fit for purpose for any significant quantity of data. The problem you mention coupled with the lack of caching make it practically useless in production.
In the end we switched to using Logstash for our enrichment rather than NiFi, although, depending on your use case this may not be possible.

How to identify multiple entities in RASA

I want to extract multiple entities from a user input.
Example- "Service httpd is not responding because of high CPU usage and DNS Error"
So here I want to identify below:
Httpd
High CPU usage
DNS Error
And I will be using this keywords to get a response from a Database.
Just annotate them accordingly, e.g.
## intent: query_error
- Service [httpd](keyword) is not responding because of [high CPU usage](keyword) and [DNS Error](keyword)
Having the sentence from above, Rasa NLU would extract 3 entities of type keyword. You can then access these entities in a custom action and query your database.
Regarding the number of examples which are required: this depends on
the NLU pipeline which you are using. Typically tensorflow_embedding requires more training examples than spacy_sklearn since it does not use pretrained language models.
the number of different values your entities can have. If it is only httpd, high CPU usage, and DNS error then you don't need a lot of examples. However, if you have a thousand different values for your entity, then you need more training examples
One intent is enough if you always want to trigger the same custom action. However, if you want to classify different type of problems, e.g. server problems and client problems, and trigger different databases depending on the type of problems, you might consider having multiple intents.
Sorry for the vague answers, but in machine learning most things are highly dependent on the use case and the dataset.

Best practices to handle errors in NIFI

I'm using NIFI, and i have data flows where I use the following processos :
ExecuteScript
RouteOnAttribute
FetchMapDistribuedCache
InvokeHTTPRequest
EvaluateJSONPath
and two level process group like NIFI FLOW >>> Process group 1 >>> Process group 2, my question is how to handle errors in this case, I have created output port for each processor to output errors outside the process group and in the NIFI Flow I have done a funnel for each error type and then put all those errors catched in Hbase so i can do some reporting later on, and as you can imagine this add multiples relationships and my simple dataflow start to became less visible.
My questions are, what's the best practices to handle errors in processors, and what's the best approach to do some error reporting using NIFI ( Email or PDF )
It depends on the errors you routinely encounter. Some processors may fail to perform a task (an expected but not desired outcome), and route the failed flowfile to REL_FAILURE, a specific relationship which can be connected to a processor to handle these failures, or back to the same processor to be retried. Others (or the same processors in different scenarios) may encounter exceptions, which are unexpected occurrences which cannot be resolved by the processor.
An example of this is PutKafka vs. EncryptContent. If the remote Kafka system is temporarily unavailable, the processor would fail to send the flowfile content. However, retrying after some delay period could be successful if the remote system is once again available. However, decrypting cipher text with the wrong key will always throw an exception, no matter how many times it is attempted or how long the retry delay is.
Many users route the errors to PutEmail processor and report them to a specific user/group who can evaluate the errors and monitor the data flow if necessary. You can also use "Reporting Tasks" to monitor metrics or ingest provenance data as operational data and route that to email/offline storage, etc. to run analytics on it.

Calculating similarites between sentences

I have datbase with thousands of rows of error logs and their description.This error log is for an application that running 24/7. I want to create a dashboard/UI to view the current common errors happening for prodcution support.
The problem I am having is that even though there are lot of common errors, the error description differs by the transcation ID or user ID or things that are unique for that sigle prcoess.
e.g Error trasaction XYz failed for user 233
e.g 2. Error trasaction XYz failed for user 567
I consider these two erros to be same. So I want to a program that will go through the new error logs and classify them into groups. I am trying to use "edit distance" but its very slow.Since I alraedy have old error logs, i am trying to think of solutions using that information too. Any thoughts?
I'm assuming that the error messages are generated by a program, and so they probably fall into very specific pattern.
That means you don't have to do anything particularly complex. Just parse the error messages: use regular expressions (or maybe something more powerful) to split the messages into tuples. Then group or count or do something with the individual fields. For example, you could do a regex like "Error transaction ([A-Z]*) failed for user ([0-9]*)". You could then make a histogram of the error codes (first capture group) or users (second capture group).
There are other metrics (apart from Levenshtein) which might be more appropriate. Have you considered Cosine Similarity?
SimMetrics is an F/OSS library that offers an extensive collection of similarity algorithms and their corresponding cost functions.

Resources