Using MOJOS in H2O Steam Prediction Service Builder - h2o

In h2o's documentation for the Steam Prediction Service Builder, here, it says that the service builder can compile both h2o pojos (.java files) and mojos (downloaded from h2o flow in my case as a .zip (version 3.10.5.2), which I have been using in the manner shown here). However, doing something like this:
gives this error:
Problem accessing /makewar. Reason:
Compilation of pojo failed exit value 1 warning: [options] bootstrap class path not set in conjunction with -source 1.6
error: Class names, 'drf_denials_v4.zip', are only accepted if annotation processing is explicitly requested
1 error
1 warning
So how can I use mojo files in the servicec builder? DO I need to use the "exported" model file from h2o flow rather than the "downloaded" zip file? The reason that I need to use mojos rather than the .java pojos is that my model is too large to fit in the pojo downloadable from h2o flow.
UPDATE:
Trying to use the CLI with the command:
$ curl -X POST --form mojo=#drf_denials_v4.zip --form jar=#h2o-genmodel.jar localhost:55000/makewar > drf_denials_v4.war
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 106M 100 53.6M 100 52.7M 6748k 6632k 0:00:08 0:00:08 --:--:-- 229k
in the dir. containing the relevant files, then using the command:
prediction-service-builder git:(master)$ java -jar jetty-runner-9.3.9.M1.jar --port 55001 ~/Documents/h2o_production/mojos/drf_denials_v4/drf_denials_v4.war
gives the output:
2017-09-21 12:33:58.226:INFO::main: Logging initialized #232ms
2017-09-21 12:33:58.234:INFO:oejr.Runner:main: Runner
2017-09-21 12:33:58.558:INFO:oejs.Server:main: jetty-9.3.9.M1
2017-09-21 12:33:59.557:WARN:oeja.AnnotationConfiguration:main: ServletContainerInitializers: detected. Class hierarchy: empty
2017-09-21 12:34:00.068 -1000 [main] INFO ServletUtil - modelNames size 1
2017-09-21 12:34:01.285 -1000 [main] INFO ServletUtil - added model drf_denials_v4 new size 1
2017-09-21 12:34:01.290 -1000 [main] INFO ServletUtil - added 1 models
2017-09-21 12:34:01.291:INFO:oejsh.ContextHandler:main: Started o.e.j.w.WebAppContext#4c75cab9{/,file:///tmp/jetty-0.0.0.0-55001-drf_denials_v4.war-_-any-39945022624149883.dir/webapp/,AVAILABLE}{file:///home/reedv/Documents/h2o_production/mojos/drf_denials_v4/drf_denials_v4.war}
2017-09-21 12:34:01.321:INFO:oejs.AbstractConnector:main: Started ServerConnector#176c9571{HTTP/1.1,[http/1.1]}{0.0.0.0:55001}
2017-09-21 12:34:01.322:INFO:oejs.Server:main: Started #3329ms
Going to localhost:55001, and trying to make a prediction, I see:
Notice that a prediction is given with a label, but there are no parameter input fields present and I get the cli error message:
2017-09-21 12:35:11.270:WARN:oejs.ServletHandler:qtp1531448569-12: Error for /info
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
at java.lang.StringBuffer.append(StringBuffer.java:270)
at java.io.StringWriter.write(StringWriter.java:101)
at java.io.StringWriter.append(StringWriter.java:143)
at java.io.StringWriter.append(StringWriter.java:41)
at com.google.gson.stream.JsonWriter.value(JsonWriter.java:519)
at com.google.gson.internal.bind.TypeAdapters$5.write(TypeAdapters.java:210)
at com.google.gson.internal.bind.TypeAdapters$5.write(TypeAdapters.java:194)
at com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.write(TypeAdapterRuntimeTypeWrapper.java:68)
at com.google.gson.internal.bind.ArrayTypeAdapter.write(ArrayTypeAdapter.java:93)
at com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.write(TypeAdapterRuntimeTypeWrapper.java:68)
at com.google.gson.internal.bind.ArrayTypeAdapter.write(ArrayTypeAdapter.java:93)
at com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.write(TypeAdapterRuntimeTypeWrapper.java:68)
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$1.write(ReflectiveTypeAdapterFactory.java:112)
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.write(ReflectiveTypeAdapterFactory.java:239)
at com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.write(TypeAdapterRuntimeTypeWrapper.java:68)
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$1.write(ReflectiveTypeAdapterFactory.java:112)
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.write(ReflectiveTypeAdapterFactory.java:239)
at com.google.gson.Gson.toJson(Gson.java:661)
at com.google.gson.Gson.toJson(Gson.java:640)
at com.google.gson.Gson.toJson(Gson.java:595)
at com.google.gson.Gson.toJson(Gson.java:575)
at InfoServlet.doGet(InfoServlet.java:59)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:837)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
The cli pojo example works, but trying to use my mojo zip does not.

Unfortunately, the UI has not been updated for mojo functionality yet. You can however use the command line to build war files with mojos.
Run this from your command line:
curl -X POST --form mojo=drf_denials_v4.zip --form jar=h2o-genmodel.jar localhost:55000/makewar > example.war
Then run the war file in the normal way.
For more information see: https://github.com/h2oai/steam/tree/master/prediction-service-builder

Related

No tuples is emitted or transffered by topology in storm ui

i am using stormcrawler 1.16 with elasticsearch 7.2.0. i have built project with with acrhetype.
command i run to submitted topology
storm jar target/stormcrawler-1.0-SNAPSHOT.jar org.apache.storm.flux.Flux --remote es-crawler.flux
i am getting this in output -
Parsing file: /home/ubuntu/stormcrawler/es-crawler.flux
835 [main] INFO o.a.s.f.p.FluxParser - loading YAML from input
stream...
841 [main] INFO o.a.s.f.p.FluxParser - Not performing property
substitution.
841 [main] INFO o.a.s.f.p.FluxParser - Not performing environment
variable substitution.
900 [main] INFO o.a.s.f.p.FluxParser - Loading includes from
resource: /crawler-default.yaml
901 [main] INFO o.a.s.f.p.FluxParser - loading YAML from input
stream...
903 [main] INFO o.a.s.f.p.FluxParser - Not performing property
substitution.
903 [main] INFO o.a.s.f.p.FluxParser - Not performing environment
variable substitution.
Configuration (interpreted):
then i last output lines -
2014 [main] WARN o.a.s.u.Utils - STORM-VERSION new 1.2.3 old 1.2.3
2376 [main] INFO o.a.s.StormSubmitter - Finished submitting topology: crawler
but when i check this crawler topology in storm ui, then in topology stats, no tuple is emitted or transffered by this crawler topology.
i have atteched a snapshot of storm ui in link below.
[in topology stats, no tuples is emitted or transffered. how can i solve this issue ? 1
Your POM file is probably missing the storm-crawler-elasticsearch dependency.
You could compare your code with what is generated by the storm-crawler-elasticsearch-archetype, which should give you a working configuration.
Use the archetype for Elasticsearch with:
mvn archetype:generate
-DarchetypeGroupId=com.digitalpebble.stormcrawler -DarchetypeArtifactId=storm-crawler-elasticsearch-archetype -DarchetypeVersion=LATEST
You'll be asked to enter a groupId (e.g. com.mycompany.crawler), an
artefactId (e.g. stormcrawler), a version and package name.
This will not only create a fully formed project containing a POM with
the dependency above but also a set of resources, configuration files
and a topology class. Enter the directory you just created (should be
the same as the artefactId you specified earlier) and follow the
instructions on the README file.

Adding more memory to h2o.ai Steam Prediction Service Builder

I trying to build a .war file in h2o steam's prediction service builder using a (~800MB) pojo file (a similara pojo of size ~200MB also produced these same problems). However, when trying this, an error appears after clicking 'build':
Problem accessing /makewar. Reason:
Compilation of pojo failed exit value 3 warning: [options] bootstrap class path not set in conjunction with -source 1.6
The system is out of resources.
Consult the following stack trace for details.
java.lang.OutOfMemoryError
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:255)
at com.sun.tools.javac.util.BaseFileManager.makeByteBuffer(BaseFileManager.java:302)
at com.sun.tools.javac.file.RegularFileObject.getCharContent(RegularFileObject.java:114)
at com.sun.tools.javac.file.RegularFileObject.getCharContent(RegularFileObject.java:53)
at com.sun.tools.javac.main.JavaCompiler.readSource(JavaCompiler.java:602)
at com.sun.tools.javac.main.JavaCompiler.parse(JavaCompiler.java:665)
at com.sun.tools.javac.main.JavaCompiler.parseFiles(JavaCompiler.java:950)
at com.sun.tools.javac.main.JavaCompiler.compile(JavaCompiler.java:857)
at com.sun.tools.javac.main.Main.compile(Main.java:523)
at com.sun.tools.javac.main.Main.compile(Main.java:381)
at com.sun.tools.javac.main.Main.compile(Main.java:370)
at com.sun.tools.javac.main.Main.compile(Main.java:361)
at com.sun.tools.javac.Main.compile(Main.java:56)
at com.sun.tools.javac.Main.main(Main.java:42)
I am launching the Prediction Service Builder from the command line following the instruction in this documentation. Is there a way to launch the service builder with more memory?
UPDATE
Using the command:
$ GRADLE_OPTS=-Xmx4g ./gradlew jettyRunWar
Trying to build a .war from the pojo returns the cli error:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/reedv/.gradle/wrapper/dists/gradle-2.7-all/2glqtbnmvcq45bfjvhghri39p6/gradle-2.7/lib/gradle-core-2.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/reedv/Documents/h2o_production/h2o-steam/steam/prediction-service-builder/build/tmp/jettyRunWar/webapp/WEB-INF/lib/slf4j-simple-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.SimpleLoggerFactory]
2017-09-21 15:22:48.084 -1000 [1222676357#qtp-1014435252-3] INFO MakeWarServlet - servletPath = /home/reedv/Documents/h2o_production/h2o-steam/steam/prediction-service-builder/build/tmp/jettyRunWar/webapp
2017-09-21 15:22:48.086 -1000 [1222676357#qtp-1014435252-3] INFO MakeWarServlet - tmpDir /tmp/makeWar316567921053262563022859149567148
2017-09-21 15:22:57.175 -1000 [1222676357#qtp-1014435252-3] INFO MakeWarServlet - added pojo model drf_denials_v4_v3-10-5-2.java
2017-09-21 15:22:57.190 -1000 [1222676357#qtp-1014435252-3] INFO MakeWarServlet - prejar null preclass null
2017-09-21 15:22:58.047 -1000 [1222676357#qtp-1014435252-3] INFO Util - warning: [options] bootstrap class path not set in conjunction with -source 1.6
2017-09-21 15:23:25.017 -1000 [1190941229#qtp-1014435252-0] INFO MakeWarServlet - tmpDir /tmp/makeWar432278342000106527922896081353600
2017-09-21 15:23:39.448 -1000 [1190941229#qtp-1014435252-0] INFO MakeWarServlet - added pojo model drf_denials_v4_v3-10-5-2.java
2017-09-21 15:23:39.569 -1000 [1190941229#qtp-1014435252-0] INFO MakeWarServlet - prejar null preclass null
2017-09-21 15:23:40.651 -1000 [1190941229#qtp-1014435252-0] INFO Util - warning: [options] bootstrap class path not set in conjunction with -source 1.6
2017-09-21 15:23:57.124 -1000 [1190941229#qtp-1014435252-0] INFO Util - OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000006efd00000, 1592786944, 0) failed; error='Cannot allocate memory' (errno=12)
2017-09-21 15:23:57.604 -1000 [1190941229#qtp-1014435252-0] INFO Util - #
2017-09-21 15:23:57.605 -1000 [1190941229#qtp-1014435252-0] INFO Util - # There is insufficient memory for the Java Runtime Environment to continue.
2017-09-21 15:23:57.616 -1000 [1190941229#qtp-1014435252-0] INFO Util - # Native memory allocation (mmap) failed to map 1592786944 bytes for committing reserved memory.
2017-09-21 15:23:57.619 -1000 [1190941229#qtp-1014435252-0] INFO Util - # An error report file with more information is saved as:
2017-09-21 15:23:57.622 -1000 [1190941229#qtp-1014435252-0] INFO Util - # /tmp/makeWar432278342000106527922896081353600/hs_err_pid32313.log
2017-09-21 15:23:57.747 -1000 [1190941229#qtp-1014435252-0] ERROR MakeWarServlet - doPost failed
java.lang.Exception: Compilation of pojo failed exit value 1 warning: [options] bootstrap class path not set in conjunction with -source 1.6
OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000006efd00000, 1592786944, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 1592786944 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /tmp/makeWar432278342000106527922896081353600/hs_err_pid32313.log
at ai.h2o.servicebuilder.Util.runCmd(Util.java:162)
at ai.h2o.servicebuilder.MakeWarServlet.doPost(MakeWarServlet.java:151)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:440)
at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:943)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
2017-09-21 15:23:58.039 -1000 [1190941229#qtp-1014435252-0] ERROR MakeWarServlet - Compilation of pojo failed exit value 1 warning: [options] bootstrap class path not set in conjunction with -source 1.6
OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000006efd00000, 1592786944, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 1592786944 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /tmp/makeWar432278342000106527922896081353600/hs_err_pid32313.log
Increasing the memory allocation value to Xmx6g or Xmx7g still gives this same error.
Furthermore, looking for the file /tmp/makeWar432278342000106527922896081353600/hs_err_pid32313.log that was supposedly created by this error, there seems to not be a directory named "makeWar432278342000106527922896081353600" in my root tmp/ directory, so I'm not really sure where to look for it.
I'm guessing you started it with Gradle. In that case you can do GRADLE_OPTS=-Xmx4g ./gradlew jettyrunwar to start it with 4 GB of memory.

GC overhead limit exceeded running background task in version 5.5

I am running SonarQube 5.5 with the following wrapper config settings.
wrapper.java.initmemory=3
wrapper.java.maxmemory=4096
I am still getting the following stack trace, this project has run successfully with sonarqube 5.3.
2016.05.09 11:14:09 INFO [o.s.s.c.s.ComputationStepExecutor] Compute coverage measures | time=105ms
2016.05.09 11:14:09 INFO [o.s.s.c.s.ComputationStepExecutor] Compute comment measures | time=120ms
2016.05.09 11:14:14 INFO [o.s.s.c.s.ComputationStepExecutor] Copy custom measures | time=5667ms
2016.05.09 11:14:15 INFO [o.s.s.c.s.ComputationStepExecutor] Compute duplication measures | time=424ms
2016.05.09 11:14:26 ERROR [o.s.s.c.c.ComputeEngineContainerImpl] Cleanup of container failed
java.lang.OutOfMemoryError: GC overhead limit exceeded
2016.05.09 11:14:26 ERROR [o.s.s.c.t.CeWorkerCallableImpl] Failed to execute task AVSWNiXkOySW07vtMalp
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOfRange(Arrays.java:3664) ~[na:1.8.0_45]
at java.lang.StringBuffer.toString(StringBuffer.java:671) ~[na:1.8.0_45]
at java.io.StringWriter.toString(StringWriter.java:210) ~[na:1.8.0_45]
at org.apache.commons.lang.Entities.escape(Entities.java:838) ~[commons-lang-2.6.jar:2.6]
at org.apache.commons.lang.StringEscapeUtils.escapeXml(StringEscapeUtils.java:620) ~[commons-lang-2.6.jar:2.6]
at org.sonar.server.computation.step.DuplicationDataMeasuresStep$DuplicationVisitor.appendDuplication(DuplicationDataMeasuresStep.java:129) ~[sonar-server-5.5.jar:na]
Memory adjustments must be made in sonar.properties:
sonar.web.javaOpts (for Web Server JVM)
sonar.ce.javaOpts (for Compute Engine JVM)
sonar.search.javaOpts (for JVM running ElasticSearch).
In your case the memory exception occurs in a background task so it relates to Compute Engine (see SonarQube architecture for more insight).
Settings in wrapper.conf are not relevant here and should be left untouched (hence the # DO NOT EDIT THE FOLLOWING SECTIONS warning in the file).

Spark shell throwing exception after trying to integrate s3 / hadoop

I'm working on a windows machine trying to set up a spark teststack - the aim is to read/write file to an s3 bucket.
I'm running 1.6.1. When I run spark-shell I now receive an error:
16/03/22 15:19:48 INFO metastore.HiveMetaStore: 0: get_functions: db=default pat=*
16/03/22 15:19:48 INFO HiveMetaStore.audit: ugi=Administrator ip=unknown-ip-addr cmd=get_functions: db=default pat=*
16/03/22 15:19:48 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
java.lang.RuntimeException: java.io.IOException: No FileSystem for scheme: s3n
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:204)
doing some reading lead me to believe that I need to add the aws jars as an argument - the jars are included in the hadoop structure.
I then run C:\Spark\hadoop\share\hadoop\tools\lib>spark-shell --jars aws-java-sdk-1.7.4.jar, hadoop-aws-2.7.1.jar
thinking that I'm now including the jars and so it must be ok...how foolish of me - I get the exact same error.
I then tried to include just the hadoop-aws jar and all kinds of exceptions were thrown including not being able to instantiate hive, s3a couldn't be instantiated, awscredentials wasn't happy and so on.
I'm at a bit of a loss, if anyone can shed some light on what I might be doing wrong I'll happily buy them a pint :)
EDIT:
I've since updates the core-site.xml file, by removing the fs.defaultFS property witha value os s3n://mybucketname, spark will now load.
In it's stead i have the hdfs://0.0.0.0:19000 which is working fine.
Soi I guess my question changes from 'gaaaaah to 'gaaaaah, how does one include s3 correctly as a filesystem'

Unable to initialize any output collector in CDH5.3

15/05/24 06:11:40 INFO mapreduce.Job: Task Id : attempt_1432456238397_0004_m_000000_0, Status : FAILED
Error: java.io.IOException: Unable to initialize any output collector
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:412)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:439)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
I am using CDH 5.3 cloudera quickstart, I wrote MapReduce Program. When i run that on shell i getting above exception.
Can any one please help me on this, how to resolve
The error "Unable to initialize any output collector" indicates that the job failed to start the container's, there can be multiple reasons for the same. However, one must review the container logs at hdfs to identify the cause the error.
In this specific instance, the value of mapreduce.task.io.sort.mb value was entered greater than 2047 MB, however the maximum value which it allows is 2047 MB, thus anything above its causes the jobs to fail marking the value provided as Invalid.
Solution:
Set the value of mapreduce.task.io.sort.mb < 2048MB
Reference:
https://support.pivotal.io/hc/en-us/articles/205649987-Map-Reduce-job-failed-with-Unable-to-initialize-any-output-collector-
CDH5.2: MR, Unable to initialize any output collector
https://community.cloudera.com/t5/Storage-Random-Access-HDFS/HBase-MapReduce-Job-Error-java-io-IOException-Unable-to/td-p/23786

Resources