Error starting exported model from Automl Vision model - google-cloud-automl

I trained an Auto ml Vision Edge model and exported it as TensorFlow Package model. I then tried to run it using 'gcr.io/automl-vision-ondevice/gcloud-container-1.12.0' image:
docker run --rm --name ${CONTAINER_NAME} -p ${PORT}:8501 -v ${MODEL_PATH}:/tmp/mounted_model/0001 -t ${CPU_DOCKER_GCR_PATH}
This is the output:
2020-03-24 18:49:11.574773: I tensorflow_serving/model_servers/server.cc:82] Building single TensorFlow model file config: model_name: default model_base_path: /tmp/mounted_model/
2020-03-24 18:49:11.576100: I tensorflow_serving/model_servers/server_core.cc:462] Adding/updating models.
2020-03-24 18:49:11.576174: I tensorflow_serving/model_servers/server_core.cc:559] (Re-)adding model: default
2020-03-24 18:49:11.676338: I tensorflow_serving/core/basic_manager.cc:739] Successfully reserved resources to load servable {name: default version: 1}
2020-03-24 18:49:11.676387: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: default version: 1}
2020-03-24 18:49:11.676457: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: default version: 1}
2020-03-24 18:49:11.676491: I external/org_tensorflow/tensorflow/contrib/session_bundle/bundle_shim.cc:363] Attempting to load native SavedModelBundle in bundle-shim from: /tmp/mounted_model/0001
2020-03-24 18:49:11.676551: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /tmp/mounted_model/0001
2020-03-24 18:49:11.713626: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2020-03-24 18:49:11.748933: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-03-24 18:49:11.821336: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:310] SavedModel load for tags { serve }; Status: fail. Took 144731 microseconds.
2020-03-24 18:49:11.821400: E tensorflow_serving/util/retrier.cc:37] Loading servable: {name: default version: 1} failed: Not found: Op type not registered 'FusedBatchNormV3' in binary running on 2f729ee881b6. Make sure the Op and Kernelare registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazilyregistered when the module is first accessed.
It seems that the error is "failed: Not found: Op type not registered 'FusedBatchNormV3'"
The model is a standard exported auto ml vision model that I never touched. The model is working fibe when served by Google auto m vision deployment but I want to run it myself. Any help?
Best
André

The error message "failed: Not found: Op type not registered 'FusedBatchNormV3'" is indeed symptomatic of a conflict in runtime versions used for model training and deployment.
The issue lies with the (not configurable) runtime version used by the Console when creating a model version.
The workaround is to train and deploy your model exclusively through the cli.

Related

Unknown processors type "resourcedetection" for "resourcedetection"

Running OT Collector with image ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector:0.58.0
In config.yaml I have,
processors:
batch:
resourcedetection:
detectors: [ env ]
timeout: 2s
override: false
The collector is deployed as a sidecar but it keeps failing with
collector server run finished with error: failed to get config: cannot unmarshal the configuration: unknown processors type "resourcedetection" for "resourcedetection" (valid values: [resource span probabilistic_sampler filter batch memory_limiter attributes])
Any idea as to what is causing this? I haven't found any relevant documentation/question
The Resource Detection Processor is part of the otelcol-contrib distro upstream and you'd hence would need to use otel/opentelemetry-collector-contrib:0.58.0 (or the equivalent on your container registry of choice) for this processor to be available in your collector.

BOOST ERROR: Failed to retrieve archive table name

I have installed cacti 1.2.17 on Ubuntu 20.04 using the docker container. I have installed the cacti as instructed. However, I have provided some additional resources for cacti. I am getting below 2 problems:
Loading for any leaf section is very slow it take almost 20sec load the graphs when clicked
I am getting boost error "BOOST ERROR: Failed to retrieve archive table name"''
I have added the screenshot of the error and also along with the configuration of the cacti
innodb_file_format=Barracuda
innodb_large_prefix=1
collation-server=utf8mb4_unicode_ci
character-set-server=utf8mb4
innodb_doublewrite=ON
max_heap_table_size=1G
tmp_table_size=1G
join_buffer_size=1G
innodb_buffer_pool_size=3G
innodb_flush_log_at_timeout=3
innodb_read_io_threads=32
innodb_write_io_threads=16
innodb_io_capacity=5000
innodb_io_capacity_max=10000
innodb_buffer_pool_instances=9enter image description here
[1]: https://i.stack.imgur.com/ekdak.jpg

How can I create several executors for a job in Circle CI orb?

NOTE: The actual problem I am trying to solve is run testcontainers in Circle CI.
To make it reusable, I decided to extend the existing orb in my organisation.
The question, how can I create several executors for a job? I was able to create the executor itself.
Executor ubuntu.yml:
description: >
The executor to run testcontainers without extra setup in Circle CI builds.
parameters:
# https://circleci.com/docs/2.0/configuration-reference/#resource_class
resource-class:
type: enum
default: medium
enum: [medium, large, xlarge, 2xlarge]
tag:
type: string
default: ubuntu-2004:202010-01
resource_class: <<parameters.resource-class>>
machine:
image: <<parameters.tag>>
One of the jobs itself:
parameters:
executor:
type: executor
default: openjdk
resource-class:
type: enum
default: medium
enum: [small, medium, medium+, large, xlarge]
executor: << parameters.executor >>
resource_class: << parameters.resource-class >>
environment:
# Customize the JVM maximum heap limit
MAVEN_OPTS: -Xmx3200m
steps:
# Instead of checking out code, just grab it the way it is
- attach_workspace:
at: .
# Guessing this is still necessary (we only attach the project folder)
- configure-maven-settings
- cloudwheel/fetch-and-update-maven-cache
- run:
name: "Deploy to Nexus without running tests"
command: mvn clean deploy -DskipTests
I couldn't find a good example of adding several executors, and I assume that I will need to add ubuntu and openjdk for every job. Am I right?
I continue looking into other orbs and documentation but cannot find a similar case to mine.
As it is stated in Circle CI documentation, executors can be defined like this:
executors:
my-executor:
machine: true
my-openjdk:
docker:
- image: openjdk:11
Side note, there can be many executors of any type such as docker, machine (Linux), macos, win.
See, StackOverflow question how to invoke executors from CircleCI orbs.

Anyone having problem deploying a lambda function with serverless?

I am getting this error when trying to deploy a simple function.
Serverless Error ---------------------------------------
ServerlessError: Inaccessible host: `cloudformation.us-west-2.amazonaws.com'. This service may not be available in the `us-west-2' region.
Get Support --------------------------------------------
Docs: docs.serverless.com
Bugs: github.com/serverless/serverless/issues
Issues: forum.serverless.com
Your Environment Information -----------------------------
OS: linux
Node Version: 10.15.3
Serverless Version: 1.38.0

Flaky Assert "java.lang.AssertionError: Can't unlock: Not locked!"

While trying to build a H2O Random Forest via the Python API, I got a flaky error. (I've saved the .err file which is empty and the .out file in case somebody wants to look at it.)
"java.lang.AssertionError: Can't unlock: Not locked!"
in two out of the three times I tried. One failure showed progress up to 86% and the other 90%. On the third try, I got all the way through. I restarted the H2O server after the first failure.
Running on x86_64 x86_64 x86_64 GNU/Linux, Linux 4.4.0-101-generic (Ubuntu), H2O 3.18.0.4, Python 2.7.12
There are about 33K training set examples for a multi-nominal classifier of about 86 classes and 137 numeric input features. Previously, we had no problem (other than some time out issues) with the system using similar data using much older versions of H2O.
Here's the output to stdout from running my program.
Attempting to start a local H2O server...
Java Version: java version "1.8.0_144"; Java(TM) SE Runtime Environment (build 1.8.0_144-b01); Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)
Starting server from /home/ubuntu/django-env/lib/python2.7/site-packages/h2o/backend/bin/h2o.jar
Ice root: /tmp/tmpViqbs4
JVM stdout: /tmp/tmpViqbs4/h2o_ubuntu_started_from_python.out
JVM stderr: /tmp/tmpViqbs4/h2o_ubuntu_started_from_python.err
Server is running at http://127.0.0.1:54321
Connecting to H2O server at http://127.0.0.1:54321... successful.
-------------------------- ----------------------------------------
H2O cluster uptime: 08 secs
H2O cluster timezone: Etc/UTC
H2O data parsing timezone: UTC
H2O cluster version: 3.18.0.4
H2O cluster version age: 24 days
H2O cluster name: H2O_from_python_ubuntu_x4p9wv
H2O cluster total nodes: 1
H2O cluster free memory: 6.545 Gb
H2O cluster total cores: 8
H2O cluster allowed cores: 8
H2O cluster status: accepting new members, healthy
H2O connection url: http://127.0.0.1:54321
H2O connection proxy:
H2O internal security: False
H2O API Extensions: XGBoost, Algos, AutoML, Core V3, Core V4
Python version: 2.7.12 final
-------------------------- ----------------------------------------
[stuff omitted]
Parse progress: |█████████████████████████████████████████████████████████████████████████████| 100%
Running trainh2o()..
drf Model Build progress: |█████████████████████████████████████████████████████████▉ (failed)| 86%
Traceback (most recent call last):
File "/home/ubuntu/XXX/webapp/XXX/classify_unified/buildpc.py", line 130, in <module>
print("\nTrain Error: {}".format(pc.train()))
File "/home/ubuntu/XXX/webapp/XXX/classify_unified/ProvisionClassifier.py", line 2130, in train
self._trainh2o()
File "/home/ubuntu/XXX/webapp/XXX/classify_unified/ProvisionClassifier.py", line 2090, in _trainh2o
training_frame=self._combo_h2odf)
File "/home/ubuntu/django-env/local/lib/python2.7/site-packages/h2o/estimators/estimator_base.py", line 232, in train
model.poll(verbose_model_scoring_history=verbose)
File "/home/ubuntu/django-env/local/lib/python2.7/site-packages/h2o/job.py", line 77, in poll
"\n{}".format(self.job_key, self.exception, self.job["stacktrace"]))
EnvironmentError: Job with key $03017f00000132d4ffffffff$_8f9d9edcff82420eefea1d6cff0f4396 failed with an exception: java.lang.AssertionError: Can't unlock: Not locked!
stacktrace:
java.lang.AssertionError: Can't unlock: Not locked!
at water.Lockable$Unlock.atomic(Lockable.java:197)
at water.Lockable$Unlock.atomic(Lockable.java:187)
at water.TAtomic.atomic(TAtomic.java:17)
at water.Atomic.compute2(Atomic.java:56)
at water.Atomic.fork(Atomic.java:39)
at water.Atomic.invoke(Atomic.java:31)
at water.Lockable.unlock(Lockable.java:181)
at water.Lockable.unlock(Lockable.java:176)
at hex.tree.SharedTree$Driver.computeImpl(SharedTree.java:358)
at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:206)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1263)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
H2O session _sid_9f4c closed.

Resources