Ruby Stack failed to deploy on Google Developers Console - ruby

I tried to deploy Ruby stack using Google Developers Console, but no success. I tried several times at other project, error was always the same (below).
Do you have any idea why it keeps failing?
2014/10/23 15:59:44
rubyStackBox: PENDING
2014/10/23 15:59:55~2014/10/23 16:06:01
rubyStackBox: DEPLOYING
2014/10/23 16:06:11
rubyStackBox: DEPLOYMENT_FAILED
Replica rubystackbox-eaeo failed with status PERMANENTLY_FAILING: Replica State changed to PERMANENTLY_FAILING. Replica was unhealthy 2 consecutive times.

I replicated the issue you experienced several times and it also failed. What finally worked was playing with the zones/regions when deploying the ruby stack :
Developers console > Click-to-deploy > Set MySQL password > Advanced Options, choose a different zone and click Deploy.
Another useful tool when investigating this is Console Output. Even if the deployment fails, you can go to the VM instance and check View Output towards the bottom of the page. It will list all the packages and any errors encountered. The following command will achieve the same thing:
$ gcloud compute instances get-serial-port-output <INSTANCE_NAME> --project <PROJECT_ID> --zone <ZONE_NAME>
Please advise if still seeing issues.

Related

How to troubleshoot DDEV DB container healthcheck timeout

When i want to start my DDEV Project an Container stucks at creating
Container ddev-oszimt-lf12a-v2-db Started
Error Message:
Failed waiting for web/db containers to become ready: db container failed: log=, err=health check timed out after 2m0s: labels map[com.ddev.site-name:oszimt-lf12a-v2 com.docker.compose.service:db] timed out without becoming healthy, status=
Its an Error i also had with some other projects.
In the Error Log is no information about this.
What could the Problem be and how do i fix it?
This isn't a very good place to study problems with specific projects, our Discord Channel is much better, or the DDEV issue queue.
But I'll try to give you some ideas about how to study and debug this.
Go to the Troubleshooting section of the docs. Work through it step-by-step.
As it says there, try the simplest possible project and see what the results are.
If the problem is particular to one particular project, see if you can remove customizations like .ddev/docker-compose.*.yaml files and config.*.yaml and non-standard things in the config.yaml file.
To find out what the causes the healthcheck timeout, see the docs on this exact problem, in your case the db container is timing out. So first, ddev logs -s db to see if something happened, and second docker inspect --format "{{json .State.Health }}" ddev-<projectname>-db.
For more help, you'll need to provide more information with things like your OS, Docker Provider, etc, and the easiest way to do that is to run ddev debug test and capture the output and put it in a gist on gist.github.com, then come over to discord with a link to that.

Updating service [default] (this may take several minutes)...failed

This used to be working perfectly until a couple of days back exactly 4 days back. When i run gcloud app deploy now it complete the build and then straight after completing the build it hangs on Updating Service
Here is the output:
Updating service [default] (this may take several minutes)...failed.
ERROR: (gcloud.app.deploy) Error Response: [13] Flex operation projects/just-sleek/regions/us-central1/operations/8260bef8-b882-4313-bf97-efff8d603c5f error [INTERNAL]: An internal error occurred while processing task /appengine-flex-v1/insert_flex_deployment/flex_create_resources>2020-05-26T05:20:44.032Z4316.jc.11: Deployment Manager operation just-sleek/operation-1590470444486-5a68641de8da1-5dfcfe5c-b041c398 errors: [
code: "RESOURCE_ERROR"
location: "/deployments/aef-default-20200526t070946/resources/aef-default-20200526t070946"
message: {
\"ResourceType\":\"compute.beta.regionAutoscaler\",
\"ResourceErrorCode\":\"403\",
\"ResourceErrorMessage\":{
\"code\":403,
\"errors\":[{
\"domain\":\"usageLimits\",
\"message\":\"Exceeded limit \'QUOTA_FOR_INSTANCES\' on resource \'aef-default-20200526t070946\'. Limit: 8.0\",
\"reason\":\"limitExceeded\"
}],
\"message\":\"Exceeded limit \'QUOTA_FOR_INSTANCES\' on resource \'aef-default-20200526t070946\'. Limit: 8.0\",
\"statusMessage\":\"Forbidden\",
\"requestPath\":\"https://compute.googleapis.com/compute/beta/projects/just-sleek/regions/us-central1/autoscalers\",
\"httpMethod\":\"POST\"
}
}"]
I tried the following the ways to resolve the error:
I deleted all my previous version and left the running version
I ran gcloud components update still fails.
I create a new project, changed the region from [REGION1] to [REGION2] and deployed and m still getting the same error.
Also ran gcloud app deploy --verbosity=debug, does not give me any different result
I have no clue what is causing this issue and how to solve it please assist.
Google is already aware of this issue and it is currently being investigated.
There is a Public Issue Tracker, you may 'star' and follow so that you can receive any further updates on this. In addition, you may see some workarounds posted that could be performed temporarily if agreed with your preferences.
Currently, there is no ETA yet for the resolution but there will be an update provided as soon as the team progresses on the issue.
I resolved this by adding this into my app.yaml
automatic_scaling:
min_num_instances: 1
max_num_instances: 7
I found the solution here:
https://issuetracker.google.com/issues/157449521
And I was also redirected to:
gcloud app deploy - updating service default fails with code 13 Quota for instances limit exceeded, and 401 unathorizeed

Google Cloud Data flow jobs failing with error 'Failed to retrieve staged files: failed to retrieve worker in 3 attempts: bad MD5...'

SDK: Apache Beam SDK for Go 0.5.0
We are running Apache Beam Go SDK jobs in Google Cloud Data Flow. They had been working fine until recently when they intermittently stopped working (no changes made to code or config). The error that occurs is:
Failed to retrieve staged files: failed to retrieve worker in 3 attempts: bad MD5 for /var/opt/google/staged/worker: ..., want ; bad MD5 for /var/opt/google/staged/worker: ..., want ;
(Note: It seems as if it's missing a second hash value in the error message message.)
As best I can guess there's something wrong with the worker - It seems to be trying to compare md5 hashes of the worker and missing one of the values? I don't know exactly what it's comparing to though.
Does anybody know what could be causing this issue?
The fix to this issue seems to have been to rebuild the worker_harness_container_image with the latest changes. I had tried this but I didn't have the latest release when I built it locally. After I pulled the latest from the Beam repo, and rebuilt the image (As per the notes here https://github.com/apache/beam/blob/master/sdks/CONTAINERS.md) and reran it seemed to work again.
I'm seeing the same thing. If I look into the Stackdriver logging I see this:
Handler for GET /v1.27/images/apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515/json returned error: No such image: apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515
However, I can pull the image just fine locally. Any ideas why Dataflow cannot pull.

H2O Steam deploy can't connect to Prediction Service Builder

I am trying to use h2o steam (running on localhost) to deploy a model. After importing the model from h2o flow, clicking the "deploy model" option in the "models" section of the project, filling out the resulting dialog box, and clicking the "deploy" button, the following messages are displayed:
At first I thought that it was because maybe I needed to start up the service builder on my own, so I started it up following the docs here, but still got the same error. Any suggestions would be appreciated. Thanks :)
Just make sure jetty HTTP server is running locally by executing the following in your shell:
java -jar var/master/assets/jetty-runner.jar var/master/assets/ROOT.war
Looking here, it seems like I would need to "override" some kind of default browser restriction for accessing localhost:8080 (which is what I assume steam is trying to do to launch the service builder (I don't know much about networking related stuff)). I got around this by launching steam with the command:
$ ./steam serve master --prediction-service-host=localhost --prediction-service-port-range=12345:22345
where the ports are some arbitrary range between (1025, 65535) which I got by word-searching the a page of the steam source code (line 182 as of the date of this posting).
Doing this lets me deploy the models through the steam dialog without any error messages. Again, I don't know much about networking related stuff, so if anyone has a better way to solve this problem (ie. allow access of localhost:8080) please post or comment. Thanks.

Mesos framework stays inactive due to "Authentication failed: EOF"

I'm currently trying to deploy Eremetic (version 0.28.0) on top of Marathon using the configuration provided as an example. I actually have been able to deploy it once, but suddenly, after trying to redeploy it, the framework stays inactive.
By inspecting the logs I noticed a constant attempt to connect to some service that apparently never succeeds because of some authentication problem.
2017/08/14 12:30:45 Connected to [REDACTED_MESOS_MASTER_ADDRESS]
2017/08/14 12:30:45 Authentication failed: EOF
It looks like the service returning an error is ZooKeeper and more precisely it looks like the error can be traced back to this line in the Go ZooKeeper library. ZooKeeper however seems to work: I've tried to query it directly with zkCli and to run a small Spark job (where the Mesos master is given with zk:// URL) and everything seems to work.
Unfortunately I'm not able to diagnose the problem further, what could it be?
It turned out to be a configuration problem. The master URL was simply wrong and this is how the error was reported.

Resources