There is a scenario that I'm working on and that is clustering Consul for Service Discovery. I have 3 physical machines to create a cluster. I follow this tutorial to design my cluster. At the end of the day, this is the result of consul members:
** I don't see the leader election or something like that, strange!!
There is an error on connection to Consul UI, as long as I enabled ui in the configuration. The error said
The Backend responded with an error, Error 500
You may have visited a URL that is loading an unknown resource, so you can try going back to the root or try re-submitting your ACL Token/SecretID by going back to ACLs.
Then I realized may it's important to configure ACL, so I added these lines to my configuration file
acl {
enabled = true
default_policy = "deny"
down_policy = "extend-cache"
}
After this consul members command said
Error retrieving members: Unexpected response code: 403 (ACL not found)
and consul acl bootstrap command said
Failed ACL bootstrapping: Unexpected response code: 500 (The ACL system is currently in legacy mode.)
Good to mention that all the configuration for servers is coming from this tutorial.
So what's going on here?
Related
So i have setup a laravel application and hosted on a docker which in turned hosted using AWS ECS Cluster running behind ALB.
So far i have the application up and running as expected, everything runs just the way it is (e.g. Sessions are stored in memcached and working, static assets are in S3 bucket, etc).
Right now i just have 1 problem with stability and i am not quiet sure where exactly the problem is. When i hit my URL / website, sometimes (randomly) it returns 502/503 HTTP error. When this happen i have to wait for about a minute or 2 before the app can return 200 HTTP code.
Here's a result of doing tail on my docker (i.e. nginx log)
At this point i am totally lost and not sure where else i should check. I've tried the following:
Run it locally, with the same docker / nginx >> works just fine.
Run it without ALB (i.e. Using just 1 EC2) >> having similar problem.
Run it using ALB on 2 different EC2 type (i.e. t2.small and micro) >> both having similar problem.
Run it using ALB on just 1 EC2 >> having similar problem.
According to your logs, ngjnx is answering 401 Unauthorized to the ALB health check request. You have to answer 200 OK in / endpoint or configure a different one like /ping in your ALB target group.
To check the health of your targets using the console
Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.
On the navigation pane, under LOAD BALANCING, choose Target Groups.
Select the target group.
On the Targets tab, the Status column indicates the status of each target.
If the status is any value other than Healthy, view the tooltip for more information.
More info: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/target-group-health-checks.html
I have had a similar issue in the past for one of a couple of possible reasons;
Health checks configured for the ALB, e.g. the ALB is waiting for the configured number of checks to go green (e.g. every 30 seconds hit an endpoint and wait for a 200 for 4/5 times. During the "unhealthy phase" the instance may be designated offline. This usually happens most often immediately after a restart or deployment or if an instance goes unhealthy.
DNS within NGINX. If the DNS records of the downstream service that NGINX is proxying have changed it might be that NGINX has cached (either according to the TTL or for much longer depending on your configuration) the old record and is therefore unable to connect to the downstream.
To help fully debug, it might be worth determining whether the 502/503 is coming from the ALB or from NGINX. You might be able to determine this from the access log of the ALB or the /var/log/nginx/access|error.log in the container.
It may also help to check, was there a response body on the response?
We have a setup with Spring Cloud Gateway running with consul service discovery and proxying requests to services in a cluster.
When one of these services responds with a Location: / header this gets rewritten on the way out thru the gateway.
The problem is that the gateway seems to add the service local hostname and port as found in Consul. This url is of course not available (or desirable) for the client.
I have verified thet the upstream server sends:
Location: /
(Generated by the "redirect: /" Spring MCV shorthand)
But when it reached the end client is rewritten to:
Location: https://10.0.0.10:34567/
(https://10.0.0.10:34567/ being the upstream location of the service in consul)
If is of course incorrect.
My problem is that I can't find any documentation on how to configure this and no indication of what classes are used (to debug) so I just don't know where to start looking for the fix.
The desired behaviour is of course to just leave the redirect unchanged.
In this particular case we use a host based routing setup:
.route("app", r -> r.host("app.**").uri("lb://app"))
Any help or hint appreciated.
I'm trying to install oracle on AWS redhat instance. Follow the steps given on this url: http://www.davidghedini.com/pg/entry/install_oracle_11g_xe_on And when I run config command as follows
/etc/init.d/oracle-xe configure
It gives following error.
Database Configuration failed. Look into
/u01/app/oracle/product/11.2.0/xe/config/log for details
When I check the log files it shows following errors.
ORA-01034: ORACLE not available Process ID: 0 Session ID: 0 Serial
number: 0
It seems specific issue on AWS cloud instance.
Is it because of swap memory?
Or is it because of port issue?
I'm using micro instance on it.
How can I get through?
this might be an EC2 security group issue and outbound access to the network on some port being used by the installer (license check, maybe?).
if your EC2 instance is very tightly locked down, you could test if it's a security group issue by adding a new Outbound security group rule to allow all TCP traffic out to anywhere on the internet (0.0.0.0/0)
for example, the install might be trying to hit a remote licensing server endpoint via HTTP or HTTPS but your security group doesn't allow that traffic out.
perhaps there's a 'verbose' flag that you can run the installer with that can give you more info about what it's failing on? HTH
https://github.com/GoogleCloudPlatform/kubernetes/tree/master/cluster/addons/fluentd-elasticsearch
The cluster is getting automatically deleted by using these configs to create the cluster.
From https://github.com/kubernetes/kubernetes/issues/11435 the solution is to remove
kubernetes.io/cluster-service: "true"
Though without these the elasticsearch is not available through the kubernetes master.
Should i create a pull request to remove the line from the files in the repo so people dont get confused?
Firstly, I'd recommend reformatting future questions so they adhere to the stack overflow guidelines: https://stackoverflow.com/help/how-to-ask.
I'd recommend making Elasticsearch a normal Kubernetes Service. You can expose in one of the following ways:
1. Set service.Type = NodePort and access it via any public ip of node:nodePort
2. Set service.Type = LoadBalancer, this will only work on cloud providers that have loadbalancers
3. Expose the RC directly through a host port (not recommended)
Those are just the common options for accessing a Service, please see the following thread for a more detailed discussion: https://groups.google.com/forum/#!topic/kubernetes-sig-network/B-A_RuqpFWk
It's generally not a good idead to send all external traffic meant for a Kubernetes service through the apiserver. However if you must do so, you can via an endpoint such as:
/api/v1/proxy/namespaces/default/services/nginx:80/
Where default is the namespace, nginx is the name of your service and 80 is the service port (needed to disambiguate multiport services).
I am running a performance test in AWS environment using jmeter tool. we have a cluster with auto scaling enabled and having memcache session failover jars. we are using jmeter master slave so we don't get the response data from the JTL file. The response code returned after 45 minutes of test durations:
Response code: 403
Response message: Forbidden
How to resolve the issue?
After researching more I found the cause ca be session failover jars of the memcache I have upgraded the jars version to 1.6.5 but still facing the same problem.
Are you using an ELB? If so, read here: http://community.blazemeter.com/knowledgebase/articles/94060-testing-amazon-elbs
It looks like you are using an ELB. An ELB has a CNAME attached to it. AWS changes the IP attached to the CNAME. This happens quite often.
When your test starts, JMeter does a DNS lookup for the ELB CNAME. The response is then cached. From this point onwards, the test sends traffic to the IP address that was in the response that is now cached.
The result is that at some point (after the IP changed) you are testing an old IP that can now belong to a different server or belong to NO server. This is probably why you are getting the 403.
To resolve this, you need to set Cache TTL to 0 (zero). This will instruct JMeter to NOT cache the DNS lookup response and always do it again (which is more realistic in any case). You should add the following to your JMeter line: -Dsun.net.inetaddr.ttl=0.
More info here: http://community.blazemeter.com/knowledgebase/articles/94060-testing-amazon-elbs