GKE Ingress Timeout Values - websocket

I'd like to use websockets in my web application. Right now my websocket disconnects and reconnects every 30 seconds, which is the default timeout in GKE Ingress. I tried the following to change timeout values:
metadata:
name: my-ingress
annotations:
nginx.ingress.kubernetes.io/proxy-connect-timeout: "300"
nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
nginx.org/proxy-connect-timeout: "300"
nginx.org/proxy-read-timeout: "3600"
nginx.org/proxy-send-timeout: "3600"
After recreating the ingress through kubectl the timeout value remains 30 seconds:
I also tried to create a backend configuration as described here: https://cloud.google.com/kubernetes-engine/docs/how-to/configure-backend-service
The timeout value still remained unchanged at 30 seconds.
Is there a way to increase timeout value through annotations in .yml file? I could edit the timeout value through the web interface but I'd rather use .yml files.

Fixed. I upgraded my master and its nodes to version 1.14 and then the backend config approach worked.

This doesn't seems like an issue with the version.
As long as the GKE version is 1.11.3-gke.18 and above as mentioned here, you should be able to update the timeoutSec value by configuring the 'BackendConfig' as explained in the help center article.
I changed the timeoutSec value by editing the example manifest and then updating the BackendConfig (in my GKE 1.13.11-gke.14 cluster) using "kubectl apply -f my-bsc-backendconfig.yaml
" command.

Related

Datadog skip ingestion of Spring actuator health endpoint

I was trying to configure my application to not report my health endpoint in datadog APM./ I checked the documentation here: https://docs.datadoghq.com/tracing/guide/ignoring_apm_resources/?tab=kuberneteshelm&code-lang=java
And tried adding the config in my helm deployment.yaml file:
- name: DD_APM_IGNORE_RESOURCES
value: GET /actuator/health
This had no effect. Traces were still showing up in datadog. The method and path are correct. I changed the value a few times with different combinations (tried a few regex options). No go.
The I tried the DD_APM_FILTER_TAGS_REJECT environment variable, trying to ignore http.route:/actuator/health. Also without success.
I even ran the agent and application locally to see if there was anything to do with the environment, but the configs were not applied.
What are more options to try in this scenario?
This is the span detail:

Elastic APM different index name

As of a few weeks ago we added filebeat, metricbeat and apm to our dotnet core application ran on our kubernetes cluster.
It works all nice and recently we discovered filebeat and metricbeat are able to write a different index upon several rules.
We wanted to do the same for APM, however searching the documentation we can't find any option to set the name of the index to write to.
Is this even possible, and if yes how is it configured?
I also tried finding the current name apm-* within the codebase but couldn't find any matches upon configuring it.
The problem which we'd like to fix is that every space in kibana gets to see the apm metrics of every application. Certain applications shouldn't be within this space so therefore i thought a new apm-application-* index would do the trick...
Edit
Since it shouldn't be configured on the agent but instead in the cloud service console. I'm having troubles to 'user-override' the settings to my likings.
The rules i want to have:
When an application does not live inside the kubernetes namespace default OR kube-system write to an index called apm-7.8.0-application-type-2020-07
All other applications in other namespaces should remain in the default indices
I see you can add output.elasticsearch.indices to make this happen: Array of index selector rules supporting conditionals and formatted string.
I tried this by copying the same i had for metricbeat and updated it to use the apm syntax and came to the following 'user-override'
output.elasticsearch.indices:
- index: 'apm-%{[observer.version]}-%{[kubernetes.labels.app]}-%{[processor.event]}-%{+yyyy.MM}'
when:
not:
or:
- equals:
kubernetes.namespace: default
- equals:
kubernetes.namespace: kube-system
but when i use this setup it tells me:
Your changes cannot be applied
'output.elasticsearch.indices.when': is not allowed
Set output.elasticsearch.indices.0.index to apm-%{[observer.version]}-%{[kubernetes.labels.app]}-%{[processor.event]}-%{+yyyy.MM}
Set output.elasticsearch.indices.0.when.not.or.0.equals.kubernetes.namespace to default
Set output.elasticsearch.indices.0.when.not.or.1.equals.kubernetes.namespace to kube-system
Then i updated the example but came to the same conclusion as it was not valid either..
In your ES Cloud console, you need to Edit the cluster configuration, scroll to the APM section and then click "User override settings". In there you can override the target index by adding the following property:
output.elasticsearch.index: "apm-application-%{[observer.version]}-{type}-%{+yyyy.MM.dd}"
Note that if you change this setting, you also need to modify the corresponding index template to match the new index name.

How to compare a Kubernetes custom resource spec with expected spec in a GO controller?

I am trying to implement my first Kubernetes operator. I want the operator controller to be able to compare the config in a running pod vs the expected config defined in a custom resource definition.
Eg: Custom Resource
apiVersion: test.com/v1alpha1
kind: TEST
metadata::
name: example-test
spec:
replicas: 3
version: 20:03
config:
valueA: true
valueB: 123
The above custom resource is deployed and 3 pods are running. A change is made such that the config "valueA" is changed to false.
In the GO controller reconcile function I can get the TEST instance and see the "new" version of the config:
instance := &testv1alpha1.TEST{}
log.Info("New value : " + instance.Spec.Config.valueA)
I am wondering how I can access what the value of "valueA" is in my running pods so that I can compare and recreate the pods if it has changed?
Also a secondary question, do I need to loop through all running pods in the reconcile function to check each or can I do this as a single operation?
What is this config exactly? If it's Pod's spec config, i would suggest to you to update not individual Pods, but spec in Deployment, it will restart it's pods automatically. If it's environment variables for apps in this pod, i would recommend to use ConfigMap
for storing them, and update it. Answering your second question, in both cases -- it will be a single operation.
To get Deployment, or ConfigMap you need to have it's name and namespace, usually, with custom resource, it should be derived from it's name. Here is example how you can get deployment instance and update it.

Sprint Cloud Dataflow with Kubernetes: BackoffLimit

Kubernetes Pod backoff failure policy
From the k8s documentation:
There are situations where you want to fail a Job after some amount of retries due to a logical error in configuration etc. To do so, set .spec.backoffLimit to specify the number of retries before considering a Job as failed. The back-off limit is set by default to 6.
Spring cloud dataflow:
When a job has failed, we actually don't want a retry. In other words, we want to set the backoffLimit: 1 in our Sprint Cloud Dataflow config file.
We have tried to set it like the following:
deployer.kubernetes.spec.backoffLimit: 1
or even
deployer.kubernetes.backoffLimit: 1
But both is not transmitted to our Kubernetes Cluster.
After 6 tries, we see the following message:
status: conditions:
- lastProbeTime: '2019-10-22T17:45:46Z'
lastTransitionTime: '2019-10-22T17:45:46Z'
message: Job has reached the specified backoff limit
reason: BackoffLimitExceeded
status: 'True'
type: Failed failed: 6 startTime: '2019-10-22T17:33:01Z'
Actually we want to fail fast (1 or 2 tries maximum)
Question: How can we properly set this property, so that all task triggered by SCDF will fail maximum once on Kubernetes?
Update (23.10.2019)
We have also tried the property:
deployer:
kubernetes:
maxCrashLoopBackOffRestarts: Never # No retry for failed tasks
But the jobs are still failing 6 times instead of 1.
Update (26.10.2019)
For completeness sake:
I am scheduling a task in SCDF
The task is triggered on Kubernetes (more specifically Openshift)
When I check the configuration on the K8s-platform, I see that it still has a backoffLimit of 6, instead of 1:
Yalm config snippet taken from the running pod:
spec:
backoffLimit: 6
completions: 1
parallelism: 1
In the official documentation, it says:
`maxCrashLoopBackOffRestarts` - Maximum allowed restarts for app that is in a CrashLoopBackOff. Values are `Always`, `IfNotPresent`, `Never`
But maxCrashLoopBackOffRestarts takes an integer. So I guess the documentation is not accurate.
The pod is then restarted 6 times.
I have tried to set those properties unsuccessfully:
spring.cloud.dataflow.task.platform.kubernetes.accounts.defaults.maxCrashLoopBackOffRestarts: 0
spring.cloud.deployer.kubernetes.maxCrashLoopBackOffRestarts: 0
spring.cloud.scheduler.kubernetes.maxCrashLoopBackOffRestarts: 0
None of those has worked.
Any idea?
To override the default restart limit, you'd have to use SCDF's maxCrashLoopBackOffRestarts deployer property. All of the supported properties are documented in the ref. guide.
You can configure to override this property "globally" in SCDF or individually override it at each stream/task deployment level, as well. More info here.
Thanks to ilayaperumalg it's much clearer why it's not working:
It looks like the property maxCrashLoopBackOffRestarts is applicable
for determining the status of the runtime application instance while
the property you refer to as backoffLimit is applicable to the JobSpec
which is currently not being supported. We can add this as a feature
to support your case.
Github Link

Spring cloud data flow on kubernetes doesnt show the streams section

Installed spring cloud data flow on kubernetes following the procedure here: https://docs.spring.io/spring-cloud-dataflow-server-kubernetes/docs/1.7.1.BUILD-SNAPSHOT/reference/htmlsingle/#_installation
After install console is up, but only shows apps and audit records on the dashbaord, stream and task designers are missing. Are there additional steps.
Enable use skipper in the server-deployment.yaml by uncommenting the following lines seemed to have done the trick.
name: SPRING_CLOUD_SKIPPER_CLIENT_SERVER_URI
value: 'http://${SKIPPER_SERVICE_HOST}/api'
- name: SPRING_CLOUD_DATAFLOW_FEATURES_SKIPPER_ENABLED
value: 'true'
Also changed some services to NodePort instead of ClusterIP to connect from local machine.

Resources