Promtail deployment config on EC2: error in DescribeInstances - amazon-ec2

I'm just configuring promtail and testing it out on my EC2 instance, running this
./promtail-linux-amd64 -config.file=./ec2-promtail.yaml --dry-run
I get the following error
<ErrorResponse “http://webservices.amazon.com/AWSFault/200. Sender. InvalidAction. <|Message>Could not find operation DescribeInstances for version 2. . | 2bbe..caused by: expected element type but have ”
I'm checking if my config is wrong and if anyone has faced this issue.
I'm configuring Promtail on an AML Linux 2 Instance. My instance has no internet connection (security reasons) so I am using STS endpoint in my region to authenticate the role.
Promtail version: 2.7.1
AWS Role: Has DescribeInstance, DescribeAvailabilityZone permissions
The following is my ec2_sd_config
http_listen_port: 3100
grpc_listen_port: 0
clients:
- url: https://loki.dev.fdp.internal/loki/api/v1/push
positions:
filename: /opt/promtail/positions.yaml
scrape_configs:
- job_name: ec2-logs
ec2_sd_configs:
- region: ap-southeast-1
role_arn: arn:aws:iam::xxxxxx:role/promtail_role
endpoint: sts.ap-southeast-1.amazonaws.com # define to use regional endpoint instead of the default global
relabel_configs:
- source_labels: [__meta_ec2_tag_Name]
target_label: name
action: replace
- source_labels: [__meta_ec2_instance_id]
target_label: instance
action: replace
- source_labels: [__meta_ec2_availability_zone]
target_label: zone
action: replace
- action: replace
replacement: /var/log/**.log
target_label: __path__
- source_labels: [__meta_ec2_private_dns_name]
regex: "(.*)"
target_label: __host__

Related

Kubernetes API library support for apiVersion - "authentication.gke.io/v2alpha1" vs "v1"

v1 - v1 version was the first stable version release of the Kubernetes API. It contains many core objects in Kubernetes.
authentication.gke.io/v2alpha1 - API versions with ‘alpha’ in their name are early candidates for new functionality coming into Kubernetes. This is not stable for use. These may contain bugs.
For alpha version of authentication with API server(Kubernetes control plane), we are stuck with kubernetes API library support for kind:ClientConfig.
In our scenario(company), we have authentication.gke.io/v2alpha1 version type, with kind:ClientConfig, to authenticate kubernetes api server.
Below are sample templates for both versions:
kind: ClientConfig
apiVersion: authentication.gke.io/v2alpha1
spec:
name: dev-corp
server: https://10.x.x.x:443
certificateAuthorityData: ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
authentication:
- name: oidc
oidc:
clientID: aaaaad3-9aa1-33c8-dd0-ddddd6b5bf5
clientSecret: ccccccccccccccccc-
issuerURI: https://login.microsoftonline.com/aaaa92-aab7-bbfa-cccf-ddaaaaaaaa/v2.0
kubectlRedirectURI: http://localhost:12345/callback
cloudConsoleRedirectURI: http://console.cloud.google.com/kubernetes/oidc
scopes: offline_access,profile
userClaim: upn
userPrefix: '-'
groupsClaim: groups
preferredAuthentication: oidc
kind: Config
apiVersion: v1
clusters:
- cluster:
server: https://192.168.10.190:6443
name: cluster-1
- cluster:
server: https://192.168.99.101:8443
name: cluster-2
contexts:
- context:
cluster: cluster-1
user: kubernetes-admin-1
name: cluster-1
- context:
cluster: cluster-2
user: kubernetes-admin-2
name: cluster-2
preferences: {}
users:
- name: kubernetes-admin-1
user:
client-certificate: /home/user/.minikube/credential-for-cluster-1.crt
client-key: /home/user/.minikube/credential-for-cluster-1.key
- name: kubernetes-admin-2
user:
client-certificate: /home/user/.minikube/credential-for-cluster-2.crt
client-key: /home/user/.minikube/credential-for-cluster-2.key
kubernetes/client-go library from Kubernetes does not support kind:ClientConfig and gives error: no kind "ClientConfig" is registered for version "authentication.gke.io/v2alpha1"
https://github.com/kubernetes/client-go/issues/1151
Is there a kubenertes API library support to load kind:ClientConfig configuration?

Cube.js timing out in serverless environment

I've been following the guide on https://cube.dev/docs/deployment#express-with-basic-passport-authentication to deploy Cube.js to Lambda. I got it working against an Athena db such that the /meta endpoint works successfully and returns schemas.
When trying to query Athena data in Lambda however, all requests are resulting in 504 Gateway Timeouts. Checking the CloudWatch logs I see one consistent error:
/bin/sh: hostname: command not found
Any idea what this could be?
Here's my server.yml:
service: tw-cubejs
provider:
name: aws
runtime: nodejs12.x
iamRoleStatements:
- Effect: "Allow"
Action:
- "sns:*"
# Athena permissions
- "athena:*"
- "s3:*"
- "glue:*"
Resource:
- "*"
# When you uncomment vpc please make sure lambda has access to internet: https://medium.com/#philippholly/aws-lambda-enable-outgoing-internet-access-within-vpc-8dd250e11e12
vpc:
securityGroupIds:
# Your DB and Redis security groups here
- ########
subnetIds:
# Put here subnet with access to your DB, Redis and internet. For internet access 0.0.0.0/0 should be routed through NAT only for this subnet!
- ########
- ########
- ########
- ########
environment:
CUBEJS_AWS_KEY: ########
CUBEJS_AWS_SECRET: ########
CUBEJS_AWS_REGION: ########
CUBEJS_DB_TYPE: athena
CUBEJS_AWS_S3_OUTPUT_LOCATION: ########
CUBEJS_JDBC_DRIVER: athena
REDIS_URL: ########
CUBEJS_API_SECRET: ########
CUBEJS_APP: "${self:service.name}-${self:provider.stage}"
NODE_ENV: production
AWS_ACCOUNT_ID:
Fn::Join:
- ""
- - Ref: "AWS::AccountId"
functions:
cubejs:
handler: cube.api
timeout: 30
events:
- http:
path: /
method: GET
- http:
path: /{proxy+}
method: ANY
cubejsProcess:
handler: cube.process
timeout: 630
events:
- sns: "${self:service.name}-${self:provider.stage}-process"
plugins:
- serverless-express
Even this hostname error message is in logs however it isn't an issue cause.
Most probably you experiencing issue described here.
#cubejs-backend/serverless uses internet connection to access messaging API as well as Redis inside VPC for managing queue and cache.
One of those doesn't work in your environment.
Such timeouts usually mean that there's a problem with internet connection or with Redis connection. If it's Redis you'll usually see timeouts after 5 minutes or so in both cubejs and cubejsProcess functions. If it's internet connection you will never see any logs of query processing in cubejsProcess function.
Check the version of cube.js you are using, according to the changelog this issue should have been fixed in 0.10.59.
It's most likely down to a dependency of cube.js assuming that all environments where it will run will be able to run the hostname shell command (looks like it's using node-machine-id.

CloudFormation change set User not authorized

I am trying to publish aws lambda to my client aws account however I keep getting this error message.
Error creating CloudFormation change set: User: arn:aws:iam::xxxxxx:user/testuser is not authorized to perform: cloudformation:CreateChangeSet on resource: arn:aws:cloudformation:eu-west-1:xxxx:stack/test-Stack/*
When i tested on my own account I have added my IAM user with a policy of "AdministratorAccess" which basically allow everything.
I checked the policies there is only "CloudFormationReadonlyAccess" but these does not allow write/delete. What policy should I ask my client to assign to the IAM user?
I have also try adding to my role
"cloudformation:CreateStack",
"cloudformation:CreateChangeSet",
"cloudformation:ListStacks",
"cloudformation:UpdateStack",
"cloudformation:DescribeStacks",
"cloudformation:DescribeStackResource",
"cloudformation:DescribeStackEvents",
"cloudformation:ValidateTemplate",
"cloudformation:DescribeChangeSet",
"cloudformation:ExecuteChangeSet"
but the same error occur.
You need to specify the resource on which these actions are allowed. To be specific
- Action:
- cloudformation:CreateStack
- cloudformation:DeleteStack
- cloudformation:UpdateStack
- cloudformation:DescribeStacks
- cloudformation:DescribeChangeSet
- cloudformation:CreateChangeSet
- cloudformation:DeleteChangeSet
- cloudformation:ExecuteChangeSet
Effect: Allow
Resource:
- !Join
- ':'
- - arn
- aws
- cloudformation
- !Ref 'AWS::Region'
- !Ref 'AWS::AccountId'
- !Join
- /
- - stack
- test-stack
- '*'
Also check the sts:AssumeRole service is cloudformation.amazonaws.com

Only summary data in grafana for blackbox_exporter, not hosts separately

blackbox problem
I added blackbox_exporter in my docker-compose.yml:
blackbox_exporter:
container_name: blackbox_exporter
image: prom/blackbox-exporter
restart: always
ports:
- "9115:3115"
networks:
- monitor-net
labels:
org.label-schema.group: "monitoring"
I added job into prometheus.yml:
- job_name: 'blackbox'
metrics_path: /probe
params:
module: [http_2xx] # Look for a HTTP 200 response.
static_configs:
- targets: ['google.com','amazon.com'] # Target to probe with https.
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: blackbox_exporter:9115 # The blackbox exporter's real hostname:port.
I added this dashboard in grafana: https://grafana.com/dashboards/5345 because screenshot on this page was exactly what I need.
Alas, I have only summary graphics without legend, without site-specific chapters.
You can see screenshot here:
Where my actions were wrong? What can I do with it?
In the config you posted, you relabel the blackbox exporter label from __param_target to instance but the dashboard uses target for all the filters and also for the templating variable.
Either change your config to
- source_labels: [__param_target]
target_label: target
or adjust the queries and settings in the dashboard to use instance.

Prometheus: how to drop a target based on Consul tags

My Prometheus server gets its list of targets (or "services", in Consul's lingo) from Consul. I only want to monitor a subset of these targets. This should be possible via Prometheus's regex mechanism, but the correct configuration eludes me. How is this done?
I've scoured the web and there is not a single example showing how its done, so for posterity - the following configuration will drop all consul services marked with the 'ignore-at-prometheus' tag
# ignore consul services with 'ignore_at_prometheus' tag
# https://www.robustperception.io/little-things-matter/
relabel_configs:
- source_labels: ['__meta_consul_tags']
regex: '(.*),ignore-at-prometheus,(.*)'
action: drop
I've used a very similar solution to the problem using the following config. It allows to scrape only the services with a specific tag, rather than excluding services with a given tag.
Here's the scrape_configs section of my config:
scrape_configs:
- job_name: 'consul_registered_services'
scrape_interval: 5s
metrics_path: '/prometheus'
consul_sd_configs:
- server: 'my-consul-server:8500'
token: 'xyz'
relabel_configs:
- source_labels: ['__meta_consul_tags']
regex: '^.*,metrics_method=prometheus-servlet,.*$'
action: keep
- source_labels: ['__meta_consul_node']
target_label: instance
- source_labels: ['__meta_consul_service']
target_label: service
- source_labels: ['__meta_consul_tags']
target_label: tags
I then make sure to register all relevant services with the metrics_method=prometheus-servlet tag, and the rest will be ignored.
The documentation for the relabeling configuration is available here: https://prometheus.io/docs/operating/configuration/#relabel_config.
The documentation for the Consul service discovery configuration is available here: https://prometheus.io/docs/operating/configuration/#consul_sd_config.

Resources