Cube.js timing out in serverless environment - aws-lambda

I've been following the guide on https://cube.dev/docs/deployment#express-with-basic-passport-authentication to deploy Cube.js to Lambda. I got it working against an Athena db such that the /meta endpoint works successfully and returns schemas.
When trying to query Athena data in Lambda however, all requests are resulting in 504 Gateway Timeouts. Checking the CloudWatch logs I see one consistent error:
/bin/sh: hostname: command not found
Any idea what this could be?
Here's my server.yml:
service: tw-cubejs
provider:
name: aws
runtime: nodejs12.x
iamRoleStatements:
- Effect: "Allow"
Action:
- "sns:*"
# Athena permissions
- "athena:*"
- "s3:*"
- "glue:*"
Resource:
- "*"
# When you uncomment vpc please make sure lambda has access to internet: https://medium.com/#philippholly/aws-lambda-enable-outgoing-internet-access-within-vpc-8dd250e11e12
vpc:
securityGroupIds:
# Your DB and Redis security groups here
- ########
subnetIds:
# Put here subnet with access to your DB, Redis and internet. For internet access 0.0.0.0/0 should be routed through NAT only for this subnet!
- ########
- ########
- ########
- ########
environment:
CUBEJS_AWS_KEY: ########
CUBEJS_AWS_SECRET: ########
CUBEJS_AWS_REGION: ########
CUBEJS_DB_TYPE: athena
CUBEJS_AWS_S3_OUTPUT_LOCATION: ########
CUBEJS_JDBC_DRIVER: athena
REDIS_URL: ########
CUBEJS_API_SECRET: ########
CUBEJS_APP: "${self:service.name}-${self:provider.stage}"
NODE_ENV: production
AWS_ACCOUNT_ID:
Fn::Join:
- ""
- - Ref: "AWS::AccountId"
functions:
cubejs:
handler: cube.api
timeout: 30
events:
- http:
path: /
method: GET
- http:
path: /{proxy+}
method: ANY
cubejsProcess:
handler: cube.process
timeout: 630
events:
- sns: "${self:service.name}-${self:provider.stage}-process"
plugins:
- serverless-express

Even this hostname error message is in logs however it isn't an issue cause.
Most probably you experiencing issue described here.
#cubejs-backend/serverless uses internet connection to access messaging API as well as Redis inside VPC for managing queue and cache.
One of those doesn't work in your environment.
Such timeouts usually mean that there's a problem with internet connection or with Redis connection. If it's Redis you'll usually see timeouts after 5 minutes or so in both cubejs and cubejsProcess functions. If it's internet connection you will never see any logs of query processing in cubejsProcess function.

Check the version of cube.js you are using, according to the changelog this issue should have been fixed in 0.10.59.
It's most likely down to a dependency of cube.js assuming that all environments where it will run will be able to run the hostname shell command (looks like it's using node-machine-id.

Related

Services empty in Opensearch Trace Analytics

I'm using Amazon OpenSearch with the engine OpenSearch 1.2.
I was working on setting up APM with the following details
Service 1 - Tomcat Application running on an EC2 server that accesses an RDS database. The server is behind a load balancer with a sub-domain mapped to it.
I added a file - setenv.sh file in the tomcat/bin folder with the following content
#!/bin/sh
export CATALINA_OPTS="$CATALINA_OPTS -javaagent:<PATH_TO_JAVA_AGENT>"
export OTEL_METRICS_EXPORTER=none
export OTEL_EXPORTER_OTLP_ENDPOINT=http://<OTEL_COLLECTOR_SERVER_IP>:4317
export OTEL_RESOURCE_ATTRIBUTES=service.name=<SERVICE_NAME>
export OTEL_INSTRUMENTATION_COMMON_PEER_SERVICE_MAPPING=<RDS_HOST_ENDPOINT>=Database-Service
OTEL Java Agent for collecting traces from the application
OTEL Collector and Data Prepper running on another server with the following configuration
docker-compose.yml
version: "3.7"
services:
data-prepper:
restart: unless-stopped
image: opensearchproject/data-prepper:1
volumes:
- ./pipelines.yaml:/usr/share/data-prepper/pipelines.yaml
- ./data-prepper-config.yaml:/usr/share/data-prepper/data-prepper-config.yaml
networks:
- apm_net
otel-collector:
restart: unless-stopped
image: otel/opentelemetry-collector:0.55.0
command: [ "--config=/etc/otel-collector-config.yml" ]
volumes:
- ./otel-collector-config.yml:/etc/otel-collector-config.yml
ports:
- "4317:4317"
depends_on:
- data-prepper
networks:
- apm_net
data-prepper-config.yaml
ssl: false
otel-collector-config.yml
receivers:
otlp:
protocols:
grpc:
exporters:
otlp/data-prepper:
endpoint: http://data-prepper:21890
tls:
insecure: true
service:
pipelines:
traces:
receivers: [otlp]
exporters: [otlp/data-prepper]
pipelines.yaml
entry-pipeline:
delay: "100"
source:
otel_trace_source:
ssl: false
sink:
- pipeline:
name: "raw-pipeline"
- pipeline:
name: "service-map-pipeline"
raw-pipeline:
source:
pipeline:
name: "entry-pipeline"
prepper:
- otel_trace_raw_prepper:
sink:
- opensearch:
hosts:
[
<AWS OPENSEARCH HOST>,
]
# IAM signing
aws_sigv4: true
aws_region: <AWS_REGION>
index_type: "trace-analytics-raw"
service-map-pipeline:
delay: "100"
source:
pipeline:
name: "entry-pipeline"
prepper:
- service_map_stateful:
sink:
- opensearch:
hosts:
[
<AWS OPENSEARCH HOST>,
]
# IAM signing
aws_sigv4: true
aws_region: <AWS_REGION>
index_type: "trace-analytics-service-map"
The data-prepper is getting authenticated via Fine access based control with all_access role and I'm able to see the otel resources like indexes, index templates generated when running it.
On running the above setup, I'm able to see traces from the application in the Trace Analytics Dashboard of OpenSearch, and upon clicking on the individual traces, I'm able to see a pie chart with one service. I also don't see any errors in the otel-collector as well as in data-prepper. Also, in the logs of data prepper, I see records being sent to otel service map.
However, the services tab of Trace Analytics remains empty and the otel service map index also remains empty.
I have been unable to figure out the reason behind this even after going through the documentation and any help is appreciated!

"No logs found" in grafana

I installed Loki, grafana and promtail and all three runing. on http://localhost:9080/targets Ready is True, but the logs are not displayed in Grafana and show in the explore section "No logs found"
promtail-local-config-yaml:
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: system
static_configs:
- targets:
- localhost
labels:
job: varlogs
host: ward_workstation
agent: promtail
__path__: D:/LOGs/*log
loki-local-config.yaml:
auth_enabled: false
server:
http_listen_port: 3100
grpc_listen_port: 9096
common:
path_prefix: /tmp/loki
storage:
filesystem:
chunks_directory: /tmp/loki/chunks
rules_directory: /tmp/loki/rules
replication_factor: 1
ring:
instance_addr: 127.0.0.1
kvstore:
store: inmemory
schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
ruler:
alertmanager_url: http://localhost:9093
How can i solve this problem?
Perhaps you are using Loki in Windows ?
In your promtail varlogs job ,the Path "D:/LOGs/*log" is obviously wrong, you cannot access the windows file from your docker directly.
You shoud mount your windows file to your docker like this:
promtail:
image: grafana/promtail:2.5.0
volumes:
- D:/LOGs:/var/log
command: -config.file=/etc/promtail/config.yml
networks:
- loki
Then everything will be ok.
Note that, in your promtail docker the config is like this:
you can adjust both to make a match...
Here's a general advice how to debug Loki according to the question's title:
(1) Check promtail logs
If you discover such as error sending batch you need to fix your Promtail configuration.
level=warn ts=2022-10-12T16:26:20.667560426Z caller=client.go:369 component=client host=monitor:3100 msg="error sending batch, will retry" status=-1 error="Post \"http://loki:3100/loki/api/v1/push\": dial tcp: lookup *Loki* on 10.96.0.10:53: no such host"
(2) Open the Promtail config page and check, if Promtail has read your given configuration: http://localhost:3101/config
(3) Open the Promtail targets page http://localhost:3101/targets and check
if your service is listed as Ready
if the log file contains the wanted contents and is readable by Promtail. If you're using docker or kubernetes I would log into the Promtail Container and would try to read the logfile manually.
To the specific problem of the questioner:
The questioner said, that the services are shown as READY in the targets page. So I recommend to check (1) Promtail configuration and (3b) access to log files (as Frank).

Connect to dynamically create new cluster on GKE

I am using the cloud.google.com/go SDK to programmatically provision the GKE clusters with the required configuration.
I set the ClientCertificateConfig.IssueClientCertificate = true (see https://pkg.go.dev/google.golang.org/genproto/googleapis/container/v1?tab=doc#ClientCertificateConfig).
After the cluster is provisioned, I use the ca_certificate, client_key, client_secret returned for the same cluster (see https://pkg.go.dev/google.golang.org/genproto/googleapis/container/v1?tab=doc#MasterAuth). Now that I have the above 3 attributes, I try to generate the kubeconfig for this cluster (to be later used by helm)
Roughly, my kubeconfig looks something like this:
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: <base64_encoded_data>
server: https://X.X.X.X
name: gke_<project>_<location>_<name>
contexts:
- context:
cluster: gke_<project>_<location>_<name>
user: gke_<project>_<location>_<name>
name: gke_<project>_<location>_<name>
current-context: gke_<project>_<location>_<name>
kind: Config
preferences: {}
users:
- name: gke_<project>_<location>_<name>
user:
client-certificate-data: <base64_encoded_data>
client-key-data: <base64_encoded_data>
On running kubectl get nodes with above config I get the error:
Error from server (Forbidden): serviceaccounts is forbidden: User "client" cannot list resource "serviceaccounts" in API group "" at the cluster scope
Interestingly if I use the config generated by gcloud, the only change is in the user section:
user:
auth-provider:
config:
cmd-args: config config-helper --format=json
cmd-path: /Users/ishankhare/google-cloud-sdk/bin/gcloud
expiry-key: '{.credential.token_expiry}'
token-key: '{.credential.access_token}'
name: gcp
This configuration seems to work just fine. But as soon as I add client cert and client key data to it, it breaks:
user:
auth-provider:
config:
cmd-args: config config-helper --format=json
cmd-path: /Users/ishankhare/google-cloud-sdk/bin/gcloud
expiry-key: '{.credential.token_expiry}'
token-key: '{.credential.access_token}'
name: gcp
client-certificate-data: <base64_encoded_data>
client-key-data: <base64_encoded_data>
I believe I'm missing some details related to RBAC but I'm not sure what. Will you be able to provide me with some info here?
Also reffering to this question I've tried to only rely on Username - Password combination first, using that to apply a new clusterrolebinding in the cluster. But I'm unable to use just the username password approach. I get the following error:
error: You must be logged in to the server (Unauthorized)

How to access private AWS resources in AWS SAM LOCAL when start-api testing

I've been working with AWS SAM Local to create and test a lambda / api gateway stack before shipping it to production. I have recently ran into a brick wall when trying to access private resources (RDS) when testing locally (sam local start-api --profile [profile]). I'm able to connect to some of these private resources if I do some ssh tunneling, but was wondering if I am able to test locally without tunneling using VPC.
Below is an example sam template:
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Example Stack
Globals:
Function:
Timeout: 3
Resources:
ExampleFunction:
Type: 'AWS::Serverless::Function'
Properties:
Handler: index.example
Runtime: nodejs8.10
CodeUri: .
Description: 'Just an example'
MemorySize: 128
Role: 'arn:aws:iam::[arn-role]'
VpcConfig:
SecurityGroupIds:
- sg-[12345]
SubnetIds:
- subnet-[12345]
- subnet-[23456]
- subnet-[34567]
Events:
Api1:
Type: Api
Properties:
Path: /example
Method: GET
After reading through a lot of documentation and searching stackoverflow for anything that would help... I ended up joining the #samdev slack channel and asked for help. I was provided some guidance and a great guide on setting up OpenVPN on an EC2 instance.
The set up was super easy (completed in under 30 minutes) and the EC2 instance uses a pre-baked AMI image. Make sure you assign the new EC2 instance to the appropriate VPC containing the resources you need access to.
Here is a link to the OpenVPN guide: https://openvpn.net/index.php/access-server/on-amazon-cloud.html
You can request an invite to the #samdev slack channel here: https://awssamopensource.splashthat.com/

ReadOnly Rest plugin giving Authentication Exception

I am using readonlyRest plugin to secure elastic and kibana but once I added the following in my readonlyrest.yml, the kibana starts giving me "Authentication Exception", what could be the reason for that?
kibana.yml
elasticsearch.username: "kibana"
elasticsearch.password: "kibana123"
readonlyrest.yml
readonlyrest:
enable: true
response_if_req_forbidden: Access denied!!!
access_control_rules:
- name: "Accept all requests from localhost"
type: allow
hosts: [XXX.XX.XXX.XXX]
- name: "::Kibana server::"
auth_key: kibana:kibana123
type: allow
- name: "::Kibana user::"
auth_key: kibana:kibana123
type: allow
kibana_access: rw
indices: [".kibana*","log-*"]
My kibana and elastic are hosted on same server, is that the reason?
Another question: If I want to make my elastic server accessible only through a particular host then can I write that host in the first section of access_control_rules as mentioned in readonlyrest.yml?
Elastic version: 6.2.3
Log error: I didn't remember exactly but it was [ACL] Forbidden and showing false in all the three control rules.

Resources