problemin connecting apache superset running inside docker container to Kylin - hadoop

I have a running apache-superset inside a docker container that i want to connect to a running apache-kylin (Not inside docker ).
I am recieving the following error whenever i test connection with this alchemy URI : 'kylin://ADMIN#KYLIN#local:7070/test ':
[SupersetError(message='(builtins.NoneType) None\n(Background on this error at: http://sqlalche.me/e/13/dbapi)', error_type=<SupersetErrorType.GENERIC_DB_ENGINE_ERROR: 'GENERIC_DB_ENGINE_ERROR'>, level=<ErrorLevel.ERROR: 'error'>, extra={'engine_name': 'Apache Kylin', 'issue_codes': [{'code': 1002, 'message': 'Issue 1002 - The database returned an unexpected error.'}]})]
"POST /api/v1/database/test_connection HTTP/1.1" 422 -
superset_app | 2021-07-02 18:44:17,224:INFO:werkzeug:172.28.0.1 - - [02/Jul/2021 18:44:17] "POST /api/v1/database/test_connection HTTP/1.1" 422 -

You might need to check your superset_app network first.
use docker inspect [container name] i.e.
docker inspect superset_app
in my case, it is running in superset_default network
"Networks": {
"superset_default": {
.....
}
}
Next, you need to connect your kylin docker container to this network i.e.
docker network connect superset_default kylin
kylin is your container name.
Now, your superset_app and kylin container has been exposed within the same network. You can docker inspect your kylin container
docker inspect kylin
and find the IPAddress
"Networks": {
"bridge": {
....
},
"superset_default": {
...
"IPAddress": "172.18.0.5",
...
}
}
In superset you can now connect your kylin docker container

We have hosted Dremio and Superset on an AKS Cluster in Azure and we are trying to connect Superset to the Dremio Database(Lakehouse) for fetching some dashboards. We have installed all the required drivers(arrowflight, sqlalchemy_dremio and unixodc/dev) to establish the connection.
Strangely we are able not able to connect to Dremio from the Superset UI using the connection strings:
dremio+flight://admin:password#dremiohostname.westeurope.cloudapp.azure.com:32010/dremio
dremio://admin:adminpass#dremiohostname.westeurope.cloudapp.azure.com:31010/databaseschema.dataset/dremio?SSL=0
Here’s the error:
(builtins.NoneType) None\n(Background on this error at: https://sqlalche.me/e/14/dbapi)", "error_type": "GENERIC_DB_ENGINE_ERROR", "level": "error", "extra": {"engine_name": "Dremio", "issue_codes": [{"code": 1002, "message": "Issue 1002 - The database returned an unexpected error."}]}}]
However, while trying from inside the Superset pod, using this python script [here][1] 5, the connection goes through without any issues.
PS - One point to note is that, we have not enabled SSL certificates for our hostnames.

Related

Run time issues with Ballerina Integrator

I am trying to run the sample File Integration with FTP which is given by Ballerina Integrator.
While running the service i am facing same issue each and every time.
I have installed Ballerina Integrator only. I have done uninstall and installation freshly after that also Same issue.
Please help me.
I could successfully run the sample with following configurations. (sample data are given). Here I have used a Secured FTP server to do the configuration.
listener ftp:Listener dataFileListener = new({
protocol: ftp:SFTP,
host: "18.156.78.137",
port: 22,
secureSocket: {
basicAuth: {
username: "cloudloc",
password: "fsf#$#213"
}
},
path: "/clouddir/"
});
ftp:ClientEndpointConfig ftpConfig = {
protocol: ftp:SFTP,
host: "18.156.78.137",
port: 22,
secureSocket: {
basicAuth: {
username: "cloudloc",
password: "fsf#$#213"
}
}
};
Make sure you set the path parameter correctly in the dataFileListener. Without this parameter I could reproduce your attached error.
Once this is correctly configured you would get a log printed like follows.
2020-01-24 15:13:23,758 INFO [wso2/ftp] - Listening to remote server at 18.156.78.137...
2020-01-24 15:13:24,333 INFO [wso2/file_integration_using_ftp] - Added file path: /clouddir/a1.txt
2020-01-24 15:13:24,415 INFO [wso2/file_integration_using_ftp] - Added file: /clouddir/a1.txt - 12
Just install Ballerina Integrator alone which is packed with Ballerina 1.0.2 so no need to install Ballerina again or separately. From VSCode why output is not coming means,VSCode's market place all are upgraded with latest version.
Locally installed "BI with Ballerina" is lower version, In VSCode "BI with Ballerina" is latest one. Mismatched version is the main problem which i was faced.

kafka.common.KafkaException: Failed to parse the broker info from zookeeper from EC2 to elastic search

I have aws MSK set up and i am trying to sink records from MSK to elastic search.
I am able to push data into MSK into json format .
I want to sink to elastic search .
I am able to do all set up correctly .
This is what i have done on EC2 instance
wget /usr/local http://packages.confluent.io/archive/3.1/confluent-oss-3.1.2-2.11.tar.gz -P ~/Downloads/
tar -zxvf ~/Downloads/confluent-oss-3.1.2-2.11.tar.gz -C ~/Downloads/
sudo mv ~/Downloads/confluent-3.1.2 /usr/local/confluent
/usr/local/confluent/etc/kafka-connect-elasticsearch
After that i have modified kafka-connect-elasticsearch and set my elastic search url
name=elasticsearch-sink
connector.class=io.confluent.connect.elasticsearch.ElasticsearchSinkConnector
tasks.max=1
topics=AWSKafkaTutorialTopic
key.ignore=true
connection.url=https://search-abcdefg-risdfgdfgk-es-ex675zav7k6mmmqodfgdxxipg5cfsi.us-east-1.es.amazonaws.com
type.name=kafka-connect
The producer sends message like below fomrat
{
"data": {
"RequestID": 517082653,
"ContentTypeID": 9,
"OrgID": 16145,
"UserID": 4,
"PromotionStartDateTime": "2019-12-14T16:06:21Z",
"PromotionEndDateTime": "2019-12-14T16:16:04Z",
"SystemStartDatetime": "2019-12-14T16:17:45.507000000Z"
},
"metadata": {
"timestamp": "2019-12-29T10:37:31.502042Z",
"record-type": "data",
"operation": "insert",
"partition-key-type": "schema-table",
"schema-name": "dbo",
"table-name": "TRFSDIQueue"
}
}
I am little confused in how will the kafka connect start here ?
if yes how can i start that ?
I also have started schema registry like below which gave me error.
/usr/local/confluent/bin/schema-registry-start /usr/local/confluent/etc/schema-registry/schema-registry.properties
When i do that i get below error
[2019-12-29 13:49:17,861] ERROR Server died unexpectedly: (io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain:51)
kafka.common.KafkaException: Failed to parse the broker info from zookeeper: {"listener_security_protocol_map":{"CLIENT":"PLAINTEXT","CLIENT_SECURE":"SSL","REPLICATION":"PLAINTEXT","REPLICATION_SECURE":"SSL"},"endpoints":["CLIENT:/
Please help .
As suggested in answer i upgraded the kafka connect version but then i started getting below error
ERROR Error starting the schema registry (io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication:63)
io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryInitializationException: Error initializing kafka store while initializing schema registry
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:210)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.initSchemaRegistry(SchemaRegistryRestApplication.java:61)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:72)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:39)
at io.confluent.rest.Application.createServer(Application.java:201)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:41)
Caused by: io.confluent.kafka.schemaregistry.storage.exceptions.StoreInitializationException: Timed out trying to create or validate schema topic configuration
at io.confluent.kafka.schemaregistry.storage.KafkaStore.createOrVerifySchemaTopic(KafkaStore.java:168)
at io.confluent.kafka.schemaregistry.storage.KafkaStore.init(KafkaStore.java:111)
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:208)
... 5 more
Caused by: java.util.concurrent.TimeoutException
at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:108)
at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:274)
at io.confluent.kafka.schemaregistry.storage.KafkaStore.createOrVerifySchemaTopic(KafkaStore.java:161)
... 7 more
First, Confluent Platform 3.1.2 is fairly old. I suggest you get the version that aligns with the Kafka version
You start Kafka Connect using the appropriate connect-* scripts and properties located under bin and etc/kafka folders
For example,
/usr/local/confluent/bin/connect-standalone \
/usr/local/confluent/etc/kafka/kafka-connect-standalone.properties \
/usr/local/confluent/etc/kafka-connect-elasticsearch/quickstart.properties
If that works, you can move onto using connect-distributed command instead
Regarding Schema Registry, you can search its Github issues for multiple people trying to get MSK to work, but the root issue is related to MSK not exposing a PLAINTEXT listener and the Schema Registry not supporting named listeners. (This may have changed since versions 5.x)
You could also try using Connect and Schema Registry containers in ECS / EKS rather than extracting in an EC2 machine

Putting to local DynamoDB table with Python boto3 times out

I am attempting to programmatically put data into a locally running DynamoDB Container by triggering a Python lambda expression.
I'm trying to follow the template provided here: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GettingStarted.Python.03.html
I am using the amazon/dynamodb-local you can download here: https://hub.docker.com/r/amazon/dynamodb-local
Using Ubuntu 18.04.2 LTS to run the container and lambda server
AWS Sam CLI to run my Lambda api
Docker Version 18.09.4
Python 3.6 (You can see this in sam logs below)
Startup command for python lambda is just "sam local start-api"
First my Lambda Code
import json
import boto3
def lambda_handler(event, context):
print("before grabbing dynamodb")
# dynamodb = boto3.resource('dynamodb', endpoint_url="http://localhost:8000",region_name='us-west-2',AWS_ACCESS_KEY_ID='RANDOM',AWS_SECRET_ACCESS_KEY='RANDOM')
dynamodb = boto3.resource('dynamodb', endpoint_url="http://localhost:8000")
table = dynamodb.Table('ContactRequests')
try:
response = table.put_item(
Item={
'id': "1234",
'name': "test user",
'email': "testEmail#gmail.com"
}
)
print("response: " + str(response))
return {
"statusCode": 200,
"body": json.dumps({
"message": "hello world"
}),
}
I know that I should have this table ContactRequests available at localhost:8000, because I can run this script to view my docker container dynamodb tables
I have tested this with a variety of values in the boto.resource call to include the access keys, region names, and secret keys, with no improvement to result
dev#ubuntu:~/Projects$ aws dynamodb list-tables --endpoint-url http://localhost:8000
{
"TableNames": [
"ContactRequests"
]
}
I am also able to successfully hit the localhost:8000/shell that dynamodb offers
Unfortunately while running, if I hit the endpoint that triggers this method, I get a timeout that logs like so
Fetching lambci/lambda:python3.6 Docker container image......
2019-04-09 15:52:08 Mounting /home/dev/Projects/sam-app/.aws-sam/build/HelloWorldFunction as /var/task:ro inside runtime container
2019-04-09 15:52:12 Function 'HelloWorldFunction' timed out after 3 seconds
2019-04-09 15:52:13 Function returned an invalid response (must include one of: body, headers or statusCode in the response object). Response received:
2019-04-09 15:52:13 127.0.0.1 - - [09/Apr/2019 15:52:13] "GET /hello HTTP/1.1" 502 -
Notice that none of my print methods are being triggered, if I remove the call to table.put, then the print methods are successfully called.
I've seen similar questions on Stack Overflow such as this lambda python dynamodb write gets timeout error that state that the problem is I am using a local db, but shouldn't I still be able to write to a local db with boto3, if I point it to my locally running dynamodb instance?
Your Docker container running the Lambda function can't reach the DynamoDB at 127.0.0.1. Try instead the name of your DynamoDB local docker container as the host name for the endpoint:
dynamodb = boto3.resource('dynamodb', endpoint_url="http://<DynamoDB_LOCAL_NAME>:8000")
You can use docker ps to find the <DynamoDB_LOCAL_NAME> or give it a name:
docker run --name dynamodb amazon/dynamodb-local
and then connect:
dynamodb = boto3.resource('dynamodb', endpoint_url="http://dynamodb:8000")
Found the solution to the problem here: connecting AWS SAM Local with dynamodb in docker
The question asker noted that he saw online that he may need to connect to the same docker network using:
docker network create lambda-local
So created this network, then updated my sam command and my docker commands to use this network, like so:
docker run --name dynamodb -p 8000:8000 --network=local-lambda amazon/dynamodb-local
sam local start-api --docker-network local-lambda
After that I no longer experienced the timeout issue.
I'm still working on understanding exactly why this was the issue
To be fair though, it was important that I use the dynamodb container name as the host for my boto3 resource call as well.
So in the end, it was a combination of the solution above and the answer provided by "Reto Aebersold" that created the final solution
dynamodb = boto3.resource('dynamodb', endpoint_url="http://<DynamoDB_LOCAL_NAME>:8000")

Nexus 3.6 OSS Docker Hub Proxy - Can docker search but not docker pull

I've deployed Nexus OSS 3.6 and it's being served on http://server:8082/nexus
I have configured a docker-hub proxy using the instructions in http://www.sonatype.org/nexus/2017/02/16/using-nexus-3-as-your-repository-part-3-docker-images/ and have configured the docker-group to serve under port 18000
I can perform the following:
docker login server:18000
docker search server:18000/jenkins
but when I run:
docker pull server:18000/jenkins
i get the following error:
Error response from daemon: Get http://10.105.139.17:18000/v2/jenkins/manifests/latest:
error parsing HTTP 400 response body: invalid character '<'
looking for beginning of value:
"<html>\n<head>\n<meta http-equiv=\"Content-Type\"
content=\"text/html;charset=ISO-8859-1\"/>\n<title>
Error 400 </title>\n</head>\n<body>\n<h2>HTTP ERROR: 400</h2>\n
<p>Problem accessing /nexus/v2/token.
Reason:\n<pre> Not a Docker request</pre></p>\n<hr />
Powered by Jetty:// 9.3.20.v20170531<hr/>\n
</body>\n</html>\n"
My jetty nexus.properties config file is:
# Jetty section
application-port=8082
application-host=0.0.0.0
# nexus-args=${jetty.etc}/jetty.xml,${jetty.etc}/jetty-http.xml,${jetty.etc}/jetty-requestlog.xml
nexus-context-path=/nexus
# Nexus section
# nexus-edition=nexus-pro-edition
# nexus-features=\
# nexus-pro-feature
Could anyone offer any suggestions on how to fix this please?
I have the same problem when I enabled the anonymous read on some docker repository.
Repositories->Docker hosted->Check the checkbox (Disable to allow anonymous pull) from the repository.
seems you need to upgrade Nexus to 3.6.1 according to :
https://issues.sonatype.org/browse/NEXUS-14488
in order to allow anonymous read again

Recovering from Consul "No Cluster leader" state

I have:
one mesos-master in which I configured a consul server;
one mesos-slave in which I configure consul client, and;
one bootstrap server for consul.
When I hit start I am seeing the following error:
2016/04/21 19:31:31 [ERR] agent: failed to sync remote state: rpc error: No cluster leader
2016/04/21 19:31:44 [ERR] agent: coordinate update error: rpc error: No cluster leader
How do I recover from this state?
Did you look at the Consul docs ?
It looks like you have performed a ungraceful stop and now need to clean your raft/peers.json file by removing all entries there to perform an outage recovery. See the above link for more details.
As of Consul 0.7 things work differently from Keyan P's answer. raft/peers.json (in the Consul data dir) has become a manual recovery mechanism. It doesn't exist unless you create it, and then when Consul starts it loads the file and deletes it from the filesystem so it won't be read on future starts. There are instructions in raft/peers.info. Note that if you delete raft/peers.info it won't read raft/peers.json but it will delete it anyway, and it will recreate raft/peers.info. The log will indicate when it's reading and deleting the file separately.
Assuming you've already tried the bootstrap or bootstrap_expect settings, that file might help. The Outage Recovery guide in Keyan P's answer is a helpful link. You create raft/peers.json in the data dir and start Consul, and the log should indicate that it's reading/deleting the file and then it should say something like "cluster leadership acquired". The file contents are:
[ { "id": "<node-id>", "address": "<node-ip>:8300", "non_voter": false } ]
where <node-id> can be found in the node-id file in the data dir.
If u got raft version more than 2:
[
{
"id": "e3a30829-9849-bad7-32bc-11be85a49200",
"address": "10.88.0.59:8300",
"non_voter": false
},
{
"id": "326d7d5c-1c78-7d38-a306-e65988d5e9a3",
"address": "10.88.0.45:8300",
"non_voter": false
},
{
"id": "a8d60750-4b33-99d7-1185-b3c6d7458d4f",
"address": "10.233.103.119",
"non_voter": false
}
]
In my case I had 2 worker nodes in the k8s cluster, after adding another node the consul servers could elect a master and everything is up and running.
I will update what I did:
Little Background: We scaled down the AWS Autoscaling so lost the leader. But we had one server still running but without any leader.
What I did was:
I scaled up to 3 servers(don't make 2-4)
stopped consul in all 3 servers.sudo service consul stop(you can do status/stop/start)
created peers.json file and put it in old server(/opt/consul/data/raft)
start the 3 servers (peers.json should be placed on 1 server only)
For other 2 servers join it to leader using consul join 10.201.8.XXX
check peers are connected to leader using consul operator raft list-peers
Sample peers.json file
[
{
"id": "306efa34-1c9c-acff-1226-538vvvvvv",
"address": "10.201.n.vvv:8300",
"non_voter": false
},
{
"id": "dbeeffce-c93e-8678-de97-b7",
"address": "10.201.X.XXX:8300",
"non_voter": false
},
{
"id": "62d77513-e016-946b-e9bf-0149",
"address": "10.201.X.XXX:8300",
"non_voter": false
}
]
These id you can get from each server in /opt/consul/data/
[root#ip-10-20 data]# ls
checkpoint-signature node-id raft serf
[root#ip-10-1 data]# cat node-id
Some useful commands:
consul members
curl http://ip:8500/v1/status/peers
curl http://ip:8500/v1/status/leader
consul operator raft list-peers
cd opt/consul/data/raft/
consul info
sudo service consul status
consul catalog services
You may also ensure that bootstrap parameter is set in your Consul configuration file config.json on the first node:
# /etc/consul/config.json
{
"bootstrap": true,
...
}
or start the consul agent with the -bootstrap=1 option as described in the official Failure of a single server cluster Consul documentation.

Resources