How to get access to Spark shell from Kubernetes? - bash

I've used the helm chart to deploy Spark to Kubernetes in GCE. According to default configuration in values.yaml the Spark is deployed to the path /opt/spark. I've checked that Spark has deployed successfully by running kubectl --namespace=my-namespace get pods -l "release=spark". There is 1 master and 3 workers running.
However when I've tried to check Spark version by executing spark-submit --version from the Google cloud console it returned -bash: spark-submit: command not found.
I've navigated to the /opt directory and the /spark folder is missing. What should I do to be able to open Spark shell Terminal and to execute Spark commands?

You can verify by checking service
kubectl get services -n <namespace>
you can port-forward particular service and try running locally to check
kubectl port-forward svc/<service name> <external port>:<internal port or spark running port>
Locally you can try running spark terminal it will be connected to spark running on GCE instance.
If you check the helm chart document there is also options for UI you can also do same to access UI via port-forward
Access via SSH inside pod
Kubectl exec -it <spark pod name> -- /bin/bash
here you can directly run spark commands. spark-submit --version
Access UI
Access UI via port-forwarding if you have enable UI in helm chart.
kubectl port-forward svc/<spark service name> <external port>:<internal port or spark running port>
External Load balancer
This particular helm chart also creating External Load balancer you can also get External IP using
Kubectl get svc -n <namespace>
Access Shell
If want to connect via LB IP & port
./bin/spark-shell --conf spark.cassandra.connection.host=<Load balancer IP> spark.cassandra-connection.native.port=<Port>
Creating connection using port-forward
kubectl port-forward svc/<spark service name> <external(local) port>:<internal port or spark running port>
./bin/spark-shell --conf spark.cassandra.connection.host=localhost spark.cassandra-connection.native.port=<local Port>

One way would be login to pod and then run Spark commands
List the pod
kubectl --namespace=my-namespace get pods -l "release=spark"
Now, Login to the pod using following command:
kubectl exec -it <pod-id> /bin/bash
Now, you should be inside the pod and can run spark commands
spark-submit --version
Ref: https://kubernetes.io/docs/tasks/debug-application-cluster/get-shell-running-container/#getting-a-shell-to-a-container
Hope this helps.

This worked for me.
spark-shell --master k8s://localhost:32217
My spark master is a LoadBalancer exposed at localhost:32217

Related

What happens to systemd services after migrating GCE VM to GKE?

I'm following this doc to test migrating a GCE VM to GKE, but it is unclear to me what happens to my systemd services after the migration. Usually containers are used to run a single application instead of lots of daemons.
I tried to see if systemd services are running in the Pod, but failed:
$ kubectl exec -it my-app-0 -- systemctl status
System has not been booted with systemd as init system (PID 1). Can't operate.
command terminated with exit code 1
I think the doc needs to be improved to include more details about what's going on with the Pod after the migration. In addition to systemd services, what is the entrypoint of the container in the Pod?
For migrated containers, this should give you the desired result:
kubectl exec -it my-app-0 -- bash -c "systemctl status"

How to connect to kubernetes cluster locally and open dashboard?

I have a new laptop and kubernetes cluster running on Google Cloud Platform. How can I access that cluster from local machine to execute kubectl commands, open dashboard etc?
That is not clearly stated in the documentation.
From your local workstation, you need to have the gcloud tool installed and properly configured to connect to the correct GCE account. Then you can run:
gcloud container clusters get-credentials [CLUSTER_NAME]
This will setup kubectl to connect to your kubernetes cluster.
Of course you'll need to install kubectl either using gcloud with:
gcloud components install kubectl
Or using specific instructions for your operating system.
Please check the following link for more details: https://cloud.google.com/kubernetes-engine/docs/quickstart
Once you have kubectl access you can deploy and access the kubernetes dashboard as described here: https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/
The first thing you would need to do once you've installed Cloud SDK is ensure it is authenticated to your Google Cloud Platform account/project. To do this you need to run:
gcloud auth login
And then follow the on screen instructions.
Also you will need to install kubectl to access/control aspests of your cluster:
gcloud components install kubectl
You can also install it through native package management by following the instructions here.
Once your gcloud is authenticated to your project you can run this to ensure kubectl is pointing at your cluster and authenticated:
gcloud container clusters get-credentials CLUSTER_NAME --zone ZONE
You'll now be able to issue commands with kubectl that target the cluster you defined in the previous step.
You can access the dashboard following the instructions here.

How can I configure kubctl cli without logging into the web UI?

How can I configure kubectl cli without logging into the web UI?
My environment isn't up set and I want to run some kubctl commands.
Install kubctl cli first.
After that, you can issue the command.
e.g. kubectl -s 127.0.0.1:8888 -n kube-system get pods
Kudos to Arjun.

Worker nodes not available

I have setup and installed IBM Cloud private CE with two ubuntu images in Virtual Box. I can ssh into both images and from there ssh into the others. The ICp dashboard shows only one active node I was expecting two.
I explicitly ran the command (from a root user on master node):
docker run -e LICENSE=accept --net=host \
-v "$(pwd)":/installer/cluster \
ibmcom/cfc-installer install -l \
192.168.27.101
The result of this command seemed to be a successful addition of the worker node:
PLAY RECAP *********************************************************************
192.168.27.101 : ok=45 changed=11 unreachable=0 failed=0
But still the worker node isn't showing in the dashboard.
What should I be checking to ensure the worker node will work for the master node?
If you're using Vagrant to configure IBM Cloud Private, I'd highly recommend trying https://github.com/IBM/deploy-ibm-cloud-private
The project will use a Vagrantfile to configure a master/proxy and then provision 2 workers within the image using LXD. You'll get better density and performance on your laptop with this configuration over running two full Virtual Box images (1 for master/proxy, 1 for the worker).
You can check on your worker node with following steps:
check cluster nodes status
kubectl get nodes to check status of the newly added worker node
if it's NotReady, check kubelet log if there is error message about why kubelet is not running properly:
ICp 2.1
systemctl status kubelet
ICp 1.2
docker ps -a|grep kubelet to get kubelet_containerid,
docker logs kubelet_containerid
Run this to get the kubectl working
ln -sf /opt/kubernetes/hyperkube /usr/local/bin/kubectl
run the below command to identified failed pods if any in the setup on the master node.
Run this to get the pods details running in the environment
kubectl -n kube-system get pods -o wide
for restarting any failed pods of icp
txt="0/";ns="kube-system";type="pods"; kubectl -n $ns get $type | grep "$txt" | awk '{ print $1 }' | xargs kubectl -n $ns delete $type
now run the kubectl cluster-info
kubectl get nodes
Then ckeck the cluster info command of kubectl
Check kubectl version is giving you https://localhost:8080 or https://masternodeip:8001
kubectl cluster-info
Do you get the output
if no..
then
login to https://masternodeip:8443 using admin login
and then copy the configure clientcli settings by clicking on admin on the panel
paste it in ur master node.
and run the
kubectl cluster-info

cannot connect to Minikube on MacOS

I installed minikube as instructed here https://github.com/kubernetes/minikube/releases
and started with with a simple minikube start command.
But the next step, which is as simple as kubectl get pods --all-namespaces fails with
Unable to connect to the server: dial tcp 192.168.99.100:8443: i/o timeout
What did I miss?
I ran into the same issue using my Mac and basically I uninstalled both minikube and Kubectl and installed it as follows:
Installed Minikube.
curl -Lo minikube https://storage.googleapis.com/minikube/releases/v0.8.0/minikube-darwin-amd64 && chmod +x minikube && sudo mv minikube /usr/local/bin/
Installed Kubectl.
curl -Lo kubectl http://storage.googleapis.com/kubernetes-release/release/v1.3.0/bin/darwin/amd64/kubectl && chmod +x kubectl && sudo mv kubectl /usr/local/bin/
Start a cluster, run the command:
minikube start
Minikube will also create a “minikube” context, and set it to default in kubectl. To switch back to this context later, run this command:
kubectl config use-context minikube
Now to get the list of all pods run the command:
kubectl get pods --all-namespaces
Now you should be able to get the list of pods. Also make sure that you don't have a firewall within your network that blocks the connections.
I faced a similar issue on win7 when changed work environment, as you said it is working fine at home but not working at office, high chance it caused by firewall policy, cannot pass TLS verification.
Instead of waste time on troubleshoot(sometimes nothing to do if you cannot turn off firewall), if you just want to test local minikube cluster, would suggest to disable TLS verification.
This is what I have done:
# How to disable minikube TLS verification
## disable TLS verification
$ VBoxManage controlvm minikube natpf1 k8s-apiserver,tcp,127.0.0.1,8443,,8443
$ VBoxManage controlvm minikube natpf1 k8s-dashboard,tcp,127.0.0.1,30000,,30000
$ kubectl config set-cluster minikube-vpn --server=https://127.0.0.1:8443 --insecure-skip-tls-verify
$ kubectl config set-context minikube-vpn --cluster=minikube-vpn --user=minikube
$ kubectl config use-context minikube-vpn
## test kubectl
$ kubectl get pods
## enable local docker client
$ VBoxManage controlvm minikube natpf1 k8s-docker,tcp,127.0.0.1,2374,,2376
$ eval $(minikube docker-env)
$ unset DOCKER_TLS_VERIFY
$ export DOCKER_HOST="tcp://127.0.0.1:2374"
$ alias docker='docker --tls'
## test local docker client
$ docker ps
## test minikube dashboard
curl http://127.0.0.1:30000
Also I make a small script for this for your reference.
Hope it is helpful for you.
You need to just restart minikube. Sometimes I have this problem when my computer has been off for a while. I don't think you need to reinstall anything.
First verify you are in the correct context
$ kubectl config current-context
minikube
Check Minikube status (status should show "Running", mine below showed "Saved")
$ minikube status
minikube: Saved
cluster:
kubectl:
Restart minikube
$ minikube start
Starting local Kubernetes v1.8.0 cluster...
Starting VM...
Getting VM IP address...
Moving files into cluster...
Setting up certs...
Connecting to cluster...
Setting up kubeconfig...
Starting cluster components...
Kubectl is now configured to use the cluster.
Verify it is running (This is what you should see)
$ minikube status
minikube: Running
cluster: Running
kubectl: Correctly Configured: pointing to minikube-vm at 192.168.99.100
I had this issue when connected to Cisco AnyConnect VPN. Once I disconnected, minikube ran fine. Discussion on github here: https://github.com/kubernetes/minikube/issues/4540

Resources