Local Deployment/Installation of Kubeflow on Windows - windows

I am facing a major problem in installing Kubeflow locally on my Windows 10 Machine.
Machine Specs - OS: Windows 10, RAM: 16GB
Approaches Tried To Install
Microk8s - Not Successful
I get
Cannot Install MicroK8s properly, due to “MicroK8s not found error”, along with instance crashing.
MiniKube + Vagrant with VirtualBox - Partially Successful
“Vagrant up” is very slow. And sometimes when I manage to open the Kubeflow console locally it crashes after running a couple of experiments. Errors arise from time-to-time and it is hard to pinpoint as to why they occur.
Kind - Not Successful
Out-dated docs and tutorials. Old commands don’t work and the manifests have been moved to another repo.
K3s - Not Successful
Cannot install manifests as it has been moved to another repo. And no updated docs directing as to how we can install it.
Resources Referred:
https://kirenz.github.io/codelabs/codelabs/kubeflow-install/#4
https://www.kubeflow.org/docs/components/pipelines/installation/localcluster-deployment/
https://github.com/kubeflow/manifests
And official doc of all the Approaches taken
Digging deep what I found was that some guides say that Kubeflow 1.5.0 is not compatible with version 1.22 and onwards. And as of now there are no older releases for Kubernetes(lower than 1.22) in the official site. Is this the root cause for the issues that I am facing?
Are there any other way to install and setup Kubeflow locally for Windows? It is hard to find a guide/tutorial or a video which is not outdated.

The current https://www.kubeflow.org/docs/started/installing-kubeflow/ page suggests using a package.
None of the packages are expressly for Windows. The "Charmed Kubeflow" looked promising for an install on a Windows machine, so I went with it. Here are the steps I figured out after much trial and error.
Enable Hyper V for Windows. (If you have Windows Home, see https://www.makeuseof.com/install-hyper-v-windows-11-home/.)
https://multipass.run/install. Choose Hyper V over Virtual Box if you can. If you cannot, then finish the install while ignoring any VirtualBox related error, and then do Command prompt, multipass set local.driver=hyperv. (You may need to restart your computer here.)
Command prompt: multipass shell
Shell: exit
Command prompt: multipass stop
In the Windows program "Hyper-V Manager", select the VM. Settings:
Memory, RAM: 4096, Enable Dynamic Memory.
Processor, Number of Virtual Processors = 2.
SCSI Controller, Hard Drive, Virtual hard disk, Edit, Action: Expand, New size 50 GB.
Command prompt:
multipass start
multipass shell
Shell: (These steps are mostly from https://charmed-kubeflow.io/docs/quickstart.)
sudo snap install microk8s --classic --channel=1.21/stable
sudo usermod -a -G microk8s $USER
newgrp microk8s
sudo chown -f -R $USER ~/.kube
microk8s enable dns
microk8s enable storage
microk8s enable ingress
microk8s enable metallb:10.64.140.43-10.64.140.49
microk8s enable dashboard
Check the status until those items are enabled. Shell: microk8s status --wait-ready
Shell:
(Note: When I tried to enable istio before doing juju bootstrap microk8s, the juju bootstrap command consistently failed with the following error regardless of how much memory I allocated to the VM: failed to bootstrap model: creating controller stack: creating statefulset for controller: timed out waiting for controller pod: unschedulable: 0/1 nodes are available: 1 Insufficient memory.)
sudo snap install juju --classic
juju bootstrap microk8s
microk8s enable istio
juju add-model kubeflow
juju deploy kubeflow-lite --trust
juju config dex-auth public-url=http://10.64.140.43.nip.io
juju config oidc-gatekeeper public-url=http://10.64.140.43.nip.io
juju config dex-auth static-username=admin
juju config dex-auth static-password=admin
watch -c juju status --color
To access the Kubernetes dashboard: The Charmed Kubeflow Quickstart instructions for this will not work as-is for your Windows web browser. Try this:
Shell: microk8s dashboard-proxy. (This will keep running to serve the dashboard until you Ctrl-C cancel, close the window, or shutdown the VM.)
Command prompt: multipass list
Windows web browser: https://<ip address for the VM from multipass list>:<port number from dashboard-proxy>, copy-paste the token from dashboard-proxy
To access the Kubeflow dashboard:
Shell: microk8s kubectl port-forward -n istio-system service/istio-ingressgateway 8080:80 --address=0.0.0.0. (This will keep running to serve the dashboard until you Ctrl-C cancel, close the window, or shutdown the VM.)
Command prompt: multipass list
Windows web browser: https://<ip address for the VM from multipass list>:8080

Related

Why docker desktop is unable to start docker-backend?

Problem:
I started my system as usual but my docker-desktop doesn't work, WSL doesn't respond to commands and there is a process called "Vmmem" using 25% of my memory. I have tried a bunch of thing but nothing seems to work.
System Attributes:
Windows 10 Pro (10.0.19045.2486)
docker: 4.15
WSL: 1.0.3.0
More context:
Recently I was having trouble with my docker set up. I have one particular container that was "crashing" the docker. It was not throwing any exception but after some event (that I couldn't find out) all the other container where unreachable any attempt to stop/start another container would result on "Error: 500 failed to respond...". When this happens I usually just restart the system and everything works fine, but today it wasn't the case. I restarted and I noticed that I had the "Vmmem" process already running at 25% (it usually just reaches this point at the end of the day), the docker desktop could not start the docker backend and when I tried running wsl -l -v I got no response. I can use some docker commands like docker -v but the docker compose up doesn't work at all.
What I've tried:
restart the system again (nothing changed, still starting with 25% mem usage)
deactivating Hyper-V (nothing happened)
stop/start docker service using net start/stop <service> (it gives a response but didn't solve the problem)
Uninstall docker-desktop (it crashes before even starting the uninstall process)
Terminate WSL wsl -t Ubuntu (got no response from wsl)
Overwrite installation with Docker 4.16 (it gets stuck on "Preparing for update... / Stopping VM and preparing for update")
Forcefully kill the "Vmmem" (I've got Access denied error)
Edit 1:
I managed to finally install the Docker desktop 4.16 but the problem continues, system starts with 25% Vmmem memory usage and docker desktop is not able to initiate backend.
the process Vmeem It represents the memory and CPU consumed by the combination of all the virtual machines running on your Windows PC, there is a possibility that processes are still running on your PC. I recommend you try to launch these commands from the console:
docker stop $(docker ps -a -q)
docker rm $(docker ps -a -q)
This will stop all containers and delete them.
If this doesn't work, I recommend you enter your bios settings and disable virtualization, that way those processes will stop, then you can enable it again and try. I wish you luck and I hope this resolves.
Steps that I did to be able to stop "Vmmem" process and install docker desktop again:
disable Hyper-V
disable virtualization (BIOS)
restart system
to this point the "Vmmem" problem was gone
uninstall docker desktop
rm all wsl instances
enable Hyper-V
enable hypervisorlaunchtype
restart system
enable virtualization (BIOS)
install wsl Ubuntu instance
install Docker Desktop
Maybe some steps listed here are redundant but that is what I did. hope it helps if other people is passing through the same problem

Why is the Docker service stopping?

I'm running Ubuntu as a subsystem on Windows 10.
I have just followed the steps to install Docker on Linux:
https://docs.docker.com/install/linux/docker-ce/ubuntu/
And are now at the step to test the hello-world app:
$ sudo docker run hello-world
Where I get this error:
docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?.
I have narrowed it down to that it actually is the service that is not running - despite lots of other solutions online that more or less fixes this type of error.
When I check the status:
$ sudo service docker status
* Docker is not running
It says it's not running so I start it successfully:
$ sudo service docker start
* Starting Docker: docker [ OK ]
If I check the status immediately it says it's running. But when I check it again a few second later, it's not runnning:
$ sudo service docker status
* Docker is running
$ sudo service docker status
* Docker is not running
Why is the Docker service stopping and how can I keep it running?
What you got is as expected.
Microsoft does not support running the Docker daemon (also known as the service) within the WSL instance. You can refer to this discussion.
What you can do is just use docker client in WSL to connect to a remote docker engine which means docker daemon still on other PC.
But, if you use WSL2 which announced in May 6th, 2019, then, from microsoft's announcement, it could be(There is also a demo in this announcement which you can have a look):
Today we’re unveiling the newest architecture for the Windows Subsystem for Linux: WSL 2! Changes in this new architecture will allow for: dramatic file system performance increases, and full system call compatibility, meaning you can run more Linux apps in WSL 2 such as Docker.
You need either Docker on Windows:
https://medium.com/#sebagomez/installing-the-docker-client-on-ubuntus-windows-subsystem-for-linux-612b392a44c4

Connecting Docker Windows WSL Ubuntu to VMware Ubuntu

I am trying to connect my Windows 10 Home system to be able to run full Linux OS Docker containers. I have installed Docker on both WSL Ubuntu 18.04 as well as a VMware Ubuntu 18.04.
I was trying to follow this guide.
However, I get stuck trying to configure the Daemon as per the instructions.
Can’t use Docker for Windows?
This is only necessary if you are NOT running Docker for Windows!
No problem, just configure your Docker daemon to use -H tcp://0.0.0.0:2375 and --tlsverify=false. Then you can follow along with the rest of this guide exactly.
If you go down this route, I highly recommend rolling your own VM with VMware Player instead of using the Docker Toolbox because VirtualBox has crazy edge case shared folder bugs that will ruin your life at some point. Don’t worry, VMware Player is free. Just Google how to set up Ubuntu 16 server on VMware Player.
When I try to change the Docker Daemon by making a daemon.json file I get errors. I've also tried editing the .profile files and the .bashrc as per other guides (another guide), with no luck.
I am unable to check the DOCKER_HOST variable on the VM Ubuntu.
Don't make things complicated. In your case, why WSL if you just want to connect to a remote daemon? Why not simply use the windows docker client?
Setup you favorite local VM with docker.
Example: I have installed a CentOS distro running on local VMWare Workstation. All Hyper-V of course deinstalled/deactivated.
In this VM, enable tcp access for the daemon.
If you have a systemd linux distro (like mine CentOS), execute this:
sudo mkdir -p /etc/systemd/system/docker.service.d
sudo echo '[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H unix:// -H tcp://0.0.0.0:2375' >> /etc/systemd/system/docker.service.d/options.conf
sudo systemctl daemon-reload
sudo systemctl restart docker`
Test if the port is open with docker info. You should get an API access warning at the bottom result.
Download the Windows docker cli zip from here: https://download.docker.com/win/static/stable/x86_64/
Move the docker.exe to any folder, for ex. your Documents folder.
Then put this folder path into your Windows PATH variable.
Set the docker host: Open PowerShell, execute setx DOCKER_HOST <VM-IP>:2375 and close it.
Open a new PowerShell and call docker info.
You should see the docker and daemon infos.
Do what ever you like now... :-)

Docker can not run on Windows 10 linux child system

I just install the Windows 10 Anniversary update which has a new feature that linux child system. So I try to run docker in Windows 10 ubuntu bash(linux child system). Why I want to install docker in linux child system is because:
Windows 10 native docker 1.12 need Hyper-V, but Vmware couldn't run if Hyper-V enable. I have a lot images created by Vmware, it isn't so easy to switch to Hyper-v
I don't want to use Docker Toolbox, it need install VirtualBox, just redunant.
apt-get is fine, docker install success, but fail to start.
$ sudo service docker start
initctl: Unable to connect to Upstart: Failed to connect to socket /com/ubuntu/upstart: Connection refused
* Starting Docker: docker [ OK ]
$ docker ps
Cannot connect to the Docker daemon. Is the docker daemon running on this host?
I have seen this post can-you-run-docker-natively-on-the-new-windows-10-ubuntu-bash-userspace, some people says that it is no posible to run docker in such linux child system, but there also some contrary opinions.
So, I want to ask is there any way to walk around this? Or I have to wait MS update this child system(since it is still beta now).
You have two Problems in there:
the linux child system does not provide the upstart service like e.g. ubuntu. You can work around this by running the docker deamon directly in foreground with docker daemon ...
This does nearly shure not work because docker requires features of the linux kernel like namespaces and capabilities. I don't think the NT-Kernel does implement such exotic features.

Azure VM with Docker failing to connect

I'm trying to write a Powershell script to create a VM in Azure with Docker installed. From everything I've read, I should be able to do the following:
$image = "b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-14_04_3-LTS-amd64-server-20150908-en-us-30GB"
azure vm docker create -e 22 -l 'North Europe' --docker-cert-dir dockercert --vm-size Small <myvmname> $image $userName $password
docker --tls -H tcp://<myvmname>.cloudapp.net:4243 info
The vm creation works, however the docker command fails with the following error:
An error occurred trying to connect: Get https://myvmname.cloudapp.net:4243/v1.20/info: dial tcp 40.127.169.184:4243: ConnectEx tcp: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
Some articles I've found refer to port 2376 - but that doesn't work either.
Logging onto Azure portal and viewing the created VM - the Docker VM Extension doesn't seem to have been added and there's no endpoints other than the default SSH one. I was expecting these to have been created by the azure vm docker create command. Although I could be wrong with that bit.
A couple of example article I've looked at are here:
https://azure.microsoft.com/en-gb/documentation/articles/virtual-machines-docker-with-xplat-cli/
http://blog.siliconvalve.com/2015/01/07/get-started-with-docker-on-azure/
However, there's plenty of other articles saying the same thing.
Does anyone know what I'm doing wrong?
I know you are doing nothing wrong. My azurecli-dockerhost connection had been working for months and failed recently. I re-created my docker host using "azure vm docker create" but it does not work any more.
I believe it is a bug that the azure-docker team has to fix.
For the time being, my solution is to:
1) Launch a Ubuntu VM WITHOUT using the Azure docker extension
2) SSH into the VM and install docker with these lines:
sudo su; apt-get -y update
apt-get install linux-image-extra-$(uname -r)
modprobe aufs
curl -sSL https://get.docker.com/ | sh
3) Run docker within this VM directly without relying on a "client" and in particular the azure cli.
If you insist on using the docker client approach, my alternative suggestion would be to update your azure-cli and try 'azure vm docker create' again. Let me know how it goes.
sudo su
apt-get update; apt-get -y install nodejs-legacy; apt-get -y install npm; npm install azure-cli --global
To add an additional answer to my question, it turns out you can do the same using the docker create command ...
docker-machine create $vmname --driver azure --azure-publish-settings-file MySubscription.publishsettings
This method works for me.

Resources