how to auto launch new task instance when mesos-slave stopped? - mesos

version Info and command line args
mesos-master & mesos-slave version 1.1.0
marathon version 1.4.3
docker server version 1.28
mesos-master's command line args:
--zk=zk://ip1:2181,ip2:2181,ip3:2181/mesos \
--port=5050 \
--log_dir=/var/log/mesos \
--hostname=ip1 \
--quorum=2 \
--work_dir=/var/lib/mesosmaster
mesos-slave's command line args:
--master=zk://ip1:2181,ip2:2181,ip3:2181/mesos \
--log_dir=/var/log/mesos --containerizers=docker,mesos \
--executor_registration_timeout=10mins --hostname=ip1 \
--recovery_timeout=1mins \
--resources=ports:[25000-65000] \
--work_dir=/var/lib/mesos
operation
In marathon web UI,run a app via docker image,and task state is "Unknown". (Because I doesn't add health check)
reboot the machine which runs the task.
Now,the expected result what I thought is that the task will be killed and marathon will create a new task,but now (see below picture)
The State of Task became "Unscheduled". the the task can not be killed only after about 15mins.
Finally,what I want is the new task will auto launch and the old task auto be killed ( or removed or expunged)

It seems like you need to configure unreachableStrategy for your tasks. The guide here explains it. I will be playing with it soon so will post an example here as well.

Related

mc: <error> while trying to run bitnami/minio-client the container is exiting within a seconds

docker run -it --name mc3 dockerhub:5000/bitnami/minio-client
08:05:31.13
08:05:31.14 Welcome to the Bitnami minio-client container
08:05:31.14 Subscribe to project updates by watching https://github.com/bitnami/containers
08:05:31.14 Submit issues and feature requests at https://github.com/bitnami/containers/issues
08:05:31.15
08:05:31.15 INFO  ==> ** Starting MinIO Client setup **
08:05:31.16 INFO  ==> ** MinIO Client setup finished! ** mc: Configuration written to /.mc/config.json. Please update your access credentials.
mc: Successfully created /.mc/share.
mc: Initialized share uploads /.mc/share/uploads.json file.
mc: Initialized share downloads /.mc/share/downloads.json file.
**mc: /opt/bitnami/scripts/minio-client/run.sh is not a recognized command. Get help using --help flag.
dockerhub:5000/bitnami/minio-client - name of the image
It would be great if someone reach out to help me how to solve this issue as I'm stuck here for more than 2 days
MinIO has two components:
Server
Client
The Server runs continuously, as it should, so it can serve the data.
On the other hand the client, which you are trying to run, is used to perform operations on a running server. So its expected for it to run and then immediately exit as its not a daemon and its not meant to run forever.
What you want to do is to first launch the server container in background (using -d flag)
$ docker run -d --name minio-server \
--env MINIO_ROOT_USER="minio-root-user" \
--env MINIO_ROOT_PASSWORD="minio-root-password" \
minio/minio:latest
Then launch the client container to perform some operation, for example making/creating a bucket, which it will perform on the server and exit immidieatly after which it will clean up the client container (using -rm flag).
$ docker run --rm --name minio-client \
--env MINIO_SERVER_HOST="minio-server" \
--env MINIO_SERVER_ACCESS_KEY="minio-root-user" \
--env MINIO_SERVER_SECRET_KEY="minio-root-password" \
minio/mc \
mb minio/my-bucket
For more information please checkout the docs
Server: https://min.io/docs/minio/container/operations/installation.html
Client: https://min.io/docs/minio/linux/reference/minio-mc.html

How to run components in AWS Greengrass?

In AWS Greengrass Documentation it says you can test components like this
sudo /greengrass/v2/bin/greengrass-cli deployment create \
--recipeDir ~/greengrassv2/recipes \
--artifactDir ~/greengrassv2/artifacts \
--merge "com.example.HelloWorld=1.0.0"
But if I want to run a component from another script. I should use the same command? For example I have a component that publishes some data to MQTT, and right now I am using system.os like this:
os.system("sudo /greengrass/v2/bin/greengrass-cli deployment create \
--recipeDir ~/greengrassv2/recipes \
--artifactDir ~/greengrassv2/artifacts \
--merge "com.example.HelloWorld=1.0.0"")
But I am not sure if it's the right solution. It does not seem like a nice solution.
I wouldn't recommend using greengrass-cli deployment create command to run the component
It's for local development only
The command run through all the lifecycle steps defined in the component recipe file before running the component, it can be a big overhead.
If "another script" is also a Greengrass component, you can Use the AWS IoT Device SDK for interprocess communication (IPC)
If "another script" is not Greengrass component, you can use restart command to trigger a run for the component. It's less overhead than create command.
sudo /greengrass/v2/bin/greengrass-cli component restart --names "HelloWorld"

Cannot create Azure container instance (Windows)

I have created an Azure DevOps pipeline to create new instances of Azure container instances (Windows) using an Azure CLI task with the following script:
az container create \
-g $(BuildAgent.ResourceGroup) \
--name $(BuildAgent.ContainerName) \
--image $(BuildAgent.DockerImage):$(BuildAgent.DockerImageVersion) \
--cpu $(BuildAgent.Cpu) \
--memory $(BuildAgent.Memory) \
--os-type $(BuildAgent.OsType) \
--restart-policy OnFailure \
--vnet $(BuildAgent.VNet) \
--subnet $(BuildAgent.VNetSubnet) \
--registry-username $(BuildAgent.RepositoryUserName) \
--registry-password $(BuildAgent.RepositoryPassword) \
-e \
VSTS_ACCOUNT=$(BuildAgent.VstsAccount) \
VSTS_POOL=$(BuildAgent.AgentPool) \
VSTS_AGENT='$(BuildAgent.ContainerName)' \
--secure-environment-variables \
VSTS_TOKEN='$(BuildAgent.AccessToken)'
Task fails with the following error:
The requested resource is not available in the location 'westeurope' at this moment. Please retry with a different resource request or in another location. Resource requested: '4' CPU '8' GB memory 'Windows' OS virtual network
Base image in Docker file is supported (I think):
FROM mcr.microsoft.com/dotnet/framework/sdk:4.8-windowsservercore-ltsc2016
Some notes:
Resource group already exists
I've tried with different number of cores/memory (e.g. 2 cores/8GB or 4 cores/16GB)
I have a similar pipeline that creates a Linux container that is working correctly, using the same resource group and the same Azure container registry
VNet and subnet are the same used in the pipeline that creates a Linux container
What am I missing here?
After taking a good look into Resource availability for Azure Container Instances in Azure regions I realised that it's not possible to create Windows container instances with vnet/subnet.
According to the 1st table in that page, it's possible to create Windows containers instances in West Europe:
But, scrolling down to the Availability - Virtual network deployment section, I can see that there is no virtual network support for Windows containers in any region:

Storing Value file from another VM as a varible in Bash

I have a set up that needs to be bootstrapped off the values of some files in another VM.
Here is the run command I am using to invoke the run run command:
BOOT_VM="${VM_NAME}1"
BOOT_ENODE=$(az vm run-command invoke --name ${BOOT_VM} \
--command-id RunShellScript \
--resource-group ${RSC_GRP_NAME} \
--query "value[].message" \
--output tsv \
--scripts "cat /etc/parity/enode.pub")
echo ${BOOT_ENODE}
The result I get is :
Enable succeeded: [stdout] [stderr]
As far as I know, this could mean 2 things:
There is no file there
I am handling the response wrongly.
Really hoping it isnt 1 and would like advice on how to approach this.
For your issue, there is also a reason that the agent in the vm is not at work or something not good happens to it. The Azure VM agent manages interactions between an Azure VM and the Azure fabric controller. So you should check if it works well.
Update
You can check the agent in the portal:
Also, you can check the agent inside the vm:
For example, I want to get the config of vim in the vm and the vm os is Red Hat 7.2. Then the result of the command az vm run-command invoke will like below:

Could not delete DC/OS service that was failed to deploy

I deployed a service in DC/OS (the service is cassandra). The deployment failed and it kept retrying. Under DC/OS > Services > Tasks I could see a new task was created every a few minutes, but they all had the status of "Failed". Under the Debug tab I could see the TASK_FAILED state with a error message about how I misconfigured the service (I picked a user that does not exist).
So I wanted to destroy the service and start over again.
Under Services, I clicked on the menu on the service and selected "Delete". The command was taken, and the Status changed to "Deleting" But then it stayed there forever.
If I checked the Tasks tab, I could see that DC/OS was still attempting to start the server every a few minutes.
Now how do I delete the service? Thanks!
As per latest DCOS cassandra servicce docs, you should uninstall it using dcos cli :
dcos package uninstall --app-id=<service-name> cassandra
If you are using DCOS 1.9 or older version, then follow below steps to uninstall service :
$ MY_SERVICE_NAME=<service-name>
$ dcos package uninstall --app-id=$MY_SERVICE_NAME cassandra`.
$ dcos node ssh --master-proxy --leader "docker run mesosphere/janitor /janitor.py \
-r $MY_SERVICE_NAME-role \
-p $MY_SERVICE_NAME-principal \
-z dcos-service-$MY_SERVICE_NAME"

Resources