Chronos can't run a private Docker container

Chronos can't run a private Docker container - mesos

I'm playing on localhost with a DC/OS installation. While everything works fine, I can't seem to run a docker image located inside a private repo. I'm using python to communicate with chronos:
#celery.task(name='add-job', soft_time_limit=5)
def add_job(job_id):
job_document = mongo.jobs.find_one({
'_id': job_id
})
if job_document:
worker_document = mongo.workers.find_one({
'_id': job_document['workerId']
})
if worker_document:
job = {
'async': True,
'name': job_document['_id'],
'owner': 'owner#gmail.com',
'command': "python /code/run.py",
"disabled": False,
"shell": True,
"cpus": worker_document['cpus'],
"disk": worker_document['disk'],
"mem": worker_document['memory'],
'schedule': 'R1//PT300S',# start now,
"epsilon": "PT60M",
"container": {
"type": "DOCKER",
"forcePullImage": True,
"image": "quay.io/username/container",
"network": "HOST",
"volumes": [{
"containerPath": "/images/",
"hostPath": "/images/",
"mode": "RW"
}]
},
"uris": [
"file:///images/docker.tar.gz"
]
}
return chronos_client.add(job)
else:
return 'worker not found'
else:
return 'job not found'
The job runs fine with a public image (alpine:latest) but it fails without any error inside the dcos installation.
The job gets executed but it fails immediately. The error log of the job inside chronos looks like this:
I1212 12:39:11.141639 25058 fetcher.cpp:498] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/61d6d037-c9f5-482b-a441-11d85554461b-S1\/root","items":[{"action":"BYPASS_CACHE","uri":{"cache":false,"executable":false,"extract":false,"value":"file:\/\/\/images\/docker.tar.gz"}}],"sandbox_directory":"\/var\/lib\/mesos\/slave\/slaves\/61d6d037-c9f5-482b-a441-11d85554461b-S1\/docker\/links\/7029bbea-4c3d-439a-8720-411f6fe40eb9","user":"root"}
I1212 12:39:11.143575 25058 fetcher.cpp:409] Fetching URI 'file:///images/docker.tar.gz'
I1212 12:39:11.143587 25058 fetcher.cpp:250] Fetching directly into the sandbox directory
I1212 12:39:11.143602 25058 fetcher.cpp:187] Fetching URI 'file:///images/docker.tar.gz'
I1212 12:39:11.143612 25058 fetcher.cpp:167] Copying resource with command:cp '/images/docker.tar.gz' '/var/lib/mesos/slave/slaves/61d6d037-c9f5-482b-a441-11d85554461b-S1/docker/links/7029bbea-4c3d-439a-8720-411f6fe40eb9/docker.tar.gz'
I1212 12:39:11.146726 25058 fetcher.cpp:547] Fetched 'file:///images/docker.tar.gz' to '/var/lib/mesos/slave/slaves/61d6d037-c9f5-482b-a441-11d85554461b-S1/docker/links/7029bbea-4c3d-439a-8720-411f6fe40eb9/docker.tar.gz'
Stdout is empty. Executed directly inside marathon as an application with the same settings the authentication works and my image is downloaded & executed. Is this something that chronos does not support? It should...I mean, it has commands for docker...
Update: digging deeper into the agent logs I found this:
Failed to run 'docker -H unix:///var/run/docker.sock pull quay.io/username/container': exited with status 1; stderr='Error: Status 403 trying to pull repository username/container: "{\"error\": \"Permission Denied\"}"
I tried the archive with it's config.json file on the agent itself and it can download when triggered from the command line. I just can't seem to understand why chronos is not using it properly. I can't find any other reference on how to put my credentials other than this.

As it turns out...the uris param is deprecated in favor of fetch. I started from scratch with a marathon config applied to chronos and watched the logs carefully when I saw this: {'message': 'Tried to add both uri (deprecated) and fetch parameters on aBPepwhG5z33e4teG', 'status': 'Bad Request'}. Then I changed my uris parameter into:
"fetch": [{
"uri": "/images/docker.tar.gz",
"extract": true,
"executable": false,
"cache": false
}]
...and it worked.

your post looked a little like this one, which turned out to be a problem with volumes.

Related

unable to control swarm ingress network with ansible

I'm deploying Docker swarm with ansible and I would like to ensure the ingress network has been created. In that aim, I configured the following task :
- name: Ensure ingress network exists
docker_network:
state: present
name: ingress
driver: overlay
driver_options:
ingress: true
And I'm getting the following error :
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: docker.errors.NotFound: 404 Client Error for http+docker://localhost/v1.41/networks/ingress/disconnect: Not Found ("No such container: ingress-endpoint")
fatal: [swarm-srv-1]: FAILED! => {"changed": false, "msg": "An unexpected docker error occurred: 404 Client Error for http+docker://localhost/v1.41/networks/ingress/disconnect: Not Found (\"No such container: ingress-endpoint\")"}
I've tried to add some arguments likes :
scope: swarm
force: yes
But no changes... I've also tried to delete the ingress with ansible (state: absent), but I always get the same error.
Note that I don't face any issue when trying to delete a recreate the ingress network manually on the swarm : docker network rm ingress
I don't know how to resolve that issue...Any help would be appreciated. Thanks !
Here are some informations that may help...
# docker version
Version: 20.10.6
API version: 1.41
Go version: go1.13.15
Git commit: 370c289
Built: Fri Apr 9 22:47:35 2021
OS/Arch: linux/amd64
# docker inspect ingress
[
{
"Name": "ingress",
"Id": "yb2tkhep8vtaj9q7w3mssc9lx",
"Created": "2021-05-19T05:53:27.524446929-04:00",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.0.0.0/24",
"Gateway": "10.0.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": true,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"ingress-sbox": {
"Name": "ingress-endpoint",
"EndpointID": "dfdc0f123d21a196c7a815c7e0a886924d0799ae5f3be2d38b64d527ed4620b1",
"MacAddress": "02:42:0a:00:00:02",
"IPv4Address": "10.0.0.2/24",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "4096"
},
"Labels": {},
"Peers": [
{
"Name": "8f8932d6f99f",
"IP": "(ip address here)"
},
{
"Name": "28b9ca95dcf0",
"IP": "(ip address here)"
},
{
"Name": "f7c48c8af2f5",
"IP": "(ip address here)"
}
]
}
]

I had the exact same issue when trying to customize the IP range of the ingress network. It looks like the docker_network module does not support modification of swarm specific networks: there is a open Github issue for this.
I went for the ugly workaround of removing the network by executing it through a shell (docker network rm ingress command) and adding it again. When adding it with the docker_network module, I found that adding also seems not be working (fails to set the ingress property of the network). So I ended up doing both remove- and create operation through a shell command.
Since the removal will trigger a confirmation dialogue:
WARNING! Before removing the routing-mesh network, make sure all the nodes in your swarm run the same docker engine version. Otherwise, removal may not be effective and functionality of newly create ingress networks will be impaired.
Are you sure you want to continue? [y/N]
I used the expect module to confirm the dialogue:
- name: remove default ingress network
ansible.builtin.expect:
command: docker network rm ingress
responses:
"[y/N]": "y"
- name: create customized ingress network
shell: "docker network create --ingress --subnet {{ docker_ingress_network }} --driver overlay ingress"
It is not perfect but it works.
There was one last problem I experienced: when running it on an existing swarm I ended up having network issues on the node where I did run this (somehow the docker_gwbridge network on that node could not handle the change). The fix for this was to fully remove the node and re-join the swarm (regenerates the docker_gwbridge).

Strapi deployment to Heroku

I am totally new on Strapi and Heroku. I am trying to deploy my app that is working well locally to Heroku but I am getting the following error:
2020-06-15T09:56:29.114780+00:00 app[web.1]: [2020-06-15T09:56:29.114Z] error Impossible to register the 'menus.menus' model.
2020-06-15T09:56:29.115672+00:00 app[web.1]: [2020-06-15T09:56:29.115Z] error TimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
At the beginning I thought it was a problem connecting to the database, but in my local environment it work perfectly and connect with no issues.
I even upgraded my database to a paid version in case the connection is timing out.
I also follow some answers I found only about modifying my config/environment/production/database.json as follow:
{
"defaultConnection": "default",
"connections": {
"default": {
"connector": "bookshelf",
"settings": {
"client": "postgres",
"host": "***.compute-1.amazonaws.com",
"port": "5432",
"database": "***",
"username": "***",
"password": "***",
"ssl": { "rejectUnauthorized": false }
},
"options": {
"debug": false,
"acquireConnectionTimeout": 100000,
"pool": {
"min": 0,
"max": 10,
"createTimeoutMillis": 30000,
"acquireTimeoutMillis": 600000,
"idleTimeoutMillis": 20000,
"reapIntervalMillis": 20000,
"createRetryIntervalMillis": 200
}
}
}
}
}
Any other idea of what can it be?
When I run the develop locally I got a warn (but even this the app run anyway after):
[2020-06-15T10:36:41.261Z] warn The bootstrap function is taking unusually long to execute (3500 miliseconds).
[2020-06-15T10:36:41.261Z] warn Make sure you call it?
[2020-06-15T10:36:42.476Z] warn The bootstrap function is taking unusually long to execute (3500 miliseconds).
[2020-06-15T10:36:42.476Z] warn Make sure you call it?

One simple first step option is to launch a strapi quickstart application on heroku. You can find this link here.. https://github.com/strapi/strapi
Relaunching with this method will provide you with a working, secure instance to begin development on.
Also note that heroku deploys strapi to production, so that you are not able to use the content-types editor, so it is recommended that you develop locally & test your app and use the heroku cli to update your deployment.

how to proxy API requests? (Angular-CLI)

I'm working on Java project with Spring-4 and Angular-5. Session is generated on spring side.
So, I'm not able to generate this session from angular Service. It's working on Postman and I'm able to get response in PostMan.
But It's not working with Angular post method call.
So, I thought that it's may be a issue of Proxy. (Corrent me If i'm wrong).
So, My local Url is :- http://localhost:8080/MacromWeb/ws/login
So, How Can I make a proxy.conf.json file?
So for that I have added this code to my package.json file,
"start": "ng serve --proxy-config proxy.conf.json",
I have created a new file called proxy.conf.json.
And Put this code in it.
{
"/": {
"target": "http://localhost:8080/MacromWeb/ws",
"secure": false
}
}
Then I tried with ng serve and npm start both.
Postman Screenshot.

You can achieve this through proxy, You need to provide proper values in the proxy config.
/* should work too, but if MacromWeb is common in API URLs, then instead of / provide /MacromWeb/*
proxy.conf.json looks something like this,
{
"/MacromWeb/*": {
"target": {
"host": "localhost",
"protocol": "http:",
"port": 8080
},
"secure": false,
"changeOrigin": true,
"logLevel": "debug"
}
}
Hope it helps.

Say we have a server running on http://localhost:3000 and we want all calls to http://localhost:4200/api to go to that server.
In our proxy.conf.json file, we add the following content
{
"/api": {
"target": "http://localhost:3000",
"secure": false,
"pathRewrite": {
"^/api": ""
}
}
}
More on this: here

how to use secure docker registry(by CA) for mesos container?

In DCOS, I want to deploy a mesos container with a self-defined image which stored in a local secure docker registry, and it has been secured by CA (not username and password!)
The json is
{
"id": "/gpu-tflinker",
"cmd": "while [ true ] ; do nvidia-smi; sleep 5; done",
"cpus": 0.1,
"mem": 1024,
"gpus": 1,
"instances": 1,
"constraints": [
[
"hostname",
"CLUSTER",
"10.140.0.22"
]
],
"container": {
"type": "MESOS",
"docker": {
"image": "tflinker:test-gpu",
"credential": null
}
}
}
The above json failed to run on marathon, and there is no content on mesos's stderr and stdout file, on mesos-agent log, the error message is :
E0721 05:01:57.726367 22498 slave.cpp:3976] Container 'e2c68720-0fb7-41bc-9d3b-a2b5e4793816' for executor 'gpu-t
flinker.b6f96725-6dd1-11e7-ba5d-0242b2c758c0' of framework 1079aaea-6dde-4dc1-8990-d926a895de78-0000 failed to s
tart: Unexpected HTTP response '401 Unauthorized' when trying to get the manifest
W0721 05:01:57.726478 22497 composing.cpp:541] Container 'e2c68720-0fb7-41bc-9d3b-a2b5e4793816' is already destr
oyed
I0721 05:01:57.726583 22497 slave.cpp:4082] Executor 'gpu-tflinker.b6f96725-6dd1-11e7-ba5d-0242b2c758c0' of fram
ework 1079aaea-6dde-4dc1-8990-d926a895de78-0000 has terminated with unknown status
I0721 05:01:57.726603 22497 slave.cpp:4193] Cleaning up executor 'gpu-tflinker.b6f96725-6dd1-11e7-ba5d-0242b2c75
8c0' of framework 1079aaea-6dde-4dc1-8990-d926a895de78-0000
I0721 05:01:57.726794 22497 slave.cpp:4281] Cleaning up framework 1079aaea-6dde-4dc1-8990-d926a895de78-0000
so it seems mesos failed to fetch the docker image. I've configed CA file for dockerd(move ca files to /etc/docker/certs.d/), so I can 'docker pull' the image to local machine, but I am not sure how to config CA file for mesos~
in mesos-agent configurations, there exist a item --docker_config=VALUE, but it seems this item can only be used for username/password secured registry, I don't know how to config for CA secured registry.
anybody can help me out?! thanks!

I think CA file is just for encryption. you will need username and password in ca file way I think.
In my way, I put auth file into the container to authorize my private registry.
I wrote a web service for downloading the auth file
xxx.tar.gz(format: .docker/config.json in the tar.gz)
in the config.json, {"auths": {"test.com:6999": {"auth": "(username:password) [base64 encode]"}}} like {"auths": {"test.com:6999": {"auth": "Y2NjOjEyMw=="}}}
use Mesos uris to download auth files prepared into the containers. then, it would authorized.
"uris": [
"http:your download url"
]

Prevent Octopus from Running a Deployment Script

I am deploying a package that contains a deploy.ps1 file. As you already know Octopus is running this script on deploying by default, I want to prevent it happening and run a custom script instead.

If you have a requirement like this, then it's better to move the powershell that starts the services to a separate build step and then tag the tentacles you want that script to run on.
In your deployment step for the service, set the start mode to "Manual"
Then have a step that starts the service, and scope that script to the environments / servers that you want to auto start
The code for the step template I use here is
{
"Id": "ActionTemplates-1",
"Name": "Enable and start service",
"Description": null,
"ActionType": "Octopus.Script",
"Version": 8,
"Properties": {
"Octopus.Action.Package.NuGetFeedId": "feeds-builtin",
"Octopus.Action.Script.Syntax": "PowerShell",
"Octopus.Action.Script.ScriptSource": "Inline",
"Octopus.Action.RunOnServer": "false",
"Octopus.Action.Script.ScriptBody": "$serviceName = $OctopusParameters[\"ServiceName\"]\n\nwrite-host \"the service is: \" $serviceName\n\n& \"sc.exe\" config $serviceName start= delayed-auto\n& \"sc.exe\" start $serviceName\n\n"
},
"Parameters": [
{
"Name": "ServiceName",
"Label": "Service Name",
"HelpText": null,
"DefaultValue": null,
"DisplaySettings": {
"Octopus.ControlType": "SingleLineText"
}
}
],
"$Meta": {
"ExportedAt": "2016-10-10T10:21:21.980Z",
"OctopusVersion": "3.3.2",
"Type": "ActionTemplate"
}
}
You may want to modify the step template as it will set the service to "Automatic - Delayed" and then start the service.

Are you able to move the script to a sub folder?
These scripts must be located in the root of your package
http://docs.octopusdeploy.com/display/OD/Custom+scripts
Alternatively - don't include your deploy.ps1 script in the deployment package if it should never be deployed.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio