Azure custom script extension stuck. not able to rerun or delete extension - azure-vm-scale-set

I was trying to run a custom script on my scaleset vm due to the wrong location of the sh file the exeuction failed. but after that when I try to remove (az vmss extension delete) or rerun(az vmss extension set ) the custom script with correct url I keep getting the same error. It is stuck. How do I fix it.
Deployment failed. Correlation ID:
249a034f-76e2-4b0d-beb2-e9c6577623d1. VM has reported a failure when
processing extension 'customScript'. Error message: "Enable failed:
processing file downloads failed: failed to download file[0]: failed
to download file: http request failed: Get
https://wrongurl.blob.core.windows.net/script/deploytemp.sh: dial tcp:
lookup wrongurl.blob.core.windows.net on 164.33.122.16:53: no such
host".

Delete the instance and rebuild it!
It might not be the answer to delete the instance and rebuilding it, but applications on VMSS by nature should be resilient enough to let you do so.
Also, I'm curious if auto-healing/remediation helps you on this, I know that it does not reinstall the extension tough.

Related

Detect network error and retry running task in Azure

Running bash script in Azure pipeline. I am trying to rerun the npm publish step x-times if there is any network error is detected.
Is there any way to detect a network error specifically and rerun the whole task again?
I've found this document (https://learn.microsoft.com/en-us/azure/devops/release-notes/2021/sprint-195-update#automatic-retries-for-a-task) but I believe this reruns the process regardless of the error type.

Docker commit or build fails with - hcsshim::ImportLayer - failed. (Windows)

Been stuck with this failure a few days.
This happens when I build an image or try to commit after installing a particular application. I'm using mcr.microsoft.com/windows/servercore:ltsc2019 as base image.
"Error response from daemon: re-exec error: exit status 1: output: hcsshim::ImportLayer - failed failed in Win32: The system cannot find the path specified. (0x3)"
If I do not install my application to the image, I do not get this error. The application installs fine without any failures. I'm able to run the container fine with this application installed, but it fails when I commit it to an image.
I came across a few existing posts with this error, but I couldn't get this to work. Some existing posts mentions about possible size limit of the image but here I don't see size to be an issue. This error is too vague for me to do anything about it. Where can I look for some detailed logging from docker daemon to try to understand what in my application is causing the docker commit to fail?
Tried to look into log under, I don't find any thing useful to understand this failure.
AppData\Local\Docker
Appreciate any help or pointers to find what in my application can cause this commit failure.

Artifactory: Failed to persist file; Status code: 404 / Gradle

we facing the same issue as described in Artifactory : java.io.IOException: Failed to deploy file. Status code: 404 Response message when running our deployment via bitbucket pipelines.
This happens on Artifactory cloud to all pipelines from on day to another.
Execution failed for task ':artifactoryDeploy'.
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.io.IOException: Failed to deploy file. Status code: 404 Response message: Artifactory returned the following errors:
Failed to persist file with sha1: 0fexxxxxxxxxxxxxxxx Status code: 404
In Artifactory system-logs I get following warning all the time, but I'm not sure if this issue is connected. Beside following message there are no errors in logs:
2020-08-25T16:26:43.889Z [jfrt ] [WARN ] [c19ba246224f712c] [ntuallyPersistedAddFileTask:96] [al-binary-provider-2] - Failed to delete 'add file' after completing eventually persisted task '/storage/eventual/_add/a3/a396fb897aXXXXXXXXXXXXXXXXXXXXXXXX'
ERROR in request.log
2020-08-26T07:05:43.041Z|1765ac2ce37a6ffc|34.232.119.183|gradle-build|PUT|/gradle-dev-local/app/app-front/1.0.1.418_dev/app-front-1.0.1.418_dev.war;build.timestamp=1598425011065;build.name=app;build.number=1598425011337|404|0|0|9|ArtifactoryBuildClient/2.18.0
2020-08-26T07:05:44.014Z|e62cf9a7063d3fff|34.232.119.183|gradle-build|PUT|/gradle-dev-local/com/customer/app/app-core/1.0.1.418_dev/app-core-1.0.1.418_dev.pom;build.timestamp=1598425011065;build.name=app;build.number=1598425011337|404|4474|0|184|ArtifactoryBuildClient/2.18.0
Does anyone has an idea what could be the reason and what could be checked on top?
We deploying via Artifactory plugin & gradle. (https://bintray.com/jfrog/jfrog-jars/build-info-extractor-gradle#release)
We use fix version but I also updated the plugin to 4.17.1 (before we used 4.9.8)
Thanks in advance!
That sounds like more of an internal issue than something with your client.
It sounds like you may be using some sort of cloud storage, which in turn is using eventual storage. I can imagine a situation like this arising from using a mounted eventual directory over a sharded one in an HA setup.
I'd recommend to see whether that file exists in the filestore still or if it has weird permissions that couldn't be removed. If it is indeed a mounted eventual it'd be worth checking too if the request to upload that artifact came in multiple times; perhaps it was a collision of some sort.
Along those lines, since it's a 404 (not found) and it couldn't delete that file; I'm wondering whether it just couldn't write it to _add in the first place.
To summarize it could be one of two in my opinion with the information so far:
You are using a mounted eventual directory, which may be causing issues
The permissions on the filestore are not correct, affecting the filestore operations

Docker Installation Error on Windows behind Firewall

I'm trying to install Docker on a Windows computer but I get this message:
Running pre-create checks...
(default) No default Boot2Docker ISO found locally, downloading the latest release...
Error with pre-create check: "Get https://api.github.com/repos/boot2docker/boot2docker/releases/latest: dial tcp 192.30.252.124:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond."
Looks like something went wrong in step 'Checking if machine default exists'...
Press any key to continue...
Any suggestions on how to resolve this?
Editing the start.sh file may come up with other error things.
Instead that, just put your boot2docker.iso in below location as.
c:\user\USERNAME\\.docker\machine\cache
and restart your Docker terminal.
You may behind a firewall. If so, you will need to configure an http proxy.
According to https://github.com/boot2docker/boot2docker-cli/issues/230 you can do this one of a couple of ways:
(1) Edit start.sh and add the following before boot2docker.exe is called
export HTTP_PROXY=<proxy>
export HTTPS_PROXY=<proxy>
(2) Add HTTP_PROXY and HTTPS_PROXY (and their values) to your System Variables or User Variables in your Windows config.
The proxy value should be of the form http://hostname:port

Run Webapp stored on network drive in Visualstudio

Im going crazy! I hope you can help me with an advice.
Inside Virtualbox I do run a Windows Server 2008 R2. On this machine im trying to debug a webapplication inside Visualstudio 2010. the webbapplication is stored on a network drive.
Well the network drive is truly a partition of my hdd. because of the shared folder the virtual machine recognize it like an network drive but in my opinion i doesnt matter to the actual problem.
When i try to start the applition a get an error like this:
Server Error in '/' Application.
Configuration Error
Description: An error occurred during the processing of a
configuration file required to service this request. Please review the
specific error details below and modify your configuration file
appropriately.
Parser Error Message: An error occurred loading a configuration file:
Failed to start monitoring changes to 'E:\Testing\In Work\5.1
SP1\SitefinityWebApp'.
Source Error:
[No relevant source lines]
Source File: E:\Testing\In Work\5.1
SP1\SitefinityWebApp\web.config Line: 0
Version Information: Microsoft .NET Framework Version:4.0.30319;
ASP.NET Version:4.0.30319.18034
I allready tried get fulltrust to the application via caspol using the following:
C:\Windows\Microsoft.NET\Framework\v4.0.30319\CasPol.exe -m -ag 1 -url "file:////\\VBOXSVR\d_drive\Testing\In Work\5.1 SP1\SitefinityWebApp*" FullTrust -exclusive on
or this
C:\Windows\Microsoft.NET\Framework\v4.0.30319\CasPol.exe -machine -addgroup All_Code -url "E:\Testing\In Work\*" FullTrust
but nothing helped!
Has anybody an solution?
p.s.
if i run the same project on the virtual hdd it works fine!
It seems that network drives can't be monitored by Windows for file changes like local drives can so in fact it matters where you put your source code so your WebApp can automagically reload it on changes - here a possible but maybe outdated solution is given.

Resources