Getting production error readiness for iot edge on raspberry pi - raspberry-pi3

I installed Iotedge on my raspberry pi two weeks ago with no problems by following the steps here: https://learn.microsoft.com/en-us/azure/iot-edge/how-to-install-iot-edge-linux
This week i turned my raspberry pi on and I am now getting the following error when i run iotedge check. I am also getting a 406 error when i check its status in the IoT Hub
Error: "production readiness: Edge Agent's storage directory is persisted on the host filesystem-Error
Could not check current state of edgeAgent container
production readiness: Edge Hub's storage directory is persisted on the host filesystem-Error
Could not check current state of edgeHub container"
When i run it with --verbose, i get:
"production readiness: Edge Agent's storage directory is persisted on the host filesystem-Error
The edgeAgent module is not configured to persist its /tmp/edgeAgent directory on the host filesystem.
Data might be lost if the module is deleted or updated.
Please see https://aka.ms/iotedge-storage-host for best practices"
If anyone can help me out, I'd really appreciate it.
Updated errors after I linked the module storage to device storages:
enter image description here

Would you share edge agent and hub twin properties defined in azure portal? Remember to remove secret/password/connection string before sharing.

The production readiness logs are Warnings and you can ignore them if you are just doing tests on an experimental scenario. These warnings will not impact the outcome of the tutorial you followed.
When you're ready to take your IoT Edge solution from development into production, make sure that it's configured for ongoing performance like described here: Prepare to deploy your IoT Edge solution in production

Related

Windows Docker Image keeps hanging up randomly on Azure (web-app)

I won't be able to provide the docker file so I'll try to provide as much context as I can to the issue. I keep running into issues with Azure and Windows based Docker containers randomly. The app will run fine for weeks with no issues and then suddenly bug out (using the same exact image) and go into an endless cycle of "Waiting for container to be start." followed by "Container failed to reach the container's http endpoint".
I have been able to resolve the issue in the past by re-creating the service (again using the same exact image) but it seems this time its not working.
Various Tests:
The same exact docker image runs locally no problem
As I mentioned, re-creating the service before did the trick (using the same exact image)
Below are the exact steps I have in place:
Build a windows based image using a Docker Compose File. I specify in the docker compose file to map ports 2000:2000.
Push the Docker Image to a private repository on docker hub
Create a web app service in Azure using the image
Any thoughts on why this randomly happens? My only next idea is to re-create the docker image as a Linux based image.
Does your app need to access more than a single port? Please note that as of right now, we only allow a single port. More information on that here.
Lastly, please see if the turning off the container check, mentioned here, helps to resolve the matter.
Let us know the outcome of these two steps and we can assist you further if needed.

Cloud Automation Manager on IBM Cloud Private - deployement not available and pods pending

I installed the Helm Release of CAM on the catalog, however, the individual components of CAM are not being deployed. There doesn't seem to be any deployement available and all the pods are pending of CAM.
Screenshot of Deployments of CAM on ICP
With NFS installed, once I deploy CAM from the catalog, the PVs are now bounded to the PV claims from the start. However, the same problem persists, there are no available deployments for CAM and the pods are stuck at either init:0/1, pending, or ContainerCreating without any change.
Edit:
When I checked the pods, the pending error was due to insufficient memory, so I added another worker node and I no longer have any pending deployments. However I still do have the issue with pods being stuck at init(0/1), ContainerCreating and ImagePullBackoff.
Here are some of the errors: Init(0/1) Error ImagePullBackoff ContainerCreating
could be a handful of issues, depending on how many try/retry of the CAM deploy you have attempted:
- delete the PVCs
- edit the PVs to remove any PV claim (or delete the PV and recreate it)
- ensure the NFS exports are defined correctly: (See bottom of page)
https://www.ibm.com/support/knowledgecenter/SS2L37_3.1.0.0/cam_create_pv.html
- remove any prior files/data from the PV locations on disk
- delete the failed CAM chart deploy if still there
If you're facing more trouble, suggest to open a support case so we can help you right away!
ibm.biz/icpsupport
thx.
This problem may related to the CAM PVs not created or not bonded. Could you please check 4 CAM PVs created before deploy CAM
From ICP console > Platform > Storage , you should see 4 CAM PVs as follows:
Please check https://www.ibm.com/support/knowledgecenter/SS2L37_3.1.0.0/cam_create_pv.html regarding how to create CAM PVs
Please review the options for offline installation of IBM Cloud Automation Manager. With an offline installation, it can take hours until the pod cam-iaas is running.
https://developer.ibm.com/cloudautomation/2018/10/18/ibm-cloud-automation-manager-3-1-delivers-improved-offline-installation-experience/

Cloud Automation Manager Pods on CrashLoopBackOff

I'm having an issue where some of my Pods are on CrashLoopBackOff when I try to deploy CAM from the Catalog. I also followed the instructions in the IBM documentation to clear the data from PVs (By doing rm -Rf /export/CAM_db/*) and purge the previous installations of CAM.
Here are the pods that are on CrashLoopBackOff:
Cam Pods
Here's the specific error when I describe the pod:
MongoDB Pod
Ro-
It is almost always the case that if the cam-mongo pod does not come up properly, the issue is with the PV unable to mount/read/access the actual disk location or the data itself which is on the PV.
Since your pod events indicates container image already exists, and scoped to the store, it seems like you have already tried before to install CAM and its using CE version from the Docker store, correct?
If a prior deploy did not go well, do clean up the disk locations as per the doc,
https://www.ibm.com/support/knowledgecenter/SS2L37_3.1.0.0/cam_uninstalling.html
but like you showed I can see you already tried by cleaning CAM_db, so do the same for the CAM_logs, CAM_bpd and CAM_terraform locations.
Make a note of our install troubleshooting section as it describes a few scenarios in which CAM mongo can be impacted:
https://www.ibm.com/support/knowledgecenter/SS2L37_3.1.0.0/ts_cam_install.html
in the bottom of the PV Create topic, we provide some guidance around the NFS mount options that work best, please review it:
https://www.ibm.com/support/knowledgecenter/SS2L37_3.1.0.0/cam_create_pv.html
Hope this helps you make some forward progress!
The postStart error you can effectively ignore, it means mongo container probably failed to start, so it kills a post script.
This issue usually is due to NFS configuration issue.
I would recommend you to try the troubleshooting steps here in the section that has cam-mongo pod is in CrashLoopBackoff
https://www.ibm.com/support/knowledgecenter/SS2L37_3.1.0.0/ts_cam_install.html
If it's NFS, typically it's things like
-no_root_squash is missing on base directory
-fsid=0 needs to be removed on the base directory for that setup
-folder permissions.
Note. I have seen another customer experiencing this issue and the problem was caused by NFS: there were .snapshot file there already, they have to remove it at first.

Undeploying Business Network

Using HyperLedger Composer 0.19.1, I can't find a way to undeploy my business network. I don't necessarily want to upgrade to a newer version each time, but rather replacing the one deployed with a fix in the JS code for instance. Any replacement for the undeploy command that existed before?
There is no replacement for the old undeploy command, and in fact it it not really undeploy - merely hiding the old network.
Be aware that everytime you upgrade a network it creates a new Docker Image and Container so you may want to tidy these up periodically. (You could also try to delete the BNA from the Peer servers but these are very small in comparison to the docker images.)
It might not help your situation, but if you are rapidly developing and iterating you could try this in the online Playground or local Playground with the Web profile - this is fast and does not create any new images/containers.

Windows Azure - Persistence of OS Settings when using WebRoles

I've been watching some videos from the build conference re: Inside Windows Azure etc.
My take away from one of them was that unless I loaded in a preconfigured VHD into a virtual machine role, I would lose any system settings that I might have made should the instance be brought down or recycled.
So for instance, I have a single account with 2 Web Roles running multiple (small) websites. To make that happen I had to adjust the settings in the Hosts file. I know my websites will be carried over in the event of failure because they are defined in the ServiceConfiguration.csfg but will my hosts file settings also carry over to a fresh instance in the event of a failure?
i.e. how deep/comprehensive is my "template" with a web role?
The hosts file will be reconstructed on any full redeployment or reimage.
In general, you should avoid relying on changes to any file that is created by the operating system. If your application is migrated to another server it will be running on a new virtual machine with its own new copy of Windows, and so the changes will suddenly appear to have vanished.
The same will happen if you perform a deployment to the Azure "staging" environment and then perform a "swap VIP": the "staging" environment will not have the changes made to the operating system file.
Microsoft intentionally don't publish inner details of what Azure images look like as they will most likely change in future, but currently
drive C: holds the boot partition, logs, temporary data and is small
drive D: holds a Windows image
drive E: or F: holds your application
On a full deployment, or a re-image, you receive a new virtual machine so all three drives are re-created. On an upgrade, the virtual machine continues to run but the load balancer migrates traffic away while the new version of the application is deployed to drive F:. Drive E: is then removed.
So, answering your question directly, the "template" is for drive E: -- anything else is subject to change without your knowledge, and can't be relied on.
Azure provides Startup Scripts so that you can make configuration changes on instance startup. Often these are used to install additional OS components or make IIS-configuration changes (like disabling idle timeouts).
See http://blogs.msdn.com/b/lucascan/archive/2011/09/30/using-a-windows-azure-startup-script-to-prevent-your-site-from-being-shutdown.aspx for an example.
The existing answers are technically correct and answer the question, but hosting multiple web sites in a single web role doesn't require editing the hosts file at all. Just define multiple web sites (with different host headers) in your ServiceDefinition.csdef. See http://msdn.microsoft.com/en-us/library/gg433110.aspx

Resources