VMSS Autoscaling: WADPerformanceCounters - azure-vm-scale-set

I’ve added the autoscaling settings to my ServiceFabric template and after deploying it, the portal shows that auto scale is configured, but what I am not able to see is the table WADPerformanceCounters; mentioned in the documentation; in my storage account. So how is the auto scaling executed without the information about the couters?
Thanks.

If autoscale cannot find the data it's configured to look at, it will set your capacity equal to the "default" configured in the autoscale rule.
As for what could explain the behavior you're seeing, here are a couple hypotheses:
1) There are two types of metrics in Azure today: host and guest; host metrics live in Azure-internal data stores and as such don't require a storage account to store data in. Guest metrics, however, do live in a storage account. So depending on how you added autoscale, you might have added host metrics instead of guest metrics? For more info, see this doc: https://learn.microsoft.com/en-us/azure/monitoring-and-diagnostics/insights-autoscale-common-metrics
2) As you can see in this template using guest metrics, for guest metrics the scale set must have the WAD extension configured to point to the storage account; it's probably worth checking that the storage account specified in the WAD extension config is the same storage account you looked for the table in.

For host metrics, you can find the list of supported metrics here:
https://learn.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-supported-metrics#microsoftcomputevirtualmachinescalesets
For guest metrics, as mentioned above you need to configure the Windows Azure Diagnostic (WAD) extension correctly on your VMSS. Specifically autoscale engine will query from the WAD{value}PT1M{value} tables in your configured diagnostic storage account. These tables contain the local 1 minute aggregation of the performance counter data.

Related

can databricks cluster be shared across workspace?

My ultimate goal is to differentiate/manage the cost on databricks (azure) based on different teams/project.
And I was thinking whether I could utilize workspace to achieve this.
I read below , it sounds like workspace can access a cluster, but does not say whether multiple workspace can access the same cluster or not.
A Databricks workspace is an environment for accessing all of your Databricks assets. The workspace organizes objects (notebooks, libraries, and experiments) into folders, and provides access to data and computational resources such as clusters and jobs.
In other words, can I creat a cluster and somehow ensure can be only accessed by certain project or team or workspace?
To manage whom can access a particular cluster, you can make use of cluster access control. With cluster access control, you can determine what users can do on the cluster. E.g. attach to the cluster, the ability to restart it or to fully manage it. You can do this on a user level but also on a user group level. Note that you have to be on Azure Databricks Premium Plan to make use of cluster access control.
You also mentioned that your ultimate goal is to differentiate/manage costs on Azure Databricks. For this you can make use of tags. You can tag workspaces, clusters and pools which are then propagated to cost analysis reports in the Azure portal (see here).

How to use Azure Spot instances on Databricks

Spot instances brings the posibility to use free resources in the cloud paying a lower price, however if the cloud demand is increased your resources will be dealocated. This is very usefull for non critical workloads whenever you can aford to loose some of the work done. More info 2 3
Databricks has the posibility to run spot instances on AWS but there is no documentation about how to do it on Azure.
Is it possible to run Databricks clusters on Azure Spot instances?
Yes, it is possible but not using Databricks UI. To use Azure spot instances on Databricks you need to use databricks cli.
Note
With the cli tool is it possible to administrate -create, edit, delete- clusters and instances-pools. However, to simplify the process, I'll focus on editing an existing cluster.
You can install databricks cli using pip install databricks-cli and configure your credentials with databricks configure --token. For more information, visit databricks documentation.
Run the command datbricks clusters list to know the ID of the cluster you want to modify:
$ datbricks clusters list
0422-112415-fifes919 Big Spark3 TERMINATED
0612-341234-jails230 Normal Spark3 TERMINATED
0212-623261-mopes727 Small 7.6 TERMINATED
In my case, I have 3 clusters. First column is the cluster ID, second one is the name of the cluster. Last column is the state.
The command databricks cluster get generates the cluster config in json format. Let's generate the json file to modify it:
databricks clusters get --cluster-id 0422-112415-fifes919 > /tmp/my_cluster.json
This file contains all the configuration related to the cluster like name, instance type, owner... In our case we are looking for the azure_attributes section. You will see something similar to:
...
"azure_attributes": {
"first_on_demand": 1,
"availability": "ON_DEMAND_AZURE",
"spot_bid_max_price": -1.0
},
...
We need to change the availability to SPOT_WITH_FALLBACK_AZURE and spot_bid_max_price with our bid price. Edit the file with your favorite tool. The result should be something like:
...
"azure_attributes": {
"first_on_demand": 1,
"availability": "SPOT_WITH_FALLBACK_AZURE",
"spot_bid_max_price": 0.4566
},
...
Once modified, just update the cluster with the new configuration file using databricks clusters edit:
databricks clusters edit --json-file /tmp/my_cluster.json
Now, everytime you start the cluster, the workers will be spot instances.To confirm this, you can go to the configuration tab inside the worker VM that is allocated in the resource group managed by databricks. You will see the Azure spot is active and with the price configured.
Databricks on AWS has more configuration options like SPOT for the availability field. However, until the documentation is released we'll need to wait or configure with try-error approach.

Azure AKS - splitting node pool over multiple Availability Zones

I'm new to Azure so please bear with me! I'm looking to create a HA (99.99%) node pool for AKS. I am more familiar with AWS and availability zones, whereby I'd split the auto scaling group over 3 AZs and that would be that.
It appears that Azure have picked up on AZs and do offer them (https://azure.microsoft.com/en-gb/blog/azure-availability-zones-now-available-for-the-most-comprehensive-resiliency-strategy/) however, I don't see anyway to specify these parameters when creating an AKS cluster - https://learn.microsoft.com/en-us/cli/azure/aks?view=azure-cli-latest#az-aks-create
Am I missing something here? If I use the availability set, there is only a 99.95% availability target which doesn't fulfill what I need. Basically I want to architect so that if an AZ fails in Azure my app keeps running...
Thanks!
Update: AKS with Availability Zone Support is now generally available: https://learn.microsoft.com/en-us/azure/aks/availability-zones
But note that availability zone configuration can only be set during the cluster creation!
Unfortunately,Azure Availability Zones does not support AKS currently. It is just available for some regions and services. For details, see support regions and services.

Connect hadoop cluster to mutiple Google Cloud Storage backets in multiple Google Projects

It is possible, to connect my Hadoop cluster to multiple Google Cloud Projects at once ?
I can easly use any Google Storage bucket in single Google Project via Google Cloud Storage Connector as explained in this thread Migrating 50TB data from local Hadoop cluster to Google Cloud Storage. But i can't find any documentation or example how to connect to two or more Google Cloud Project from single map-reduce job. Do You have any suggestion/trick ?
Thanks a lot.
Indeed, it is possible to connect your cluster to buckets from multiple different projects at once. Ultimately, if you're using the instructions for using a service-account keyfile, the GCS requests are performed on behalf of that service-account, which can be treated more-or-less like any other user. You can either add the service account email your-service-account-email#developer.gserviceaccount.com to all the different cloud projects owning buckets you want to process, using the permissions section of cloud.google.com/console and simply adding that email address like any other member, or you can set GCS-level access to add that service-account like any other user.

How does Amazon EC2 Auto Scaling work?

I am trying to understand how Amazon implements the auto scaling feature. I can understand how it is triggered but I don't know what exactly happens during the auto scaling. How does it expand. For instance,
If I set the triggering condition as cpu>90. Once the vm's cpu usage increases above 90:
Does it have a template image which will be copied to the new machine and started?
How long will it take to start servicing the new requests ?
Will the old vm have any downtime ?
I understand that it has the capability to provide load balancing between the VMs. But, I cannot find any links/paper which explains how Amazon auto scaling works. It will be great if you can provide me some information regarding the same. Thank you.
Essentially, in the set up you register an AMI, and a set of EC2 start parameters - a launch configuration (Instance size, userdata, security group, region, availability zone etc) You also set up scaling policies.
Your scaling trigger fires
Policies are examined to determine which launch configuration pplies
ec2 run instance is called with the registered AMI and the launch configuration parameters.
At this point, an instance is started which is a combination of the AMI and the launch configuration. It registers itself with an IP address into the AWS environment.
As part of the initial startup (done by ec2config or ec2run - going from memory here) - the newly starting instance can connect to instance meta data and run the script stored in "userdata". This script can bootstrap software installation, operating system configuration, settings, anything really that you can do with a script.
Once it's completed, you've got a newly created instance.
Now - if this process was kicked off by auto-scale and elastic-load-balancing, at the point that the instance is "Windows is ready" (Check ec2config.log), the load balancer will add the instance to itself. Once it's responding to requests, it will be marked healthy, and the ELB will start routing traffic.
The gold standard is to have a generic AMI, and use your bootstrap script to install all the packages / msi's / gems or whatever you need onto the server. But what often happens is that people build a golden image, and register that AMI for scaling.
The downside to the latter method is that every release requires a new AMI to be created, and the launch configurations to be updated.
Hope that gives you a bit more info.
may be this can help you
http://www.cardinalpath.com/autoscaling-your-website-with-amazon-web-services-part-2/
http://www.cardinalpath.com/autoscaling-your-website-with-amazon-web-services-part-1/
this post helped me achiving this
Have a read of this chaps blog, it helped me when I doing some research on the subject.
http://www.codebelay.com/blog/2009/08/02/how-to-load-balance-and-auto-scale-with-amazons-ec2/

Resources