Azure AAD pod identity with Azure event hub - spark-streaming

I have a requirement to use Managed identity mechanism to access event hub from Spark streaming application running in kubernetes
I am going through azure AAD pod managed identity to connect to Azure event hub and didn’t find any doc regarding event hub
Does azure AAD pod identity support accessing of event hub resource securely using azure active directory.
Can anyone provide steps/code to use event hub with AAD pod
Thanks in advance

Yes, Aad pod identity supports Azure Eventhub Connection. Here are the steps:
Firstly, configure your cluster to enable managed identity. Also, this scenario is related to RBAC-disabled clusters.
az aks update -g <rg-name> -n <cluster-name> --enable-managed-identity
az aks update -g <rg-name> -n <cluster-name> --enable-pod-identity --enable-pod-identity-with-kubenet
After this conf., you can enable aad pod identity:
kubectl apply -f https://raw.githubusercontent.com/Azure/aad-pod-identity/v1.8.13/deploy/infra/deployment.yaml
kubectl apply -f https://raw.githubusercontent.com/Azure/aad-pod-identity/v1.8.13/deploy/infra/mic-exception.yaml
check 3 pods in the default namespace are up & running —> kubectl get po
create aad pod identity with cli:
az aks pod-identity add --resource-group <rg-name>
--cluster-name <cluster-name> --namespace <your-ns> --name <name> --identity-resource-id <resource-id>
--binding-selector <name_that_use_in_aks>
checked identity is assigned or not?
az aks show -g <rg-name> -n <cluster-name> | grep -i
<user-assigned-managed-identiy-name>
If your configuration is valid, Here is the java code sample:
ManagedIdentityCredential managedIdentityCredential = new ManagedIdentityCredentialBuilder() //
.clientId("your_id") //
.maxRetry(1) //
.retryTimeout(duration -> Duration.ofMinutes(1)) //
.build();
EventHubProducerAsyncClient eventHubProducerAsyncClient = new EventHubClientBuilder() //
.credential("fullyQualifiedNamespace", "eventhub-name", managedIdentityCredential) //
.buildAsyncProducerClient();
EventData eventData = new EventData(message.getBytes(StandardCharsets.UTF_8));
eventData.setContentType("application/json");
CreateBatchOptions options = new CreateBatchOptions() //
.setPartitionKey("1");
eventHubProducerAsyncClient.createBatch(options) //
.flatMap(batch -> { //
batch.tryAdd(eventData);
return eventHubProducerAsyncClient.send(batch);
}) //
.subscribe(unused -> {
}, error -> {
LOGGER.error("Error occurred while sending message:" + error);
// Omit the exceptions in case sth went wrong while sending merge result
}, () -> { //
LOGGER.debug("Message send successfully.");
});
For more details:
microsoft related page
aad pod identity related page

Related

Unable to create Azure-keyvault-backed secret scope on Azure Databricks

I am not able to create secret scope on Azure Databricks from Databricks CLI. I run a command like this:
databricks secrets "create-scope" --scope "edap-dev-kv" --scope-backend-type AZURE_KEYVAULT --resource-id "/subscriptions/ba426b6f-65cb-xxxx-xxxx-9a1e1656xxxx/resourceGroups/edap-dev-rg/providers/Microsoft.KeyVault/vaults/edap-dev-kv" --profile profile_edap_dev2_dbx --dns-name "https://edap-dev-kv.vault.azure.net/"
I get error msg:
Error: b'<html>\n<head>\n<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>\n<title>
Error 400 io.jsonwebtoken.IncorrectClaimException:
Expected aud claim to be: 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d, but was: https://management.core.windows.net/.
</title>\n</head>\n<body><h2>HTTP ERROR 400</h2>\n<p>
Problem accessing /api/2.0/secrets/scopes/create.
Reason:\n<pre> io.jsonwebtoken.IncorrectClaimException:
Expected aud claim to be: 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d,
but was: https://management.core.windows.net/.</pre></p>\n</body>\n</html>\n'
I have tried doing it with both user (personal) and service principal's AAD token. (I've found somewhere that it it should be a AAD token of user account.)
I am able to do it with GUI using same parameters.
In your case, the personal access token was issued for incorrect service - it was issued for https://management.core.windows.net/. but it's required that you use resource ID of the Azure Databricks - 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d.
Simplest way to do that is to use az-cli with following command:
az account get-access-token -o tsv --query accessToken \
--resource 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d

How to load a AzureML model in an Azure Databricks compute?

I am trying to run a DatabricksStep. I have used ServicePrincipalAuthentication to authenticate the run:
appId = dbutils.secrets.get(<secret-scope-name>, <client-id>)
tenant = dbutils.secrets.get(<secret-scope-name>, <directory-id>)
clientSecret = dbutils.secrets.get(<secret-scope-name>, <client-secret>)
subscription_id = dbutils.secrets.get(<secret-scope-name>, <subscription-id>)
resource_group = <aml-rgp-name>
workspace_name = <aml-ws-name>
svc_pr = ServicePrincipalAuthentication(
tenant_id=tenant,
service_principal_id=appId,
service_principal_password=clientSecret)
ws = Workspace(
subscription_id=subscription_id,
resource_group=resource_group,
workspace_name=workspace_name,
auth=svc_pr
)
The authentication is successful since running the following block of code gives the desired output:
subscription_id = ws.subscription_id
resource_group = ws.resource_group
workspace_name = ws.name
workspace_region = ws.location
print(subscription_id, resource_group, workspace_name, workspace_region, sep='\n')
However, the following block of codes gives an error:
model_name=<registered-model-name>
model_path = Model.get_model_path(model_name=model_name, _workspace=ws)
loaded_model = joblib.load(model_path)
print('model loaded!')
This is giving an error:
UserErrorException:
Message:
Operation returned an invalid status code 'Forbidden'. The possible reason could be:
1. You are not authorized to access this resource, or directory listing denied.
2. you may not login your azure service, or use other subscription, you can check your
default account by running azure cli commend:
'az account list -o table'.
3. You have multiple objects/login session opened, please close all session and try again.
InnerException None
ErrorResponse
{
"error": {
"message": "\nOperation returned an invalid status code 'Forbidden'. The possible reason could be:\n1. You are not authorized to access this resource, or directory listing denied.\n2. you may not login your azure service, or use other subscription, you can check your\ndefault account by running azure cli commend:\n'az account list -o table'.\n3. You have multiple objects/login session opened, please close all session and try again.\n ",
"code": "UserError"
}
}
The error is Forbidden Error even though I have authenticated using ServicePrincipalAuthentication.
How to resolve this error to run inference using an AML registered model in ADB?
The Databricks workspace need to be present in the same subscription as your AML workspace.
This notebook demonstrates the use of DatabricksStep in Azure Machine Learning Pipeline.
Here is the Model class register.

Using CDK deploy cannot read temporary S3 bucket that holds Lambda Code

When I deploy a Lambda "code" using CDK the deploy process (cloudformation running under presumably my user) does not have seem to have access to the bucket that holds the Lambda code.
I followed this tutorial: https://intro-to-cdk.workshop.aws/what-is-cdk.html and see this error when I run cdk deploy:
Lambda8C48573D) Your access has been denied by S3, please make sure your request credentials have permission to GetObject for cdktoolkit-stagingbucket-19kn1ypcmzq2q/assets/5327df
Lambda Code:
const handler = new lambda.Function(this, "TimestreamLambda", {
runtime: lambda.Runtime.NODEJS_10_X,
code: lambda.Code.fromAsset(path.join(__dirname, '../resources')),
handler: "index.hello_world",
...
cdk and #aws-cdk version is 1.73.0 but I also tried with 1.71.0
Notes:
I see the bucket under my account (in my region).
When logged into this account I can see and download the asset file
the downloaded zip file has the correct contents.
More error details:
12/24 | 9:15:19 PM | CREATE_FAILED | AWS::Lambda::Function | TimestreamLambda (TimestreamLambda8C48573D) Your access has been denied by S3, please make sure your request credentials have permission to GetObject for cdktoolkit-stagingbucket-28hiljazvaim/assets/5327df740bdc9c380ff567xxxxxxxxxxx7a68a.zip. S3 Error Code: AccessDenied. S3 Error Message: Access Denied (Service: AWSLambdaInternal; Status Code: 403; Error Code: AccessDeniedException; Request ID: 1b813776-7647-4767-89bc-XXXXXXXXX; Proxy: null)
new Function (/Users/<user>/dev/cdk/cdk-workshop/node_modules/#aws-cdk/aws-lambda/lib/function.ts:593:35)
\_ new CdkWorkshopStack (/Users/<user>/dev/cdk/cdk-workshop/lib/cdk-workshop-stack.ts:33:21)
I also see this (using the -v option) during deploy:
env: {
CDK_DEFAULT_REGION: 'us-west-2',
CDK_DEFAULT_ACCOUNT: '94646XXXXX',
CDK_CONTEXT_JSON: '{"#aws-cdk/core:enableStackNameDuplicates":"true","aws-cdk:enableDiffNoFail":"true","#aws-cdk/core:stackRelativeExports":"true","aws:cdk:enable-path-metadata":true,"aws:cdk:enable-asset-metadata":true,"aws:cdk:version-reporting":true,"aws:cdk:bundling-stacks":["*"]}',
CDK_OUTDIR: 'cdk.out',
CDK_CLI_ASM_VERSION: '7.0.0',
CDK_CLI_VERSION: '1.73.0'
}
As it turns out this was an issue with the internal authentication system my company uses to access AWS. Instead of using my regular AWS account to access I had to create a temporary account (which also sets a temporary token).

Azure devops terraform pipeline generate client id and secret

I am using this terraform manifest to deploy AKS on Azure. I can do this via the commandline fine and it works, as I have azure cli configured on my machine to generate client id and secret
https://github.com/anubhavmishra/terraform-azurerm-aks
However, I am now building this on Azure Devops Pipeline
So, far i have managed to run terraform init and plan with backend storage on Azure, using Azure Devops using this extension
https://marketplace.visualstudio.com/items?itemName=charleszipp.azure-pipelines-tasks-terraform
Question: How do i get client id and secret on the Azure devops pipeline and set that as an environment variable for terraform? I tried creating a bash az command in the pipeline
> az ad sp create-for-rbac --role="Contributor"
> --scopes="/subscriptions/YOUR_SUBSCRIPTION_ID"
but failed with this error
> 2019-03-27T10:41:58.1042923Z
2019-03-27T10:41:58.1055624Z Setting AZURE_CONFIG_DIR env variable to: /home/vsts/work/_temp/.azclitask
2019-03-27T10:41:58.1060006Z Setting active cloud to: AzureCloud
2019-03-27T10:41:58.1069887Z [command]/usr/bin/az cloud set -n AzureCloud
2019-03-27T10:41:58.9004429Z [command]/usr/bin/az login --service-principal -u *** -p *** --tenant ***
2019-03-27T10:42:00.0695154Z [
2019-03-27T10:42:00.0696915Z {
2019-03-27T10:42:00.0697522Z "cloudName": "AzureCloud",
2019-03-27T10:42:00.0698958Z "id": "88bfee03-551c-4ed3-98b0-be68aee330bb",
2019-03-27T10:42:00.0704752Z "isDefault": true,
2019-03-27T10:42:00.0705381Z "name": "Visual Studio Enterprise",
2019-03-27T10:42:00.0706362Z "state": "Enabled",
2019-03-27T10:42:00.0707434Z "tenantId": "***",
2019-03-27T10:42:00.0716107Z "user": {
2019-03-27T10:42:00.0717485Z "name": "***",
2019-03-27T10:42:00.0718161Z "type": "servicePrincipal"
2019-03-27T10:42:00.0718675Z }
2019-03-27T10:42:00.0719185Z }
2019-03-27T10:42:00.0719831Z ]
2019-03-27T10:42:00.0728173Z [command]/usr/bin/az account set --subscription 88bfee03-551c-4ed3-98b0-be68aee330bb
2019-03-27T10:42:00.8569816Z [command]/bin/bash /home/vsts/work/_temp/azureclitaskscript1553683312219.sh
2019-03-27T10:42:02.4431342Z ERROR: Directory permission is needed for the current user to register the application. For how to configure, please refer 'https://learn.microsoft.com/en-us/azure/azure-resource-manager/resource-group-create-service-principal-portal'. Original error: Insufficient privileges to complete the operation.
2019-03-27T10:42:02.5271752Z [command]/usr/bin/az account clear
2019-03-27T10:42:03.3092558Z ##[error]Script failed with error: Error: /bin/bash failed with return code: 1
2019-03-27T10:42:03.3108490Z ##[section]Finishing: Azure CLI
Here is how I do it with Azure Pipelines.
Create a Service Principal for Terraform.
Create the following variables in your pipeline
ARM_CLIENT_ID
ARM_CLIENT_SECRET
ARM_SUBSCRIPTION_ID
ARM_TENANT_ID
If you choose to store ARM_CLIENT_SECRET as a secret in Azure DevOps you will need to do the following in your task under the Environment Variables sections of the task to get it decrypted so terraform can read it.
you just need to grant your service connections rights to create service principals. but I'd generally advise against that, just precreate a service principal and use it in your pipeline. creating a new service principal on each run seems excessive.
you can use build\release variables and populate those with client id\secret
The approach defined in the post https://medium.com/#maninder.bindra/creating-a-single-azure-devops-yaml-pipeline-to-provision-multiple-environments-using-terraform-e6d05343cae2?
can be considered as well. Here the Keyvault task is used to fetch the secrets from Azure Vault (these include terraform backend access secrets as well as aks sp secrets):
#KEY VAULT TASK
- task: AzureKeyVault#1
inputs:
azureSubscription: '$(environment)-sp'
KeyVaultName: '$(environment)-pipeline-secrets-kv'
SecretsFilter: 'tf-sp-id,tf-sp-secret,tf-tenant-id,tf-subscription-id,tf-backend-sa-access-key,aks-sp-id,aks-sp-secret'
displayName: 'Get key vault secrets as pipeline variables'
And then you can use the secrets as variables in the rest of the pipeline. FOr instance aks-sp-id can be referred to as $(aks-sp-id). So the bash/azure-cli task can be something like
# AZ LOGIN USING TERRAFORM SERVICE PRINCIPAL
- script: |
az login --service-principal -u $(tf-sp-id) -p $(tf-sp-secret) --tenant $(tf-tenant-id)
cd $(System.DefaultWorkingDirectory)/tf-infra-provision
Followed by terraform init and plan (plan shown below, see post for complete pipeline details)
# TERRAFORM PLAN
echo '#######Terraform Plan########'
terraform plan -var-file=./tf-vars/$(tfvarsFile) -var="client_id=$(tf-sp-id)" -var="client_secret=$(tf-sp-secret)" -var="tenant_id=$(tf-tenant-id)" -var="subscription_id=$(tf-subscription-id)" -var="aks_sp_id=$(aks-sp-id)" -var="aks_sp_secret=$(aks-sp-secret)" -out="out.plan"
Hope this helps.

Executing maven unit tests on a Google Cloud SQL environment

I have a Jenkins pod running in GCP's Kubernetes Engine and I'm trying to run a maven unit test that connects to a google cloud SQL database to perform said test. My application.yaml for my project looks like this:
spring:
cloud:
gcp:
project-id: <my_project_id>
sql:
database-name: <my_database_name>
instance-connection-name: <my_instance_connection_name>
jpa:
database-platform: org.hibernate.dialect.MySQL55Dialect
hibernate:
ddl-auto: create-drop
datasource:
continue-on-error: true
driver-class-name: com.mysql.cj.jdbc.Driver
username: <my_cloud_sql_username>
password: <my_cloud_sql_password>
The current Jenkinsfile associated with this project is:
Jenkinsfile:
pipeline {
agent any
tools{
maven 'Maven 3.5.2'
jdk 'jdk8'
}
environment {
IMAGE = readMavenPom().getArtifactId()
VERSION = readMavenPom().getVersion()
DEV_DB_USER = "${env.DEV_DB_USER}"
DEV_DB_PASSWORD = "${env.DEV_DB_PASSWORD}"
}
stages {
stage('Build docker image') {
steps {
sh 'mvn -Dmaven.test.skip=true clean package'
script{
docker.build '$IMAGE:$VERSION'
}
}
}
stage('Run unit tests') {
steps {
withEnv(['GCLOUD_PATH=/var/jenkins_home/google-cloud-sdk/bin']) {
withCredentials([file(credentialsId: 'key-sa', variable: 'GC_KEY')]) {
sh("gcloud auth activate-service-account --key-file=${GC_KEY}")
sh("gcloud container clusters get-credentials <cluster_name> --zone northamerica-northeast1-a --project <project_id>")
sh 'mvn test'
}
}
}
}
}
}
}
My problem is when the Pipeline actually tries to run the mvn test using the above configuration (in my application.yaml) I'm getting this error:
Caused by:
com.google.api.client.googleapis.json.GoogleJsonResponseException: 403
Forbidden
{
"code" : 403,
"errors" : [ {
"domain" : "global",
"message" : "Insufficient Permission: Request had insufficient authentication scopes.",
"reason" : "insufficientPermissions"
} ],
"message" : "Insufficient Permission: Request had insufficient authentication scopes."
}
I have two Google Cloud projects:
One that has the Kubernetes Cluster where the Jenkins pod is running.
Another project where the K8s Cluster contains my actual Spring Boot Application and the Cloud SQL database that I'm trying to access.
I also created the service account only in my Spring Boot Project for Jenkins to use with three roles: Cloud SQL Editor, Kubernetes Engine Cluster Admin and Project owner (to verify that the service account is not at fault).
I enabled the Cloud SQL, Cloud SQL admin and Kubernetes APIs in both projects and I double checked my Cloud SQL credentials and they are ok. In addition, I authenticated the Jenkins pipeline using the json file generated when I created the service account, following the recommendations discussed here:
Jenkinsfile (extract):
...
withCredentials([file(credentialsId: 'key-sa', variable: 'GC_KEY')]) {
sh("gcloud auth activate-service-account --key-file=${GC_KEY}")
sh("gcloud container clusters get-credentials <cluster_name> --zone northamerica-northeast1-a --project <project_id>")
sh 'mvn test'
}
...
I don't believe the GCP Java SDK relies on gcloud CLI at all. Instead, it looks for an environment variable GOOGLE_APPLICATION_CREDENTIALS that points to your service account key file and GCLOUD_PROJECT (see https://cloud.google.com/docs/authentication/getting-started).
Try adding the following:
sh("export GOOGLE_APPLICATION_CREDENTIALS=${GC_KEY}")
sh("export GCLOUD_PROJECT=<project_id>")
There are a couple of different things you should verify to get this working. I'm assuming you are using the Cloud SQL JDBC SocketFactory for Cloud SQL.
You should create a testing service account and give it whatever permissions are needed to execute the tests. To connect to Cloud SQL, it needs at a minimum the "Cloud SQL Client" role for the same project as the Cloud SQL instance.
The Cloud SQL SF uses the Application Default Credentials (ADC) strategy for determining what authentication to use. This means the first place it looks for credentials is the GOOGLE_APPLICATION_CREDENTIALS env var, which should be a path to the key for the testing service account.

Resources