Databrick - Reading BLOB from Mounted filestorage - azure-databricks

I am using Azure databricks and I ran the following Python code:
sas_token = "<my sas key>"
dbutils.fs.mount(
source = "wasbs://<container>#<storageaccount>.blob.core.windows.net",
mount_point = "/mnt/gl",
extra_configs = {"fs.azure.sas.<container>.<storageaccount>.blob.core.windows.net": sas_token})
This seemed to run fine. So I then ran:
df = spark.read.text("/mnt/gl/glAgg_LE.csv")
Which gave me the error:
shaded.databricks.org.apache.hadoop.fs.azure.AzureException: com.microsoft.azure.storage.StorageException: Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
Not sure what I'm doing wrong though. I'm pretty sure my sas key is correct.

Ok if you are getting this error - double check both the SAS key and the container name.
Turned out I had pointed it to the wrong container!

Related

How to load a AzureML model in an Azure Databricks compute?

I am trying to run a DatabricksStep. I have used ServicePrincipalAuthentication to authenticate the run:
appId = dbutils.secrets.get(<secret-scope-name>, <client-id>)
tenant = dbutils.secrets.get(<secret-scope-name>, <directory-id>)
clientSecret = dbutils.secrets.get(<secret-scope-name>, <client-secret>)
subscription_id = dbutils.secrets.get(<secret-scope-name>, <subscription-id>)
resource_group = <aml-rgp-name>
workspace_name = <aml-ws-name>
svc_pr = ServicePrincipalAuthentication(
tenant_id=tenant,
service_principal_id=appId,
service_principal_password=clientSecret)
ws = Workspace(
subscription_id=subscription_id,
resource_group=resource_group,
workspace_name=workspace_name,
auth=svc_pr
)
The authentication is successful since running the following block of code gives the desired output:
subscription_id = ws.subscription_id
resource_group = ws.resource_group
workspace_name = ws.name
workspace_region = ws.location
print(subscription_id, resource_group, workspace_name, workspace_region, sep='\n')
However, the following block of codes gives an error:
model_name=<registered-model-name>
model_path = Model.get_model_path(model_name=model_name, _workspace=ws)
loaded_model = joblib.load(model_path)
print('model loaded!')
This is giving an error:
UserErrorException:
Message:
Operation returned an invalid status code 'Forbidden'. The possible reason could be:
1. You are not authorized to access this resource, or directory listing denied.
2. you may not login your azure service, or use other subscription, you can check your
default account by running azure cli commend:
'az account list -o table'.
3. You have multiple objects/login session opened, please close all session and try again.
InnerException None
ErrorResponse
{
"error": {
"message": "\nOperation returned an invalid status code 'Forbidden'. The possible reason could be:\n1. You are not authorized to access this resource, or directory listing denied.\n2. you may not login your azure service, or use other subscription, you can check your\ndefault account by running azure cli commend:\n'az account list -o table'.\n3. You have multiple objects/login session opened, please close all session and try again.\n ",
"code": "UserError"
}
}
The error is Forbidden Error even though I have authenticated using ServicePrincipalAuthentication.
How to resolve this error to run inference using an AML registered model in ADB?
The Databricks workspace need to be present in the same subscription as your AML workspace.
This notebook demonstrates the use of DatabricksStep in Azure Machine Learning Pipeline.
Here is the Model class register.

Google Cloud Monitoring Ruby client permission issue

I am following the Ruby code sample to add a custom metrics to stackdriver, however, I keep getting the permission denied error.
client = Google::Cloud::Monitoring::Metric.new
project_name = Google::Cloud::Monitoring::V3::MetricServiceClient.project_path project_id
descriptor = Google::Api::MetricDescriptor.new(
type: "custom.googleapis.com/my_metric#{random_suffix}",
metric_kind: Google::Api::MetricDescriptor::MetricKind::GAUGE,
value_type: Google::Api::MetricDescriptor::ValueType::DOUBLE,
description: "This is a simple example of a custom metric."
)
result = client.create_metric_descriptor project_name, descriptor
the error I got is "Google::Gax::PermissionDeniedError (GaxError RPC failed, caused by 7:Permission monitoring.metricDescriptors.create denied (or the resource may not exist).)"
The environment variable GOOGLE_APPLICATION_CREDENTIALS is set, and it works fine for the Google Cloud Storage code below
storage = Google::Cloud::Storage.new project: project_id
# Make an authenticated API request
storage.buckets.each do |bucket|
puts bucket.name
end
At this point, I don't know what is the problem. Do I need to set up a different credential for Cloud Monitoring?

Reading sas file from blob storage in R

I am trying to read .sas7bdat file from default container. I have tried following till now:
sas_file <- RxSasData("wasbs://container#storageaccount.blob.core.windows.net/abc/xyz.sas7bdat")
sas_df <- rxImport(sas_file)
but I get following error:
The file 'wasbs://container#storageaccount.blob.core.windows.net/abc/xyz.sas7bdat' does not exist.
Could not open data source.
Error in doTryCatch(return(expr), name, parentenv, handler) :
Could not open data source.
File exists at the mentioned location in code. Still it throws error. Can someone please help me this?
According to your code, I think you want to local a SAS data file from HDFS on Azure HDInsight via RxSasData. However, RxSasData seems to be not supported on Hadoop env, as the figure below, please see here.
Please try to copy the file to local filesystem on HDI, then to read.

Error while connecting Elastic Map Reduce ruby client

I am following the steps mentioned on the AWS to use an interactive Hive session using SSH.
I used the following resources
https://github.com/ucbtwitter/getting-started/wiki/Using-Elastic-Map-Reduce-via-Command-Line
http://docs.amazonwebservices.com/ElasticMapReduce/latest/GettingStartedGuide/SignUp.html
I was getting this error initially "Error: Missing key access-id" and then I fixed my JSON file. The JSON file is in the same format as mentioned in the above links.
When I run this command
./elastic-mapreduce
I am getting the following error :-
Error: Unable to parse credentials.json: can't convert String into Integer.
I checked the values required in JSON at AWS as well.
Does anyone has an idea why am I getting this error?
The region value in the credentials.json must be of int type.
{......
......
"region": 1
}

LDAP config on Amazon Linux

I am trying to install openldap on Amazon Linux and got the following error:
olcRootPW: value #0: <olcRootPW> can only be set when rootdn is under suffix
config error processing olcDatabase={1}monitor,cn=config: <olcRootPW> can only be set when rootdn is under suffix
slaptest: bad configuration file!
I also tried putting the olcRootPW in the olcDatabase={2}bdb.ldif file, but that just gives the same error. Any advise?
The message is quite clear. You can only set a password on the monitor database if the rootDN is under the suffix of the database. In other words the rootDN has to end with 'cn=monitor,cn=config'.
Try the to add it like below, in the file olcDatabase={2}hdb.ldif
olcSuffix: dc=my-domain,dc=com
olcRootDN: cn=manager,dc=my-domain,dc=com
olcRootPW: {SSHA}---password---

Resources