How to list azure Databricks workspaces along with properties like workspaceId? - azure-databricks

My objective is to create a csv file that lists all azure databricks workspaces and in particular has the workspace id.
I have been able to retrieve all details as json using the CLI:
az rest -m get --header "Accept=application/json" -u 'https://management.azure.com/subscriptions/{subscriptionId}/providers/Microsoft.Databricks/workspaces?api-version=2018-04-01' > workspaces.json
How can I retrieve the same information using azure resource graph?

If you prefer to work with the workspace list api that returns json, here is one approach for post processing the data (in my case I ran this from a jupyter notebook):
import json
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)
# json from https://learn.microsoft.com/en-us/rest/api/databricks/workspaces/list-by-subscription?tabs=HTTP&tryIt=true&source=docs#code-try-0
# E.g.
# az rest -m get --header "Accept=application/json" -u 'https://management.azure.com/subscriptions/{subscriptionId}/providers/Microsoft.Databricks/workspaces?api-version=2018-04-01' > workspaces.json
pdf = pd.read_json('./workspaces.json')
# flatten the nested json
pdf_flat = pd.json_normalize(json.loads(pdf.to_json(orient="records")))
# drop columns with name '*.type'
pdf_flat.drop(pdf_flat.columns[pdf_flat.columns.str.endswith('.type')], axis=1, inplace=True)
# drop rows without a workspaceId
pdf_flat = pdf_flat[ ~pdf_flat['value.properties.workspaceId'].isna() ]
# drop unwanted columns
pdf_flat.drop(columns=[
'value.properties.parameters.enableFedRampCertification.value',
'value.properties.parameters.enableNoPublicIp.value',
'value.properties.parameters.natGatewayName.value',
'value.properties.parameters.prepareEncryption.value',
'value.properties.parameters.publicIpName.value',
'value.properties.parameters.relayNamespaceName.value',
'value.properties.parameters.requireInfrastructureEncryption.value',
'value.properties.parameters.resourceTags.value.databricks-environment',
'value.properties.parameters.storageAccountName.value',
'value.properties.parameters.storageAccountSkuName.value',
'value.properties.parameters.vnetAddressPrefix.value',
], inplace=True)
pdf_flat

I was able to retrieve the information I needed by:
Searching for databricks resources in the Azure portal:
From there I could click Open Query to use the Azure Resource Graph Explorer and write a query to extract the information I need:
I ended up using the following query:
// Run query to see results.
where type == "microsoft.databricks/workspaces"
| project id,properties.workspaceId,name,tenantId,type,resourceGroup,location,subscriptionId,kind,tags

Related

How to edit existing Azure storage account CORS Rule through Bash Script

I've been working as Devops engineer from past 3 months. So, I am learning creating scripts for automation. But I am stuck here.
I wanna try to edit the CORS rule through bash script. I have successfully written a script for adding the rule but every time i run the script, its create a new rule. I want to edit the existing rule.
Here's the script line, I am using to add CORS rule.
Add_Rule=$(az storage cors add --account-name testingscriptcors --origins 'http://google321.comh, http://www.google123.com' --methods GET PUT --allowed-headers '*' --exposed-headers '*' --max-age 200 --services b)
What I am thinking that if I provide ( --rule 1 ) in this script, It'll worked out. But this doesn't work.
You can add and edit the existing rule using Set-AzStorageCORSRule. You can approch this way as i have mentioned here.
Cmdlet to Set or add the CORS rule
$ctx=New-AzStorageContext -StorageAccountName "clouXXXXXXXman" -StorageAccountKey "ExXXXXXXXXXzUgkXi80HHrXXXXXXXXXXXXXXXXXXXXX"
$CorsRules = (#{
AllowedHeaders=#("x-ms-blob-content-type","x-ms-blob-content-disposition");
AllowedOrigins=#("*");
MaxAgeInSeconds=30;
AllowedMethods=#("Get","Connect")},
#{
AllowedOrigins=#("http://www.fabrikam.com","http://www.contoso.com");
ExposedHeaders=#("x-ms-meta-data*","x-ms-meta-customheader");
AllowedHeaders=#("x-ms-meta-target*","x-ms-meta-customheader");
MaxAgeInSeconds=30;
AllowedMethods=#("Put")})
Set-AzStorageCORSRule -ServiceType Blob -CorsRules $CorsRules -Context $ctx
Once Set you can check the list of rule using below command
az storage cors list --account-name cloudXXXXXXn --account-key ExlYLcXXXXXXXXXXXXXXXXXX
Change properties of a CORS rule for blob service
$ctx1=New-AzStorageContext -StorageAccountName "cloXXXXXuman" -StorageAccountKey "ExlYLc5XXXXXXXXXXXXXXXXXXXXXXXXXXX"
$CorsRules = Get-AzStorageCORSRule -ServiceType Blob -Context $ctx1
$CorsRules[0].AllowedHeaders = #("x-ms-blob-content-type", "x-ms-blob-content-disposition")
$CorsRules[0].AllowedMethods = #("Get", "Connect", "Merge")
Set-AzStorageCORSRule -ServiceType Blob -CorsRules $CorsRules -Context $ctx1
You can refer this MS Document for more information.

How to list public IPs of all compute instances in OCI?

I need to get the public IPs of all instances of my OCI tenant.
Here i saw a python scripts to do this from OCI Console Cloud Shell : https://medium.com/oracledevs/writing-python-scripts-to-run-from-the-oci-console-cloud-shell-a0be1091384c
But I want to create a bash script, that uses OCI CLI commands to fetch the required data.
How can I achieve this using OCI CLI commands?
OCI CLI structured-search and query feature can be used to fetch the OCID of instances, and instance command can be used fetch the instance details.
The output would be in json format by default.
You can use jq to filter needed data from the output json and create an array with it.
(OCI tool supports JMESPath queries)
Here is the snippet from bash script that uses OCI CLI commands to get public IPs of all compute instances in the compartment :
Pre-requisites: OCI CLI should be installed and configured properly to authenticate with the correct tenant and compartment
# Fetch the OCID of all the running instances in OCI and store to an array
instance_ocids=$(oci search resource structured-search --query-text "QUERY instance resources where lifeCycleState='RUNNING'" --query 'data.items[*].identifier' --raw-output | jq -r '.[]' )
# Iterate through the array to fetch details of each instance one by one
for val in ${instance_ocids[#]} ; do
echo $val
# Get name of the instance
instance_name=$(oci compute instance get --instance-id $val --raw-output --query 'data."display-name"')
echo $instance_name
# Get Public Ip of the instance
public_ip=$(oci compute instance list-vnics --instance-id $val --raw-output --query 'data[0]."public-ip"')
echo $public_ip
done
References :
https://blogs.oracle.com/cloud-infrastructure/post/exploring-the-search-and-query-features-of-oracle-cloud-infrastructure-command-line-interface
https://docs.oracle.com/en-us/iaas/tools/oci-cli/2.9.9/oci_cli_docs/cmdref/compute/instance.html

ADO CLI can't remove tags

I'm using the Azure CLI with the az boards work-item update command (docs are here). This is part of a larger system that reads the tags on a ticket (amongst other things) and then removes the Ready tag from that list and tries to set the tags back to remove it.
az boards work-item update --organization $ORG --output json --id 12345 --fields System.Tags=Android
When updating the tags field using --fields System.Tags=Android argument, this used to replace existing tags with the tag specified e.g. if a ticket had the Android and Ready tags, this would remove the Ready tag. However recently this seems to only be able to add tags and not remove them.
I've tried various other properties and formats, but nothing seems to work. Does anyone know how I can replace the tags on the ticket with the ones I'm specifying using the CLI?
EDIT: ADO community ticket raised here
I can confirm that the az boards work-item update command is no longer removing/replacing tags with the ones that are passed, and this certainly is a bug. Please report it on Developer Community so it can be fixed.
Meanwhile, as #Fairy Xu mentioned, this behavior leaves you with the only option of making a REST call to update the work item. However, you do not have to swap your entire current setup to REST to work around the issue. The same REST call can be made via Azure CLI as well using the az rest command!
Here is how it can be achieved:
# Get current tags on the work item; Sample response: Tag1; Tag2; Tag3
$Tags = az boards work-item show --id 456 --query 'fields.\"System.Tags\"' -o tsv
# Sample headers.json
# {
# "Authorization":"Basic OmZpYWxreG9xYnBwiZ2IyeDRyZm90d3psNmE=",
# "Content-Type":"application/json-patch+json"
# }
#
# Sample body.json
# [
# {
# "op": "replace",
# "path": "/fields/System.Tags",
# "value": "Tag1; Tag3"
# }
# ]
# Use az rest command to make the Update work item REST call
# In the response you'd see the System.Tags field showing only Tag1; Tag3
az rest --method patch --url https://dev.azure.com/{organization}/{project}/_apis/wit/workitems/456?api-version=5.1 --headers '#headers.json' --body '#body.json'
EDIT:
You can deal with authentication in two ways:
Using AAD Bearer access token
It appears that one can use 499b84ac-1321-427f-aa17-267ca6975798 as the value of --resource to call az rest as follows:
az rest --url 'https://dev.azure.com/{organization}/{project}/_apis/wit/workitems/456?api-version=5.1' --resource 499b84ac-1321-427f-aa17-267ca6975798
Using Basic Authentication with a PAT
For populating the Authorization header, you first have to generate a Personal Access Token (PAT) for your organization with the appropriate scope. Once you have it, you have to convert it to a Base64 string as follows:
$Username=""
$Password="<PAT>"
$Token = [Convert]::ToBase64String([Text.Encoding]::ASCII.GetBytes(("{0}:{1}" -f $Username,$Password)))
Also, be sure to include and set Content-Type to application/json-patch+json as one of the headers as az rest defaults it to application/json.
References:
az rest - Azure CLI
Work Items - Update REST API
When using PowerShell, pay attention to quoting rules
az curl or az rest for custom REST support (coverage) #7618
As far as I know, there is no such method could delete specific work item tag with Rest API or Azure CLI currently.
For a work around, we need to use Azure CLI/ Rest API to get the work item tags list first. Then we could modify the tags field and update the field.
Here is my powershell example to run the Rest APIs.
$token = "PAT"
$url=" https://dev.azure.com/{organization}/{project}/_apis/wit/workitems/{workitemid}?api-version=6.0"
$token = [System.Convert]::ToBase64String([System.Text.Encoding]::ASCII.GetBytes(":$($token)"))
$response = Invoke-RestMethod -Uri $url -Headers #{Authorization = "Basic $token"} -Method Get -ContentType application/json
$tags = $response.fields.'System.Tags'
echo $tags
$New = $tags -replace "tagname" -replace ""
echo $new
$url1 ="https://dev.azure.com/{organization}/{project}/_apis/wit/workitems/{workitemid}?api-version=6.0"
$body = "[
{
`"From`" : null,
`"op`": `"replace`",
`"path`": `"/fields/System.Tags`",
`"value`" : `"$new`"
}
]"
$response = Invoke-RestMethod -Uri $url1 -Headers #{Authorization = "Basic $token"} -Method PATCH -Body $body -ContentType application/json-patch+json
Here is the docs about Rest APIs: Work Items - Get Work Item and Work Items - Update

How to bulk load data into a dgraph/standalone:graphql container?

Assuming I've a db like the quick-start of https://graphql.dgraph.io/docs/quick-start/
i.e.
type Product {
productID: ID!
name: String #search(by: [term])
reviews: [Review] #hasInverse(field: about)
}
type Customer {
custID: ID!
name: String #search(by: [hash, regexp])
reviews: [Review] #hasInverse(field: by)
}
type Review {
id: ID!
about: Product! #hasInverse(field: reviews)
by: Customer! #hasInverse(field: reviews)
comment: String #search(by: [fulltext])
rating: Int #search
}
Now I would like to import millions of entries and therefore would like to use the bulk loader. My dataset is a bug folder full of .json files.
To what I've seen, I should be able to run a command like
dgraph bulk -f folderOfJsonFiles -s goldendata.schema --map_shards=4 --reduce_shards=2 --http localhost:8000 --zero=localhost:5080
But to run my server, I am using the dgraph/standalone:graphql image ran docker run -v $(pwd):/dgraph -p 9000:9000 -it dgraph/standalone:graphql
Now how to start the bulk import ?
1:
Should I run the command within the docker container itself (and share the volume (folder) containing all my .json files ) or install dgraph on my host and run the dgraph bulk command from the host ?
2: What should be the format of the .json files ?
3: Would the bulk loader support blank nodes (id which are not _:0x1234) ?
[edit]
bulk loader seems not to support graphql schema, the schema should be converted to rdf first. To achieve this, I exported the schema and data right after importing the graphql schema curl 'localhost:8080/admin/export?format=json'
Here a few things to understand:
the bulk loader is not an offline version of the live loader. It is a tool which purpose is to prepare the data for the Dgraph Alpha(s) server(s).
the bulk loader, seems to be only able to load triples
the bulk loader can load a schema and files but this is not the graphql schema, the graphql schema must be loaded apart later.
So to answer the question:
start the dgraph graphql server using docker run -v $(pwd)/dgraph:/dgraph -p 8000:8000 -p 9000:9000 -p 8080:8080 -p 9080:9080 -p 5080:5080 -it dgraph/standalone:graphql for your information, this image launch the /tmp/run.sh script which will itself run dgraph-ratel & dgraph zero & dgraph alpha --lru_mb $lru_mb & dgraph graphql (where lru_mb is the memory you give to dgraph alpha). Keep the container's id for later find it using docker ps if you lost it.
Unless you have + 5 millions of entries (or no time), try using the live loader. If you have troubles with the live loader like: it became very slow after few hundred of thousands entries (300k in my case), this is very likely because your alpha does not have sufficient memory. In my case, I had to tune docker to provide 16Gb of memory to the engine, the script gives to the $lru_mb variable a third of the host memory.
Once you imported your full set of data using live loader, you can export the data using docker exec -it yourDockerContainerId curl localhost:8080/admin/export?format=json, the export will generate 2 files for instance: g01.json.gzand g01.schema.gz which corresponds to your entries and their schema (which is not the graphql schema).
To import those 2 files g01.json.gzand g01.schema.gz back to your dgraph graphql instance, you need to convert them to group’s "p" directory output. To what I understood, the "p" directory holds all the data for the Dgraph Alpha. If you delete it, you lose your data, if you replace it with another set, you will replace / restore the data with the one you just copied. Bulk loader is not an instance of dgraph, it is only the tool which will generate those "p" directory outputs. I have been successful running it within the container. Just run docker exec -it yourDockerContainerId dgraph bulk -f export/pathTo/g01.json.gz -s export/pathTo/g01.schema.gz --map_shards=1 --reduce_shards=1 --http localhost:8001 --zero=localhost:5080. I will be honest, I do not understand the purpose of the http localhost:8001 argument in this command. If the bulk loader ran successfully, it created an out/0/p folder containing the data you can use in your Dgraph Alpha. Stop your docker container docker stop yourDockerContainerId then Replace your current Dgraph Alpha's p folder with the one generated by bulk loader. (Re)start your docker container and you should have your imported data. (perhaps trash the w and zw folders as well, I have no clue about their use).
The data is imported but you will have an warning saying something like there is no graphql schema. Okay let's import our schema (assuming you have it at path dgraph/schemas/schema.graphql) schema=$(cat dgraph/schemas/schema.graphql | tr '\\n' ' ');jq -n --arg schema \"$schema\" '{ query: \"mutation addSchema($sch: String!) { addSchema(input: { schema: $sch }) { schema { schema } } }\", variables: { sch: $schema }}' | curl -X POST -H \"Content-Type: application/json\" http://localhost:9000/admin -d #- This might take few minutes as graph will likely have to index your data according to your graphql schema's indexing rule (typically related to the #search decorator)
You're done…
Now, I am still not completely answering the question because the data we are importing back is the one we just exported (and the one we actually imported using the live loader). So unfortunately, the bulk loader cannot import nice data like live loader, you have to feed him with triples. Therefore you have to prepare the data to load using bulk loader in that format. To help you in this talk, I suggest to
Run the dgraph graphql server docker run -v $(pwd)/dgraph:/dgraph -p 8000:8000 -p 9000:9000 -p 8080:8080 -p 9080:9080 -p 5080:5080 -it dgraph/standalone:graphql
import a graphql schema (assuming the schema is at path dgraph/schemas/schema.graphql ) schema=$(cat dgraph/schemas/schema.graphql | tr '\\n' ' ');jq -n --arg schema \"$schema\" '{ query: \"mutation addSchema($sch: String!) { addSchema(input: { schema: $sch }) { schema { schema } } }\", variables: { sch: $schema }}' | curl -X POST -H \"Content-Type: application/json\" http://localhost:9000/admin -d #-
create one or two basic / template entries using a graphql client. You can install the Altair chrome extension, connect to http://localhost:9000/graphql then add some data, something like:
mutation {
addCustomer(input:{name:"Toto"}){
name
}
}
You can also using a file and the live loader
Then export your small template data docker exec -it yourDockerContainerId curl localhost:8080/admin/export?format=json
Open the g01.json.gz and you will find an example of the data the bulk loader expects to be fed with.
What about blank ids ? I am not sure but as the bulk loader is doing a 2 levels mapping on ids, I can imagine you can provide your ids and those will be converted to dgraph ids later.

Error creating DB in bicgcouch with database_dir moved

I have a cluster setup and I've moved the data dir from /opt/bigcouch/var/lib to /bigcouch
I changed the following lines in /opt/bigcouch/etc/default.ini
database_dir = /bigcouch
view_index_dir = /bigcouch
I'm having a issue where if I try to create a new DB it returns JSON saying the DB was created but the actual DB is not created
I have the owner of the dir set to bigcouch
If I create the DB
curl -X PUT localhost:5984/testDB
and then
curl localhost:5986/dbs/_all_docs
I get zero records back
If you edited the .ini file by hand you'll need to restart bigcouch. I suggest using _config instead in general.

Resources