Download and store GRIB2 files in Azure - azure-blob-storage

Currently, on my Linux Docker container, I have a bash script that is able to download many many GRIB2 weather forecast files from a specific URL with a login by cookie.
Once those files are downloaded, I use an executable from the ECCODES library that I installed in that same Docker container to filter out the unneeded data in order to reduce the files size.
My company has access to the Azure platform and I would like to download and filter those GRIB2 files directly in the Azure platform so I don't have to run manually the script and to always download files and then upload them to an Azure storage.
However, I have never worked with Azure before so what I would like to know is :
would it be possible to run this script in maybe an Azure VM that would download and store directly the filtered GRIB2 files in an Azure storage (Blob storage seems to be the best option based on what I've read so far) ?
Thanks !

#!/usr/bin/env bash
export AZURE_STORAGE_ACCOUNT=your_azure_storage_account
export AZURE_STORAGE_ACCESS_KEY=your_azure_storage_access_key
# Retrieving current date to upload only new files
date=`date +%Y-%m-%dT%H:%MZ`
az login -u xxx#yyy.com -p password --output none
containerName=your_container_name
containerExists=`az storage container exists --account-name $AZURE_STORAGE_ACCOUNT --account-key $AZURE_STORAGE_ACCESS_KEY --name $containerName --output tsv`
if [[ $containerExists == "False" ]]; then
az storage container create --name $containerName # Create a container
fi
# Upload GRIB2 files to container
fileExists=`az storage blob exists --account-name $AZURE_STORAGE_ACCOUNT --account-key $AZURE_STORAGE_ACCESS_KEY --container-name $containerName --name "gfs.0p25.2019061300.f006.grib2" --output tsv`
if [[ $fileExists == "False" ]]; then
az storage blob upload --container-name $containerName --file ../Resources/Weather/Historical_Data/gfs.0p25.2019061300.f006.grib2 --name gfs.0p25.2019061300.f006.grib2
fi

Related

SAS URI gets broken when trying to upload the vhd to blob container in a storage account

I am using a bash script ot upload a vhd to blob container in a storage account.
Steps:
I created a VM, Storage account, container with blob acces, and snapshot. From the VM created, I'm generating one SAS URI by taking a snapshot of the VM Disk.
After SAS URI is generated I'm trying to upload it to the storage account container using below CLI Command:
$sas=$(az snapshot grant-access --resource-group $resourceGroupName --name $snapshotName --duration-in-seconds $sasExpiryDuration --query [accessSas] -o tsv)
I verified the value of $sas in the terminal it's printing the correct value.
But when try the below command.
az storage blob copy start --destination-blob $destinationVHDFileName --destination-container $storageContainerName --account-name $accountname --account-key $key --source-uri $sas
The SAS URI is a very large string. Wherever I see '&' in SAS URL the string after it gets separated.
Let's say the string is like `abc&sr63663&si74883&sig74848`
I'm getting the error:
sr is not recognized as internal or external command.
Si is not recognized as internal or external command.
Sig is not recognized as internal or external command.
Please help me with how I can pass the SAS URI properly in the last command mentioned through the bash script.
I tried to reproduce the same in my environment and got the same error as below:
To resolve the error, try wrapping the SAS token separately with single quotes followed by double quotes like below:
$sastoken=' " your_sas_token" '
az storage blob copy start --destination-blob $destinationVHDFileName --destination-container $storageContainerName --account-name $accountname --account-key $key --source-uri $sastoken
After executing the above script, vhd file uploaded to blob container successfully like below:
To confirm the above, Go to the Azure Portal and check the blob container like below:
Reference:
How to provide a SAS token to Azure CLI by Jon Tirjan

Uploading file to S3 bucket

I am trying to upload file from my local machine to S3 bucket but I am getting an error "The user-provided path ~Downloads/index.png does not exist."
aws s3 cp ~Downloads/index.png s3://asdfbucketasdf/Temp/index_temp.png
File with name index does exists on my Downloads.
This answer might be helpful to some users new to AWS CLI on different platforms.
If you are on Linux or Linux-like systems, you can type:
aws s3 cp ~/Downloads/index.png s3://asdfbucketasdf/Temp/index_temp.png
Note that ~Downloads means a username called Downloads. What you would want is ~/Downloads, which means Downloads directory under current user's home directory.
You can type out your path fully like so (assuming your home directory was /home/matt):
aws s3 cp /home/matt/Downloads/index.png s3://asdfbucketasdf/Temp/index_temp.png
If you are on Windows, you can type:
aws s3 cp C:\Users\matt\Downloads\index.png s3://asdfbucketasdf/Temp/index_temp.png
or you can use ~ like feature in Windows:
aws s3 cp %USERPROFILE%\Downloads\index.png s3://asdfbucketasdf/Temp/index_temp.png
If you are using windows and CLI version 2:
aws s3 cp "helloworld.txt" s3://testbucket

Fails to create azure container right after storage account was created

I am trying create an storage account from terraform, and use some of its access keys to create a blob container.
My terraform configuration is given from a bash file, so, one of the most important steps here are the following:
customer_prefix=4pd
customer_environment=dev
RESOURCE_GROUP_NAME=$customer_prefix
az group create --name $RESOURCE_GROUP_NAME --location westeurope
# Creating Storage account
STORAGE_ACCOUNT_NAME=4pdterraformstates
az storage account create --resource-group $RESOURCE_GROUP_NAME --name $STORAGE_ACCOUNT_NAME --sku Standard_LRS --encryption-services blob
# We are getting the storage account key to access to it when we need to store the terraform .tf production state file
ACCOUNT_KEY=$(az storage account keys list --resource-group $RESOURCE_GROUP_NAME --account-name $STORAGE_ACCOUNT_NAME --query [0].value -o tsv)
# Creating a blob container
CONTAINER_NAME=4pd-tfstate
az storage container create --name $CONTAINER_NAME --account-name $STORAGE_ACCOUNT_NAME --account-key $ACCOUNT_KEY
source ../terraform_apply_template.sh
I am sending those az cli commands from a .bash file and so, I am sending other TF_VAR_azurerm ... variables there
Finally when I execute the bash first bash file, this call to the terraform_apply_template.sh which create the plan, and it's applied. Is the following:
#!/bin/bash
#Terminates script execution after first failed (returned non-zero exit code) command and treat unset variables as errors.
set -ue
terraform init -backend-config=$statefile -backend-config=$storage_access_key -backend-config=$storage_account_name
#TF:
function tf_plan {
terraform plan -out=$outfile
}
case "${1-}" in
apply)
tf_plan && terraform apply $outfile
;;
apply-saved)
if [ ! -f $outfile ]; then tf_plan; fi
terraform apply $outfile
;;
*)
tf_plan
echo ""
echo "Not applying changes. Call one of the following to apply changes:"
echo " - '$0 apply': prepares and applies new plan"
echo " - '$0 apply-saved': applies saved plan ($outfile)"
;;
esac
But my output is the following, the azurerm backend is initialized,the storage account named 4pdterraformstates is created, and also the 4pd-tfstate blob container,
but in the practice this action is not effective, I get the following output:
Initializing the backend...
Successfully configured the backend "azurerm"! Terraform will automatically
use this backend unless the backend configuration changes.
Error: Failed to get existing workspaces: storage: service returned error: StatusCode=404, ErrorCode=ContainerNotFound, ErrorMessage=The specified container does not exist.
RequestId:2db5df4e-f01e-014c-369d-272246000000
Time:2019-06-20T19:21:01.6931098Z, RequestInitiated=Thu, 20 Jun 2019 19:21:01 GMT, RequestId=2db5df4e-f01e-014c-369d-272246000000, API Version=2016-05-31, QueryParameterName=, QueryParameterValue=
Looking for a similar behavior, I have found this issue in the azurerm provider terraform repository
And also according to this issue created directly in terraform repository It's looks like a network operational error ...
But the strange is that it's was fixed ..
I am using the terraform v0.12.1 version
⟩ terraform version
Terraform v0.12.2
According to #Gaurav-Mantri answer, is necessary wait until the storage account to be provisioned in order to continue wit the other task related with the storage account itself.
Creation of a storage account is an asynchronous process. When you execute az storage account create to create a storage account, request is sent to Azure and you get an accepted response back (if everything went well).
The whole process to create (provision) a storage account takes some time (maximum 1 minute in my experience) and until the storage account is provisioned, no operations are allowed on that storage account.
How can I include a wait process after of the storage account creation?
It seem that only with the bash sleep command or pausing the bash shell with the read command for some time is not enough..
Creation of a storage account is an asynchronous process. When you execute az storage account create to create a storage account, request is sent to Azure and you get an accepted response back (if everything went well).
The whole process to create (provision) a storage account takes some time (maximum 1 minute in my experience) and until the storage account is provisioned, no operations are allowed on that storage account.
This is the reason you're getting the error because you're immediately trying to perform an operation after sending the creation request.
Not sure how you would do it but what you would need to do is wait for the storage account to be provisioned. You can get the properties of the storage account periodically (say once every 5 seconds) and check for it's provisioning state. Once the provisioning state is successful, you can try to create a container. At that time your request should succeed.

Need to get the filename from a blob storage in azure using shell script

I am trying to fetch the filename in a blob storage so as to use it in my script further. I tried using az storage blob list and list the blobs present there, but unsuccessful to.
Here's the command that I used:
az storage blob list --connection-string connstr --container-name "vinny/input/"
It threw error as The requested URI does not represent any resource on the server.ErrorCode: InvalidUri
Seems like it would just take the container and not the folder in it. But when i tried:
az storage blob list --connection-string connstr --container-name "vinny"
It doesn't list the file but keeps on executing.
I need to get the filename that's inside vinny/input/
Anyone got any solution for it?
I just added a --prefix option to it and was able to list the file the way I wanted. Here it goes:
az storage blob list --connection-string connstr --container-name "vinny" --prefix "Input/" --output table
cli> az storage blob list -c container_name --account-name storage_accaunt_name --output=table --num-results=* | awk or cut
then you can parse txt file with awk and cut etc..
good luck

Azure CLI : bash script to upload files in parallel

Is there a Azure CLI upload option to parallel upload files to blob storage. There is folder with lots of files. Currently the only option I have is do a for loop with below command and upload is sequentially.
az storage blob upload --file $f --container-name $CONTAINERNAME --name $FILEINFO
For now, it is not possible. With the Azure CLI 2.0 there is no option or argument to upload the contents of a specified directory to Blob storage recursively. So, Azure CLi 2.0 does not support upload files in parallel.
If you want to upload multiple files in parallel, you could use Azcopy.
AzCopy /Source:C:\myfolder /Dest:https://myaccount.blob.core.windows.net/mycontainer /DestKey:key /S
Specifying option /S uploads the contents of the specified directory to Blob storage recursively, meaning that all subfolders and their files will be uploaded as well.
As you mentioned, you could use loop to upload files, but it does not support upload files in parallel. Try following script.
export AZURE_STORAGE_ACCOUNT='PUT_YOUR_STORAGE_ACCOUNT_HERE'
export AZURE_STORAGE_ACCESS_KEY='PUT_YOUR_ACCESS_KEY_HERE'
export container_name='nyc-tlc-sf'
export source_folder='/Volumes/MacintoshDisk02/Data/Misc/NYC_TLC/yellow/2012/*'
export destination_folder='yellow/2012/'
#echo "Creating container..."
#azure storage container create $container_name
for f in $source_folder
do
echo "Uploading $f file..."
azure storage blob upload $f $container_name $destination_folder$(basename $f)
done
echo "List all blobs in container..."
azure storage blob list $container_name
echo "Completed"

Resources