Appassure & Azure Blob storage - azure-blob-storage

I'm trying to use Appassure with Azure Blob storage however tle Blob storage does not work with Appassure, and if I want to get ZRS replication only Blob is available.
Is there a way to configure it to work ?
Thanks,

You probably need check with Dell to confirm if AppAssure supports Azure Blob Storage as backup destination or not.
You can also check Azure Backup service to see if it fit your replication requirements
https://azure.microsoft.com/en-us/services/backup/

Related

How to move and copy files/read files from Azure Blob into databricks and transform the file and send to target bob container

I am new beginner in to Azure Databricks. I wanted to know how can we copy and read files from Azure Blob source container into databricks, transform as needed and sent back to target container in blob.
Can some provide python code here?
It's not recommended to copy files to DBFS. I would suggest you to mount the blob storage account and then you can read/write files to the storage account.
You can mount a Blob storage container or a folder inside a container to Databricks File System (DBFS). The mount is a pointer to a Blob storage container, so the data is never synced locally.
Reference: https://learn.microsoft.com/en-us/azure/databricks/data/data-sources/azure/azure-storage
Do not Store any Production Data in Default DBFS Folders
.
Reference: Azure Databricks - best practices.

Azure databricks cluster local storage maximum size

I am having a databrick cluster on Azure,
there is a local storage /mnt /tmp /user..
May I know are there any folder size limitation for each of the folder ?
And how long the data will be retention ?
Databricks File System (DBFS) is a distributed file system mounted into an Azure Databricks workspace and available on Azure Databricks clusters. DBFS is an abstraction on top of scalable object storage i.e. ADLS gen2.
There is no restriction on the amount of data you can store in Azure Data Lake Storage Gen2.
Note: Azure Data Lake Storage Gen2 able to store and serve many exabytes of data.
For Azure Databricks Filesystem (DBFS) - Support only files less than 2GB in size.
Note: If you use local file I/O APIs to read or write files larger than 2GB you might see corrupted files. Instead, access files larger than 2GB using the DBFS CLI, dbutils.fs, or Spark APIs or use the /dbfs/ml folder.
For Azure Storage – Maximum storage account capacity is 5 PiB Petabytes.
For more details, refer What is the Data size limit of DBFS in Azure Databricks.

Access azure file from hadoop

I am able to access azure storage blobs from hadoop by using the folowing command
wasb[s]://#.blob.core.windows.net/
But i am not able to access Azure file. Can anyone suggest how to access azure storage files from Hadoop just like blobs ?
HDInsight can use a blob container in Azure Storage as the default
file system for the cluster. Through a Hadoop distributed file system
(HDFS) interface, the full set of components in HDInsight can operate
directly on structured or unstructured data stored as blobs.
From the official document , HDinsight only supports Azure Blob Storage.
File Storage is not supported currently.

What does Azure Recovery Services actually backup?

I have some questions about Azure recovery services that I can't find on the azure website:
If I have a windows VM with SQL and IIS installed and a network drive (azure file service account). What will actually be backuped? Do all files from all drives get backuped?
Is it possible to download the backuped files? or at least see where they live?
Can you set your own storage account for Azure recovery services?
Does Site Replication have a purpose for Azure VM's, or only for on-premise servers. I can't really figure out what site replication does.
How do I delete a backup after I created it, the delete-backup button always seems disabled.
What happens when I do a restore, does it basically just write back a copy of the VHD to my storage around and reboots the VM?
If I have a windows VM with SQL and IIS installed and a network drive
(azure file service account). What will actually be backuped? Do all
files from all drives get backuped?
Base on my experience, Azure Recovery service will backup your data disks(no more than 16), storage, and system environments of VMs at the scheduled time. And backup entension to take a point to time snapshot and transfer this data into backup vault. Please refer to this document and the picture:
Is it possible to download the backuped files? or at least see where they live?
You can check them status[Backup Item] on Azure portal, such as this picture:
Can you set your own storage account for Azure recovery services?
No. So far, the backup items is store into storage. But the data is encrypted.
see this document commentpost.
Does Site Replication have a purpose for Azure VM's, or only for
on-premise servers. I can't really figure out what site replication
does.
How do I delete a backup after I created it, the delete-backup
button always seems disabled.
From your description, I think you may need to understanding the concept of Azure Site Recovery and Azure back up service, please refer to these document for more details:site-recovery-overview and azure backup. Then you can follow the document to manage your backup items.Also, you can delete it.
What happens when I do a restore, does it basically just write back a
copy of the VHD to my storage around and reboots the VM?
The data is retrieved from the Azure Recovery Services vault.
Please refer to this document about how to restore the backup item into the same service and other servers.

Downloading files from Google Cloud Storage straight into HDFS and Hive tables

I'm working on Windows command line as problems with Unix and firewalls prevent gsutil from working. I can read my Google Cloud Storage files and copy them over to other buckets (which I don't need to do). What I'm wondering is how to download them directly into HDFS (which I'm 'ssh'ing into)? Has anyone done this? Ideally this is part one, part two is creating Hive tables for the Google Cloud Storage data so we can use HiveQL and Pig.
You can use the Google Cloud Storage connector which provides an HDFS-API compatible interface to your data already in Google Cloud Storage, so you don't even need to copy it anywhere, just read from and write directly to your Google Cloud Storage buckets/objects.
Once you set up the connector, you can also copy data between HDFS and Google Cloud Storage with the hdfs tool, if necessary.

Resources