I have a large array with around 1 million data objects. I have found lots of samples for uploading files to azure blob storage. I guess you could do it with a memory stream, but I haven't found samples for doing it from objects. I am not sure with that size of data whether you should write line by line of what options I have. All input are welcome, would be perfect with some samples. The goal is to write the data objects to a csv file in Azure Blob Storage.
I assume you are writing the code in C# with the latest version of Azure Storage SDK for .NET (9.3.3).
Here is my code for realizing your needs to write a large array of data objects directly to Azure Blob Storage.
using System;
using System.Collections;
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Blob;
namespace WriteCSVDataToBlob
{
class Record
{
string[] cols;
public Record(string[] cols)
{
this.cols= cols;
}
override public string ToString()
{
return String.Join(',', cols);
}
}
class Program
{
static void Main(string[] args)
{
var arr = new ArrayList();
arr.Add(new Record(new string[]{ "A", "B","one" }));
arr.Add(new Record(new string[] { "C", "D","two"}));
string storageConnectionString = "<your storage connection string>";
var storageAccount = CloudStorageAccount.Parse(storageConnectionString);
var client = storageAccount.CreateCloudBlobClient();
var container = client.GetContainerReference("test");
var blob = container.GetBlockBlobReference("data.txt");
using (CloudBlobStream x = blob.OpenWriteAsync().Result)
{
foreach(var rec in arr)
{
x.Write(System.Text.Encoding.Default.GetBytes(rec.ToString()+"\n"));
}
x.Flush();
x.Close();
}
}
}
}
#Peter Pan's solution works for Microsoft.Azure.Storage.Blob v11.1.0.
In newer version of azure blob storage you can use something like this:
using (var writer = new StreamWriter(new MemoryStream()))
using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture))
{
csv.Configuration.Delimiter = userInputModel.Separator;
csv.WriteRecords(arr);
writer.Flush();
writer.BaseStream.Seek(0, SeekOrigin.Begin);
client.AppendBlock(writer.BaseStream);
}
Related
I need to lowercase all filenames during UploadDirectoryAsync - is this possible to control or set via 'SetAttributesCallback'??
I cannot control the local physical files or rename them locally before uploading them to azure via Azure Storage Data Movement Library.
The end result will be that source and destination filename always will be with lowercase.
Any solution out there??
I need to lowercase all filenames during UploadDirectoryAsync - is this possible to control or set via 'SetAttributesCallback'
Yes, we could do that in the SetAttributesCallback, currently there is no rename Azure blob API, so we could upload the required renamed file in the SetAttributesCallback and delete the UploadDirectoryAsync load file. I also test it on my side, it works correctly.
The following is my demo code.
using System;
using System.IO;
using System.Linq;
using System.Threading;
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Blob;
using Microsoft.WindowsAzure.Storage.DataMovement;
namespace DataMovementTest
{
class Program
{
static void Main(string[] args)
{
string storageConnectionString = "storage connection string";
CloudStorageAccount account = CloudStorageAccount.Parse(storageConnectionString);
CloudBlobClient blobClient = account.CreateCloudBlobClient();
CloudBlobContainer blobContainer = blobClient.GetContainerReference("container name");
blobContainer.CreateIfNotExists();
var destDir = blobContainer.GetDirectoryReference("directory name");
string sourceDirPath = #"local path";
var options = new UploadDirectoryOptions
{
Recursive = false,
BlobType = BlobType.BlockBlob
};
using (MemoryStream journalStream = new MemoryStream())
{
// Store the transfer context in a streamed journal.
DirectoryTransferContext context = new DirectoryTransferContext(journalStream)
{
SetAttributesCallback = (destination) =>
{
CloudBlob destBlob = destination as CloudBlob;
if (System.Text.RegularExpressions.Regex.IsMatch(destBlob.Uri.Segments.Last(), "[A-Z]")) //check whether blobName contains uppercase
{
var path = sourceDirPath + $"/{destBlob.Uri.Segments.Last()}";
Console.WriteLine(path);
var renameBlob = destDir.GetBlockBlobReference(destBlob.Uri.Segments.Last().ToLower());
using (var fileStream = File.OpenRead(path))
{
renameBlob.UploadFromStream(fileStream);
}
destBlob.DeleteIfExists();
}
},
ShouldTransferCallback = (source, destination) => true
};
CancellationTokenSource cancellationSource = new CancellationTokenSource();
try
{
// Start the upload
var uploadResult = TransferManager.UploadDirectoryAsync(sourceDirPath, destDir, options, context, cancellationSource.Token).Result;
}
catch (Exception e)
{
Console.WriteLine("The transfer is cancelled: {0}", e.Message);
}
Console.WriteLine("Files in directory {0} uploading to {1} is finished.", sourceDirPath, destDir.Uri.ToString());
}
}
}
}
I'm developing the service within ASP.NET Boilerplate engine and getting the error from the subject. The nature of the error is not clear, as I inheriting from ApplicationService, as documentation suggests. The code:
namespace MyAbilities.Api.Blob
{
public class BlobService : ApplicationService, IBlobService
{
public readonly IRepository<UserMedia, int> _blobRepository;
public BlobService(IRepository<UserMedia, int> blobRepository)
{
_blobRepository = blobRepository;
}
public async Task<List<BlobDto>> UploadBlobs(HttpContent httpContent)
{
var blobUploadProvider = new BlobStorageUploadProvider();
var list = await httpContent.ReadAsMultipartAsync(blobUploadProvider)
.ContinueWith(task =>
{
if (task.IsFaulted || task.IsCanceled)
{
if (task.Exception != null) throw task.Exception;
}
var provider = task.Result;
return provider.Uploads.ToList();
});
// store blob info in the database
foreach (var blobDto in list)
{
SaveBlobData(blobDto);
}
return list;
}
public void SaveBlobData(BlobDto blobData)
{
UserMedia um = blobData.MapTo<UserMedia>();
_blobRepository.InsertOrUpdateAndGetId(um);
CurrentUnitOfWork.SaveChanges();
}
public async Task<BlobDto> DownloadBlob(int blobId)
{
// TODO: Implement this helper method. It should retrieve blob info
// from the database, based on the blobId. The record should contain the
// blobName, which should be returned as the result of this helper method.
var blobName = GetBlobName(blobId);
if (!String.IsNullOrEmpty(blobName))
{
var container = BlobHelper.GetBlobContainer();
var blob = container.GetBlockBlobReference(blobName);
// Download the blob into a memory stream. Notice that we're not putting the memory
// stream in a using statement. This is because we need the stream to be open for the
// API controller in order for the file to actually be downloadable. The closing and
// disposing of the stream is handled by the Web API framework.
var ms = new MemoryStream();
await blob.DownloadToStreamAsync(ms);
// Strip off any folder structure so the file name is just the file name
var lastPos = blob.Name.LastIndexOf('/');
var fileName = blob.Name.Substring(lastPos + 1, blob.Name.Length - lastPos - 1);
// Build and return the download model with the blob stream and its relevant info
var download = new BlobDto
{
FileName = fileName,
FileUrl = Convert.ToString(blob.Uri),
FileSizeInBytes = blob.Properties.Length,
ContentType = blob.Properties.ContentType
};
return download;
}
// Otherwise
return null;
}
//Retrieve blob info from the database
private string GetBlobName(int blobId)
{
throw new NotImplementedException();
}
}
}
The error appears even before the app flow jumps to 'SaveBlobData' method. Am I missed something?
Hate to answer my own questions, but here it is... after a while, I found out that if UnitOfWorkManager is not available for some reason, I can instantiate it in the code, by initializing IUnitOfWorkManager in the constructor. Then, you can simply use the following construction in your Save method:
using (var unitOfWork = _unitOfWorkManager.Begin())
{
//Save logic...
unitOfWork.Complete();
}
I use only Microsoft Azure Storage and no other Azure products/services. I upload files to my storage blob via ftp type client (GoodSync), and I need to change the content type of all the files based on their file extension after they are already in the Blob. I have looked around and have not found out how to do this without having one of their VPS with PowerShell. What are my options and how do I accomplish this? I really need step by step here.
I recently had the same issue so I created a simple utility class in order to "fix" content type based on file's extension. You can read details here
What you need to do is parse each file in your Azure Storage Containers and update ContentType based on a dictionary that defines which MIME type is appropriate for each file extension.
// Connect to your storage account
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(storageConnectionString);
// Load Container with the specified name
private CloudBlobContainer GetCloudBlobContainer(string name)
{
CloudBlobClient cloudBlobClient = _storageAccount.CreateCloudBlobClient();
return cloudBlobClient.GetContainerReference(name.ToLowerInvariant());
}
// Parse all files in your container and apply proper ContentType
private void ResetContainer(CloudBlobContainer container)
{
if (!container.Exists()) return;
Trace.WriteLine($"Ready to parse {container.Name} container");
Trace.WriteLine("------------------------------------------------");
var blobs = container.ListBlobs().ToList();
var total = blobs.Count;
var counter = 1;
foreach (var blob in blobs)
{
if (blob is CloudBlobDirectory) continue;
var cloudBlob = (CloudBlob)blob;
var extension = Path.GetExtension(cloudBlob.Uri.AbsoluteUri);
string contentType;
_contentTypes.TryGetValue(extension, out contentType);
if (string.IsNullOrEmpty(contentType)) continue;
Trace.Write($"{counter++} of {total} : {cloudBlob.Name}");
if (cloudBlob.Properties.ContentType == contentType)
{
Trace.WriteLine($" ({cloudBlob.Properties.ContentType}) (skipped)");
continue;
}
cloudBlob.Properties.ContentType = contentType;
cloudBlob.SetProperties();
Trace.WriteLine($" ({cloudBlob.Properties.ContentType}) (reset)");
}
}
_contentTypes is a dictionary that contains the appropriate MIME type for each file extension:
private readonly Dictionary _contentTypes = new Dictionary()
{
{".jpeg", "image/jpeg"},
{".jpg", "image/jpeg" }
};
Full list of content types and source code can be found here.
Here you are a refreshed version for latest Azure.Storage.Blobs SDK. I'm using .Net 5 and console app.
using Azure.Storage.Blobs.Models;
using System;
using System.Collections.Generic;
using System.IO;
var contentTypes = new Dictionary<string, string>()
{
{".woff", "font/woff"},
{".woff2", "font/woff2" }
};
var cloudBlobClient = new BlobServiceClient("connectionstring");
var cloudBlobContainerClient = cloudBlobClient.GetBlobContainerClient("fonts");
await cloudBlobContainerClient.CreateIfNotExistsAsync();
var blobs = cloudBlobContainerClient.GetBlobsAsync();
await foreach (var blob in blobs)
{
var extension = Path.GetExtension(blob.Name);
contentTypes.TryGetValue(extension, out var contentType);
if (string.IsNullOrEmpty(contentType)) continue;
if (blob.Properties.ContentType == contentType)
{
continue;
}
try
{
// Get the existing properties
var blobClient = cloudBlobContainerClient.GetBlobClient(blob.Name);
var properties = await blobClient.GetPropertiesAsync();
var headers = new BlobHttpHeaders
{
ContentType = contentType,
CacheControl = properties.CacheControl,
ContentDisposition = properties.ContentDisposition,
ContentEncoding = properties.ContentEncoding,
ContentHash = properties.ContentHash,
ContentLanguage = properties.ContentLanguage
};
// Set the blob's properties.
await blobClient.SetHttpHeadersAsync(headers);
}
catch (RequestFailedException e)
{
Console.WriteLine($"HTTP error code {e.Status}: {e.ErrorCode}");
Console.WriteLine(e.Message);
Console.ReadLine();
}
}
I'm working with large data in wcf (gigs). I need to return this to the client and the previous code wrote a memory stream and then returned this to the client as a property on a streamed data contract.
The problem is that the data has grown and the memory stream now uses all available memory and eventually fails before all the data is written to the stream.
Is it possible to write the data to the stream on-demand, so the server is writing it was the client is requesting it?
Here's our code;
internal StreamedFileResponse CreateStream()
{
var stream = new MemoryStream();
using (var writer = new CsvWriter(stream, Encoding.UTF8))
{
writer.Write(rowFactory.CreateHeader());
foreach (var source in sources)
{
var row = rowFactory.Create(source);
if (row != null)
{
writer.Write(row);
}
}
}
stream.Seek(0, SeekOrigin.Begin);
return new StreamedFileResponse { Data = stream, Length = stream.Length };
}
[MessageContract(WrapperNamespace = "http://www.contoso.com/services/contract/reports/streamed")]
public class StreamedFileResponse : IDisposable
{
[MessageBodyMember]
public Stream Data { get; set; }
[MessageHeader(MustUnderstand = true)]
public long Length { get; set; }
public void Dispose()
{
Data.Dispose();
}
}
I want to encrypt and decrypt the isolated storage file.
The Microsoft site took me here
While using Isolated Storage on the emulator, it can persist only until the emulator is running.
There is no way to get the physical location of the Isolated Storage.
I hope the above statements of mine are correct.
Now, I want to know how can I encrypt the Isolated Storage file ?
Taking the example provided by Microsoft, (application name is GasMileage)
here is the code
namespace CodeBadger.GasMileage.Persistence
{
public class IsolatedStorageGateway
{
private const string StorageFile = "data.txt";
private readonly XmlSerializer _serializer;
public IsolatedStorageGateway()
{
_serializer = new XmlSerializer(typeof (Notebook));
}
public Notebook LoadNotebook()
{
using (var store = IsolatedStorageFile.GetUserStoreForApplication())
{
using (var stream = GetStorageStreamForReading(store))
using (var reader = new StreamReader(stream))
{
return reader.EndOfStream
? new Notebook()
: (Notebook) _serializer.Deserialize(reader);
}
}
}
public NotebookEntry LoadEntry(Guid guid)
{
var notebook = LoadNotebook();
return notebook.Where(x => x.Id == guid).FirstOrDefault();
}
public void StoreEntry(NotebookEntry entry)
{
var notebook = LoadNotebook();
AssignId(entry);
RemoveExistingEntryFromNotebook(notebook, entry);
Console.WriteLine(entry);
notebook.Add(entry);
WriteNotebookToStorage(notebook);
}
public void DeleteEntry(NotebookEntry entry)
{
var notebook = LoadNotebook();
RemoveExistingEntryFromNotebook(notebook, entry);
WriteNotebookToStorage(notebook);
}
private void WriteNotebookToStorage(Notebook notebook)
{
using (var store = IsolatedStorageFile.GetUserStoreForApplication())
using (var stream = GetStorageStreamForWriting(store))
{
_serializer.Serialize(stream, notebook);
}
}
private static void AssignId(NotebookEntry entry)
{
if (entry.Id == Guid.Empty) entry.Id = Guid.NewGuid();
}
private static void RemoveExistingEntryFromNotebook(Notebook notebook, NotebookEntry entry)
{
var toRemove = notebook.Where(x => x.Id == entry.Id).FirstOrDefault();
if (toRemove == null) return;
notebook.Remove(toRemove);
}
private static IsolatedStorageFileStream GetStorageStreamForWriting(IsolatedStorageFile store)
{
return new IsolatedStorageFileStream(StorageFile, FileMode.Create, FileAccess.Write, store);
}
private static IsolatedStorageFileStream GetStorageStreamForReading(IsolatedStorageFile store)
{
return new IsolatedStorageFileStream(StorageFile, FileMode.OpenOrCreate, FileAccess.Read, store);
}
}
Now I want to know, How to encrypt the data.txt given in the context.
On Application load, decrypt the file and on application termination, it should encrypt.
Can someone help me on this ?
The ProtectedData class will encrypt/decrypt a byte array for storing on isolated storage. You can supply your own additional entropy, but by default:
In Silverlight for Windows Phone, both the user and machine credentials are used to encrypt or decrypt data
For more information, see How to: Encrypt Data in a Windows Phone Application