Receive partial file(sometimes) when reading from Google Storage using HTTP Response - httpresponse

I am trying to read files from Google Storage and write it to files in our filesystem (HDFS). If i run it for a period of time (lets say 7 days), sometimes i get the full file with lines matching with whats on the source and sometimes i get partial files (discrepancy is quite large). I am pasting below the method that takes a response and writes it to a file.
Any help or suggestions as to how i can troubleshoot this further would be much appreciated.
Thanks,
Before calling this method i do a simple check on the response status code -
if(response.getStatusCode() == 200 &&
StringUtils.equals(response.getContentType(), "application/zip")) {
writeHdfsFile(response, path);
}
private void writeHdfsFile(HttpResponse response, String path) throws IOException {
final GZIPInputStream inputStream = new GZIPInputStream(response.getContent());
Path filePath = new Path(path);
final FSDataOutputStream outputStream = fileSystem.create(filePath, true);
final byte[] buffer = new byte[1024];
int length;
try {
while((length = inputStream.read(buffer)) > 0) {
outputStream.write(buffer, 0, length);
}
outputStream.flush();
} finally {
inputStream.close();
outputStream.close();
}
}

The way we solved it was downloading the file first and then unzipping and writing it. Basically, splitting it into two steps solved that issue. If someone else ran into the same issue..

Related

Xamarin Cam2 IOnImageAvailableListener's OnImageAvailable called twice causing

UPDATE: The initial question has been answered as to why the crashes happen but the lingering problem remains of why is the 'OnImageAvailable' callback called so may times? When it is called, I want to do stuff with the image, but whatever method I run at that time is called many times. Is this the wrong place to be using the resulting image?
I am using the sample code found here for a Xamarin Android implementation of the Android Camera2 API. My issue is that when the capture button is pressed a single time, the OnCameraAvalibleListener's OnImageAvailable callback gets called multiple times.
This is causing a problem because the image from AcquireNextImage needs to be closed before another can be used, but close is not called until the Run method of the ImageSaver class as seen below.
This causes these 2 errors:
Unable to acquire a buffer item, very likely client tried to acquire
more than maxImages buffers
AND
maxImages (2) has already been acquired, call #close before acquiring
more.
The max image is set to 2 by default, but setting it to 1 does not help. How do I prevent the callback from being called twice?
public void OnImageAvailable(ImageReader reader)
{
var image = reader.AcquireNextImage();
owner.mBackgroundHandler.Post(new ImageSaver(image, file));
}
// Saves a JPEG {#link Image} into the specified {#link File}.
private class ImageSaver : Java.Lang.Object, IRunnable
{
// The JPEG image
private Image mImage;
// The file we save the image into.
private File mFile;
public ImageSaver(Image image, File file)
{
if (image == null)
throw new System.ArgumentNullException("image");
if (file == null)
throw new System.ArgumentNullException("file");
mImage = image;
mFile = file;
}
public void Run()
{
ByteBuffer buffer = mImage.GetPlanes()[0].Buffer;
byte[] bytes = new byte[buffer.Remaining()];
buffer.Get(bytes);
using (var output = new FileOutputStream(mFile))
{
try
{
output.Write(bytes);
}
catch (IOException e)
{
e.PrintStackTrace();
}
finally
{
mImage.Close();
}
}
}
}
The method OnImageAvailable can be called again as soon as you leave it if there is another picture in the pipeline.
I would recommend calling Close in the same method you are calling AcquireNextImage. So, if you choose to get the image directly from that callback, then you have to call Close in there as well.
One solution involved grabbing the image in that method and close it right away.
public void OnImageAvailable(ImageReader reader)
{
var image = reader.AcquireNextImage();
try
{
ByteBuffer buffer = mImage.GetPlanes()[0].Buffer;
byte[] bytes = new byte[buffer.Remaining()];
buffer.Get(bytes);
// I am not sure where you get the file instance but it is not important.
owner.mBackgroundHandler.Post(new ImageSaver(bytes, file));
}
finally
{
image.Close();
}
}
The ImageSaver would be modified to accept the byte array as first parameter in the constructor:
public ImageSaver(byte[] bytes, File file)
{
if (bytes == null)
throw new System.ArgumentNullException("bytes");
if (file == null)
throw new System.ArgumentNullException("file");
mBytes = bytes;
mFile = file;
}
The major downside of this solution is the risk of putting a lot of pressure on the memory as you basically save the images in memory until they are processed, one after another.
Another solution consists in acquiring the image on the background thread instead.
public void OnImageAvailable(ImageReader reader)
{
// Again, I am not sure where you get the file instance but it is not important.
owner.mBackgroundHandler.Post(new ImageSaver(reader, file));
}
This solution is less intensive on the memory; but you might have to increase the maximum number of images from 2 to something higher depending on your needs. Again, the ImageSaver's constructor needs to be modified to accept an ImageReader as a parameter:
public ImageSaver(ImageReader imageReader, File file)
{
if (imageReader == null)
throw new System.ArgumentNullException("imageReader");
if (file == null)
throw new System.ArgumentNullException("file");
mImageReader = imageReader;
mFile = file;
}
Now the Run method would have the responsibility of acquiring and releasing the Image:
public void Run()
{
Image image = mImageReader.AcquireNextImage();
try
{
ByteBuffer buffer = image.GetPlanes()[0].Buffer;
byte[] bytes = new byte[buffer.Remaining()];
buffer.Get(bytes);
using (var output = new FileOutputStream(mFile))
{
try
{
output.Write(bytes);
}
catch (IOException e)
{
e.PrintStackTrace();
}
}
}
finally
{
image?.Close();
}
}
I too facing this issue for longer time and tried implementing #kzrytof's solution but didn't helped well as expected but found the way to get the onImageAvailable to execute once.,
Scenario: When the image is available then the onImageAvailable method is called right?
so, What I did is after closing the image using image.close(); I called the imagereader.setonImageAvailableListener() and made the listener = null. this way I stopped the execution for second time.,
I know, that your question is for xamarin and my below code is in native android java but the method and functionalities are same, so try once:
#Override
public void onImageAvailable(ImageReader reader) {
final Image image=imageReader.acquireLatestImage();
try {
if (image != null) {
Image.Plane[] planes = image.getPlanes();
ByteBuffer buffer = planes[0].getBuffer();
int pixelStride = planes[0].getPixelStride();
int rowStride = planes[0].getRowStride();
int rowPadding = rowStride - pixelStride * width;
int bitmapWidth = width + rowPadding / pixelStride;
if (latestBitmap == null ||
latestBitmap.getWidth() != bitmapWidth ||
latestBitmap.getHeight() != height) {
if (latestBitmap != null) {
latestBitmap.recycle();
}
}
latestBitmap.copyPixelsFromBuffer(buffer);
}
}
catch(Exception e){
}
finally{
image.close();
imageReader.setOnImageAvailableListener(null, svc.getHandler());
}
// next steps to save the image
}

Spark on Windows - What exactly is winutils and why do we need it?

I'm curious! To my knowledge, HDFS needs datanode processes to run, and this is why it's only working on servers. Spark can run locally though, but needs winutils.exe which is a component of Hadoop. But what exactly does it do? How is it, that I cannot run Hadoop on Windows, but I can run Spark, which is built on Hadoop?
I know of at least one usage, it is for running shell commands on Windows OS. You can find it in org.apache.hadoop.util.Shell, other modules depends on this class and uses it's methods, for example getGetPermissionCommand() method:
static final String WINUTILS_EXE = "winutils.exe";
...
static {
IOException ioe = null;
String path = null;
File file = null;
// invariant: either there's a valid file and path,
// or there is a cached IO exception.
if (WINDOWS) {
try {
file = getQualifiedBin(WINUTILS_EXE);
path = file.getCanonicalPath();
ioe = null;
} catch (IOException e) {
LOG.warn("Did not find {}: {}", WINUTILS_EXE, e);
// stack trace comes at debug level
LOG.debug("Failed to find " + WINUTILS_EXE, e);
file = null;
path = null;
ioe = e;
}
} else {
// on a non-windows system, the invariant is kept
// by adding an explicit exception.
ioe = new FileNotFoundException(E_NOT_A_WINDOWS_SYSTEM);
}
WINUTILS_PATH = path;
WINUTILS_FILE = file;
WINUTILS = path;
WINUTILS_FAILURE = ioe;
}
...
public static String getWinUtilsPath() {
if (WINUTILS_FAILURE == null) {
return WINUTILS_PATH;
} else {
throw new RuntimeException(WINUTILS_FAILURE.toString(),
WINUTILS_FAILURE);
}
}
...
public static String[] getGetPermissionCommand() {
return (WINDOWS) ? new String[] { getWinUtilsPath(), "ls", "-F" }
: new String[] { "/bin/ls", "-ld" };
}
Though Max's answer covers the actual place where it's being referred. Let me give a brief background on why it needs it on Windows -
From Hadoop's Confluence Page itself -
Hadoop requires native libraries on Windows to work properly -that
includes accessing the file:// filesystem, where Hadoop uses some
Windows APIs to implement posix-like file access permissions.
This is implemented in HADOOP.DLL and WINUTILS.EXE.
In particular, %HADOOP_HOME%\BIN\WINUTILS.EXE must be locatable
And , I think you should be able to run both Spark and Hadoop on Windows.

How to speed up time when using Java Mail to save attachments?

I separate Message msg into Multipart multi1 = (Multipart) msg.getContent().
And a mail attachment is in one BodyPart, Part part = multi1.getBodyPart(i);
Then I want to save the attachment.
private void saveFile(String fileName, InputStream in) throws IOException {
File file = new File(fileName);
if (!file.exists()) {
OutputStream out = null;
try {
out = new BufferedOutputStream(new FileOutputStream(file));
in = new BufferedInputStream(in);
byte[] buf = new byte[BUFFSIZE];
int len;
while ((len = in.read(buf)) > 0) {
out.write(buf, 0, len);
}
} catch (FileNotFoundException e) {
LOG.error(e.toString());
} finally {
// close streams
if (in != null) {
in.close();
}
if (out != null) {
out.close();
}
}
}
But it cost too much time on reading IO Stream. For example,a 2.7M file needs almost 160 seconds to save on the disk. I have already tried Channel and some other IO Stream, but nothing changed. Any solution for saving attachment using Java Mail?
For more code information https://github.com/cainzhong/java-mail-demo/blob/master/src/main/java/com/java/mail/impl/ReceiveMailImpl.java
Actually, mail.imaps.partialfetch takes effect and speeds up a lot. There is a mistake for my previous code.
props.put("mail.imap.partialfetch","false");
props.put("mail.imap.fetchsize", "1048576");
props.put("mail.imaps.partialfetch", "false");
props.put("mail.imaps.fetchsize", "1048576");
instead of
props.put("mail.imap.partialfetch",false);
props.put("mail.imap.fetchsize", "1048576");
props.put("mail.imaps.partialfetch", false);
props.put("mail.imaps.fetchsize", "1048576");
It is important to put a quotation mark on "false". If not, the parameters will not take effects.
Anyway, thanks to Bill Shannon.
There's two key parts to this operation - reading the data from your mail server and writing the data to your filesystem. Most likely it's the speed of the server and the network connection to the server that's controlling the overall speed of the operation. You can try setting the mail.imap.fetchsize and mail.imap.partialfetch properties to see if that improves performance.
You can also try using something like NullOutputStream instead of FileOutputStream to measure only the speed of reading the data.

Cannot read two consecutive files with a Windows Service using StreamReader object

I need to be able to read lines of a file with a StreamReader processed by a FileSystemWatcher in a Windows service.
I've read and tried everything that made sense online, but it still doesn't work. When I'm attahced to my Windows service process (local machine using Visual Studio 2010), the whole thing works flawlessly!
When I try to run it (on my local machine) without attaching to it and debugging it, the second file never makes it through and I get the following msg:
"The process cannot access the file 'C:\Projects\Data\VendingStats\20121213_AZM_Journey_MIS.txt' because it is being used by another process." I do not have this file open anywhere else on my machine. It is just sitting in a directory. I then copy it in a directory and the FSW takes over (and the code below).
Can someone please tell me what I need to do to get this to work? I don't know why it works fine when I'm attached to and debugging it, but it doesn't work when I send the files through without being attached and debugging it. I feel it's defeintiely something on my local box that I need to disable, etc --- I don't know.....
I noticed that the error occurs even before it gets into the "using" statement, because the second file is never copied to the temp directory for it to be processed.
I noticed in my StackTrace, I'm getting the following error:
system.io.__error.winioerror(int32 errorcode string maybefullpath)
Here is my code:
protected override void OnStart(string[] args)
{
FileSystemWatcher Watcher = new FileSystemWatcher(#"C:\Projects\Data\VendingStats");
Watcher.EnableRaisingEvents = true;
Watcher.Created += new FileSystemEventHandler(Watcher_Created);
Watcher.Filter = "*.txt";
Watcher.IncludeSubdirectories = false;
}
private void Watcher_Created(object sender, FileSystemEventArgs e)
{
try
{
string targetPath = #"C:\Temp\VendorStats";
// Use Path class to manipulate file and directory paths.
FileInfo fi = new FileInfo(e.FullPath); // full name of path & file in the FSW directory
string destFile = Path.Combine(targetPath, fi.Name);
// To copy a folder's contents to a new location:
// Create a new target folder, if necessary.
if (!Directory.Exists(targetPath))
Directory.CreateDirectory(targetPath);
// To copy a file to another location and
File.Copy(e.FullPath, destFile, true);
// Set attribute to READONLY
if (fi.IsReadOnly == false)
fi.Attributes = FileAttributes.ReadOnly;
GetCruiseLineShipName(destFile, ref cruiseLine, ref shipName);
using (StreamReader sr = new StreamReader(File.Open(destFile, FileMode.Open, FileAccess.Read, FileShare.Read)))
{
filename = e.FullPath;
//How many lines should be loaded?
int NumberOfLines = 39;
//Read the number of lines and put them in the array
for (int i = 1; i < NumberOfLines; i++)
{
ListLines[i] = sr.ReadLine();
switch (i)
{
case 3:
int idx = ListLines[i].IndexOf(":");
string timeLine = ListLines[i].Substring(idx + 1);
dt = GetDate(Convert.ToDateTime(timeLine.Substring(1)));
break;
//more code here of the same
}
}
//InsertData into database }
}
catch (Exception ex)
{
EventLog.WriteEntry("VendorStats", "Error in the Main:" + "\r\n\r\n" + ex.Message + "\r\n\r\n" + ex.InnerException);
return;
}
}
The bottom line to solving this was to put the method (that was spawned by the FileSystemWatcher) to sleep for "X" amount of seconds until Windows completely releases the resources to the previous and present files as well as the folder.
It was the FileSystemWatcher that actaully had a hold on the resources.
Here is some sample code:
private static void Watcher_Created(object sender, FileSystemEventArgs e)
{
try
{
Thread.Sleep(10000);
GetCruiseLineShipName(e.FullPath, ref cruiseLine, ref shipName);
using (StreamReader sr = new StreamReader(File.Open(e.FullPath, FileMode.Open, FileAccess.Read, FileShare.Read)))
{

Images takes so much time to load in j2me mobile application

I have used below code to fetch image from server. I have 60 different images placed on server. I have urls of all those images. By using while loop I am getting all these images but it's taking so much time to load image from server.
What can I do to get these images as fast as possible?
public Image getImagefromURL(String imageURL) {
DataInputStream is = null;
StringBuffer sb = new StringBuffer();
Image img = null;
try {
HttpConnection c = (HttpConnection) Connector.open(imageURL);
int len = (int) c.getLength();
if (len > 0) {
is = c.openDataInputStream();
byte[] data = new byte[len];
is.readFully(data);
img = Image.createImage(data, 0, len);
}
} catch (Exception e) {
e.printStackTrace();
}
return img;
}
And also one thing is happening that when I fetch first image, application is confirming me like "Application wants to connect to [URL of image location] using airtime. IS it ok to use airtime?" here I want to hide my path of image location. How can I do that?
The bigger the image, the longer it will take to load it. Make sure your PNG images are compressed with a tool, for example, PNGGauntlet.
You can also add a local cache on the application side using RMS.
And a last tip... You should not rely on HttpConnection.getLength, it might come as zero even when there is data to be read.

Resources