HDFS append in 0.23.5 using libhdfs - hadoop

I am trying to append to a hdfs file on ver 0.23.5. I have set the property dfs.support.append to true in hdfs-site.xml. I am getting the following error when calling hdfsWrite() saying append is not supported.
Exception in thread "main" java.io.IOException: Not supported
at org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:345)
at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1046)
Call to org.apache.hadoop.conf.FileSystem::append((Lorg/apache/hadoop/fs/Path;)Lorg/apache/hadoop/fs/FSDataOutputStream;) failed!
I checked past literature on append in hdfs. Looks like append should work in 0.23.5.
I am able to insert and read. The problem is when I try to open for O_APPEND and write to the file. Here is the sample code -
int append(char *filepath, char *data, int size)
{
hdfsFS fs = hdfsConnect("default", 0);
int openFlags = O_WRONLY | O_APPEND;
hdfsFile fdData = hdfsOpenFile(fs, filepath, openFlags, 0, 0, 0);
if (!fdData)
return -1;
if (hdfsWrite(fs, fdData, data, size) == -1)
return -1;
hdfsCloseFile(fs, fdData);
return 0;
}
Am I missing something?
thanks.

FSDataOutputStream doesn't support the append method. Here's the relevant source:
public FSDataOutputStream append(Path f, int bufferSize, Progressable progress) throws IOException {
throw new IOException("Not supported");
}
You'll notice that its super class FileSystem notes that append is an optional method.

Related

While using FileReader class in java why is it compulsory to store the value of r.read in i first and then type caste it? why can't we typecast r.read

[it is giving wrong output when I use f1.read directly .. why is it compulsory to use (i=f1.read) and then type caste i into the file .. why can't we use f1.read directly..] 1
//why this code is giving wrong output
// why is it compulsory to first store the value of f1.read in i
import static java.lang.System.out;
import java.io.*;
import java.util.*;
public class third{
public static void main(String [] args) {
try{
FileReader f1 = new FileReader("C:\\Users\\Akshita Agarwal\\Desktop\\a.txt");
FileWriter f2 = new FileWriter("C:\\Users\\Akshita Agarwal\\Desktop\\b.txt");
try{
while(f1.read() !=-1){
f2.write((char)f1.read());
}
}
finally {
f2.close();
}
}
catch(IOException e){
out.println(e);
}
}
}
char is two bytes unsigned. int -1 is outside this range.
Your code does not close f1 and it writes half of the file, every second char.
It could be written as:
for (;;) {
int c = f1.read();
if (c == -1) {
break;
}
f2.write((char)c);
}
You suggestion
for (;;) {
char c = (char) f1.read();
if (c == (char)-1) {
break;
}
f2.write(c);
}
has one drawback.
You could embed a ((char) -1) or '\uFFFF' in the middle of your file, and only
half of the file would be read.
The shortest code is:
for (int c; (c = f1.read()) != -1;) {
f2.write((char)c);
}
However it:
reads and writes per character
there is a conversion reading from binary data using the platform encoding to (Unicode) text
there is a conversion writing from (Unicode) text to binary data using the platform encoding.
Here you could do:
Path path1 = Paths.get("C:\\Users\\Akshita Agarwal\\Desktop\\a.txt");
Path path2 = Paths.get(System.getProperty("user.home"), "Desktop\\b.txt");
Files.copy(path1, path2);
or read by lines:
Charset charset = Charset.defaultCharset();
try (BufferedWriter writer = Files.newBufferedWriter(path2, charset));
Stream<String> reader = Files.lines(path1, charset)) {
reader.forEach(line -> bw.write(line + "\r\n"));
} // Automatically closes reader and writer.
The charset is needed to use the platform encoding. Per default it uses UTF-8, Unicode for the full range of possible characters for all languages.
Try-with-resources is a weird syntax try (<DECLARATIONS>) { ... } that ensures closing of the (auto-)closeable variables in the declarations. Even on return, break or exception.
The lines read are stripped from line endings CR-LF = "\r\n" (Windows), LF (Unix), CR, NEL (AS/400 e.a.).

Xamarin Cam2 IOnImageAvailableListener's OnImageAvailable called twice causing

UPDATE: The initial question has been answered as to why the crashes happen but the lingering problem remains of why is the 'OnImageAvailable' callback called so may times? When it is called, I want to do stuff with the image, but whatever method I run at that time is called many times. Is this the wrong place to be using the resulting image?
I am using the sample code found here for a Xamarin Android implementation of the Android Camera2 API. My issue is that when the capture button is pressed a single time, the OnCameraAvalibleListener's OnImageAvailable callback gets called multiple times.
This is causing a problem because the image from AcquireNextImage needs to be closed before another can be used, but close is not called until the Run method of the ImageSaver class as seen below.
This causes these 2 errors:
Unable to acquire a buffer item, very likely client tried to acquire
more than maxImages buffers
AND
maxImages (2) has already been acquired, call #close before acquiring
more.
The max image is set to 2 by default, but setting it to 1 does not help. How do I prevent the callback from being called twice?
public void OnImageAvailable(ImageReader reader)
{
var image = reader.AcquireNextImage();
owner.mBackgroundHandler.Post(new ImageSaver(image, file));
}
// Saves a JPEG {#link Image} into the specified {#link File}.
private class ImageSaver : Java.Lang.Object, IRunnable
{
// The JPEG image
private Image mImage;
// The file we save the image into.
private File mFile;
public ImageSaver(Image image, File file)
{
if (image == null)
throw new System.ArgumentNullException("image");
if (file == null)
throw new System.ArgumentNullException("file");
mImage = image;
mFile = file;
}
public void Run()
{
ByteBuffer buffer = mImage.GetPlanes()[0].Buffer;
byte[] bytes = new byte[buffer.Remaining()];
buffer.Get(bytes);
using (var output = new FileOutputStream(mFile))
{
try
{
output.Write(bytes);
}
catch (IOException e)
{
e.PrintStackTrace();
}
finally
{
mImage.Close();
}
}
}
}
The method OnImageAvailable can be called again as soon as you leave it if there is another picture in the pipeline.
I would recommend calling Close in the same method you are calling AcquireNextImage. So, if you choose to get the image directly from that callback, then you have to call Close in there as well.
One solution involved grabbing the image in that method and close it right away.
public void OnImageAvailable(ImageReader reader)
{
var image = reader.AcquireNextImage();
try
{
ByteBuffer buffer = mImage.GetPlanes()[0].Buffer;
byte[] bytes = new byte[buffer.Remaining()];
buffer.Get(bytes);
// I am not sure where you get the file instance but it is not important.
owner.mBackgroundHandler.Post(new ImageSaver(bytes, file));
}
finally
{
image.Close();
}
}
The ImageSaver would be modified to accept the byte array as first parameter in the constructor:
public ImageSaver(byte[] bytes, File file)
{
if (bytes == null)
throw new System.ArgumentNullException("bytes");
if (file == null)
throw new System.ArgumentNullException("file");
mBytes = bytes;
mFile = file;
}
The major downside of this solution is the risk of putting a lot of pressure on the memory as you basically save the images in memory until they are processed, one after another.
Another solution consists in acquiring the image on the background thread instead.
public void OnImageAvailable(ImageReader reader)
{
// Again, I am not sure where you get the file instance but it is not important.
owner.mBackgroundHandler.Post(new ImageSaver(reader, file));
}
This solution is less intensive on the memory; but you might have to increase the maximum number of images from 2 to something higher depending on your needs. Again, the ImageSaver's constructor needs to be modified to accept an ImageReader as a parameter:
public ImageSaver(ImageReader imageReader, File file)
{
if (imageReader == null)
throw new System.ArgumentNullException("imageReader");
if (file == null)
throw new System.ArgumentNullException("file");
mImageReader = imageReader;
mFile = file;
}
Now the Run method would have the responsibility of acquiring and releasing the Image:
public void Run()
{
Image image = mImageReader.AcquireNextImage();
try
{
ByteBuffer buffer = image.GetPlanes()[0].Buffer;
byte[] bytes = new byte[buffer.Remaining()];
buffer.Get(bytes);
using (var output = new FileOutputStream(mFile))
{
try
{
output.Write(bytes);
}
catch (IOException e)
{
e.PrintStackTrace();
}
}
}
finally
{
image?.Close();
}
}
I too facing this issue for longer time and tried implementing #kzrytof's solution but didn't helped well as expected but found the way to get the onImageAvailable to execute once.,
Scenario: When the image is available then the onImageAvailable method is called right?
so, What I did is after closing the image using image.close(); I called the imagereader.setonImageAvailableListener() and made the listener = null. this way I stopped the execution for second time.,
I know, that your question is for xamarin and my below code is in native android java but the method and functionalities are same, so try once:
#Override
public void onImageAvailable(ImageReader reader) {
final Image image=imageReader.acquireLatestImage();
try {
if (image != null) {
Image.Plane[] planes = image.getPlanes();
ByteBuffer buffer = planes[0].getBuffer();
int pixelStride = planes[0].getPixelStride();
int rowStride = planes[0].getRowStride();
int rowPadding = rowStride - pixelStride * width;
int bitmapWidth = width + rowPadding / pixelStride;
if (latestBitmap == null ||
latestBitmap.getWidth() != bitmapWidth ||
latestBitmap.getHeight() != height) {
if (latestBitmap != null) {
latestBitmap.recycle();
}
}
latestBitmap.copyPixelsFromBuffer(buffer);
}
}
catch(Exception e){
}
finally{
image.close();
imageReader.setOnImageAvailableListener(null, svc.getHandler());
}
// next steps to save the image
}

How to speed up time when using Java Mail to save attachments?

I separate Message msg into Multipart multi1 = (Multipart) msg.getContent().
And a mail attachment is in one BodyPart, Part part = multi1.getBodyPart(i);
Then I want to save the attachment.
private void saveFile(String fileName, InputStream in) throws IOException {
File file = new File(fileName);
if (!file.exists()) {
OutputStream out = null;
try {
out = new BufferedOutputStream(new FileOutputStream(file));
in = new BufferedInputStream(in);
byte[] buf = new byte[BUFFSIZE];
int len;
while ((len = in.read(buf)) > 0) {
out.write(buf, 0, len);
}
} catch (FileNotFoundException e) {
LOG.error(e.toString());
} finally {
// close streams
if (in != null) {
in.close();
}
if (out != null) {
out.close();
}
}
}
But it cost too much time on reading IO Stream. For example,a 2.7M file needs almost 160 seconds to save on the disk. I have already tried Channel and some other IO Stream, but nothing changed. Any solution for saving attachment using Java Mail?
For more code information https://github.com/cainzhong/java-mail-demo/blob/master/src/main/java/com/java/mail/impl/ReceiveMailImpl.java
Actually, mail.imaps.partialfetch takes effect and speeds up a lot. There is a mistake for my previous code.
props.put("mail.imap.partialfetch","false");
props.put("mail.imap.fetchsize", "1048576");
props.put("mail.imaps.partialfetch", "false");
props.put("mail.imaps.fetchsize", "1048576");
instead of
props.put("mail.imap.partialfetch",false);
props.put("mail.imap.fetchsize", "1048576");
props.put("mail.imaps.partialfetch", false);
props.put("mail.imaps.fetchsize", "1048576");
It is important to put a quotation mark on "false". If not, the parameters will not take effects.
Anyway, thanks to Bill Shannon.
There's two key parts to this operation - reading the data from your mail server and writing the data to your filesystem. Most likely it's the speed of the server and the network connection to the server that's controlling the overall speed of the operation. You can try setting the mail.imap.fetchsize and mail.imap.partialfetch properties to see if that improves performance.
You can also try using something like NullOutputStream instead of FileOutputStream to measure only the speed of reading the data.

Receive partial file(sometimes) when reading from Google Storage using HTTP Response

I am trying to read files from Google Storage and write it to files in our filesystem (HDFS). If i run it for a period of time (lets say 7 days), sometimes i get the full file with lines matching with whats on the source and sometimes i get partial files (discrepancy is quite large). I am pasting below the method that takes a response and writes it to a file.
Any help or suggestions as to how i can troubleshoot this further would be much appreciated.
Thanks,
Before calling this method i do a simple check on the response status code -
if(response.getStatusCode() == 200 &&
StringUtils.equals(response.getContentType(), "application/zip")) {
writeHdfsFile(response, path);
}
private void writeHdfsFile(HttpResponse response, String path) throws IOException {
final GZIPInputStream inputStream = new GZIPInputStream(response.getContent());
Path filePath = new Path(path);
final FSDataOutputStream outputStream = fileSystem.create(filePath, true);
final byte[] buffer = new byte[1024];
int length;
try {
while((length = inputStream.read(buffer)) > 0) {
outputStream.write(buffer, 0, length);
}
outputStream.flush();
} finally {
inputStream.close();
outputStream.close();
}
}
The way we solved it was downloading the file first and then unzipping and writing it. Basically, splitting it into two steps solved that issue. If someone else ran into the same issue..

Cannot read two consecutive files with a Windows Service using StreamReader object

I need to be able to read lines of a file with a StreamReader processed by a FileSystemWatcher in a Windows service.
I've read and tried everything that made sense online, but it still doesn't work. When I'm attahced to my Windows service process (local machine using Visual Studio 2010), the whole thing works flawlessly!
When I try to run it (on my local machine) without attaching to it and debugging it, the second file never makes it through and I get the following msg:
"The process cannot access the file 'C:\Projects\Data\VendingStats\20121213_AZM_Journey_MIS.txt' because it is being used by another process." I do not have this file open anywhere else on my machine. It is just sitting in a directory. I then copy it in a directory and the FSW takes over (and the code below).
Can someone please tell me what I need to do to get this to work? I don't know why it works fine when I'm attached to and debugging it, but it doesn't work when I send the files through without being attached and debugging it. I feel it's defeintiely something on my local box that I need to disable, etc --- I don't know.....
I noticed that the error occurs even before it gets into the "using" statement, because the second file is never copied to the temp directory for it to be processed.
I noticed in my StackTrace, I'm getting the following error:
system.io.__error.winioerror(int32 errorcode string maybefullpath)
Here is my code:
protected override void OnStart(string[] args)
{
FileSystemWatcher Watcher = new FileSystemWatcher(#"C:\Projects\Data\VendingStats");
Watcher.EnableRaisingEvents = true;
Watcher.Created += new FileSystemEventHandler(Watcher_Created);
Watcher.Filter = "*.txt";
Watcher.IncludeSubdirectories = false;
}
private void Watcher_Created(object sender, FileSystemEventArgs e)
{
try
{
string targetPath = #"C:\Temp\VendorStats";
// Use Path class to manipulate file and directory paths.
FileInfo fi = new FileInfo(e.FullPath); // full name of path & file in the FSW directory
string destFile = Path.Combine(targetPath, fi.Name);
// To copy a folder's contents to a new location:
// Create a new target folder, if necessary.
if (!Directory.Exists(targetPath))
Directory.CreateDirectory(targetPath);
// To copy a file to another location and
File.Copy(e.FullPath, destFile, true);
// Set attribute to READONLY
if (fi.IsReadOnly == false)
fi.Attributes = FileAttributes.ReadOnly;
GetCruiseLineShipName(destFile, ref cruiseLine, ref shipName);
using (StreamReader sr = new StreamReader(File.Open(destFile, FileMode.Open, FileAccess.Read, FileShare.Read)))
{
filename = e.FullPath;
//How many lines should be loaded?
int NumberOfLines = 39;
//Read the number of lines and put them in the array
for (int i = 1; i < NumberOfLines; i++)
{
ListLines[i] = sr.ReadLine();
switch (i)
{
case 3:
int idx = ListLines[i].IndexOf(":");
string timeLine = ListLines[i].Substring(idx + 1);
dt = GetDate(Convert.ToDateTime(timeLine.Substring(1)));
break;
//more code here of the same
}
}
//InsertData into database }
}
catch (Exception ex)
{
EventLog.WriteEntry("VendorStats", "Error in the Main:" + "\r\n\r\n" + ex.Message + "\r\n\r\n" + ex.InnerException);
return;
}
}
The bottom line to solving this was to put the method (that was spawned by the FileSystemWatcher) to sleep for "X" amount of seconds until Windows completely releases the resources to the previous and present files as well as the folder.
It was the FileSystemWatcher that actaully had a hold on the resources.
Here is some sample code:
private static void Watcher_Created(object sender, FileSystemEventArgs e)
{
try
{
Thread.Sleep(10000);
GetCruiseLineShipName(e.FullPath, ref cruiseLine, ref shipName);
using (StreamReader sr = new StreamReader(File.Open(e.FullPath, FileMode.Open, FileAccess.Read, FileShare.Read)))
{

Resources