Writing to a file in S3 from jar on EMR on AWS

Writing to a file in S3 from jar on EMR on AWS - hadoop

Is there any way in which I can write to a file from my Java jar to an S3 folder where my reduce files would be written ? I have tried something like:
FileSystem fs = FileSystem.get(conf);
FSDataOutputStream FS = fs.create(new Path("S3 folder output path"+"//Result.txt"));
PrintWriter writer = new PrintWriter(FS);
writer.write(averageDelay.toString());
writer.close();
FS.close();
Here Result.txt is the new file which I would want to write.

Answering my own question:-
I found my mistake.I should be passing the URI of S3 folder path to the fileSystem Object like below:-
FileSystem fileSystem = FileSystem.get(URI.create(otherArgs[1]),conf);
FSDataOutputStream fsDataOutputStream = fileSystem.create(new Path(otherArgs[1]+"//Result.txt"));
PrintWriter writer = new PrintWriter(fsDataOutputStream);
writer.write("\n Average Delay:"+averageDelay);
writer.close();
fsDataOutputStream.close();

FileSystem fileSystem = FileSystem.get(URI.create(otherArgs[1]),new JobConf(<Your_Class_Name_here>.class));
FSDataOutputStream fsDataOutputStream = fileSystem.create(new
Path(otherArgs[1]+"//Result.txt"));
PrintWriter writer = new PrintWriter(fsDataOutputStream);
writer.write("\n Average Delay:"+averageDelay);
writer.close();
fsDataOutputStream.close();
This is how I handled the conf variable in the above code block and it worked like charm.

Here's another way to do it in Java by using the AWS S3 putObject directly with a string buffer.
... AmazonS3 s3Client;
public void reduce(Text key, java.lang.Iterable<Text> values, Reducer<Text, Text, Text, Text>.Context context) throws Exception {
UUID fileUUID = UUID.randomUUID();
SimpleDateFormat sdf = new SimpleDateFormat("yyy-MM-dd");
sdf.setTimeZone(TimeZone.getTimeZone("UTC"));
String fileName = String.format("nightly-dump/%s/%s-%s",sdf.format(new Date()), key, fileUUID);
log.info("Filename = [{}]", fileName);
String content = "";
int count = 0;
for (Text value : values) {
count++;
String s3Line = value.toString();
content += s3Line + "\n";
}
log.info("Count = {}, S3Lines = \n{}", count, content);
PutObjectResult putObjectResult = s3Client.putObject(S3_BUCKETNAME, fileName, content);
log.info("Put versionId = {}", putObjectResult.getVersionId());
reduceWriteContext("1", "1");
context.setStatus("COMPLETED");
}

Related

Spring Boot | Upload an image to relative path in resources

I get "System can't find the path specified." when I try to upload an image to project folder inside resources.
Here is my project structure:
|Project
|src
|main
|resources
|META-INF.resources
|images
Project Structural Hierarchy can be seen here in the image format.
I have defined the path as
String path = "\\resources\\images\\" + imageName; File file = new File(path );
try {
InputStream is = event.getFile().getInputstream();
OutputStream out = new FileOutputStream(path );
byte buf[] = new byte[1024];
int len;
while ((len = is.read(buf)) > 0)
out.write(buf, 0, len);
is.close();
out.close();
} catch (Exception e) {
System.out.println(e);
}
What can be the exact path to images folder under META-INF.resources directory?

The following should work fine:
//App.java
String path = "\\META-INF.resources\\images\\" + imageName;
InputStream is = App.class.getClassLoader().getResourceAsStream(
path);
In a nut shell, use "getClassLoader().getResourceAsStream()" instead of FileInputStream or FileOutputStream.

Following worked for me.
In application.properties I defined my path string as:
image.path = src/main/resources/META-INF/resources/images/uploads/
Here is my uploadImage function in a class file.
#Autowired
private Environment environment;
public String uploadImg(FileUploadEvent event, String imagePrefix) {
String path = environment.getProperty("image.path");
SimpleDateFormat fmt = new SimpleDateFormat("yyyyMMddHHmmss");
String name = fmt.format(new Date())
+ event.getFile().getFileName().substring(
event.getFile().getFileName().lastIndexOf('.'));
name = String.format("%s%s" ,imagePrefix ,name );
String finalPath = String.format("%s%s" ,path ,name);
System.out.println("image is going to save # " +finalPath);
File file = new File(finalPath).getAbsoluteFile();
try {
InputStream is = event.getFile().getInputstream();
OutputStream out = new FileOutputStream(file);
byte buf[] = new byte[1024];
int len;
while ((len = is.read(buf)) > 0)
out.write(buf, 0, len);
is.close();
out.close();
} catch (Exception e) {
System.out.println(e);
}
return name;
}
I am not sure whether it will work at production or not. Basically I get absolute path of the project and then append my relatively extracted path. Correct me if I am wrong.

How to read image metadata into text file

i am using org.apache.commons.imaging to access metadata of jpeg file and write it to a text file. But instead of writing metadata to file, program writes random characters to text file. Can someone please help me resolve this issue? Following is the code
//Method to access image metadata and write to text file
public void removeExifMetadata(final File jpegImageFile, final File dst)
throws IOException, ImageReadException, ImageWriteException {
OutputStream os = null;
boolean canThrow = false;
try {
os = new FileOutputStream(dst);
os = new BufferedOutputStream(os);
new ExifRewriter().removeExifMetadata(jpegImageFile, os);
canThrow = true;
} finally {
IoUtils.closeQuietly(canThrow, os);
File metadata=new File("matadata.txt");
if(!metadata.exists()){
metadata.createNewFile();
}
System.out.printf("in try block\n");
FileOutputStream fos = new FileOutputStream(metadata);
TeeOutputStream myOut=new TeeOutputStream(System.out, fos);
PrintStream ps = new PrintStream(myOut);
System.setOut(ps);
System.out.printf("in final block\n");
}
}
//call to removeExifMetadata from main
File imageFile=new File("toddler.jpg");
File out=new File("exif.txt");
e.removeExifMetadata(imageFile, out);

Bufferreader and Bufferwriter for reading and writing hdfs files

I'm trying to read from a hdfs file line by line and then create a hdfs file and write to it line by line. The code that I use looks like this:
Path FileToRead=new Path(inputPath);
FileSystem hdfs = FileToRead.getFileSystem(new Configuration());
FSDataInputStream fis = hdfs.open(FileToRead);
BufferedReader reader = new BufferedReader(new InputStreamReader(fis));
String line;
line = reader.readLine();
while (line != null){
String[] lineElem = line.split(",");
for(int i=0;i<10;i++){
MyMatrix[i][Integer.valueOf(lineElem[0])-1] = Double.valueOf(lineElem[i+1]);
}
line=reader.readLine();
}
reader.close();
fis.close();
Path FileToWrite = new Path(outputPath+"/V");
FileSystem fs = FileSystem.get(new Configuration());
FSDataOutputStream fileOut = fs.create(FileToWrite);
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(fileOut));
writer.write("check");
writer.close();
fileOut.close();
When I run this code in my outputPath file V has not been created. But if I replace the part for reading with the part for writing the file will be created and check is written into it.
Can anyone please help me understand how to use them correctly to be able to read first the whole file and then write to the file line by line?
I have also tried another code for reading from one file and writing to another one but the file will be created but there is nothing written into it!
I use sth like this:
hadoop jar main.jar program2.Main input output
Then in my first job I read from arg[0] and write to a file in args[1]+"/NewV" using map reduce classes and it works.
In my other class (non map reduce)I use args[1]+"/NewV" as input path and output+"/V_0" as output path (I pass these strings to constructor). here is the code for the class :
public class Init_V {
String inputPath, outputPath;
public Init_V(String inputPath, String outputPath) throws Exception {
this.inputPath = inputPath;
this.outputPath = outputPath;
try{
FileSystem fs = FileSystem.get(new Configuration());
Path FileToWrite = new Path(outputPath+"/V.txt");
Path FileToRead=new Path(inputPath);
BufferedWriter output = new BufferedWriter
(new OutputStreamWriter(fs.create(FileToWrite,
true)));
BufferedReader reader = new
BufferedReader(new InputStreamReader(fs.open(FileToRead)));
String data;
data = reader.readLine();
while ( data != null )
{
output.write(data);
data = reader.readLine();
}
reader.close();
output.close(); }catch(Exception e){
}
}
}

I think, you need to understand how hadoop works properly. In hadoop, many thing is done by the system, you are just giving input and output path, then they are opened and created by hadoop if the paths are valid. Check the following example;
public int run (String[] args) throws Exception{
if(args.length != 3){
System.err.println("Usage: MapReduce <input path> <output path> ");
ToolRunner.printGenericCommandUsage(System.err);
}
Job job = new Job();
job.setJarByClass(MyClass.class);
job.setNumReduceTasks(5);
job.setJobName("myclass");
FileInputFormat.addInputPath(job, new Path(args[0]) );
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
return job.waitForCompletion(true) ? 0:1 ;
}
/* ----------------------main---------------------*/
public static void main(String[] args) throws Exception{
int exitCode = ToolRunner.run(new MyClass(), args);
System.exit(exitCode);
}
As you see here, you only initialize necessary variables and reading&writing is done by hadoop.
Also, in your Mapper class you are saying context.write(key, value) inside map, and similarly in your Reduce class you are doing same, it writes for you.
If you use BufferedWriter/Reader it will write to your local file system not to HDFS. To see files in HDFS you should write hadoop fs -ls <path>, the files you are looking by ls command are in your local file system
EDIT: In order to use read/write you should know the followings: Let say you have N machine in your hadoop network. When you want to read, you will not know which mapper is reading, similarly writing. So, all mappers and reducer should have those paths not to give exception.
I dont know if you could use any other class but you can use two methods for your specific reason: startup and cleanup. These methods are used only once in each map and reduce worker. So if you want to read and write you can use that files. Reading and writing is same as normal java code. For example, you want to see something for each key, and want to write it to a txt. You can do the following:
//in reducer
BufferedReader bw ..;
void startup(...){
bw = new ....;
}
void reduce(...){
while(iter.hasNext()){ ....;
}
bw.write(key, ...);
}
void cleanup(...){
bw.close();
}

Display the PDF file stored on the webserver on Browser new window using Spring MVC

I have a requirement to show PDF files in a browser. I use Spring MVC. Is there a way I can do this without using AbstractPdfView? I do not want to render the PDF at runtime. All the PDF files will be stored in my webserver.
This is the code I am using. But this directly downloads the file instead of showing it up in a browser.
#RequestMapping(value = "/download" , method = RequestMethod.GET)
public void doDownload(HttpServletRequest request,
HttpServletResponse response) throws IOException {
// get absolute path of the application
ServletContext context = request.getSession().getServletContext();
String appPath = context.getRealPath("");
String filename= request.getParameter("filename");
filePath = getDownloadFilePath(lessonName);
// construct the complete absolute path of the file
String fullPath = appPath + filePath;
File downloadFile = new File(fullPath);
FileInputStream inputStream = new FileInputStream(downloadFile);
// get MIME type of the file
String mimeType = context.getMimeType(fullPath);
if (mimeType == null) {
// set to binary type if MIME mapping not found
mimeType = "application/pdf";
}
System.out.println("MIME type: " + mimeType);
String headerKey = "Content-Disposition";
response.addHeader("Content-Disposition", "attachment;filename=report.pdf");
response.setContentType("application/pdf");
// get output stream of the response
OutputStream outStream = response.getOutputStream();
byte[] buffer = new byte[BUFFER_SIZE];
int bytesRead = -1;
// write bytes read from the input stream into the output stream
while ((bytesRead = inputStream.read(buffer)) != -1) {
outStream.write(buffer, 0, bytesRead);
}
inputStream.close();
outStream.close();
}

Remove the line
response.addHeader("Content-Disposition", "attachment;filename=report.pdf");
This line precisely tells the browser to display a download/save dialog rather than displaying the PDF directly.
Oh, and make sure to close the input sytream in a finally block.

how to upload an image using servlet to an absolute path

I want to upload a file to my project folder. My code is as follows:
protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
File savedFile;
String destination;
List<FileItem> items = null;
try {
items = new ServletFileUpload(new DiskFileItemFactory()).parseRequest(request);
} catch (FileUploadException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
for (FileItem item : items) {
if (item.isFormField()) {
// Process regular form field (input type="text|radio|checkbox|etc", select, etc).
} else {
// Process form file field (input type="file").
String fieldName = item.getFieldName();
String fileName = FilenameUtils.getName(item.getName());
InputStream fileContent = item.getInputStream();
String userName = (String) session.getAttribute("newUser");
destination = getServletConfig().getServletContext().getContextPath() + "\\" + userName + ".jpeg";
savedFile = new File(destination);
//Check if file exists
if(!savedFile.exists())
savedFile.createNewFile();
BufferedOutputStream bos = new BufferedOutputStream(new FileOutputStream(savedFile));
byte[] buffer = new byte[1024];
int len;
//Read from file and write to new file destination
while((len = fileContent.read(buffer)) >= 0) {
bos.write(buffer, 0, len);
}
//Closing the streams
fileContent.close();
bos.close();
}
}
}
When I run the jsp file and browse and select the required image and submit the form, the servlet runs but it throws IOException. The exception is throws by the line where I create a new path using savedFile.createNewFile(). Before I used that code, it threw another FileNotFoundException. I am not sure if the path that I have provided is correct.

Try to use getRealPath() method.
String fileName="/" + userName + ".jpeg";
destination = getServletContext().getRealPath(fileName);
savedFile = new File(destination);

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Writing to a file in S3 from jar on EMR on AWS - hadoop

Related

Spring Boot | Upload an image to relative path in resources

How to read image metadata into text file

Bufferreader and Bufferwriter for reading and writing hdfs files

Display the PDF file stored on the webserver on Browser new window using Spring MVC

how to upload an image using servlet to an absolute path

Categories

Resources