Hadoop: How to save Map object in configuration - hadoop

Any idea how can I set Map object into org.apache.hadoop.conf.Configuration?

Serialize your map into JSON and then put it as string in your configuration.
There is no way to put a whole object into it, because the whole configuration will be written as a XML file.
GSON is quite good at it: http://code.google.com/p/google-gson/
Here is the tutorial about how to serialize collections: http://sites.google.com/site/gson/gson-user-guide#TOC-Collections-Examples

Related

Assign ArrayList from the data in propeties file

this is my property file.
REDCA_IF_00001=com.sds.redca.biz.svc.RedCAIF00001SVC
REDCA_IF_00002=com.sds.redca.biz.svc.RedCAIF00002SVC
REDCA_IF_00003=com.sds.redca.biz.svc.RedCAIF00003SVC
REDCA_IF_00004=com.sds.redca.biz.svc.RedCAIF00004SVC
and I want to these values into hashmap in my spring context file.
How can I achieve this?
Does it have to be a HashMap or any kind of Map would be fine?
Because you can define that as a java.util.Properties instance (Spring has great support for properties loading), which already implements Map (it actually extends from Hashtable).

convert properties file to hashmap in spring

I need to load all of the content of my properties file which looks something like this:
some.properties
key.1=this item needs to be loaded onto a hashmap
key.2=this item also needs to be loaded onto a hashmap
key.3=this item also needs to be loaded onto a hashmap
key.4=this item also needs to be loaded onto a hashmap
I want to know a way in which i can load all of the content from my properties file onto a hashmap. The actual content that is present against every key is very lengthy, so i cannot make my properties file as abc=aa,bb,cc and then load it onto my java class using the #Value annotation.
Also, I have around 40 keys in my properties file. So, im trying to use this approach, As i dont want to add #Value annotation separately for every value in my java class.
As, on my hashmap i will put in certain checks to load which keys and then set those parameters one by one into my variables and pass it for further processing.
I tried a lot of things load the properties file and convert it into a hashmap through spring, all i now know is that i can make use of Property Placeholder Configurer which could load all of the properties file. However, how do i access the content inside the properties file in my java class converting it onto a hashmap.
Any help will be highly appreciated.
Thanks!
Or I don't understand your question or it is very simple:
Properties props = new Properties().load(new FileInputStream([PROPERTIES_PATH]));
Map<String, String> map = new HashMap<String, String>(props);
Let's start from here! Maybe I can help you more, if give more concrete info

Hadoop - How to switch from implementing the writable interface to use an Avro object?

I’m using Hadoop to convert JSONs into CSV files to access them with Hive.
At the moment the Mapper is filling an own data structure parsing the JSONs with JSON-Smart. Then the reducer is reading out that object and is writing it to a file, separated by commas.
For making this faster I already implemented the writable interface in the data structure...
Now I want to use Avro for the data structure object to have more flexibility and performance. How could I change my classes to make them exchange an Avro object instead of a writable?
Hadoop offers a pluggable serialization mechanism via the SerializationFactory.
By default, Hadoop uses the WritableSerialization class to handle the deserialization of classes which implement the Writable interface, but you can register custom serializers that implement the Serialization interface by setting the Hadoop configuration property io.serializations (a CSV list of classes that implement the Serialization interface).
Avro has an implementation of the Serialization interface in the AvroSerialization class - so this would be the class you configure in the io.serializations property.
Avro actually has a whole bunch of helper classes which help you write Map / Reduce jobs to use Avro as input / output - there's some examples in the source (Git copy)
I can't seem to find any good documentation for Avro & Map Reduce at the moment, but i'm sure there are some other good examples out there.

Is it possible to Serialize to XML using the same format of the XSD?

I have generated a Class from an XSD using XSD2Code.
I now need to deserialize a conformant XML file for this XSD into an object of this class.
I have tried a number of XML Serializers, but they seem to use their own XML format, thus I am unable to externally edit a conformant XML file for deserializing into an object.
Is it possible to Deserialize into an object while maintaining the original format ie one can generate an XML file which is conformant to the XSD, and not the serializer's specific XML format.
Many thanks in advance.
Ed
Sorted. I have found XSD2Code does this.

Jersey JSONConfiguration FEATURE_POJO_MAPPING - how to skip unwanted entries during deserialization

I'm defining pojos for facebook objects, which can be consumed by clients who dont have the capacity to parse JSON. Some of the FB object's data structure is loosely defined, like work
"work":[
{"employer":{"id":"xxxxxxxxx","name":"ABC"},
"location":{"id":"xxxxxxxxx","name":"Philadelphia, Pennsylvania"}
"position":{"id":"198376496853401","name":"Manager"}
"with":[{"id":"xxxxxxxxxxxx","name":"Dogbert Smith"}]}
]
My question is how to skip these objects while deserializing. I'm using
ClientConfig config = new DefaultClientConfig();
config.getFeatures().put(JSONConfiguration.FEATURE_POJO_MAPPING, Boolean.TRUE);
Is there a way I can customize what to deserialize?
Thanks for any pointers.
I found the solution. Added #JsonIgnoreProperties to the pojo for which I need to ignore some properties. Works great.

Resources