I have one hive table. I'm using JSON data for the hive table. When I select the whole table it works for me. If I select a particular column it prints null values.
The data looks like this
{"page_1":"{\"city\":\"Bangalore\",\"locality\":\"Battarahalli\",\"Name_of_Person\":\"xxx\",\"User_email_address\":\"test#gmail.com\",\"user_phone_number\":\"\",\"sub_locality\":\"\",\"street_name\":\"7th Cross Road, Near Reliance Fresh, T.c Palya,\",\"home_plot_no\":\"45\",\"pin_code\":\"560049\",\"project_society_build_name\":\"Sunshine Layout\",\"landmark_reference_1\":\"\",\"landmark_reference_2\":\"\",\"No_of_Schools\":20,\"No_of_Hospitals\":20,\"No_of_Metro\":0,\"No_of_Mall\":11,\"No_of_Park\":10,\"Distance_of_schools\":1.55,\"Distance_of_Hospitals\":2.29,\"Distance_of_Metro\":0,\"Distance_of_Mall\":1.55,\"Distance_of_Park\":2.01,\"lat\":13.0243273,\"lng\":77.7077906,\"ipinfo\":{\"ip\":\"113.193.30.130\",\"hostname\":\"No Hostname\",\"city\":\"\",\"region\":\"\",\"country\":\"IN\",\"loc\":\"20.0000,77.0000\",\"org\":\"AS45528 Tikona Digital Networks Pvt Ltd.\"}}","page_2":"{\"home_type\":\"Flat\",\"area\":\"1350\",\"beds\":\"3 BHK\",\"bath_rooms\":2,\"building_age\":\"1\",\"floors\":2,\"balcony\":2,\"amenities\":\"premium\",\"amenities_options\":{\"gated_security\":\"\",\"physical_security\":\"\",\"cctv_camera\":\"\",\"controll_access\":\"\",\"elevator\":true,\"power_back_up\":\"\",\"parking\":true,\"partial_parking\":\"\",\"onsite_maintenance_store\":\"\",\"open_garden\":\"\",\"party_lawn\":\"\",\"amenities_balcony\":\"\",\"club_house\":\"\",\"fitness_center\":\"\",\"swimming_pool\":\"\",\"party_hall\":\"\",\"tennis_court\":\"\",\"basket_ball_court\":\"\",\"squash_coutry\":\"\",\"amphi_theatre\":\"\",\"business_center\":\"\",\"jogging_track\":\"\",\"convinience_store\":\"\",\"guest_rooms\":\"\"},\"interior\":\"regular\",\"interior_options\":{\"tiles\":true,\"marble\":\"\",\"wooden\":\"\",\"modular_kitchen\":\"\",\"partial_modular_kitchen\":\"\",\"gas_pipe\":\"\",\"intercom_system\":\"\",\"air_conditioning\":\"\",\"partial_air_conditioning\":\"\",\"wardrobe\":\"\",\"sanitation_fixtures\":\"\",\"false_ceiling\":\"\",\"partial_false_ceiling\":\"\",\"recessed_lighting\":\"\"},\"location\":\"regular\",\"location_options\":{\"good_view\":true,\"transporation_hub\":true,\"shopping_center\":\"\",\"hospital\":\"\",\"school\":\"\",\"ample_parking\":\"\",\"park\":\"\",\"temple\":\"\",\"bank\":\"\",\"less_congestion\":\"\",\"less_pollution\":\"\"},\"maintenance\":\"\",\"maintenance_value\":\"\",\"near_by\":{\"school\":\"\",\"hospital\":\"\",\"mall\":\"\",\"park\":\"\",\"metro\":\"\",\"Near_by_school\":\"Little Champ Gurukulam Pre School \\\/ 1.52 km\",\"Near_by_hospital\":\"Suresh Hospital \\\/ 2.16 km\",\"Near_by_mall\":\"LORVEN LEO \\\/ 2.13 km\",\"Near_by_park\":\"SURYA ENCLAIVE \\\/ 2.09 km\"},\"city\":\"Bangalore\",\"locality\":\"Battarahalli\",\"token\":\"344bd4f0fab99b460873cfff6befb12f\"}"}
I created the table like this
CREATE EXTERNAL TABLE orc_test (json string)
LOCATION '/user/ec2-user/test_orc';
IF I use this query it works for me.
select * from orc_test;
If I try to select one column it prints null
select get_json_object(orc_test.json,'$.locality') as loc
from orc_test;
It prints
NULL NULL NULL
any help will be appreciated.
In your case, I think the back slashes in your data are causing the problem and also the quotes surrounding your page data. I have listed below the updated data, you could save it to a file and load it to the table, then your query should work.
{"page_1":{"city":"Bangalore","locality":"Battarahalli","Name_of_Person":"xxx","User_email_address":"test#gmail.com","user_phone_number":"","sub_locality":"","street_name":"7th Cross Road, Near Reliance Fresh, T.c Palya,","home_plot_no":"45","pin_code":"560049","project_society_build_name":"Sunshine Layout","landmark_reference_1":"","landmark_reference_2":"","No_of_Schools":20,"No_of_Hospitals":20,"No_of_Metro":0,"No_of_Mall":11,"No_of_Park":10,"Distance_of_schools":1.55,"Distance_of_Hospitals":2.29,"Distance_of_Metro":0,"Distance_of_Mall":1.55,"Distance_of_Park":2.01,"lat":13.0243273,"lng":77.7077906,"ipinfo":{"ip":"113.193.30.130","hostname":"No Hostname","city":"","region":"","country":"IN","loc":"20.0000,77.0000","org":"AS45528 Tikona Digital Networks Pvt Ltd."}},"page_2":{"home_type":"Flat","area":"1350","beds":"3 BHK","bath_rooms":2,"building_age":"1","floors":2,"balcony":2,"amenities":"premium","amenities_options":{"gated_security":"","physical_security":"","cctv_camera":"","controll_access":"","elevator":true,"power_back_up":"","parking":true,"partial_parking":"","onsite_maintenance_store":"","open_garden":"","party_lawn":"","amenities_balcony":"","club_house":"","fitness_center":"","swimming_pool":"","party_hall":"","tennis_court":"","basket_ball_court":"","squash_coutry":"","amphi_theatre":"","business_center":"","jogging_track":"","convinience_store":"","guest_rooms":""},"interior":"regular","interior_options":{"tiles":true,"marble":"","wooden":"","modular_kitchen":"","partial_modular_kitchen":"","gas_pipe":"","intercom_system":"","air_conditioning":"","partial_air_conditioning":"","wardrobe":"","sanitation_fixtures":"","false_ceiling":"","partial_false_ceiling":"","recessed_lighting":""},"location":"regular","location_options":{"good_view":true,"transporation_hub":true,"shopping_center":"","hospital":"","school":"","ample_parking":"","park":"","temple":"","bank":"","less_congestion":"","less_pollution":""},"maintenance":"","maintenance_value":"","near_by":{"school":"","hospital":"","mall":"","park":"","metro":"","Near_by_school":"Little Champ Gurukulam Pre School / 1.52 km","Near_by_hospital":"Suresh Hospital / 2.16 km","Near_by_mall":"LORVEN LEO / 2.13 km","Near_by_park":"SURYA ENCLAIVE / 2.09 km"},"city":"Bangalore","locality":"Battarahalli","token":"344bd4f0fab99b460873cfff6befb12f"}}
I tried this and it works for me.
hive> select get_json_object(orc_test.json,'$.page_1.locality') as loc from orc_test;
OK
Battarahalli
Time taken: 0.091 seconds, Fetched: 1 row(s)
hive> select get_json_object(orc_test.json,'$.page_1.city') as loc from orc_test;
OK
Bangalore
Time taken: 0.097 seconds, Fetched: 1 row(s)
hive> select get_json_object(orc_test.json,'$.page_2.home_type') as loc from orc_test;
OK
Flat
Time taken: 0.091 seconds, Fetched: 1 row(s)
It seems that you have not created table with many columns. only one column in hive table.
In hive the whole data of json value has been taken single value for a column. hence it shows null values for the columns.
use a JSON serde in order for Hive to map your JSON to the columns in your table.
Beside vmachan answer, which I think is right, the problem I encountered in similar situation was that json records weren't placed in separate lines. Also it didn't worked when it was an array. So, for example, this worked ok with Hive 3.1.0 using lateral view/json_tuple:
{"color":"black","category":"hue","type":"primary","code":{"rgba":[255,255,255,1],"hex":"#000"}}
{"color":"white","category":"value","code":{"rgba":[0,0,0,1],"hex":"#FFF"}}
{"color":"red","category":"hue","type":"primary","code":{"rgba":[255,0,0,1],"hex":"#FF0"}}
{"color":"blue","category":"hue","type":"primary","code":{"rgba":[0,0,255,1],"hex":"#00F"}}
{"color":"yellow","category":"hue","type":"primary","code":{"rgba":[255,255,0,1],"hex":"#FF0"}}
{"color":"green","category":"hue","type":"secondary","code":{"rgba":[0,255,0,1],"hex":"#0F0"}}
and that was NOT working well:
[{"color":"black","category":"hue","type":"primary","code":{"rgba":[255,255,255,1],"hex":"#000"}},
{"color":"white","category":"value","code":{"rgba":[0,0,0,1],"hex":"#FFF"}},
{"color":"red","category":"hue","type":"primary","code":{"rgba":[255,0,0,1],"hex":"#FF0"}},
{"color":"blue","category":"hue","type":"primary","code":{"rgba":[0,0,255,1],"hex":"#00F"}},
{"color":"yellow","category":"hue","type":"primary","code":{"rgba":[255,255,0,1],"hex":"#FF0"}},
{"color":"green","category":"hue","type":"secondary","code":{"rgba":[0,255,0,1],"hex":"#0F0"}}]