I am using HBase 0.98.1-cdh5.1.3. I am trying to ingest a csv file present in my hdfs at location /user/hdfs/exp to Hbase. My file has data in the following format:
1,abc,xyz
2,def,uvw
3,ghi,rst
I am using the command below:
bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv '-Dimporttsv.separator=,' -Dimporttsv.columns=HBASE_ROW_KEY,CF:firstname,CF:lastname tablename /user/hdfs/exp
I have also used different combinations like
bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,CF:firstname,CF:lastname tablename /user/hdfs/exp '-Dimporttsv.separator=,'
and
bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,CF:firstname,CF:lastname '-Dimporttsv.separator=,' tablename /user/hdfs/exp
but nothing works. It fails to detect separator i.e , in my case and is not parsed properly. Can anybody help me figure out where I am going wrong.
this is just one line of data set:
10000064202896309897,1000006420,2896309897,10180,hdfs://btc5x015:8020/user/mr_test/logsJan/log_jan20_29/10180_log201501260000.log,3.2.3.1,9,2015-01-26,15:46:12.12,REF SHOULDER 4,n,n,SHOULDER,60,17.0,M,487093458,[study_16004_16004_],exam_16004_16004,[patient_16004_1_],Schulter std,SCHULTERGELENK RECHTS,8.10,NOT_EXIST,-8.1,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,n,NOT_EXIST,NOT_EXIST,y,n,HF,NOT_EXIST,NOT_EXIST,NOT_EXIST,,,NOT_EXIST,NOT_EXIST,N,NOT_EXIST,1,NOT_EXIST,IMAG,FFE,T1FFE,4.0,0.72,34,NOT_EXIST,NOT_EXIST,NOT_EXIST,4.0,cor,,NOT_EXIST,no,0,n,NOT_EXIST,1,0.0,,,,,,,NOT_EXIST,102,NOT_EXIST,NOT_EXIST,NOT_EXIST,15:45:29.28,15:46:12.12,15:46:12.12,9.5,9.4,1.002,NOT_EXIST,TRUE,no,NOT_EXIST,0.0,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,0.0,NOT_EXIST,NOT_EXIST,0.0,0.09,0.3,3.3,0,false,false,n,n,26.1,LT,0.33,0.03,NOT_EXIST,null,1,Dres.GrafKernHausmann,E:\Export\DataMonitoring\p_i_20150126_154530.frame,hdfs://btc5x015.code1.emi.philips.com:8020/user/mr_test/logsJan/log_jan20_29/10180_log201501260000.log,317774,883,0,0,1,8,6,2,0,0,0,0,0,0,6014,0,15:44:08.15,15:44:59.93,15:45:23.14,15:45:29.28,00:00:00.00,00:00:00.00,15:45:30.57,15:45:38.45,15:45:29.28,15:46:12.12,15:45:38.45,15:46:12.12,42984,33967,0,7988,6014,00:00:00.00,00:00:00.00,0,00:00:00.00,00:00:00.00,0,169,102,SENSE-SHOULDER8,,SENSE-SHOULDER8/(19) BODY-QUAD,190,190,CLINICAL,0,0,94166709,Radiologische Gemeinschaftspraxis,Dr. med. Michael Graf,Dr. med. Andreas Kern,Dr. med. Hausmann,Wetzlar,35578,Hausertorstr. 47,6,NOT_EXIST,1,NOT_EXIST,NO,NOT_EXIST,NOT_EXIST,NOT_EXIST,1,NOT_EXIST,NOT_EXIST,NOT_EXIST,NO,NOT_EXIST,SHORTEST,NOT_EXIST,SHORTEST,1,NO,YES,DEFAULT,FFE,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,3,NOT_EXIST,CARTESIAN,YES,NO,NOT_EXIST,NO,NOT_EXIST,low,FULL,NOT_EXIST,NOT_EXIST,NOT_EXIST,3D,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,YES,NOT_EXIST,no,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NO,NOT_EXIST,USER_DEF,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NO,DEF,H,NOT_EXIST,NOT_EXIST,NO,NOT_EXIST,NOT_EXIST,NOT_EXIST,MPU_MTC_MODE_NO,NO,NOT_EXIST,T1,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NO,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,450,405,YES,Supine,HF,NOT_EXIST,NO,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,DEFAULT,NOT_EXIST,2,SENSE-SHOULDER8 BODY-QUAD,F,400 400,,,100 100,5.625,7.23214293,4,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NO,NO,NOT_EXIST,NOT_EXIST,80,PARALLEL,NO,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NO,NOT_EXIST,0,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,OFF,NOT_EXIST,NO,NO,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,NOT_EXIST,0,5.625,NOT_EXIST,NOT_EXIST,NOT_EXIST,10180,15:45:29.29,15:46:13.76,null,,,96,,1,,,0,10180,PATTERN_SRN,SHOULDER,SCHULTER,MATCHED_SHOULDER,SHOULDER,UPPER EXTREMITIES,ANATOMY_GROUP_MAPPED,10180,10180,10180,3.2.3,Achieva 3.0T,Achieva,T30,3.0T,NO,F2000,Watercooled2,274-D,Master,NONE,,,0,16,0,1,S26_128,NONE,8,null,CDAS,LOGFOLDER_SYSFOLDER_MATCHED_RELEASE_NOT_CHECKED,FALSE,null,null,null,null,12.15,SENSE-SHOULDER8/(19) Q-BODY,0,SCAN_PARSE_SUCCESS,SHOULDER,-4.64757729 5.37445641,
I just loaded a single line give in question into hbase table with ImportTSV command by providing 414 columns and it worked perfectly for me.Here is a command that I used.
hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,CF:c1,CF:c2,CF:c3,CF:c4,CF:c5,CF:c6,CF:c7,CF:c8,CF:c9,CF:c10,CF:c11,CF:c12,CF:c13,CF:c14,CF:c15,CF:c16,CF:c17,CF:c18,CF:c19,CF:c20,CF:c21,CF:c22,CF:c23,CF:c24,CF:c25,CF:c26,CF:c27,CF:c28,CF:c29,CF:c30,CF:c31,CF:c32,CF:c33,CF:c34,CF:c35,CF:c36,CF:c37,CF:c38,CF:c39,CF:c40,CF:c41,CF:c42,CF:c43,CF:c44,CF:c45,CF:c46,CF:c47,CF:c48,CF:c49,CF:c50,CF:c51,CF:c52,CF:c53,CF:c54,CF:c55,CF:c56,CF:c57,CF:c58,CF:c59,CF:c60,CF:c61,CF:c62,CF:c63,CF:c64,CF:c65,CF:c66,CF:c67,CF:c68,CF:c69,CF:c70,CF:c71,CF:c72,CF:c73,CF:c74,CF:c75,CF:c76,CF:c77,CF:c78,CF:c79,CF:c80,CF:c81,CF:c82,CF:c83,CF:c84,CF:c85,CF:c86,CF:c87,CF:c88,CF:c89,CF:c90,CF:c91,CF:c92,CF:c93,CF:c94,CF:c95,CF:c96,CF:c97,CF:c98,CF:c99,CF:c100,CF:c101,CF:c102,CF:c103,CF:c104,CF:c105,CF:c106,CF:c107,CF:c108,CF:c109,CF:c110,CF:c111,CF:c112,CF:c113,CF:c114,CF:c115,CF:c116,CF:c117,CF:c118,CF:c119,CF:c120,CF:c121,CF:c122,CF:c123,CF:c124,CF:c125,CF:c126,CF:c127,CF:c128,CF:c129,CF:c130,CF:c131,CF:c132,CF:c133,CF:c134,CF:c135,CF:c136,CF:c137,CF:c138,CF:c139,CF:c140,CF:c141,CF:c142,CF:c143,CF:c144,CF:c145,CF:c146,CF:c147,CF:c148,CF:c149,CF:c150,CF:c151,CF:c152,CF:c153,CF:c154,CF:c155,CF:c156,CF:c157,CF:c158,CF:c159,CF:c160,CF:c161,CF:c162,CF:c163,CF:c164,CF:c165,CF:c166,CF:c167,CF:c168,CF:c169,CF:c170,CF:c171,CF:c172,CF:c173,CF:c174,CF:c175,CF:c176,CF:c177,CF:c178,CF:c179,CF:c180,CF:c181,CF:c182,CF:c183,CF:c184,CF:c185,CF:c186,CF:c187,CF:c188,CF:c189,CF:c190,CF:c191,CF:c192,CF:c193,CF:c194,CF:c195,CF:c196,CF:c197,CF:c198,CF:c199,CF:c200,CF:c201,CF:c202,CF:c203,CF:c204,CF:c205,CF:c206,CF:c207,CF:c208,CF:c209,CF:c210,CF:c211,CF:c212,CF:c213,CF:c214,CF:c215,CF:c216,CF:c217,CF:c218,CF:c219,CF:c220,CF:c221,CF:c222,CF:c223,CF:c224,CF:c225,CF:c226,CF:c227,CF:c228,CF:c229,CF:c230,CF:c231,CF:c232,CF:c233,CF:c234,CF:c235,CF:c236,CF:c237,CF:c238,CF:c239,CF:c240,CF:c241,CF:c242,CF:c243,CF:c244,CF:c245,CF:c246,CF:c247,CF:c248,CF:c249,CF:c250,CF:c251,CF:c252,CF:c253,CF:c254,CF:c255,CF:c256,CF:c257,CF:c258,CF:c259,CF:c260,CF:c261,CF:c262,CF:c263,CF:c264,CF:c265,CF:c266,CF:c267,CF:c268,CF:c269,CF:c270,CF:c271,CF:c272,CF:c273,CF:c274,CF:c275,CF:c276,CF:c277,CF:c278,CF:c279,CF:c280,CF:c281,CF:c282,CF:c283,CF:c284,CF:c285,CF:c286,CF:c287,CF:c288,CF:c289,CF:c290,CF:c291,CF:c292,CF:c293,CF:c294,CF:c295,CF:c296,CF:c297,CF:c298,CF:c299,CF:c300,CF:c301,CF:c302,CF:c303,CF:c304,CF:c305,CF:c306,CF:c307,CF:c308,CF:c309,CF:c310,CF:c311,CF:c312,CF:c313,CF:c314,CF:c315,CF:c316,CF:c317,CF:c318,CF:c319,CF:c320,CF:c321,CF:c322,CF:c323,CF:c324,CF:c325,CF:c326,CF:c327,CF:c328,CF:c329,CF:c330,CF:c331,CF:c332,CF:c333,CF:c334,CF:c335,CF:c336,CF:c337,CF:c338,CF:c339,CF:c340,CF:c341,CF:c342,CF:c343,CF:c344,CF:c345,CF:c346,CF:c347,CF:c348,CF:c349,CF:c350,CF:c351,CF:c352,CF:c353,CF:c354,CF:c355,CF:c356,CF:c357,CF:c358,CF:c359,CF:c360,CF:c361,CF:c362,CF:c363,CF:c364,CF:c365,CF:c366,CF:c367,CF:c368,CF:c369,CF:c370,CF:c371,CF:c372,CF:c373,CF:c374,CF:c375,CF:c376,CF:c377,CF:c378,CF:c379,CF:c380,CF:c381,CF:c382,CF:c383,CF:c384,CF:c385,CF:c386,CF:c387,CF:c388,CF:c389,CF:c390,CF:c391,CF:c392,CF:c393,CF:c394,CF:c395,CF:c396,CF:c397,CF:c398,CF:c399,CF:c400,CF:c401,CF:c402,CF:c403,CF:c404,CF:c405,CF:c406,CF:c407,CF:c408,CF:c409,CF:c410,CF:c411,CF:c412,CF:c413,CF:c414 '-Dimporttsv.separator=,' tablename /user/hdfs/exp
I have given random column name , you can update it as per your need.
Note : Make sure that number of columns you are passing through command are matching with your input data source. Even I got Bad Line issue when I passed 412 columns instead of 414.
Hope this will help.:)
It looks like the single quotation is misplaced while specifying separator. Try using this: -Dimporttsv.separator=',' instead of '-Dimporttsv.separator=,'
If the input file is prepared such that any column value consists of a field delimiter (in this case, comma), it will fail. Better to keep a different delimiter (such as |) while preparing the CSV file