Facing issue with OpenCsv : Number of data fields does not match number of headers - spring-boot

I am using OpenCSV 4.2 in a springboot project and trying to parse a CSV file with 1 data row.
col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11,col12,col13,col14,col15,col16,col17
"1234","VSHRT","TTRYE","PLRTY","1165","NOW","","Collection","store/WEZXB6Z2CC_1.jpg","500","ABC","false","0","[{""name"":""fdtty"",""id"":""242541"",""value"":10}]","400","ABC","dummycol"
No new line character after last data column.
This is my function which return Iterator for the data
public static <T> Iterator<T> csvToBeanIterator(String csv, Class<T> clazz) {
CsvToBean cb = new CsvToBeanBuilder<>(new StringReader(csv))
.withType(clazz)
.withSeparator(",")
.build();
return cb.iterator();
}
I am getting error
Caused by: com.opencsv.exceptions.CsvRequiredFieldEmptyException: Number of data fields does not match number of headers.
at com.opencsv.bean.HeaderColumnNameMappingStrategy.verifyLineLength(HeaderColumnNameMappingStrategy.java:105) ~[opencsv-4.2.jar:?]
at com.opencsv.bean.AbstractMappingStrategy.populateNewBean(AbstractMappingStrategy.java:313) ~[opencsv-4.2.jar:?]
at com.opencsv.bean.concurrent.ProcessCsvLine.processLine(ProcessCsvLine.java:116) ~[opencsv-4.2.jar:?]
at com.opencsv.bean.concurrent.ProcessCsvLine.run(ProcessCsvLine.java:77) ~[opencsv-4.2.jar:?]
I have tried multiple posts available on the internet but no luck.
Could someone please point the issue here.

I had the same issue and deleting some of the columns sorted it for me.

Related

Read Full Query parameter String in Spring Boot

Is there a way for me to read the entire query string in the GET API? Since there can be a variable number of parameters I am already looking at using this
public void createUser(#RequestParam(required=false) Map<String,String> qparams) {
}
But I want to read the entire query string as well.
The reason being one of the parameters here is an HMAC which is calculated on the entire string. and we are using that HMAC for cross verification.
We have deep integration with third-party software. The issue here is that the third-party software can make a change to their API at any point in time.
Here's how you can do it.
#GetMapping("/test1")
void endpoint1(HttpServletRequest req) {
var qs = req.getQueryString() //returns the entire string
qs.split("&") //split to get the individual parameters
}

Google Cloud Search - db.blobColumns

I'm trying to understand the property db.blobColumns in the database connector --- I've got essentially a massive string of 500,000 characters and I want to use db.blobColumns to upload this text. By the inherent name of blob I am assuming that it is expecting a binary large object? If anyone's used this property before for large text files please help me! I'm at a loss with this particular situation.
Here are the docs: https://developers.google.com/cloud-search/docs/guides/database-connector#content-fields
I have tried using the db.blobColumn field with database blob(binary) content and it works well by extracting text from the file and doing OCR if its an image. But yes, it also accepts text content in the form of database's CLOB type.
I suggest you take a look at the code of database connector here. Main two files that matter here are DatabaseAccess.java and DatabaseRepository.java.
private ByteArrayContent createBlobContent(Map<String, Object> allColumnValues) {
byte[] bytes;
Object value = allColumnValues.get(columnManager.getBlobColumn());
if (value == null) {
return null;
} else if (value instanceof String) {
bytes = ((String) value).getBytes(UTF_8);
} else if (value instanceof byte[]) {
bytes = (byte[]) value;
} else {
throw new InvalidConfigurationException( // allow SDK to send dashboard notification
"Invalid Blob column type. Column: " + columnManager.getBlobColumn()
+ "; object type: " + value.getClass().getSimpleName());
}
return new ByteArrayContent(null, bytes);
}
Above code snippet from the DatabaseRepository.java file is responsible for generating the blob content(binary) which is pushed to Cloud Search. The content of Clob and Blob comes to this function in the form of a byte[]. And is pushed as-is to Cloud Search.
Note from here:
Google Cloud Search will only index first 10 MB of your content
regardless of whether its a text file or binary content.

Parsing multi-format & multi line data file in spring batch job

I am writing a spring batch job to process the below mentioned data file and write it into a db.
Sample data file is of this format where I have multiple headers and
each header has a bunch of rows associated with it .
I can have million of records for each header and I can have n number
of headers in a flat file that am processing.My requirement is to
pick a few readers which am concerned .
For all the picked readers I need to pick all the data rows .Each
header and its data format is also different .I can receive either of
these data in my processor and need to write them into my DB.
HDR01
A|41|57|Data1|S|62|Data2|9|N|2017-02-01 18:01:05|2017-02-01 00:00:00
A|41|57|Data1|S|62|Data2|9|N|2017-02-01 18:01:05|2017-02-01 00:00:00
HDR02
A|41|57|Data1|S|62|Data2|9|N|
A|41|57|Data1|S|62|Data2|9|N|
I tried exploring the PatternMatchingCompositeLineMapper where I can
map the different header pattern I have to a tokenizer and
corresponding FieldSetMapper but I need to read the body and not the
header here .
Don't have any footer to Crete a end of line policy of my own as well .
Also tried using AggregateItemReader but don't want to club all the
records of a header before I process them .
Each rows corresponding a header should be processed parallel .
#Bean
public LineMapper myLineMapper() {
PatternMatchingCompositeLineMapper< Domain > mapper = new PatternMatchingCompositeLineMapper<>();
final Map<String, LineTokenizer> tokenizers = new HashMap<String, LineTokenizer>();
tokenizers.put("* HDR01*", new DelimitedLineTokenizer());
tokenizers.put("*HDR02*", new DelimitedLineTokenizer());
tokenizers.put("*", new DelimitedLineTokenizer("|"));
mapper.setTokenizers(tokenizers);
Map<String, FieldSetMapper<VMSFeedStyleInfo>> mappers = new HashMap<String, FieldSetMapper<VMSFeedStyleInfo>>();
try {
mappers.put("* HDR01*", customMapper());
mappers.put("*HDR02*", customMapper());
mappers.put("*", customMapper() );
} catch (Exception e) {
e.printStackTrace();
}
mapper.setFieldSetMappers(mappers);
return mapper;
}
Can somebody help me provide some inputs as to how should I achieve this .

How to remove table names in JDBC response in SOAPUI

In SOAPUI tool, in response for any DB step, response contains with tableName.column.
please refer the below image.
How can remove the tableName attribute from the response.
I mean to ask, is there any setting in SOAPUI or is there any properties file I need to update...
This doesn't depends on settings from SOAPUI, it's depends on DB drivers.
I follow the SOAPUI code from github, and I finally found that internally JDBCTestSteps constructs the XML node names from response based on the follow code fragment:
...
public static Document addResultSetXmlPart(Element resultsElement, ResultSet rs, Document xmlDocumentResult)
throws SQLException {
ResultSetMetaData rsmd = rs.getMetaData();
...
...
String columnName = "";
if (!StringUtils.isNullOrEmpty(rsmd.getTableName(ii))) {
columnName += (rsmd.getTableName(ii)).toUpperCase() + ".";
}
columnName += (rsmd.getColumnName(ii)).toUpperCase();
String value = rs.getString(ii);
Element node = xmlDocumentResult.createElement(StringUtils.createXmlName(columnName));
...
(You can see the whole method addResultSetXmlPart method form XMLUtils class here)
So as you can see the node name on the XML depends on ResultSetMetaData getTableName and getColumnName methods. This class is an interface and the implementation of these methods depends on specific DB driver version.
So to have the same behavior as your client, simply check that both have the same DB drivers in SOAPUI_HOME\bin\ext.
REMARK: Once you or your client change the .jar in SOAPUI_HOME\bin\ext restart SOAPUI in order to load the new ones.
Hope this helps,
"postgresql-9.1-903.jdbc4" should return the resultset without the table names. I got it working without placing the db driver to the SOAPUI_HOME\bin\ext.

Hbase Issue | google protobuf tag mismatch error while deserialising SCAN string

Context: I am in the process of migrating my MR jobs on HBase from CDH 2.0.0-cdh4.5.0 (Hadoop1) to HDP 2.2.0.0-2041 (YARN). After minor changes the code was compiled against HDP 2.2.0.0-2041.
Problem: I am trying to run a oozie workflow that executes a series of MR jobs after creating a scan on HBase. The scan is created programatically and then serialised-deserialised before handing it to the mapper to fetch batches from HBase.
Issue: When TableInputFormat internally tries to deserialise the scan string, it throws an error indicating that under the hood google protobuf was not able to deserialise the string. The stack trace looks as follows.
Exception in thread "main" java.io.IOException: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag. at com.flipkart.yarn.test.TestScanSerialiseDeserialise.convertStringToScan(TestScanSerialiseDeserialise.java:37) at com.flipkart.yarn.test.TestScanSerialiseDeserialise.main(TestScanSerialiseDeserialise.java:25) Caused by: ......
Reproducable: I am able to reproduce this in the sample code I am pasting
Sample code:
Scan scan1 = constructScanObjectForUsers("A");
String json = scan1.toJSON();
Scan scan2 = convertStringToScan(Base64.encodeBytes(json.getBytes()));
.......
private static Scan convertStringToScan(String base64) throws IOException {
byte[] decoded = Base64.decode(base64);
// System.out.println(new String(decoded));
ClientProtos.Scan scan;
try {
scan = ClientProtos.Scan.parseFrom(decoded);
} catch (InvalidProtocolBufferException ipbe) {
throw new IOException(ipbe);
}
return ProtobufUtil.toScan(scan);
}
Possible causes: I am suspecting that I missed supplying some dependency or there is some dependency mismatch in underlying jars.
Appreciate any help in solving this?
Scan scan1 = constructScanObjectForUsers("A");
String json = scan1.toJSON();
Scan scan2 = convertStringToScan(Base64.encodeBytes(json.getBytes()));
Here you appear to be encoding the message as JSON. Then you are applying base64 to the JSON text. Usually base64 only applies to binary, but JSON is text.
byte[] decoded = Base64.decode(base64);
// System.out.println(new String(decoded));
ClientProtos.Scan scan;
try {
scan = ClientProtos.Scan.parseFrom(decoded);
Here you are un-base64'ing some text and then decoding it as a protobuf. Is this the same data from above? Because if so, this won't work: JSON and Protobuf are different formats. If you want to decode as Protobuf, you need to encode as Protobuf, not JSON.

Resources