Load Multiple CSV files into database using Spring Batch

Load Multiple CSV files into database using Spring Batch - spring-boot

I want to load multiple CSV files into mysql database at single table using
Spring Batch. The path of the files are derived from the following method.
public List<String> getFilePath() {
String inputPath = "E:\\input";
List<String> inputCSVPaths = new ArrayList<String>();
Map<String, List<String>> inputInfo = new HashMap<String, List<String>>();
File inputFolder = new File(inputPath);
File[] inputFiles = inputFolder.listFiles();
for (File file : inputFiles) {
inputCSVPaths.add(file.getAbsolutePath());
}
inputInfo.put("Introduction", inputCSVPaths);
List<String> inputFile = inputInfo.get("Introduction");
System.out.println("Input File :"+inputFile);
return inputFile;
}
There are total 3 CSV files. But it reads only onle file and inserts data of only that CSV file. Is there wrong in getting resources.
#Autowired
private FilePathDemo filePathDemo;
#Bean
public MultiResourceItemReader<Introduction> multiResourceItemReader() throws IOException {
MultiResourceItemReader<Introduction> multiReader = new MultiResourceItemReader<Introduction>();
ResourcePatternResolver patternResolver = new PathMatchingResourcePatternResolver();
Resource[] resources;
String filePath = "file:";
List<String> path = filePathDemo.getFilePath();
for (String introPath : path) {
System.out.println("File Path of the Introduction CSV :" + introPath);
resources = patternResolver.getResources(filePath + introPath);
multiReader.setResources(resources);
}
FlatFileItemReader<Introduction> flatReader = new FlatFileItemReader<Introduction>();
multiReader.setDelegate(flatReader);
flatReader.setLinesToSkip(1);
flatReader.setLineMapper(new DefaultLineMapper<Introduction>() {
{
setLineTokenizer(new DelimitedLineTokenizer() {
{
setNames(new String[] { "id", "name", "age", "phoneNo"});
}
});
setFieldSetMapper(new BeanWrapperFieldSetMapper<Introduction>() {
{
setTargetType(Introduction.class);
}
});
}
});
flatReader.close();
multiReader.close();
return multiReader;
}

There are two issues with your configuration:
You are reassigning the resources array with a single file in the for loop. Hence, the MultiResourceItemReader will be configured with only one file.
You are calling the close method on the MultiResourceItemReader and the delegate FlatFileItemReader but you should not. Spring Batch will call those methods when the step is complete.
You can find an example of how to configure the MultiResourceItemReader here: https://docs.spring.io/spring-batch/4.0.x/reference/html/readersAndWriters.html#multiFileInput

Related

spring batch flatfileitemwriter for multifile of large data

i use chunk for write files. i have two tables files and datas
config.java
public ListItemReader<> reader(String fileName) {
listItemReader = selectDataOfFileFromDB(fileName);
....
return listItemReader;
}
public FlatFileItemWriter<> writer(fileName) {
FlatFileItemWriter<> delegate = new FlatFileItemWriterBuilder<>
.name(fileName+XXX)
.resource(new FileSystemResource("/xxx/xxx/xxx/"+ fileName)).build();
return delegate;
}
public Step xxxxStep(fileName) {
return stepBuilderFactory.get("xxxxstep" + XXXX)
.reader(reader(fileName))
.writer(writer(fileName)).build();
}
#Bean
public Job xxxJob() {
List<fileName> list = selectFileNameFromDB();
JobBuilder xx = jobBuilderFactory.get("XXXXjob");
SimpleJobBuilder a = null;
a = xx.start(xxxxStep(list.get(0)));
a.next(xxxxStep(list.get(1)))
a.next(xxxxStep(list.get(2))
a.next(xxxxStep(list.get(3))
.....
a.next(xxxxStep(list.get(n))
}
I can write data to each of file but it not smart. any other solution is?
I try the classifiercompositeitemwriter but not suitable!

How to read csv with unknow column names and with unknow column count using in spring batch?

I have following the FlatFileItemReader configuration for my step:
#Bean
#StepScope
public FlatFileItemReader<RawInput> reader(FieldSetMapper<RawInput> fieldSetMapper, #Value("#{jobParameters['files.location']}") Resource resource) {
var reader = new FlatFileItemReader<RawInput>();
reader.setName("my-reader");
reader.setResource(resource);
var mapper = new DefaultLineMapper<RawInput>();
mapper.setLineTokenizer(crmCsvLineTokenizer());
mapper.setFieldSetMapper(fieldSetMapper);
mapper.afterPropertiesSet();
reader.setLineMapper(mapper);
return reader;
}
RawInput contains 1 field so it allows me to read csv with single column. For now requirements were changes and now I have to be able to read any csv file with any amount of rows thus instead of RawInput I need to pass array somehow. is it possible with FlatFileItemReader or maybe I should change implementation ?

It works:
var reader = new FlatFileItemReader<List<String>>();
reader.setName("reader");
reader.setResource(resource);
//line mapper
var lineMapper = new DefaultLineMapper<List<String>>();
lineMapper.setLineTokenizer(new DelimitedLineTokenizer());
lineMapper.setFieldSetMapper(myFieldSetMapper); // see implementation below
lineMapper.afterPropertiesSet();
reader.setLineMapper(lineMapper);
return reader;
#Component
public class MyFieldSetMapper implements FieldSetMapper<List<String>> {
#NonNull
#Override
public List<String> mapFieldSet(#NonNull FieldSet fieldSet) {
return Arrays.stream(fieldSet.getValues())
.map(StringUtils::lowerCase) // optional
.map(StringUtils::trimToNull) // optional
.collect(Collectors.toList());
}
}

How can I put the port and host in the property file in Spring?

I have this url
private static final String PRODUCTS_URL = "http://localhost:3007/catalog/products/";
And this methods:
public JSONObject getProductByIdFromMicroservice(String id) throws IOException, JSONException {
return getProductsFromProductMicroservice(PRODUCTS_URL + id);
}
public JSONObject getProductsFromProductMicroservice(String url) throws IOException, JSONException {
CloseableHttpClient productClient = HttpClients.createDefault();
HttpGet getProducts = new HttpGet(url);
CloseableHttpResponse microserviceResponse = productClient.execute(getProducts);
HttpEntity entity = microserviceResponse.getEntity();
BufferedReader br = new BufferedReader(new InputStreamReader((entity.getContent())));
StringBuilder sb = new StringBuilder();
String line;
while ((line = br.readLine()) != null) {
sb.append(line);
}
br.close();
System.out.println(sb.toString());
JSONObject obj = new JSONObject(sb.toString());
System.out.println(obj);
return obj;
}
I want to put the port and host in a separate property file. I have already seen examples using properties and the yml file. But I do not understand how then my methods will work using this port when creating an instance of the class, which I will indicate in the properties file. Can you tell?

You can put your properties in a properties file in the resource directory for example
PRODUCTS_URL="http://localhost:3007/catalog/products/"
and add #PropertySource("YOUR_RESOURCE_FILE_HERE.properties") in your main class (Application.java)
#SpringBootApplication
#PropertySource("products.properties")
public class Application {...}
and then use #Value("${YOUR_PROPERTY_NAME}") to load it:
#Value("${PRODUCTS_URL}")
private String PRODUCTS_URL;
Check this tutorial

This is how i do it :
CONFIG FILE
#Database Server Properties
dbUrl=jdbc:sqlserver://localhost:1433;database=Something;
dbUser=sa
dbPassword=SomePassword
Then i annotate a config class with this :
#PropertySource("file:${ENV_VARIABLE_TO_PATH}/config.properties")
Then autowire this field :
#Autowired
private Environment environment;
Then create the data source :
#Bean
public DataSource dataSource()
{
HikariDataSource dataSource = new HikariDataSource();
try
{
dataSource.setDriverClassName("com.microsoft.sqlserver.jdbc.SQLServerDriver");
dataSource.setConnectionTestQuery("SELECT 1");
dataSource.setMaximumPoolSize(100);
String dbUrl = environment.getProperty("dbUrl");
if (dbUrl != null)
{
dataSource.setJdbcUrl(dbUrl);
}
else
{
throw new PropertyNotFoundException("The dbUrl property is missing from the config file!");
}
String dbUser = environment.getProperty("dbUser");
if (dbUser != null)
{
dataSource.setUsername(dbUser);
}
else
{
throw new PropertyNotFoundException("The dbUser property is missing from the config file!");
}
String dbPassword = environment.getProperty("dbPassword");
if (dbPassword != null)
{
dataSource.setPassword(dbPassword);
}
else
{
throw new PropertyNotFoundException("The dbPassword property is missing from the config file!");
}
logger.debug("Successfully initialized datasource");
}
catch (PropertyNotFoundException ex)
{
logger.fatal("Error initializing datasource : " + ex.getMessage());
}
return dataSource;
}
I know this is not exactly your scenario but perhaps you can find inspiration from this code to suit your specific needs?

Other answers here mention using #PropertySource annotation to specify path of config files. Also if this is a test code (unit/integration) you can also make use of another annotation #TestPropertySource.
With this, we can define configuration sources that have higher precedence than any other source used in the project.
See here: https://www.baeldung.com/spring-test-property-source

Fat Jar throwing File not found exception when trying to access text file within the jar

I have built a Spring boot MVC application with a Tree data structure in place of an actual database. The program reads from a text file and stores words in the tree. originally I used a the CommandLineRunner class to populate the tree, which works... but after creating a fat jar and running the jar, I get a file not found exception. how can I build a fat jar with maven that includes the text file with maven?
the file is currently in the project root.
here is the logic to generate the tree:
#Component
#Order(value = Ordered.HIGHEST_PRECEDENCE)
public class GenerateTree implements CommandLineRunner {
#Autowired
TreeRepository trie = new TreeRepository();
#Autowired
FileReader fileReader = new FileReader();
#Override
public void run(String... args) throws Exception {
for (String s : fileReader.readFile("wordList1.txt")){
trie.add(s);
}
}
}
here is the logic that reads in the file:
#Component
public class FileReader {
List<String> readFile(String filename){
List<String> list = new ArrayList<>();
try (Stream<String> stream = Files.lines(Paths.get(filename))) {
list = stream
.filter(line -> line.matches("[a-zA-Z]+"))
.collect(Collectors.toList());
} catch (IOException e) {
e.printStackTrace();
}
return list;
}
}

You cannot access a File inside a jar (see https://stackoverflow.com/a/8258308/4516887).
Put the wordlist.txt into the src/main/resources directory and read its contents using a [ClassPathResource][1]:
ClassPathResource resource = new ClassPathResource("worldlist.txt");
try (InputStream in = resource.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(in))) {
String line;
while((line = reader.readLine()) != null) {
...
}
}

How to change my job configuration to add file name dynamically

I have a spring batch job which reads from a db then outputs to a multiple csv's. Inside my db I have a special column named divisionId. A CSV file should exist for every distinct value of divisionId. I split out the data using a ClassifierCompositeItemWriter.
At the moment I have an ItemWriter bean defined for every distinct value of divisionId. The beans are the same, it's only the file name that is different.
How can I change the configuration below to create a file with the divisionId automatically pre-pended to the file name without having to register a new ItemWriter for each divisionId?
I've been playing around with #JobScope and #StepScope annotations but can't get it right.
Thanks in advance.
#Bean
public Step readStgDbAndExportMasterListStep() {
return commonJobConfig.stepBuilderFactory
.get("readStgDbAndExportMasterListStep")
.<MasterList,MasterList>chunk(commonJobConfig.chunkSize)
.reader(commonJobConfig.queryStagingDbReader())
.processor(masterListOutputProcessor())
.writer(masterListFileWriter())
.stream((ItemStream) divisionMasterListFileWriter45())
.stream((ItemStream) divisionMasterListFileWriter90())
.build();
}
#Bean
public ItemWriter<MasterList> masterListFileWriter() {
BackToBackPatternClassifier classifier = new BackToBackPatternClassifier();
classifier.setRouterDelegate(new DivisionClassifier());
classifier.setMatcherMap(new HashMap<String, ItemWriter<? extends MasterList>>() {{
put("45", divisionMasterListFileWriter45());
put("90", divisionMasterListFileWriter90());
}});
ClassifierCompositeItemWriter<MasterList> writer = new ClassifierCompositeItemWriter<MasterList>();
writer.setClassifier(classifier);
return writer;
}
#Bean
public ItemWriter<MasterList> divisionMasterListFileWriter45() {
FlatFileItemWriter<MasterList> writer = new FlatFileItemWriter<>();
writer.setResource(new FileSystemResource(new File(commonJobConfig.outDir, "45_masterList" + "" + ".csv")));
writer.setHeaderCallback(masterListFlatFileHeaderCallback());
writer.setLineAggregator(masterListFormatterLineAggregator());
return writer;
}
#Bean
public ItemWriter<MasterList> divisionMasterListFileWriter90() {
FlatFileItemWriter<MasterList> writer = new FlatFileItemWriter<>();
writer.setResource(new FileSystemResource(new File(commonJobConfig.outDir, "90_masterList" + "" + ".csv")));
writer.setHeaderCallback(masterListFlatFileHeaderCallback());
writer.setLineAggregator(masterListFormatterLineAggregator());
return writer;
}

I came up with a pretty complex way of doing this. I followed a tutorial at https://github.com/langmi/spring-batch-examples/wiki/Rename-Files.
The premise is to use the step execution context to place the file name in it.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Load Multiple CSV files into database using Spring Batch - spring-boot

Related

spring batch flatfileitemwriter for multifile of large data

How to read csv with unknow column names and with unknow column count using in spring batch?

How can I put the port and host in the property file in Spring?

Fat Jar throwing File not found exception when trying to access text file within the jar

How to change my job configuration to add file name dynamically

Categories

Resources