Spring XD to read multiple website data - spring

I need to create a website that reads contents of different websites and help to compare them.
One of the examples having a similar website
http://www.mysmartprice.com/mobile/samsung-galaxy-grand-2-msp3633
This helps us to compare prices of samsung mobile between different online websites.
Now I need to know :
1. How to read data from different websites.
Using java, I can read and fetch html data. But question arises, what is the best way to parse the html content to get desired information?
I want to use Spring XD. Please suggest best strategy?
Regards,
Jubin

I think you need to develop a java application for each data source, and then develop a custom module "source", and use Spring xd to ingest the data.
Another solution is to develop the application, make your applications load required data to csv files and trasfer them into a path like /tmp/xd/input automatically when the program runs, and then use Spring XD to ingest the data from csv files into whatever destination you need.

Related

How to fetch BigQuery data into a springboot application?

I have a use case wherein I need to fetch data from GCP BigQuery database into my Springboot application and subsequently perform some operations on it. I'm unable to understand how to go about doing it. For example, how the application properties need to be configured for using BQ database, etc, nor was I able to find any good resource for the same.
Request you all to kindly guide me a bit on this. Would be great even if you could point me to a relevant resource!
Indeed there are no examples on Spring Cloud documentation. However there is nice sample on spring-cloud-gcp github.
There is small tutorial how to run it, so I think this will be good starting point.

How to make a text file to be the "database" in a Spring Rest Application?

I´m developing a Jokenpo Game using React with Spring Rest, but I can´t have a database to store all the information needed(create and delete moves, create and delete players).
I don´t know the best practice of development, or if there is some design pattern on how to store that kind of information. I know there is the folder src/main/resources where maybe I can store a text file there and thought about on the startup of the api it loads that file with the begin of the game, maybe, and after changing it during the game.
Trying to be more clear: I just would like to know the simplest way of storing information without being a database inside of a Spring Rest application. I really appreciate any helps. Thanks.
Take a look at SQLite. It's a very light database library that you can include as a dependency of your Spring application, It doesn't require a separate database server to run, and the entire database is stored in a single file, that you can choose where to store in the connection string.
It offers the flexibility of a standard database, so you can use Spring Data / JPA to access the data. It has some limitations compared with robust databases like MySQL, specially related with concurrent writes that you should investigate and be aware of. Usually it works very well for small applications or embedded applications.

Convert Resuable ErrorHandling flow in to connector/component in Mule4

I'm Using Mule 4.2.2 Runtime. We use the errorHandling generated by APIKIT and we customized it according to customer requirement's, which is quite standard across all the upcoming api's.
Thinking to convert this as a connector so that it will appear as component/connector in palette to reuse across all the api's instead copy paste everytime.
Like RestConnect for API specification which will automatically convert in to connector as soon as published in Exchange ( https://help.mulesoft.com/s/article/How-to-generate-a-connector-for-a-REST-API-for-Mule-3-x-and-4-x).
Do we have any option like above publishing mule common flow which will convert to component/connector?
If not, which one is the best way suits in my scenario
1) using SDK
https://dzone.com/articles/mulesoft-custom-connector-using-mule-sdk-for-mule (or)
2) creating jar as mentioned in this page
[https://www.linkedin.com/pulse/flow-reusability-mule-4-nagaraju-kshathriya][2]
Please suggest which one is best and easy way in this case? Thanks in advance.
Using the Mule SDK (1) is useful to create a connector or module in Java. Your questions wasn't fully clear about what do want to encapsulate in a connector. I understand that you want is to share parts of a flow as a connector in the palette, which is different. The XML SDK seems to be more inline with that. You will need to make some changes to encapsulate the flow elements, as described in the documentation. That's actually very similar to how REST connect works.
The method described in (2) is for importing XML flows from a JAR file, but the method described by that link is actually incorrect for Mule 4. The right way to implement sharing flows through a library is the one described at https://help.mulesoft.com/s/article/How-to-add-a-call-to-an-external-flow-in-Mule-4. Note that this method doesn't create a connector that can be used from Anypoint Studio palette.
From personal experience - use common flow, put it to repository and include it as dependency to pom file. Even better solution - include is as flow to the Domain app and use it alone with your shared https connector.
I wrote a lot of Java based custom components. I liked them a lot and was proud of them. But transition from Mule3 to Mule4 killed most of them. Even in Mule4 Mulesoft makes changes periodically which make components incompatible with runtime.

Feasibilty analysis of data transformation using any ETL tool

I don't have any experience on any ETL tool. However I want to know if it is possible to do the followings using any ETL tool or we need to write a java or any other batch job to do this:
Scenario 1:
The source system has different REST APIs. I need to get the data, transform it, then store the data in a MongoDB.
The hardest part is the transformation. There can be situation where I need to call a REST API of source, and based on its data I need to call several other REST APIs using the 1st API data. After that we need to format the entire data in different format and store it in Mongo.
Scenario 2:
The source system has a DB. I need to transform the data using my custom logic and store it in MongoDB.
Here the custom logic can include things like this:
From table1 of source I created collection1. After that I need to consult table2 and previously created collection1, process the data and then create collection2.
Is this possible using any ETL tool? If possible then which tool? If possible please mention in as short as possible, how it can be done using different terminology so that I can search internet, learn things and implement it.
Briefly speaking: yes, that is what ETL tools are exactly for. You can Extract data from REST sources, Transform using sophisticated logic and Load to target, like MongoDB.
Exact implementation depends on the tool. While I guess you will get help if you run across problems implementing the solution in any of the tools, I don't think anyone will prepare complete, detailed solutions for you.

Best way to store uploaded files in a Spring MVC environnment

The question is quite easy: what is the best way to store uploaded files in a clustered Spring MVC environnment?
Example: let's say I'm coding a social network and I have to possibility to upload my personal profile picture. The user will then have at most one image associated with his profile.
A solution can then be to add a blob column to the users table in the DB — I read this is good when in a clustered environment (easier to scale the DB than a folder containing lots of images). Is it true? How can I achieve this in a JPA (Hibernate and PostgreSQL) environment? I saw there is the #Lob annotation but what type of variable should I use? byte[]?
Another solution is to store them on the hard drive. What is the best path to store these images? In the webapp folder? In the classpath? In another folder outside the Spring context?
Thank you very much for your help.
Update: an important detail that I forgot to say. The administration dashboard (CMS/back end) of this website will be coded in PHP (I know, I know...) while the front-end will be coded in Java Spring (MVC). The database is all managed by the java part (JPA). Whatever the final choice will be, it has to be compatible with this requirement.
I'd rather not store it in DB. The best place is some server for static files (or CDN).
If you really need you can store is as a Lob but I think it's a bad idea for performance scalability reasons.
What is more important, databases seems to be more expensive than simple Content Delivery Networks.

Resources