Does Spring have any solution for Web Scraping? - spring

I need to build a web application which is going to scrape and crawl some websites and extract data and crawlers will have scheduler. I know there are plenty of tools for parsing and extracting data like Jsoup, but I just want to know whether I can implement this with Spring Tools or not?

There is no one single template project available from Spring which fits your case. But you can definitely achieve what you want through Spring. Spring provides many template projects to develop different kind of applications. There is Spring Boot project to quickly get started with a web application. There is Spring Scheduler project for scheduling tasks. I would suggest combine these two and develop your application.
For crawling and Scraping, I believe there are no templates available from Spring. But you can do it using jSpider and jSoup.

Related

Microservice project structure using Spring boot and Spring Cloud

I am trying to convert a normal monolithic web application into microservices structure using Spring Boot and Spring Cloud. I am actually trying to create Angular 2 front-end application and calls these my developed microservices in the cloud. And I am already started to break the modules into independent process's structure for microservice architecture.
Here my doubt is that, when designing the flow of control and microservice structure architecture, can I use only one single Spring Boot project using different controller for this entire web application back end process?
Somewhere I found that when I am reading develop all microservices using 2 different Spring Boot project. I am new to Spring and Spring Cloud. Is it possible to create all services in single project by using different modules?
Actually, it doesn't matter to package all those services into ONE project. But in micro-service's opinion, you should separate them into many independent projects. There are several questions you can ask yourself before transforming original architecture.
Is your application critical? Can user be tolerant of downtime while you must re-deploying whole package for updating only one service?
If there is no any dependency between services, why you want to put them together? Isn't it hard to develop or maintain?
Is the usage rate of each service the same? Maybe you can isolate those services and deploy them which are often to be invoked to a strong server.
Try to read this article Adopting Microservices at Netflix: Lessons for Architectural Design to understand the best practices for designing a microservices architecture. And for developing with Spring Cloud, you can also read this post Spring Cloud Netflix to know which components you should use in your architecture.
Currently I am working on microservices too, according my experience we have designed microservices as step below,
Maven
You should create the project with different project. But actually you can separate your project to submodule. So you will be easy to manage your project, the submodule you can use with other project too.
Build the Jar Library put your local repository. it can save your time, you have just find the same component or your functionality then build the jar file put in your local repository , so every project that use this function call point to download this repository, you don't have to write many project same same.
So finally I would like you to create different springboot project, but just create submodule and build local repository.
By creating your modules in different projects you create a more flexible solution.
You could even use different languages and technologies in a service in particular. E.g. one of your services could be NodeJS and the rest Java/Spring.

Spring Boot/Thymeleaf based large/mid scale application

We are starting out with Spring Boot, and looking for best practices
in implementing a large application. If you can provide links to any large/mid scale open source application
implemented using Spring Boot, that would be helpful.
Also we did research code generated by "JHipster" (jhipster.github.io/) project, which
definitely helps generating lot of boiler plate code like user management, transaction management, REST Services/ AngularJS based application.
The only problem is "JHipster" is AngularJS based. But in our case we
would like to go with "Thymeleaf" based UI.
If you can provide a link to framework/sample application similar to "JHipster",
but based on "Thymeleaf" based UI, that will also be very helpful.
Thanks
JHipster also supports Thymeleaf: by default it generates an AngularJS front-end, and this is its main goal, but you can also use Thymeleaf if you don't want a single Web page application.
If you have a look at the error pages, for example, they are done with Thymeleaf (as the 404 page can't be in the single Web page application, for obvious reasons)

Spring 4 vs Grails - Open Source Plugins

I have used spring 3 but not sure what is the equivalent of a grails plugin. And now need to suggest a stack for a new app. Looking at grails it seems to be great for making data base models and has a lot of plugins. but it seems its more expensive at runtime.
So my question is that is there a equal or better repo of spring for every little thing you can need like facebook login or other social actions, ajax upload, joda etc or is this what we call a dependency and some code from a blog/ stack?
Is there any repo of small reusable code like we have on grails plug ins for regular spring mvc projects?
I know that your question is about pure spring alternatives, but I would honestly recommend just using Grails. I've done projects in both stacks. If you want to get rid of the configuration headaches and get started quickly on a new project while staying within the Spring stack, it is the way to go. It is a great framework and some of my employers have many production Grails applications supporting thousands of customers.
You can also upgrade to Grails 3 when it comes out next year and take advantage of the leaner code they provide in it due to Spring Boot!
You may need to check into Spring Boot. It does not provide a full stack framework, but it is hiding much of the extra coding you may need to do for a spring application. There are some new projects that enable you to get the benefits of spring boot. Check the below projects:
1- http://jhipster.github.io/ , use it if you need to make SPA with AngularJS also have commands to generate Entities for you using Yeoman
2- http://lightadmin.org/ , use it if you want to create CRUD pages based on Spring Data Entities
For both, you may have to use Spring Data and maybe even Spring Data REST. These may be helpful too.

Document Management System

At the company I work for, we are developing a billing web application with Spring and Vaadin. The trouble is that the number of files to manage is becoming too large; bill,offers contract, etc. We currently store each document as a file on the server, but in this way it is too hard to manage them. This is tedious and error-prone, and it also means we lack any sort of security for accessing these documents.
Now, I'm looking a Document Management System to manage this document. I saw Alfresco Document Management, but I don't know how to integrate it with my application.
Any suggestions?
Alfresco has REST API, so you can use it in your Spring + Vaadin application. Spring has RestTemplate based on Jackson who will help you with REST client implementation.
There are several ways to integrate with alfresco. My two favorite ways to integrate with Alfresco are:
Using the CMIS api
http://wiki.alfresco.com/wiki/CMIS
Using your own custom webscripts or java back end webscripts. These allow you to quickly develop your own rest api with alfresco.
http://docs.alfresco.com/4.0/index.jsp?topic=%2Fcom.alfresco.enterprise.doc%2Fconcepts%2Fws-architecture.html
There are many different ways to integrate. They have webdav, cifs, ftp, and several other ways to integrate. Here is some documentation from alfresco about it
http://docs.alfresco.com/4.2/index.jsp?topic=%2Fcom.alfresco.enterprise.doc%2Fconcepts%2Fintegration-options.html

Modular Java Web Application

We have a rich web application based in Java & Spring framework which have many functionalities and classes. recently something sparked in my mind that why not we provide modularity to make it even better.
what I mean by modularity is to provide a section inside the web application that the authenticated user can contribute with us using plugins or extensions. exactly like joomla, wordpress and the other cms's around.
I want to separate each part one to another and while a user upload a plugin, that does not break down the entire system and core. also I want to provide the plugin/extension tester in the backend that the system won't accept malicious plugins.
The system should be able to uninstall each plugins and extensions without harming the core as well
How do I make this functionalities, and from where we have to start?
I'd say this depends on a couple things.
One way of achieving this could be having a Modular Framework like either Wicket or Vaadin, use those with OSGi mechanisms like Services provided through blueprint or DS and you should be able to have a fine modular web-application. For example take a look at the Pax-Wicket project it does have a sample application that does exactly this.

Resources