Change the path of MirrorWriterProcessor in Heritrix 3.1.0 - spring

I am crawling using Heritrix 3.1.0. I am trying to save the files using the MirrorWriterProcessor. However, this option is not available in the crawler-beans.cxml.
What I did was to replace the "warcWriter" "org.archive.modules.writer.WARCWriterProcessor"
to
"org.archive.modules.writer.MirrorWriterProcessor"
However, this processor write the mirror content to
$HERITRIX_HOME/mirror
I configured the "path" to "${launchId}/mirror", hoping Heritrix to write the mirror directory to under the job directory.
What shall I do to change the path of MirrorWriterProcessor to under the job directory?

You cannot, at the moment, use tags like the ones warcWritter accepts. You can, however, write some spring magic to create your own stamped folders. This creates a factory for the format function of SimpleDateFormat and spits out a string you can use to create a stamped folder.
<bean id="dateFormat" class="java.text.SimpleDateFormat">
<constructor-arg value="ddMMyyyy" />
</bean>
<bean id="formatedDate" factory-bean="dateFormat" factory-method="format">
<constructor-arg>
<bean class="java.util.Date" />
</constructor-arg>
</bean>
<bean id="mirrorWriter" class="org.archive.modules.writer.MirrorWriterProcessor">
<property name="path">
<bean class="java.lang.String">
<constructor-arg value="#{formatedDate + '/mirror'}" />
</bean>
</property>
...

Related

FileReadingMessageSource.WatchServiceDirectoryScanner: turn off recursive descent into sub-directories?

Versions:
Spring: 5.2.16.RELEASE
Spring Integrations: 5.3.9.RELEASE
macOS Big Sur: 11.6
I am using XML to set up the directory scanner FileReadingMessageSource.WatchServiceDirectoryScanner like so:
<int-file:inbound-channel-adapter id="channelIn" directory="${channel.dir}" auto-create-directory="false" use-watch-service="true" filter="channelFilter" watch-events="CREATE,MODIFY">
<int-file:nio-locker ref="channelLocker"/>
<int:poller fixed-delay="${channel.polling.delay}" max-messages-per-poll="${channel.polling.maxmsgs}"></int:poller>
</int-file:inbound-channel-adapter>
with the following bean definitions:
<bean id="channelLocker" class="org.springframework.integration.file.locking.NioFileLocker"/>
<bean id="channelFilter" class="org.springframework.integration.file.filters.ChainFileListFilter">
<constructor-arg>
<list>
<bean class="org.springframework.integration.file.filters.SimplePatternFileListFilter">
<constructor-arg value="SpreadSheets*.xls" />
</bean>
<bean id="filter" class="org.springframework.integration.file.filters.LastModifiedFileListFilter">
<property name="age" value="${channel.filter.age}" />
</bean>
<ref bean="persistentFilter" />
</list>
</constructor-arg>
</bean>
<bean id="persistentFilter" class="org.springframework.integration.file.filters.FileSystemPersistentAcceptOnceFileListFilter">
<constructor-arg index="0" ref="metadataStore" />
<constructor-arg index="1" name="prefix" value="" />
<property name="flushOnUpdate" value="false" />
</bean>
If I look at logs for org.springframework.integration.file.FileReadingMessageSource, I notice that we register both the specified directory (ie, ${channel.dir}) as well as any of its sub-directories. That is, I see logs like this:
15:44:45.706 [main] DEBUG org.springframework.integration.file.FileReadingMessageSource - registering: /Users/kc/scan.here for file events
15:44:45.711 [main] DEBUG org.springframework.integration.file.FileReadingMessageSource - registering: /Users/kc/scan.here/and.here for file events
I've looked Spring docs as well as API docs for the relevant software modules (Eg, FileReadingMessageSource), but I don't see any property or configuration option for turning off recursive descent into sub-directories.
What is the recommended practice here for scanning only files within the specified directory, but not recursing any deeper than that?
If you don't a recursion and scan the whole file tree, just don't use that watch service!
For the create and modify events you can configure a FileSystemPersistentAcceptOnceFileListFilter which checks for the file.lastModified(). I see you do that anyway, therefore it is not clear why do you need a watch service at all?
See some related discussed here: https://github.com/spring-projects/spring-integration/issues/3557.
If you still have some reasonable argument to use watch service for only a root dir, please add a comment into that issue and we will revise it respectively.

Apache Ignite not resolving properties in configuration XML

I am looking to load a number of values into my server configuration.xml from a properties file.
However, on adding the placeholders I start getting, property cannot be resolved errors. Preferably I would like to use Jasypt, which has loaded up fine, but has the same issue, property cannot be resolved.
Sample placeholder:
<bean id="placeholderConfig" class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
<property name="location" value="ignite.properties"/>
</bean>
Sample Bean:
<property name="sslContextFactory">
<bean class="org.apache.ignite.ssl.SslContextFactory">
<property name="keyStoreFilePath" value="ignite.jks"/>
<property name="keyStorePassword" value="${some.password}"/>
<property name="keyStoreType" value="JKS"/>
<property name="protocol" value="TLSv1.2"/>
<property name="trustManagers">
<bean class="org.apache.ignite.ssl.SslContextFactory" factory-method="getDisabledTrustManager"/>
</property>
</bean>
</property>
Is it possible, is there a library I should have added, it otherwise runs fine if I do not use properties.
The configuration is parsed by Spring and Ignite has nothing to do with it. I believe there are two possible reasons:
Incorrect file path. Note that if the file is on the classpath, the location should be classpath:ignite.properties.
Incorrect property name.

Is it possible to dynamically create a resource name?

I'm writing to a CSV file and using FlatFileItemWriter to do it. I have a bean with that as my class, and I also have a property for the resource, where I provide the file name to use to write out the item.
Is it possible to append the date and time to the file name?
Right now I am telling it to write out to a file called report.csv, instead I want it to write out to a file called report-7-2-2014-16-03.csv
Here's the XML Config for the writer
<bean id="csvWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">
<property name="resource">
<bean class="org.springframework.core.io.FileSystemResource">
<constructor-arg value="${REPORT_FILENAME}" />
</bean>
</property>
<property name="shouldDeleteIfExists" value="true" />
<property name="lineAggregator">
<bean class="com.example.CSVLineAggregator" />
</property>
<property name="headerCallback">
<bean class="com.example.CSVHeaderWriter" >
<constructor-arg value="${REPORT_HEADER}" />
</bean>
</property>
</bean>
You need to configure your step with late binding to solve your problem.
Late binding allow you to evaluate an expression during job/step execution and not in a static way (during xml pasring, to be clear) using values stored in jobParameters and job or step execution context; thise contexts are available only during job/step execution!
To enable this feature:
mark your artifacts with special step="scope"
Add to jobParameters with name 'REPORT_FILENAME' and value 'report-7-2-2014-16-03.csv' (of course this value will change every time you launch the job and is your reponsibility to make it unique and create with right date/time)
use spEL with special #{jobParameters['REPORT_FILENAME']} expression to let SB extract the report name from job parameters dinamically
<bean id="csvWriter" class="org.springframework.batch.item.file.FlatFileItemWriter" scope="step">
<property name="resource" value="file://report-dir/#{jobParameters['REPORT_FILENAME']}" />
<property name="shouldDeleteIfExists" value="true" />
<property name="lineAggregator">
<bean class="com.example.CSVLineAggregator" />
</property>
<property name="headerCallback">
<bean class="com.example.CSVHeaderWriter" >
<constructor-arg value="${REPORT_HEADER}" />
</bean>
</property>
</bean>
This is just a small example but you can read more about Late binding

<jee:jndi-lookup default-value and the use of classpath

I am really stuck on this one... Help! :)
I am using j2ee:jndi lookup for a property file. The following works fine:
<bean class="org.springframework.core.io.FileSystemResource">
<constructor-arg>
<jee:jndi-lookup id="myProps" jndi-name="myProps" resource-ref="true" />
</constructor-arg>
</bean>
However, I want to handle the case where the jndi lookup fails but will fall back on a default file located in WEB-INF/classes folder. If I use the default-value as below, the webapp throws an exception complaining that it cannot find the file "classpath:myprops.properties"
<bean class="org.springframework.core.io.FileSystemResource">
<constructor-arg>
<jee:jndi-lookup id="myProps" jndi-name="myProps" resource-ref="true"
default-value="classpath:myprops.properties" />
</constructor-arg>
</bean>
However, if I hard-code a specific path for default-value, then it works fine, but that is unacceptable as a final solution.
Thus, my issue is how to use "classpath:" so that it gets properly resolved?
This is the overall usage I'm employing:
<bean id="authServerProperties"
class="org.springframework.beans.factory.config.PropertiesFactoryBean">
<property name="ignoreResourceNotFound" value="true"/>
<property name="location">
<bean class="org.springframework.core.io.FileSystemResource">
<constructor-arg>
<jee:jndi-lookup id="myProps" jndi-name="myProps" resource-ref="true"
default-value="classpath:myprops.properties" />
</constructor-arg>
</bean>
</property>
.....
</bean>
Let Spring use its built-in PropertyEditor support to decide on the type of resource, rather than supplying an explicit FileSystemResource bean as this won't work with classpath resources (it needs to be configured with a path on the file system). Instead you should use something like
<bean id="authServerProperties"
class="org.springframework.beans.factory.config.PropertiesFactoryBean">
<property name="ignoreResourceNotFound" value="true"/>
<property name="location" ref="myProps" />
</bean>
<jee:jndi-lookup id="myProps" jndi-name="myProps" resource-ref="true"
default-value="classpath:myprops.properties"/>
Here we are setting the location to be a string value and allowing Spring to convert that to the appropriate resource type, so if you have
<env-entry>
<env-entry-name>myProps</env-entry-name>
<env-entry-type>java.lang.String</env-entry-type>
<env-entry-value>file:///Users/something/myProps.properties</env-entry-value>
</env-entry>
in your web.xml, it will use a UrlResource with the given file URL, otherwise it will create a ClasspathResource to look for the file myprops.properties.

Setting properties in Spring

Is there a way to change the properties of a file? I'm trying to run selenium tests in parallel, with Spring and Jetty, so I'm trying to configure the url of the database, the port of the jettyserver and the port of the selenium server. So that I'm able to initialize two or more servers where the tests can run on.
My server.properties file contains this:
jdbc.url=jdbc:hsqldb:hsql://localhost/bibliothouris_scenario
jetty.port=8081
seleniumServer.port=4444
I can read those properties with a PropertyPlaceholderConfigurer, and I need the database URL, jettyport and seleniumserver port to be flexible.
I have declared them like this:
In my applicationContext.xml:
<bean class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
<property name="location">
<value>classpath:server.properties</value>
</property>
</bean>
<bean id="dataSource" destroy-method="close" class="org.apache.commons.dbcp.BasicDataSource">
<property name="driverClassName" value="org.hsqldb.jdbcDriver" />
<property name="url" value="${jdbc.url}" />
<property name="username" value="sa" />
<property name="password" value="" />
</bean>
In the serverContext.xml file:
<bean class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
<property name="location">
<value>classpath:server.properties</value>
</property>
</bean>
<bean class="com.~companyName~.bibliothouris.jetty.JettyServer" init-method="start" destroy-method="stop">
<constructor-arg value="${jetty.port}" />
<constructor-arg ref="dataSource" />
</bean>
<bean class="org.openqa.selenium.server.SeleniumServer" init-method="start" destroy-method="stop">
<constructor-arg>
<bean class="org.openqa.selenium.server.RemoteControlConfiguration">
<property name="port" value="${seleniumServer.port}" />
<property name="singleWindow" value="true" />
<property name="timeoutInSeconds" value="10" />
</bean>
</constructor-arg>
</bean>
<bean class="com.thoughtworks.selenium.DefaultSelenium" init-method="start" destroy-method="stop" lazy-init="true">
<constructor-arg>
<bean class="com.thoughtworks.selenium.HttpCommandProcessor">
<constructor-arg value="localhost" />
<constructor-arg value="${seleniumServer.port}" />
<constructor-arg value="*firefox c:/~companyname~/firefox/firefox.exe" />
<constructor-arg value="http://localhost:${jetty.port}" />
</bean>
</constructor-arg>
</bean>
When I change the data in server.properties the selenium tests run on the right servers with the right ports, without failures.
So now I'm looking for a method to change the properties in the server.properties file.
Kind regards and thanks in advance
I solved this by having a flag in my build process (I'm using Maven) that chose which property file to include in the final war. This way you can include different artifacts (different property files) with different properties without having to mess with the low level property support of Spring.
If you do need to do this is Spring only, I would recommend going for a Java based configuration, where you can get and set the properties by in code not in XML.
Is there a way to change the
properties of a file?
No, but you could solve this in the following ways.
Split the properties into jdbc.properties (for applicationContext.xml) and test.properties (for serverContext.xml)
override server.properties via a src/test/resources resource
use system properties in addition to server.properties (use PropertyPlaceholderConfigurer.setSystemPropertiesMode for this)
Thanks for the help guys, without your info, I couldn't find my own solution. Here it is:
try {
Properties props = new Properties();
FileInputStream fileInputStream = new FileInputStream(
"C:\\~CompanyName~\\workspace\\bibliothouris\\infrastructure\\src\\main\\resources\\server.properties");
props.load(fileInputStream);
fileInputStream.close();
props.setProperty("seleniumServer.port", "4445");
FileOutputStream fileOutputStream = new FileOutputStream(
"C:\\~CompanyName~\\workspace\\bibliothouris\\infrastructure\\src\\main\\resources\\server.properties");
props.store(fileOutputStream, "");
fileOutputStream.close();
} catch (Exception e) {
e.printStackTrace();
}
I've wrote this piece of code in a testclass, now I have to create a method of it, which takes a few arguments (the URL, jettyport and seleniumport). And I have to change the path to a relative one.
Thanks for the help!

Resources