Yaml properties as a Map in Spring Boot - spring

We have a spring-boot project and are using application.yml files. This works exactly as described in the spring-boot documentation. spring-boot automatically looks in several locations for the files, and obeys any environment overrides we use for the location of those files.
Now we want to also expose those yaml properties as a Map. According to the documentation this can be done with YamlMapFactoryBean. However YamlMapFactoryBean wants me to specify which yaml files to use via the resources property. I want it to use the same yaml files and processing hierarchy that it used when creating properties, so that I can take still take advantage of "magical" features such as placeholder resolution in property values.
I didn't see any documentation on if this was possible.
I was thinking of writing a MapFactoryBean that looked at the environment and simply reversed the "flattening" performed by the YamlProcessor when creating the properties representation of the file.
Any other ideas?

The ConfigFileApplicationContextListener contains the logic for searching for files in various locations. And PropertySourcesLoader loads a file (Resource) into property sources. Neither is really designed for standalone use, but you could easily duplicate them if you want more control. The PropertySourcesLoader delegates to a collection of PropertySourceLoaders so you could add one of the latter that delegates to your YamlMapFactoryBean.
A slightly awkward but workable solution would be to use the existing machinery to collect the YAML on startup. Add a new PropertySourceLoader to your META-INF/spring.factories and let it create new property sources, then post process the Environment to extract the source map(s).
Beware, though: creating a single Map from multiple YAML files, or even a single one with multiple documents (let alone multiple files with multiple documents) isn't as easy as you might think. You have a map-merge problem, and someone is going to have to define the algorithm. The flattening done in YamlMapPropertiesBean and the merge in YamlMapFactoryBean are just two choices out of (probably) a larger set of possibilities.

Related

spring-boot profiles : resources/profiles/application-*.properties

I have a lot of profiles, I want to put it to /profiles folder, I do not want to put it to classpath:/,classpath:/config/,file:./,file:./config/, I saw the Class ConfigFileApplicationListener, Is there any other way?
Hard to tell exactly what you are asking, but it sounds like you want to put your profile-specific properties files in a /profiles directory, rather than one of the default search locations.
In the ConfigFileApplicationListener, you can specify /profiles using setSearchLocations()
Alternative search locations and names can be specified using
setSearchLocations(String) and setSearchNames(String).
From the Javadocs for ConfigFileApplicationListener.setSearchLocations():
Set the search locations that will be considered as a comma-separated
list. Each search location should be a directory path (ending in
"/") and it will be prefixed by the file names constructed from
setSearchNames(String) search names and profiles (if any)
plus file extensions supported by the properties loaders. Locations
are considered in the order specified, with later items taking
precedence (like a map merge).
Add the custom ConfigFileApplicationListener to the SpringApplication via the addListeners() method before calling run().
Alternative Method
As a somewhat easier workaround, I find #masted's solution here this method to be much easier. The only downside is you need to add an environment variable,
-Dext.properties.dir=classpath:/profiles/
followed by whatever environment you want to point to.

application.yml vs application.properties for Spring Boot

In my project i'm currently using application.yml for configuration. Spring Initializr generate application.properties? What are the Pro/Cons for each one?
Well, they are just different data formats. Which one's nicer and easier to read? That's obviously subjective. Here's a useful blog post.
As far as spring-boot configuration is concerned, note that there's only one documented shortcoming of using YAML. Per the documentation:
YAML files can’t be loaded via the #PropertySource annotation. So in the case that you need to load values that way, you need to use a properties file.
As per my knowledge, these are at least some of the differences:
.properties stores data in sequential format, whereas
.yml stores data in hierarchical format.
.properties supports only key-value pairs (basically string values), whereas
.yml supports key-value pair, as well as map, list & scalar type values.
.properties is specifically used by Java, whereas
.yml can be used by other languages (eg Java, Python, ROR, etc).
When managing multiple configuration profiles, then:
.properties requires you to create .properties file per every profile, whereas in
.yml you can create a section for each specific profile inside a single .yml file.
In Spring projects, #PropertySource annotation can only be used with .properties.
One notable difference is how the properties are represented in each file. YAML files may use consistent spaces to denote hierarchy whereas properties file may use = to denote property values.
For ex.
Lists are represented hierarchically in YAML:
headers:
- user-agent
- x-wag-diagonalsize
Lists may be represented as inline list (separated by commas) in a properties file:
headers = user-agent, x-wag-diagonalsize
Another difference is we can add multiple configuration files into single yaml file.
For ex., we can add application.yaml(application specific properties) and bootstrap.yaml(server specific properties) into single config.yaml file

Using Page Objects vs Config Files in Selenium

I've been using Ruby Selenium-Webdriver for one of the automation scripts I'm developing and I'm being asked to use Page Objects, we use page objects a lot however for this application I am using CSV file instead, I have defined all the xpaths that I'm using in my application in a CSV file and I'm parsing that CSV file in my script to refer to those objects, I would like to know is there much of a difference in using a class for defining Page Objects or using a CSV file instead apart from performance concern? I believe using a CSV file will be an addon for us from configuration standpoint and will make it much easier to maintain, any suggestions on this?
Edit - In our use case, we're actually automating applications built on a cloud based tool, so basically all the applications share same design structure from HTML standpoint so we define xpath patterns in CSV and then we pass certain parameters to some custom methods that we've developed to generate xpath's automatically using the CSV instead of finding those manually as its overhead for us because we already know that all the applications will share similar xpath pattern for all elements.
Thanks
I think, POM is better than CSV approach. In POM, you put elements for a page in a separate class file. So, if any change is to make then it's easier to find where to change/maintain. Moreover, it won't get too messy as CSV file and you don't need to use extra utility function to parse those.
There is also a pageobjects gem that provides a set of libraries over and above webdriver/watir, simplifying the code.
Plus, why xpaths? Its one of the last recommended ways to identify an element.
As for the frameork aspect, csv should be more of a maintenance problem than PageObjects. Its the basic difference between text and code. You enforce Object oriented approach on your elements in PageObjects but that is not possible with csv.
In the best case scenario, you have created a column/separate sheets defining which page that element xpath belongs to. That sounds like an overhead. As your application / suite grows there can be thousands of elements. Imagine parsing/ manually updating a csv with that kind of data.
Instead in PageObjects, your elements will be restricted to the Page. Any changes to the app will also specify which elements may get impacted. Now, when define your element as an object in PageObject, rather than css, you also dont need to explicitly create your elements by reading the csv.
It completely depends on the application and the type of test you might perform.
Since it is an automated test script, you do not have to really worry about the performance of the script (it might take few more milli seconds to parse, which should be OK).
Maintaining all the elements identification properties & corresponding actions in a CSV file will make the maintenance easier and make the framework application independent which are nice. But maintaining your framework is bit difficult to make it more robust. Both approaches have its own pros and cons.
Refer to below posts [examples are in java - but you will get the idea]:
Keyword driven framework
Advanced Page Objects
Update:
If you like both, you can comeup with your implementation to easily integrate these too.
#ObjectRepository(src="/login.csv")
public class LoginPage{
private Map<String, WebElement> elements;
public void login(){
elements.get("username").sendKeys('');
elements.get("password").sendKeys('');
elements.get("signin").click();
}
}
Ie, define all the elements in a config file like csv/json etc. Let the page object refer to the class for the page elements. All the methods will be part of the page class.

Issue with setting multiple projectionSchemas for AvroParquetInputFormat

I use AvroParquetInputFormat. The usecase requires scanning of multiple input directories and each directory will have files with one schema. Since AvroParquetInputFormat class could not handle multiple input schemas, I created a workaround by statically creating multiple dummy classes like MyAvroParquetInputFormat1, MyAvroParquetInputFormat2 etc where each class just inherits from AvroParquetInputFormat. And for each directory, I set a different MyAvroParquetInputFormat and that worked (please let me know if there is a cleaner way to achieve this).
My current problem is as follows:
Each file has a few hundred columns and based on meta-data I construct a projectionSchema for each directory, to reduce unnecessary disk & network IO. I use the static setRequestedProjection() method on each of my MyAvroParquetInputFormat classes. But, being static, the last call’s projectionSchema is used for reading data from all directories, which is not the required behavior.
Any pointers to workarounds/solutions would is highly appreciated.
Thanks & Regards
MK
Keep in mind that if your avro schemas are compatible (see avro doc for definition of schema compatibility) you can access all the data with a single schema. Extending on this, it is also possible to construct a parquet friendly schema (no unions) that is compatible with all your schemas so you can use just that one.
As for the approach you took, there is no easy way of doing this that I know of. You have to extend MultipleInputs functionality somehow to assign a different schema for each of your input formats. MultipleInputs works by setting two configuration properties in your job configuration:
mapreduce.input.multipleinputs.dir.formats //contains a comma separated list of InputFormat classes
mapreduce.input.multipleinputs.dir.mappers //contains a comma separated list of Mapper classes.
These two lists must be the same length. And this is where it gets tricky. This information is used deep within hadoop code to initialize mappers and input formats, so that's where you should add your own code.
As an alternative, I would suggest that you do the projection using one of the tools already available, such as hive. If there are not too many different schemas, you can write a set of simple hive queries to do the projection for each of the schemas, and after that you can use a single mapper to process the data or whatever the hell you want.

How do I get CsvDozerBeanWriter to pull column headers from Dozer XML mapping files

I'm writing a feature to produce CSV snapshots of screen data.
I need this to be data-driven. Thus I need to avoid hard-coding each snapshot in Java, but rather load it from a data source such as an XML file or a database. The data is contained in Java beans.
I'm using SuperCSV with the Dozer extension both at 2.1.0.
This combination seems perfect since I can code the mappings from the beans to the columns in Dozer XML mapping files.
This works well for the data, but I have not found a way to specify the strings to use for the CSV's column headers other than to hard-code them in Java as is done in all of the examples and test cases I've looked at. That is not data-driven.
Is there a way for me to code the column headers in the mapper file. Or even to extract them from the mapper file, construct a List and pass them to the writerHeader() method?
I think it would be OK to just use the bean property names as the headers, although ideal situation is that I am provided some additional meta-data notation in the XML's <Field> tag that specifies the header.
I'd have posted this on SourceForge, but I'm getting a 500 error there.
I'm a Super CSV developer. You're the first person I've heard of who's using CsvDozerBeanWriter with their own DozerBeanMapper - great to hear that feature is useful :)
So what's the goal of being 'data driven'? It sounds like you want your code to be really generic, so you can alter the CSV just by changing the XML. Is that right? Of course, you can't configure the cell processors dynamically...or are you trying to do that too!!??
I'd take a look at the MappingMetadata API of Dozer, which you can access by calling getMappingMetadata() on the DozerBeanMapper. I've never used it, but it looks like you could derive the column names this way (though you'd probably be limited to the field names).
Otherwise, you'll have to parse the XML file yourself (I'd probably use XPath). You'd have to do it this way if you want to use some other metadata in the XML for the column name.

Resources