application.yml vs application.properties for Spring Boot - spring-boot

In my project i'm currently using application.yml for configuration. Spring Initializr generate application.properties? What are the Pro/Cons for each one?

Well, they are just different data formats. Which one's nicer and easier to read? That's obviously subjective. Here's a useful blog post.
As far as spring-boot configuration is concerned, note that there's only one documented shortcoming of using YAML. Per the documentation:
YAML files can’t be loaded via the #PropertySource annotation. So in the case that you need to load values that way, you need to use a properties file.

As per my knowledge, these are at least some of the differences:
.properties stores data in sequential format, whereas
.yml stores data in hierarchical format.
.properties supports only key-value pairs (basically string values), whereas
.yml supports key-value pair, as well as map, list & scalar type values.
.properties is specifically used by Java, whereas
.yml can be used by other languages (eg Java, Python, ROR, etc).
When managing multiple configuration profiles, then:
.properties requires you to create .properties file per every profile, whereas in
.yml you can create a section for each specific profile inside a single .yml file.
In Spring projects, #PropertySource annotation can only be used with .properties.

One notable difference is how the properties are represented in each file. YAML files may use consistent spaces to denote hierarchy whereas properties file may use = to denote property values.
For ex.
Lists are represented hierarchically in YAML:
headers:
- user-agent
- x-wag-diagonalsize
Lists may be represented as inline list (separated by commas) in a properties file:
headers = user-agent, x-wag-diagonalsize
Another difference is we can add multiple configuration files into single yaml file.
For ex., we can add application.yaml(application specific properties) and bootstrap.yaml(server specific properties) into single config.yaml file

Related

spring-boot profiles : resources/profiles/application-*.properties

I have a lot of profiles, I want to put it to /profiles folder, I do not want to put it to classpath:/,classpath:/config/,file:./,file:./config/, I saw the Class ConfigFileApplicationListener, Is there any other way?
Hard to tell exactly what you are asking, but it sounds like you want to put your profile-specific properties files in a /profiles directory, rather than one of the default search locations.
In the ConfigFileApplicationListener, you can specify /profiles using setSearchLocations()
Alternative search locations and names can be specified using
setSearchLocations(String) and setSearchNames(String).
From the Javadocs for ConfigFileApplicationListener.setSearchLocations():
Set the search locations that will be considered as a comma-separated
list. Each search location should be a directory path (ending in
"/") and it will be prefixed by the file names constructed from
setSearchNames(String) search names and profiles (if any)
plus file extensions supported by the properties loaders. Locations
are considered in the order specified, with later items taking
precedence (like a map merge).
Add the custom ConfigFileApplicationListener to the SpringApplication via the addListeners() method before calling run().
Alternative Method
As a somewhat easier workaround, I find #masted's solution here this method to be much easier. The only downside is you need to add an environment variable,
-Dext.properties.dir=classpath:/profiles/
followed by whatever environment you want to point to.

Nested YAML for hiera in puppet

I am working on designing a puppet architecture for our company. I really like the idea of hiera and YAML files to classify my nodes. However, I would really like to be able to either apply YAML files that aren't based on facts or import YAML files into another YAML file.
For example NodeA.yaml
---
include webserver.yaml
include public.yaml
classes:
etc. . .
This would allow me to reuse my code as much as possible. That way when I make a change to my web server configs, I only have to do it in one file instead of every node's YAML file.
I'm open to other solutions as well.
YAML does not support import or include
(not recomended) You can use loadyaml from stdlib module to achieve desired functionality. Check this example of using loadyaml function.
You can easily achieve expected functionality just by designing a proper hiera hierarchy. I do not understand why you do not want to use facter facts? e.g: on each node define custom facter fact location. Next define a hiera hierarchy:
:hierarchy:
- "%{::location}"/"%{::fqdn}"
- "%{::location}"/common
- common
Next in file location_1/node1.yaml you define data specific only for that node1. In file location_1/common.yaml you define data common for all nodes in location_1. In common.yaml you define data common for all nodes.
If some data is common for all nodes you define it once in common.yaml and that's all. You do not have to redundantly define it in every node's yaml file.

Issue with setting multiple projectionSchemas for AvroParquetInputFormat

I use AvroParquetInputFormat. The usecase requires scanning of multiple input directories and each directory will have files with one schema. Since AvroParquetInputFormat class could not handle multiple input schemas, I created a workaround by statically creating multiple dummy classes like MyAvroParquetInputFormat1, MyAvroParquetInputFormat2 etc where each class just inherits from AvroParquetInputFormat. And for each directory, I set a different MyAvroParquetInputFormat and that worked (please let me know if there is a cleaner way to achieve this).
My current problem is as follows:
Each file has a few hundred columns and based on meta-data I construct a projectionSchema for each directory, to reduce unnecessary disk & network IO. I use the static setRequestedProjection() method on each of my MyAvroParquetInputFormat classes. But, being static, the last call’s projectionSchema is used for reading data from all directories, which is not the required behavior.
Any pointers to workarounds/solutions would is highly appreciated.
Thanks & Regards
MK
Keep in mind that if your avro schemas are compatible (see avro doc for definition of schema compatibility) you can access all the data with a single schema. Extending on this, it is also possible to construct a parquet friendly schema (no unions) that is compatible with all your schemas so you can use just that one.
As for the approach you took, there is no easy way of doing this that I know of. You have to extend MultipleInputs functionality somehow to assign a different schema for each of your input formats. MultipleInputs works by setting two configuration properties in your job configuration:
mapreduce.input.multipleinputs.dir.formats //contains a comma separated list of InputFormat classes
mapreduce.input.multipleinputs.dir.mappers //contains a comma separated list of Mapper classes.
These two lists must be the same length. And this is where it gets tricky. This information is used deep within hadoop code to initialize mappers and input formats, so that's where you should add your own code.
As an alternative, I would suggest that you do the projection using one of the tools already available, such as hive. If there are not too many different schemas, you can write a set of simple hive queries to do the projection for each of the schemas, and after that you can use a single mapper to process the data or whatever the hell you want.

Yaml properties as a Map in Spring Boot

We have a spring-boot project and are using application.yml files. This works exactly as described in the spring-boot documentation. spring-boot automatically looks in several locations for the files, and obeys any environment overrides we use for the location of those files.
Now we want to also expose those yaml properties as a Map. According to the documentation this can be done with YamlMapFactoryBean. However YamlMapFactoryBean wants me to specify which yaml files to use via the resources property. I want it to use the same yaml files and processing hierarchy that it used when creating properties, so that I can take still take advantage of "magical" features such as placeholder resolution in property values.
I didn't see any documentation on if this was possible.
I was thinking of writing a MapFactoryBean that looked at the environment and simply reversed the "flattening" performed by the YamlProcessor when creating the properties representation of the file.
Any other ideas?
The ConfigFileApplicationContextListener contains the logic for searching for files in various locations. And PropertySourcesLoader loads a file (Resource) into property sources. Neither is really designed for standalone use, but you could easily duplicate them if you want more control. The PropertySourcesLoader delegates to a collection of PropertySourceLoaders so you could add one of the latter that delegates to your YamlMapFactoryBean.
A slightly awkward but workable solution would be to use the existing machinery to collect the YAML on startup. Add a new PropertySourceLoader to your META-INF/spring.factories and let it create new property sources, then post process the Environment to extract the source map(s).
Beware, though: creating a single Map from multiple YAML files, or even a single one with multiple documents (let alone multiple files with multiple documents) isn't as easy as you might think. You have a map-merge problem, and someone is going to have to define the algorithm. The flattening done in YamlMapPropertiesBean and the merge in YamlMapFactoryBean are just two choices out of (probably) a larger set of possibilities.

How do I get CsvDozerBeanWriter to pull column headers from Dozer XML mapping files

I'm writing a feature to produce CSV snapshots of screen data.
I need this to be data-driven. Thus I need to avoid hard-coding each snapshot in Java, but rather load it from a data source such as an XML file or a database. The data is contained in Java beans.
I'm using SuperCSV with the Dozer extension both at 2.1.0.
This combination seems perfect since I can code the mappings from the beans to the columns in Dozer XML mapping files.
This works well for the data, but I have not found a way to specify the strings to use for the CSV's column headers other than to hard-code them in Java as is done in all of the examples and test cases I've looked at. That is not data-driven.
Is there a way for me to code the column headers in the mapper file. Or even to extract them from the mapper file, construct a List and pass them to the writerHeader() method?
I think it would be OK to just use the bean property names as the headers, although ideal situation is that I am provided some additional meta-data notation in the XML's <Field> tag that specifies the header.
I'd have posted this on SourceForge, but I'm getting a 500 error there.
I'm a Super CSV developer. You're the first person I've heard of who's using CsvDozerBeanWriter with their own DozerBeanMapper - great to hear that feature is useful :)
So what's the goal of being 'data driven'? It sounds like you want your code to be really generic, so you can alter the CSV just by changing the XML. Is that right? Of course, you can't configure the cell processors dynamically...or are you trying to do that too!!??
I'd take a look at the MappingMetadata API of Dozer, which you can access by calling getMappingMetadata() on the DozerBeanMapper. I've never used it, but it looks like you could derive the column names this way (though you'd probably be limited to the field names).
Otherwise, you'll have to parse the XML file yourself (I'd probably use XPath). You'd have to do it this way if you want to use some other metadata in the XML for the column name.

Resources