Include YAML files from snakeyaml - yaml

I would like to have YAML files with an include, similar to this question, but with Snakeyaml:
How can I include an YAML file inside another?
For example:
%YAML 1.2
---
!include "load.yml"
!include "load2.yml"
I am having a lot of trouble with it. I have the Constructor defined, and I can make it import one document, but not two. The error I get is:
Exception in thread "main" expected '<document start>', but found Tag
in 'reader', line 5, column 1:
!include "load2.yml"
^
With one include, Snakeyaml is happy that it finds an EOF and processes the import. With two, it's not happy (above).
My java source is:
package yaml;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.InputStream;
import org.yaml.snakeyaml.Yaml;
import org.yaml.snakeyaml.constructor.AbstractConstruct;
import org.yaml.snakeyaml.constructor.Constructor;
import org.yaml.snakeyaml.nodes.Node;
import org.yaml.snakeyaml.nodes.ScalarNode;
import org.yaml.snakeyaml.nodes.Tag;
public class Main {
final static Constructor constructor = new MyConstructor();
private static class ImportConstruct extends AbstractConstruct {
#Override
public Object construct(Node node) {
if (!(node instanceof ScalarNode)) {
throw new IllegalArgumentException("Non-scalar !import: " + node.toString());
}
final ScalarNode scalarNode = (ScalarNode)node;
final String value = scalarNode.getValue();
File file = new File("src/imports/" + value);
if (!file.exists()) {
return null;
}
try {
final InputStream input = new FileInputStream(new File("src/imports/" + value));
final Yaml yaml = new Yaml(constructor);
return yaml.loadAll(input);
} catch (FileNotFoundException ex) {
ex.printStackTrace();
}
return null;
}
}
private static class MyConstructor extends Constructor {
public MyConstructor() {
yamlConstructors.put(new Tag("!include"), new ImportConstruct());
}
}
public static void main(String[] args) {
try {
final InputStream input = new FileInputStream(new File("src/imports/example.yml"));
final Yaml yaml = new Yaml(constructor);
Object object = yaml.load(input);
System.out.println("Loaded");
} catch (FileNotFoundException ex) {
ex.printStackTrace();
}
finally {
}
}
}
Question is, has anybody done a similar thing with Snakeyaml? Any thoughts as to what I might be doing wrong?

I see two issues:
final InputStream input = new FileInputStream(new File("src/imports/" + value));
final Yaml yaml = new Yaml(constructor);
return yaml.loadAll(input);
You should be using yaml.load(input), not yaml.loadAll(input). The loadAll() method returns multiple objects, but the construct() method expects to return a single object.
The other issue is that you may have some inconsistent expectations with the way that the YAML processing pipeline works:
If you think that your !include works like in C where the preprocessor sticks in the contents of the included file, the way to implement it would be to handle it in the Presentation stage (parsing) or Serialization stage (composing). But you have implemented it in the Representation stage (constructing), so !include returns an object, and the structure of your YAML file must be consistent with this.
Let's say that you have the following files:
test1a.yaml
activity: "herding cats"
test1b.yaml
33
test1.yaml
favorites: !include test1a.yaml
age: !include test1b.yaml
This would work ok, and would be equivalent to
favorites:
activity: "herding cats"
age: 33
But the following file would not work:
!include test1a.yaml
!include test1b.yaml
because there is nothing to say how to combine the two values in a larger hierarchy. You'd need to do this, if you want an array:
- !include test1a.yaml
- !include test1b.yaml
or, again, handle this custom logic in an earlier stage such as parsing or composing.
Alternatively, you need to tell the YAML library that you are starting a 2nd document (which is what the error is complaining about: expected '<document start>') since YAML supports multiple "documents" (top-level values) in a single .yaml file.

Related

Trying to read a csv file in Jmeter using Beanshell Sampler

I am trying to read the cells in an xls file.This is the code I have.Please let me know where I am going wrong.I don't see any error in the logviewer but it isn't printing anything.
import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVRecord;
import java.io.IOException;
import java.io.Reader;
import java.nio.file.Files;
import java.nio.file.Paths;
public class ApacheCommonsCSV {
public void readCSV() throws IOException {
String CSV_File_Path = "C:\\source\\Test.csv";
// read the file
Reader reader = Files.newBufferedReader(Paths.get(CSV_File_Path));
// parse the file into csv values
CSVParser csvParser = new CSVParser(reader, CSVFormat.DEFAULT);
for (CSVRecord csvRecord : csvParser) {
// Accessing Values by Column Index
String name = csvRecord.get(0);
String product = csvRecord.get(1);
// print the value to console
log.info("Record No - " + csvRecord.getRecordNumber());
log.info("---------------");
log.info("Name : " + name);
log.info("Product : " + product);
log.info("---------------");
}
}
}
You're declaring readCSV() function but not calling it anywhere, that's why your code doesn't even get executed.
You need to add this readCSV() function call and if you're lucky enough it will start working and if not - your will see an error in jmeter.log file.
For example like this:
import java.io.IOException;
import java.io.Reader;
import java.nio.file.Files;
import java.nio.file.Paths;
import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVRecord;
public class ApacheCommonsCSV {
public void readCSV() throws IOException {
String CSV_File_Path = "C:\\source\\Test.csv";
// read the file
Reader reader = Files.newBufferedReader(Paths.get(CSV_File_Path));
// parse the file into csv values
CSVParser csvParser = new CSVParser(reader, CSVFormat.DEFAULT);
for (CSVRecord csvRecord : csvParser) {
// Accessing Values by Column Index
String name = csvRecord.get(0);
String product = csvRecord.get(1);
// print the value to console
log.info("Record No - " + csvRecord.getRecordNumber());
log.info("---------------");
log.info("Name : " + name);
log.info("Product : " + product);
log.info("---------------");
}
}
readCSV(); // here is the entry point
}
Just make sure to have commons-csv.jar in JMeter Classpath
Last 2 cents:
Any reason for not using CSV Data Set Config?
Since JMeter 3.1 you should be using JSR223 Test Elements and Groovy language for scripting so consider migrating to Groovy right away. Check out Apache Groovy - Why and How You Should Use It for reasons, tips and tricks.

Check for words in input text from words collection

I have collection of Noun Phrases say around 10,000 words. I want to check every new input text data for these NP collection and extract those sentences that contains any of these NP. I don't want to run loops for every word because it makes my code dead slow. I am using Java and Stanford CoreNLP.
A quick and easy way to do this is to use RegexNER to identify all examples of anything in your dictionary, and then check for non "O" NER tags in the sentence.
package edu.stanford.nlp.examples;
import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.util.*;
import java.util.*;
import java.util.stream.Collectors;
public class FindSentencesWithPhrase {
public static boolean checkForNamedEntity(CoreMap sentence) {
for (CoreLabel token : sentence.get(CoreAnnotations.TokensAnnotation.class)) {
if (token.ner() != null && !token.ner().equals("O")) {
return true;
}
}
return false;
}
public static void main(String[] args) {
Properties props = new Properties();
props.setProperty("annotators", "tokenize,ssplit,pos,lemma,regexner");
props.setProperty("regexner.mapping", "phrases.rules");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
String exampleText = "This sentence contains the phrase \"ice cream\"." +
"This sentence is not of interest. This sentences contains pizza.";
Annotation ann = new Annotation(exampleText);
pipeline.annotate(ann);
for (CoreMap sentence : ann.get(CoreAnnotations.SentencesAnnotation.class)) {
if (checkForNamedEntity(sentence)) {
System.out.println("---");
System.out.println(sentence.get(CoreAnnotations.TokensAnnotation.class).
stream().map(token -> token.word()).collect(Collectors.joining(" ")));
}
}
}
}
The file "phrases.rules" should look like this:
ice cream PHRASE_OF_INTEREST MISC 1
pizza PHRASE_OF_INTEREST MISC 1

PIG UDF handle multi-lined tuple split into different mapper

I have file where each tuple span multiple lines, for example:
START
name: Jim
phone: 2128789283
address: 56 2nd street, New York, USA
END
START
name: Tom
phone: 6308789283
address: 56 5th street, Chicago, 13611, USA
END
.
.
.
So above are 2 tuples in my file. I wrote my UDF that defined a getNext() function which check if it is START then I will initialize my tuple; if it is END then I will return the tuple (from string buffer); otherwise I will just add the string to string buffer.
It works well for file size is less than HDFS block size which is 64 MB (on Amazon EMR), whereas it will fail for the size larger than this. I try to google around, find this blog post. Raja's explaination is easy to understand and he provided a sample code. But the code is implementing the RecordReader part, instead of getNext() for pig LoadFunc. Just wondering if anyone has experience to handle multi-lined pig tuple split problem? Should I go ahead implement RecordReader in Pig? If so, how?
Thanks.
You may preprocess your input as Guy mentioned or can apply other tricks described here.
I think the cleanest solution would be to implement a custom InputFormat (along with its RecordReader) which creates one record/START-END. The Pig's LoadFunc sits on the top of the Hadoop's InputFormat, so you can define which InputFormat your LoadFunc will use.
A raw, skeleton implementation of a custom LoadFunc would look like:
import java.io.IOException;
import org.apache.hadoop.mapreduce.InputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.pig.LoadFunc;
import org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit;
import org.apache.pig.data.Tuple;
import org.apache.pig.data.TupleFactory;
public class CustomLoader extends LoadFunc {
private RecordReader reader;
private TupleFactory tupleFactory;
public CustomLoader() {
tupleFactory = TupleFactory.getInstance();
}
#Override
public InputFormat getInputFormat() throws IOException {
return new MyInputFormat(); //custom InputFormat
}
#Override
public Tuple getNext() {
Tuple result = null;
try {
if (!reader.nextKeyValue()) {
return null;
}
//value can be a custom Writable containing your name/value
//field pairs for a given record
Object value = reader.getCurrentValue();
result = tupleFactory.newTuple();
// ...
//append fields to tuple
}
catch (Exception e) {
// ...
}
return result;
}
#Override
public void prepareToRead(RecordReader reader, PigSplit pigSplit)
throws IOException {
this.reader = reader;
}
#Override
public void setLocation(String location, Job job) throws IOException {
FileInputFormat.setInputPaths(job, location);
}
}
After the LoadFunc initializes the InputFormat and its RecordReader, it locates the input location of your data and begins to obtain the records from recordReader, creates the resulting tuples (getNext()) until the input has been fully read.
Some remarks on the custom InputFormat:
I'd create a custom InputFormat in which the RecordReader is a modified version of
org.apache.hadoop.mapreduce.lib.input.LineRecordReader: Most of the methods would
remain the same, except initialize(): it would call a custom LineReader
(based on org.apache.hadoop.util.LineReader).
The InputFormat's key would be the line offset (Long), the value would be a custom
Writable. This would hold the fields of a record (i.e data between START-END) as a list of key-value pairs. Each time your RecordReader's nextKeyValue() is called the record is written to the custom Writable by the LineReader. The gist of the whole thing is how you
implement LineReader.readLine().
Another, probably an easier approach would be to change the delimiter of TextInputFormat (It is configurable in Hadoop 0.23, see textinputformat.record.delimiter)
to one that is appropriate for your data structure (if it is possible). In this case you'll end up having your data in Text from which you need to split and extract KV pairs and into tuples.
If can take start as your delimiter, probably below code works without UDF
SET textinputformat.record.delimiter 'START';
a = load '<input path>' as (data:chararray);
dump a;
the output would look like:
(
name: Jim
enter code here`phone: 2128789283
address: 56 2nd street, New York, USA
END
)
(
name: Tom
phone: 6308789283
address: 56 5th street, Chicago, 13611, USA
END
)
Now both are separated into two tuples.

How to filter a TreeGrid?

I currently have a TreeGrid which shows nodes with names. The data is coming from a manually populated DataSource.
When setting the filter on the nodeName field, The filter is not done recursevily and thus I can only filter the Root node.
How can I tell the filter to search for matches in child nodes?
PS: in the code below, I have 3 nodes Root > Run > Child1. If i try the filter and type "R", I get Root and Run. But if i Type "C", I get "no results found"
Code
DataSource:
package murex.otk.gwt.gui.client.ui.datasource;
import java.util.List;
import murex.otk.gwt.gui.client.ui.record.TreeRecord;
import com.smartgwt.client.data.DataSource;
import com.smartgwt.client.data.fields.DataSourceIntegerField;
import com.smartgwt.client.data.fields.DataSourceTextField;
public class ClassesDataSource extends DataSource {
private static ClassesDataSource instance = null;
public static ClassesDataSource getInstance() {
if (instance == null) {
instance = new ClassesDataSource("classesDataSource");
}
return instance;
}
private ClassesDataSource(String id) {
setID(id);
DataSourceTextField nodeNameField = new DataSourceTextField("nodeName");
nodeNameField.setCanFilter(true);
nodeNameField.setRequired(true);
DataSourceIntegerField nodeIdField = new DataSourceIntegerField("nodeId");
nodeIdField.setPrimaryKey(true);
nodeIdField.setRequired(true);
DataSourceIntegerField nodeParentField = new DataSourceIntegerField("nodeParent");
nodeParentField.setRequired(true);
nodeParentField.setForeignKey(id + ".nodeId");
nodeParentField.setRootValue(0);
setFields(nodeIdField, nodeNameField, nodeParentField);
setClientOnly(true);
}
public void populateDataSource(List<String> classNames) {
TreeRecord root = new TreeRecord("Root", 0);
addData(root);
TreeRecord child1 = new TreeRecord("Child1", root.getNodeId());
addData(child1);
TreeRecord child2 = new TreeRecord("Run", child1.getNodeId());
addData(child2);
}
}
Main
public void onModuleLoad() {
ClassesDataSource.getInstance().populateDataSource(new ArrayList<String>());
final TreeGrid employeeTree = new TreeGrid();
employeeTree.setHeight(350);
employeeTree.setDataSource(ClassesDataSource.getInstance());
employeeTree.setAutoFetchData(true);
TreeGridField field = new TreeGridField("nodeName");
field.setCanFilter(true);
employeeTree.setFields(field);
employeeTree.setShowFilterEditor(true);
employeeTree.setAutoFetchAsFilter(true);
employeeTree.setFilterOnKeypress(true);
employeeTree.draw();
}
I solved this.
The problem was that the filter was calling the server to fetch data whereas my datasource was set to Client Only. To fix this, the employeeTree must have employeeTree.setLoadDataOnDemand(false);
You may also use employeeTree.setKeepParentsOnFilter(true)
http://www.smartclient.com/smartgwtee/javadoc/com/smartgwt/client/widgets/tree/TreeGrid.html#setKeepParentsOnFilter(java.lang.Boolean)

In freemarker is it possible to check to see if a file exists before including it?

We are trying to build a system in freemarker where extension files can be optionally added to replace blocks of the standard template.
We have gotten to this point
<#attempt>
<#include "extension.ftl">
<#recover>
Standard output
</#attempt>
So - if the extension.ftl file exists it will be used otherwise the part inside of the recover block is output.
The problem with this is that freemarker always logs the error that caused the recover block to trigger.
So we need one of two things:
Don't call the include if the file doesn't exist - thus the need to check for file existence.
-OR-
A way to prevent the logging of the error inside the recover block without changing the logging to prevent ALL freemarker errors from showing up.
easier solution would be:
<#attempt>
<#import xyz.ftl>
your_code_here
<#recover>
</#attempt>
We've written a custom macro which solves this for us. In early testing, it works well. To include it, add something like this (where mm is a Spring ModelMap):
mm.addAttribute(IncludeIfExistsMacro.MACRO_NAME, new IncludeIfExistsMacro());
import java.io.IOException;
import java.util.Map;
import org.apache.commons.io.FilenameUtils;
import freemarker.cache.TemplateCache;
import freemarker.cache.TemplateLoader;
import freemarker.core.Environment;
import freemarker.template.TemplateDirectiveBody;
import freemarker.template.TemplateDirectiveModel;
import freemarker.template.TemplateException;
import freemarker.template.TemplateModel;
/**
* This macro optionally includes the template given by path. If the template isn't found, it doesn't
* throw an exception; instead it renders the nested content (if any).
*
* For example:
* <#include_if_exists path="module/${behavior}.ftl">
* <#include "default.ftl">
* </#include_if_exists>
*
* #param path the path of the template to be included if it exists
* #param nested (optional) body could be include directive or any other block of code
*/
public class IncludeIfExistsMacro implements TemplateDirectiveModel {
private static final String PATH_PARAM = "path";
public static final String MACRO_NAME = "include_if_exists";
#Override
public void execute(Environment environment, Map map, TemplateModel[] templateModel,
TemplateDirectiveBody directiveBody) throws TemplateException, IOException {
if (! map.containsKey(PATH_PARAM)) {
throw new RuntimeException("missing required parameter '" + PATH_PARAM + "' for macro " + MACRO_NAME);
}
// get the current template's parent directory to use when searching for relative paths
final String currentTemplateName = environment.getTemplate().getName();
final String currentTemplateDir = FilenameUtils.getPath(currentTemplateName);
// look up the path relative to the current working directory (this also works for absolute paths)
final String path = map.get(PATH_PARAM).toString();
final String fullTemplatePath = TemplateCache.getFullTemplatePath(environment, currentTemplateDir, path);
TemplateLoader templateLoader = environment.getConfiguration().getTemplateLoader();
if (templateLoader.findTemplateSource(fullTemplatePath) != null) {
// include the template for the path, if it's found
environment.include(environment.getTemplateForInclusion(fullTemplatePath, null, true));
} else {
// otherwise render the nested content, if there is any
if (directiveBody != null) {
directiveBody.render(environment.getOut());
}
}
}
}
I had this exact need as well but I didn't want to use FreeMarker's ObjectConstructor (it felt too much like a scriptlet for my taste).
I wrote a custom FileTemplateLoader:
public class CustomFileTemplateLoader
extends FileTemplateLoader {
private static final String STUB_FTL = "/tools/empty.ftl";
public CustomFileTemplateLoader(File baseDir) throws IOException {
super(baseDir);
}
#Override
public Object findTemplateSource(String name) throws IOException {
Object result = null;
if (name.startsWith("optional:")) {
result = super.findTemplateSource(name.replace("optional:", ""));
if (result == null) {
result = super.findTemplateSource(STUB_FTL);
}
}
if (result == null) {
result = super.findTemplateSource(name);
}
return result;
}
}
And my corresponding FreeMarker macro:
<#macro optional_include name>
<#include "/optional:" + name>
</#macro>
An empty FTL file was required (/tools/empty.ftl) which just contains a comment explaining its existence.
The result is that an "optional" include will just include this empty FTL if the requested FTL cannot be found.
You can use also use Java method to check file exist or not.
Java Method-
public static boolean checkFileExistance(String filePath){
File tmpDir = new File(filePath);
boolean exists = tmpDir.exists();
return exists;
}
Freemarker Code-
<#assign fileExists = (Static["ClassName"].checkFileExistance("Filename"))?c/>
<#if fileExists = "true">
<#include "/home/demo.ftl"/>
<#else>
<#include "/home/index.ftl">
</#if>
Try this to get the base path:
<#assign objectConstructor = "freemarker.template.utility.ObjectConstructor"?new()>
<#assign file = objectConstructor("java.io.File","")>
<#assign path = file.getAbsolutePath()>
<script type="text/javascript">
alert("${path?string}");
</script>
Then this to walk the directory structure:
<#assign objectConstructor = "freemarker.template.utility.ObjectConstructor"?new()>
<#assign file = objectConstructor("java.io.File","target/test.ftl")>
<#assign exist = file.exists()>
<script type="text/javascript">
alert("${exist?string}");
</script>

Resources