Apache camel regex pattern for txt files - spring-boot

I would like to use apache camel with regex pattern for txt files, but the problem is the correct pattern and how to use it in from() method. The documentation mentions only about the keyword include and exclude. Which is the easiest way to use a pattern in order to check if filename matches the regex pattern? Thank you in advance.

The easiest might be the one you've mentioned (include parameter on the file endpoint). Example (include every txt):
from("file://input/directory?include=.*\\.txt")
Other option is to implement a GenericFileFilter.

Since you're only checking for the file name, you can do something like this for files appearing in a specific folder and choose what to do with them using a predicate:
from("file://fooFileFolder/")
.choice()
.when(header("CamelFileNameOnly").regex("fooPattern")).to("mock:fooHere")
.otherwise().to("mock:fooThere")
.end();
It's up to you to use the regex suitable for the matching pattern in the currently read file name you're looking to apply the test to. You can also use regex with Camel's Simple dialect.

Related

Multiple Pattern Match Algorithm

I have lot of logs and every record contains a url. And I have about 2000+ url patterns to filter the log. Some patterns are regular pattern with capturable group. I want to get url and the matched pattern and, if possible, the captured groupes. Is there a java lib can help me. Or any Algorithm which can solve my problem. Or anyting else which related to my problem. Thanks a lot.
Take a look at java regular expressions library (link).
You can construct a single large pattern by concatenating your original patterns with | between them (use () to specify that you don't want just 1 character).
The regular expression can be compiled into an efficient matching finite automata, that you can run over your data. Just make sure you compile it once and reuse it for every record.
It will handle extracting groups, but you need to handle the groups in a generic way (since any group can be matched). If it makes it easier consider using named groups to make handling simpler.

sonarqube xpath rule match multiple file patterns

I'am building custom rules in SonarQube 5.1.2 and I can't find out how to apply a rule to multiple file types.
I've seen that ant-style file pattern is only one pattern, not a list.
Specifically I want my rule to match **/*.wsdl and **/*.WSDL and eventually files with other extensions.
Is there a better way to do this than replicating the rule?
thanks.

Sublime Text - Exclude comments in search

Every time I search for a function inside of hundreds of files, I see so many matches for comments which have no effect in the code.
Can someone limit Sublime Text's search scope to real code, and exclude comments?
I use Sublime Text 3 for developing a C++ program.
I created a Plugin that search for a given string inside a given scope.
The default scope selector is -comment effectively searching outside of comments. The text to search for is taken from the current selection. The results are presented in the drop-down menu
Basically I combined two API methods:
view.find_all(pattern) that searches for a pattern in the given view.
view.match_selector(position, scope_selecor) that check if the given position is inside the given scope.
You could use regex to find patters matching the regex you give.
Design the regex according to match your.
You can give regex by turning on the 'Regular Expression' flag
Example
You can have this regex to match your case if you want to match alone in single line comments.
^(?!\/\/)([^\/\n]*)YOUR_SEARCH_TERM
If you want to match also in multi line comments use this.
^(?!(\/\/|(\/\*(.|\n)*([^\*])(?=\/))))YOUR_SEARCH_TERM

Most efficient way of matching file names with Ruby regex

The method Dir.glob is used for achieving file names that match a certain pattern, but its argument has a Unix-like syntax (e.g., using *, ** as wild cards in a particular way, etc.). Instead, I want to use Ruby (Onigmo) regex for the matching pattern to do the same thing (using its wildcards, quantifiers, anchors, escaped characters, etc). What is the best way to do this?
One simple way that comes to mind is to use Dir.glob to get the list of all existing files in all directories, and filtering them using the regex, but that does not look efficient. Or, is it? Is there a better way?
You could try the Find module in Ruby's standard library.
require 'find'
Find.find(path).grep(/regex/)
The find method returns every path that exists within the path you provide as an argument recursively, pretty much like what you mentioned with Dir.glob. You can then use the built-in grep method to filter the results with a regex.
This may not be the most efficient method though, since Dir.glob is written in C while the Find module is written in Ruby. I did a test on my home directory and it took Find a little longer to get the result than Dir.glob, but you can also use the Find module's prune method in order to not descend into particular folders, which could help make things more efficient using Find.

Solr : conserve hyphen word for suggest

I use Solr 3.3. and I need to use suggest component to make an autocomplete.
I would like to conserve word with hyphen to make suggestion (for example : "Wi-fi")
For differents field type configuration I have word "wifi" or "wi" .
Someone knows which filter can make this.
Thanks
How does your schema look like (the autocomplete type)?
You could use solr.WhitespaceTokenizerFactory. It doesn't tokenize on extraneous characters like hyphens.
If you want to remove these characters, you need to use solr.PatternReplaceFilterFactory, solr.PatternReplaceCharFilterFactory or even creating your own custom Tokenizer.

Resources