OpenNLP Name Entity Recognizer output - opennlp

I have trained an OpenNLP Name Entity Recognizer. When I use it over some data it gives an output like:
[0..1) location
I rather want to output the original name that occurred in the data.

this is a Span objects toString() output. Each call to find(String[]) can return multiple Spans, hence the find() method returns Span[].
Use this code to get the actual named entities
//"tokens" here is the String[] of words in your sentence
Span[] find = nf.find(tokens);
//use the Span's static method to get the String[] of names
String[] namedEntities = Span.spansToStrings(find, tokens);
A span is simply a start and end index to your String[] tokens.

Related

Initialize non existing keys to empty array list

I have a class :
class Con {
private List<Ind> inds;
}
I am using Gson in the usual way to convert a JSON string to this class object. so in case, the JSON doesn't have the key inds present this variable inds is assigned a null value. Is there a way to assign inds an empty ArrayList instead?
My Thoughts:
One straightforward way could be once the Gson object is built. Go over all the null objects and assign them to the new ArrayList<>(). Is there a better approach?
public List<Ind> getInds() {
return inds;
}
Currently I am using the above getter in a code like : con.getInds().stream() which is causing NullPointerException.
I am not sure what would be a good way to resolve this. Instead of List Should I return an Optional or Should I modify this getter like
public List<Ind> getInds() {
inds==null?new ArrayList<>():inds;
}
The above will also resolve the NullPointerException. Not sure if there are pros and cons to using this approach. Although now there is no way to identify if the Json has a key with name inds or not. For the current code that I am writing this may not be required. But there is a meaning loss here certainly.
One solution to this would be to assign default values to the fields, for example:
class Con {
private List<Ind> inds = new ArrayList<>();
}
Gson will keep this default value; only if the field is present in the JSON data it will reassign the field value.
There are however a few things to keep in mind:
Your class needs a no-args constructor (implicit or explicit); otherwise Gson might create instances without invoking the initializer blocks of the class, and therefore the field will be null
If the field is present in JSON but has a JSON null value, then Gson will still set that as value
You cannot tell afterwards whether the field was present in JSON but had an empty JSON array as value, or whether it was missing

Spring AOP get method parameter value based on parameter name

Is it possible to get the method parameter value based on parameter name in Spring AOP.
MethodSignature signature = (MethodSignature) proceedingJoinPoint.getSignature();
Method method = signature.getMethod();
method.getParameters().getName()
// possible to get the paramater names
This approach will get parameter names, not value.
proceedingJoinPoint.getArgs()
will return values not names
Then is it possible to get the value based on a parameter name?
As I searched everywhere does not exist a function that gives parameter value by name and I wrote a simple method that makes this work.
public Object getParameterByName(ProceedingJoinPoint proceedingJoinPoint, String parameterName) {
MethodSignature methodSig = (MethodSignature) proceedingJoinPoint.getSignature();
Object[] args = proceedingJoinPoint.getArgs();
String[] parametersName = methodSig.getParameterNames();
int idx = Arrays.asList(parametersName).indexOf(parameterName);
if(args.length > idx) { // parameter exist
return args[idx];
} // otherwise your parameter does not exist by given name
return null;
}
I searched for the same thing when I had to use AOP for logging function arguments and their values but it seems there is no direct way to get value based on argument name.
What I noticed however us that value returned by method.getParameters().getName() and proceedingJoinPoint.getArgs() was always in sync., i.e., for function
public void foo(String a, String b)
called as
foo("hello", "world");
method.getParameters().getName() returned ["a", "b"] and proceedingJoinPoint.getArgs() returned ["hello", "world"], in order. So you can iterate over the array by index and for each index i, the i'th argument name would correspond to i'th argument value.
I couldn't find a supporting documentation for this behavior but hey, this code has been running on production servers for about an year it never has produced incorrect result. Though I'd be glad if someone can link to a documentation of this behavior. You may even dig into reflectiion's code to verify this behavior.

Trie with association between words

I have a usecase where i want to correct words.
I have set of correct and incorrect words[spelling mistake].
I populate the trie with all words.
i have both the correct and incorrect version of every word.
Now in case if i get word as "a" for correction ,
-- i search it in trie.if trie has this word , i want to associate this word with the correct version of this word.
Solution :
i can set correct version ["a1"] of word at last node of incorrect word in trie. And can resolve it to "a1" .
But i have to store correct version of each word at last node that will increase the memory foot print .
Since i have all words loaded in to trie [correct/incorrect].
Is there any way i can make association between correct and incorrect word without again storing entire word in last node as value ?
Any pointer ?
public class TrieNode<T> {
private Map<Character, TrieNode<T>> childs;
private boolean complete;
private T value;
....
}
You could use a single dictionary for this. In C#, that would be:
Dictionary<string, string> MisspellingsLookup = new Dictionary<string, int>();
The key is the misspelling, and the value is the correct spelling.
Now, some words are commonly misspelled in multiple ways. For example, "occasion" is often misspelled as "ocassion" or "occassion" If you want to reduce the memory used by the multiple misspellings, you can use a temporary dictionary during construction. Whenever you add a misspelling, you look up the correct spelling in the good words dictionary, and if it's already there then you use that value. So all you do is store a reference to an existing word rather than creating a new string. Here's an example:
Dictionary<string, string> GoodWords = new Dictionary<string, int>();
Dictionary<string, string> Misspellings = new Dictionary<string, string>();
void AddMisspelling(string misspelled, string correct)
{
string goodWord;
if (!GoodWords.TryGetValue(correct, out goodWord))
{
goodWord = correct;
GoodWords.Add(correct, correct);
}
// Always use goodWord here, so you're not creating duplicate strings.
Misspellings.Add(misspelled, goodWord);
}
When you're done adding words, you can clear the GoodWords dictionary to save space:
GoodWords = null;
I recommend a dictionary here because it'll almost certainly use less memory, and lookup is O(1) rather than O(word length).

Spring Batch Passing list of values as a parameter

I want to pass list of id's as one of parameter to Spring batch. Is this possible to achieve?
Thanks in advance.
What you are trying to do is not possible.
From the JobParameter doc:
Domain representation of a parameter to a batch job. Only the
following types can be parameters: String, Long, Date, and Double. The
identifying flag is used to indicate if the parameter is to be used as
part of the identification of a job instance.
You might be tempted write your list of of id's to a comma delimited string and pass that as a single parameter but beware that when stored in the DB it has a length of at most 250 bytes. You'll either have to increase that limit or use another way.
Perhaps you can explain what why you need to pass that list of ids.
If you want to pass the list from ItemReader, then you have to get JobParameters first (you have to declare your reader to be step scoped for that, see this thread also).
You will have to put your list as a parameter to the JobParameters. As JobParameters is immutable, you will have to create a new object then
List yourList = ....
JobParameters jp = (JobParameters) fac.getBean("params");
Map map=params.getParameters();
map.put("yourList", list);
params=new JobParameters(map);
launcher.run(job, params);
You cannot use the List<T> concept itself in spring-batch, but I think you can implement your intentions(listOf(a, b, c, d..)) in the following way.
The job parameter itself receives a comma-separated string of items.
#Nonnull
private List<String> someList = Collections.emptyList();
#Value("#{jobParameters['someList']}")
public void setTableNames(#Nullable final String someList) {
if (StringUtils.isNotBlank(tableNames)) {
this.someList = Arrays.stream(StringUtils.split(someList, ","))
.map(String::trim)
.filter(StringUtils::isNotBlank)
.collect(Collectors.toList());
}
}
Hope it was helpful for using list-type parameters in spring-batch!
Thanks.

Accessing public static final field using JoSQL

I've been using JoSQL for quite a few months now and today I came across a problem I am not sure how to solve. I probably could solve it by binding variables/placeholders, but I'd like to include the fields in the query.
SELECT * FROM ...MyObject WHERE getType != com.mypackage.myclass.TYPE_A
This is the query that I have. TYPE_A is a public static final int attribute in "myclass" class. Accessing methods (such as getType) is easy, because getType is expected to be a method from MyObject - just that I do not write round brackets after it (this is how JoSQL works as far as I know).
Does anyone happen to have an idea how to access a public static final field?
JoSQL uses gentlyweb-utils; it seems to be some sort of Accessor/Getter/Setter framework. I'd love to access that attribute without having to bind variables, but I haven't been able to do so.
Thanks for your help in advance! I really appreciate it.
I think I have figured something out. First: it seems not possible to access the static variables for whatever reason. I've used the following approach to solve my issue:
create a method, which picks up a given JoSQL-statement
mark the constants, which you want to replace, by say "{?FULL_PACKAGE_AND$CONSTANT}"
use reflections to determine the column as well as the column (and value) from the field
iteratively replace the statement until no "{?"-values are available
Example:
JoSQL-statement looks like this:
(isWeapon = TRUE AND getItem.getType2 = {?com.l2jserver.gameserver.model.items.L2Item$TYPE2_WEAPON})
Method using the query-object:
final Query query = DataLayer.createJoSqlQuery(joSql);
Method (pre)processing the JoSQL-statement:
final Query query = new Query();
int variableColumn = 0;
while (joSql.indexOf("{?") > -1) {
variableColumn++;
final int startIndex = joSql.indexOf("{?");
final int endIndex = joSql.indexOf("}", startIndex);
final String value = joSql.substring(startIndex + 2, endIndex);
try {
final Object variableValue = Class.forName(value.split("\\$")[0]).getField(value.split("\\$")[1]).get(null);
query.setVariable(variableColumn, variableValue);
joSql = joSql.replace("{?" + value + "}", "?");
}
catch (...) {
e.printStackTrace();
}
}
query.parse(joSql);
return query;
The JoSQL-statement preprocessing method bascially iterates through a given JoSQL-statement and sees whether it contains the string "{?". If it does, it does some copy and paste (note the dollar-symbol right in front of the constant name).
Finally it creates the objects and sets them using something similar to prepared statements "setObject"-method. In the end it just replaces the values within the JoSQL-statement with question marks ("?") and sets a corresponding object in the newly created Query-object, which is later used to retrieve information.

Resources