How to make the custom RelationshipExtractor understand custom Entities? - stanford-nlp

I created a new NER for my project in which I identified additional words and BusinessTerms. I merged the Stanford provided NER along with my created NER and then got the NER for the set of sentences. I then converted these sentences into a form that would be understood by the RelationshipExtraction classifer as shown below:
5 O 0 O DT The O O O
5 O 1 O NNP Balanced O O O
5 BusinessTerm 2 O NN option O O O
5 O 3 O VBZ s O O O
5 BusinessTerm 4 O NN return O O O
5 O 5 O IN of O O O
5 PERCENT 6 O CD 13.88 O O O
5 PERCENT 7 O NN % O O O
5 O 8 O IN for O O O
5 O 9 O DT the O O O
5 O 10 O NN year O O O
5 O 11 O TO to O O O
5 O 12 O CD 30 O O O
5 DATE 13 O NNP June O O O
5 DATE 14 O CD 2014 O O O
5 O 15 O VBD was O O O
5 O 16 O RB well O O O
5 O 17 O RB ahead O O O
5 O 18 O IN of O O O
5 O 19 O DT the O O O
5 O 20 O JJ median O O O
5 BusinessTerm 21 O NN fund O O O
5 O 22 O , , O O O
5 O 23 O WDT which O O O
5 O 24 O VBZ is O O O
5 O 25 O JJ great O O O
5 O 26 O NN news O O O
5 O 27 O . . O O O
Now when I run this as input to create a relationship extraction classifier, it throws an exception saying
"Cannot normalize ner tag BusinessTerm."
I think I am missing some step of the process.
How do I make the RelationshipExtraction classifer understand these terms that have been created by my custom NER classifier (during the RelationshipExtraction classification process)?

To fix this problem you need to adjust the method getNormalizedNERTag inside the class edu.stanford.nlp.ie.machinereading.domains.roth.RothCONLL04Reader.
E.g.: I needed to add 3 new classes (Concept, Entity and Paper) so I added in this method like the following:
private static String getNormalizedNERTag(String ner){
if(ner.equalsIgnoreCase("O"))
return "O";
else if(ner.equalsIgnoreCase("Peop"))
return "PERSON";
else if(ner.equalsIgnoreCase("Loc"))
return "LOCATION";
else if(ner.equalsIgnoreCase("Org"))
return "ORGANIZATION";
else if(ner.equalsIgnoreCase("Other"))
return "OTHER";
else if(ner.equalsIgnoreCase("Concept"))
return "CONCEPT";
else if(ner.equalsIgnoreCase("Entity"))
return "ENTITY";
else if(ner.equalsIgnoreCase("Paper"))
return "PAPER";
throw new RuntimeException("Cannot normalize ner tag " + ner);
}
Once you do so, you need to recompile the CoreNLP from source. To do so, install ant, then rename the directory with the source code to src. Go inside the CoreNLP folder (where the build.xml file and the src dir are) and run ant.
Further instructions can be found here: https://github.com/stanfordnlp/CoreNLP/wiki/Compilation-Instructions
After this, just pack the classes into a jar, and run your compiled version with your own classes.
I am not sure why these classes are hard-coded! If anyone has any hints, please advise.
Thanks!

For those of you who do not want to recompile the default CoreNLP from the source. You can simply create your own class and implement it from there. A sample as follows:
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.Properties;
import java.util.logging.Level;
import java.util.logging.Logger;
import edu.stanford.nlp.ie.machinereading.GenericDataSetReader;
import edu.stanford.nlp.ie.machinereading.structure.AnnotationUtils;
import edu.stanford.nlp.ie.machinereading.structure.EntityMention;
import edu.stanford.nlp.ie.machinereading.structure.ExtractionObject;
import edu.stanford.nlp.ie.machinereading.structure.MachineReadingAnnotations;
import edu.stanford.nlp.ie.machinereading.structure.RelationMention;
import edu.stanford.nlp.ie.machinereading.structure.Span;
import edu.stanford.nlp.io.IOUtils;
import edu.stanford.nlp.ling.CoreAnnotations;
import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.pipeline.StanfordCoreNLP;
import edu.stanford.nlp.trees.Tree;
import edu.stanford.nlp.util.StringUtils;
/**
* A Reader designed for the relation extraction data studied in Dan Roth and Wen-tau Yih,
* A Linear Programming Formulation for Global Inference in Natural Language Tasks. CoNLL 2004.
* The format is a somewhat ad-hoc tab-separated value file format.
*
* #author Mihai, David McClosky, and agusev
* #author Sonal Gupta (sonalg#stanford.edu)
*/
public class TestReader extends GenericDataSetReader {
public TestReader() {
super(null, true, true, true);
// change the logger to one from our namespace
logger = Logger.getLogger(TestReader.class.getName());
// run quietly by default
logger.setLevel(Level.SEVERE);
}
#Override
public Annotation read(String path) throws IOException {
Annotation doc = new Annotation("");
logger.info("Reading file: " + path);
// Each iteration through this loop processes a single sentence along with any relations in it
for (Iterator<String> lineIterator = IOUtils.readLines(path).iterator(); lineIterator.hasNext(); ) {
Annotation sentence = readSentence(path, lineIterator);
AnnotationUtils.addSentence(doc, sentence);
}
return doc;
}
private boolean warnedNER; // = false;
private String getNormalizedNERTag(String ner) {
if (ner.equalsIgnoreCase("O")) {
return "O";
} else if (ner.equalsIgnoreCase("Peop")) {
return "PERSON";
} else if (ner.equalsIgnoreCase("LOCATION")) {
return "LOCATION";
} else if(ner.equalsIgnoreCase("Organization")) {
return "ORGANIZATION";
} else if(ner.equalsIgnoreCase("Other")) {
return "OTHER";
} else {
if ( ! warnedNER) {
warnedNER = true;
logger.warning("This file contains NER tags not in the original Roth/Yih dataset, e.g.: " + ner);
}
}
throw new RuntimeException("Cannot normalize ner tag " + ner);
}
private Annotation readSentence(String docId, Iterator<String> lineIterator) {
Annotation sentence = new Annotation("");
sentence.set(CoreAnnotations.DocIDAnnotation.class, docId);
sentence.set(MachineReadingAnnotations.EntityMentionsAnnotation.class, new ArrayList<>());
// we'll need to set things like the tokens and textContent after we've
// fully read the sentence
// contains the full text that we've read so far
StringBuilder textContent = new StringBuilder();
int tokenCount = 0; // how many tokens we've seen so far
List<CoreLabel> tokens = new ArrayList<>();
// when we've seen two blank lines in a row, this sentence is over (one
// blank line separates the sentence and the relations
int numBlankLinesSeen = 0;
String sentenceID = null;
// keeps tracks of entities we've seen so far for use by relations
Map<String, EntityMention> indexToEntityMention = new HashMap<>();
while (lineIterator.hasNext() && numBlankLinesSeen < 2) {
String currentLine = lineIterator.next();
currentLine = currentLine.replace("COMMA", ",");
List<String> pieces = StringUtils.split(currentLine);
String identifier;
int size = pieces.size();
switch (size) {
case 1: // blank line between sentences or relations
numBlankLinesSeen++;
break;
case 3: // relation
String type = pieces.get(2);
List<ExtractionObject> args = new ArrayList<>();
EntityMention entity1 = indexToEntityMention.get(pieces.get(0));
EntityMention entity2 = indexToEntityMention.get(pieces.get(1));
args.add(entity1);
args.add(entity2);
Span span = new Span(entity1.getExtentTokenStart(), entity2
.getExtentTokenEnd());
// identifier = "relation" + sentenceID + "-" + sentence.getAllRelations().size();
identifier = RelationMention.makeUniqueId();
RelationMention relationMention = new RelationMention(identifier,
sentence, span, type, null, args);
AnnotationUtils.addRelationMention(sentence, relationMention);
break;
case 9: // token
/*
* Roth token lines look like this:
*
* 19 Peop 9 O NNP/NNP Jamal/Ghosheh O O O
*/
// Entities may be multiple words joined by '/'; we split these up
List<String> words = StringUtils.split(pieces.get(5), "/");
//List<String> postags = StringUtils.split(pieces.get(4),"/");
String text = StringUtils.join(words, " ");
identifier = "entity" + pieces.get(0) + '-' + pieces.get(2);
String nerTag = getNormalizedNERTag(pieces.get(1)); // entity type of the word/expression
if (sentenceID == null)
sentenceID = pieces.get(0);
if (!nerTag.equals("O")) {
Span extentSpan = new Span(tokenCount, tokenCount + words.size());
// Temporarily sets the head span to equal the extent span.
// This is so the entity has a head (in particular, getValue() works) even if preprocessSentences isn't called.
// The head span is later modified if preprocessSentences is called.
EntityMention entity = new EntityMention(identifier, sentence,
extentSpan, extentSpan, nerTag, null, null);
AnnotationUtils.addEntityMention(sentence, entity);
// we can get by using these indices as strings since we only use them
// as a hash key
String index = pieces.get(2);
indexToEntityMention.put(index, entity);
}
// int i =0;
for (String word : words) {
CoreLabel label = new CoreLabel();
label.setWord(word);
//label.setTag(postags.get(i));
label.set(CoreAnnotations.TextAnnotation.class, word);
label.set(CoreAnnotations.ValueAnnotation.class, word);
// we don't set TokenBeginAnnotation or TokenEndAnnotation since we're
// not keeping track of character offsets
tokens.add(label);
// i++;
}
textContent.append(text);
textContent.append(' ');
tokenCount += words.size();
break;
}
}
sentence.set(CoreAnnotations.TextAnnotation.class, textContent.toString());
sentence.set(CoreAnnotations.ValueAnnotation.class, textContent.toString());
sentence.set(CoreAnnotations.TokensAnnotation.class, tokens);
sentence.set(CoreAnnotations.SentenceIDAnnotation.class, sentenceID);
return sentence;
}
/*
* Gets the index of an object in a list using == to test (List.indexOf uses
* equals() which could be problematic here)
*/
private static <X> int getIndexByObjectEquality(List<X> list, X obj) {
for (int i = 0, sz = list.size(); i < sz; i++) {
if (list.get(i) == obj) {
return i;
}
}
return -1;
}
/*
* Sets the head word and the index for an entity, given the parse tree for
* the sentence containing the entity.
*
* This code is no longer used, but I've kept it around (at least for now) as
* reference when we modify preProcessSentences().
*/
#SuppressWarnings("unused")
private void setHeadWord(EntityMention entity, Tree tree) {
List<Tree> leaves = tree.getLeaves();
Tree argRoot = tree.joinNode(leaves.get(entity.getExtentTokenStart()),
leaves.get(entity.getExtentTokenEnd()));
Tree headWordNode = argRoot.headTerminal(headFinder);
int headWordIndex = getIndexByObjectEquality(leaves, headWordNode);
if (StringUtils.isPunct(leaves.get(entity.getExtentTokenEnd()).label().value().trim())
&& (headWordIndex >= entity.getExtentTokenEnd()
|| headWordIndex < entity.getExtentTokenStart())) {
argRoot = tree.joinNode(leaves.get(entity.getExtentTokenStart()), leaves
.get(entity.getExtentTokenEnd() - 1));
headWordNode = argRoot.headTerminal(headFinder);
headWordIndex = getIndexByObjectEquality(leaves, headWordNode);
if (headWordIndex >= entity.getExtentTokenStart()
&& headWordIndex <= entity.getExtentTokenEnd() - 1) {
entity.setHeadTokenPosition(headWordIndex);
entity.setHeadTokenSpan(new Span(headWordIndex, headWordIndex + 1));
}
}
if (headWordIndex >= entity.getExtentTokenStart()
&& headWordIndex <= entity.getExtentTokenEnd()) {
entity.setHeadTokenPosition(headWordIndex);
entity.setHeadTokenSpan(new Span(headWordIndex, headWordIndex + 1));
} else {
// Re-parse the argument words by themselves
// Get the list of words in the arg by looking at the leaves between
// arg.getExtentTokenStart() and arg.getExtentTokenEnd() inclusive
List<String> argWords = new ArrayList<>();
for (int i = entity.getExtentTokenStart(); i <= entity.getExtentTokenEnd(); i++) {
argWords.add(leaves.get(i).label().value());
}
if (StringUtils.isPunct(argWords.get(argWords.size() - 1))) {
argWords.remove(argWords.size() - 1);
}
Tree argTree = parseStrings(argWords);
headWordNode = argTree.headTerminal(headFinder);
headWordIndex = getIndexByObjectEquality(argTree.getLeaves(),
headWordNode)
+ entity.getExtentTokenStart();
entity.setHeadTokenPosition(headWordIndex);
entity.setHeadTokenSpan(new Span(headWordIndex, headWordIndex + 1));
}
}
public static void main(String[] args) throws Exception {
// just a simple test, to make sure stuff works
Properties props = StringUtils.argsToProperties(args);
TestReader reader = new TestReader();
reader.setLoggerLevel(Level.INFO);
reader.setProcessor(new StanfordCoreNLP(props));
Annotation doc = reader.parse("/u/nlp/data/RothCONLL04/conll04.corp");
System.out.println(AnnotationUtils.datasetToString(doc));
}
}
Just remember to change the property, datasetReaderClass in your properties file(roth.properties).
datasetReaderClass = test.TestReader

Related

Algorithm to get all the possible combinations of operations from a given numbers

I want to write a function that given a set of numbers, for example:
2, 3
It returns all the combinations of operations with +, -, *, and /.
The result for these two numbers would be:
2+3
2-3
2*3
2/3
For the numbers:
2, 3, 4
it would be:
(2+3)+4
(2+3)-4
(2+3)*4
(2+3)/4
(2-3)+4
(2-3)-4
(2-3)*4
(2-3)/4
...
2+(3+4)
2+(3*4)
2+(3-4)
2+(3/4)
...
3+(2+4)
3+(2*4)
3+(2-4)
3+(2/4)
...
and so on
The order of the operators doesn't matter, the point is to obtain all the results from all the possible combinations of operations.
I would tackle this by using Reverse Polish Notation, where you can just append operators and operands to a string while being considerate to a few simple rules.
For example, the expression 2 + (3 * 4) would be 2 3 4 * + in Reverse Polish Notation. On the other hand, (2 + 3) * 4 would be 2 3 + 4 *.
If we already have a partial expression, we can either add an operand or an operator.
Adding an operand can always be done and will increase the size of the stack by 1. On the other hand, adding an operator decreases the size of the stack by 1 (remove the two top-most operands and add the result) and can therefore only be done if the stack has at least two entries. At the end, to form a valid expression, the stack size has to be exactly 1.
This motivates a recursive function with the following interface:
getSubexpressions(remainingOperands, currentStackSize)
The function returns a list of subexpressions that can be appended to a partial expression with stack size currentStackSize and using the operands remainingOperands.
The base case of this recursive function is when there are no more remaining operands and the stack size is 1:
if remainingOperands = ∅ and currentStackSize = 1
return { "" }
In this case, we can only add the empty string to the expression.
In all other cases, we need to gather a set of subexpressions
subexpressions = { } // initialize an empty set
If we can add an operator, we can simply append it:
if currentStackSize >= 2
for each possible operator o
subexpressions.add(o + getSubexpressions(remainingOperands, currentStackSize - 1))
The notation o + getSubexpressions(remainingOperands, currentStackSize - 1) is shorthand for concatenating the operand o with all subexpressions returned from the call to getSubexpressions().
We are almost there. The last remaining bit is to add potential operands:
for each o in remainingOperands
subexpressions.add(o + getSubexpressions(remainingOperands \ { o }, currentStackSize + 1))
The notation remainingOperands \ { o } stands for set difference, i.e., the set of remaining operands without o.
That's it. In full:
getSubexpressions(remainingOperands, currentStackSize)
if remainingOperands = ∅ and currentStackSize = 1
return { "" }
subexpressions = { } // initialize an empty set
if currentStackSize >= 2
for each possible operator o
subexpressions.add(o + getSubexpressions(remainingOperands, currentStackSize - 1))
for each o in remainingOperands
subexpressions.add(o + getSubexpressions(remainingOperands \ { o }, currentStackSize + 1))
return subexpressions
This recursive call will usually have overlapping subcalls. Therefore, you can use memoization to cache intermediate results instead of re-calculating them over and over.
Here is a proof-of-concept implementation without memoization in C#. Expecially the operand management can be designed more efficiently with more appropriate data structures:
static void Main(string[] args)
{
foreach (var expr in GetSubexpressions(new List<string> { "1", "2", "3" }, 0, new StringBuilder()))
{
Console.WriteLine(expr);
}
}
static char[] operators = { '+', '-', '*', '/' };
static IEnumerable<StringBuilder> GetSubexpressions(IList<string> remainingOperands, int currentStackSize, StringBuilder sb)
{
if (remainingOperands.Count() == 0 && currentStackSize == 1)
{
yield return sb;
yield break;
}
if(currentStackSize >= 2)
{
foreach (var o in operators)
{
sb.Append(o);
foreach (var expr in GetSubexpressions(remainingOperands, currentStackSize - 1, sb))
yield return expr;
sb.Remove(sb.Length - 1, 1);
}
}
for (int i = 0; i < remainingOperands.Count; ++i)
{
var operand = remainingOperands[i];
remainingOperands.RemoveAt(i);
sb.Append(operand);
foreach (var expr in GetSubexpressions(remainingOperands, currentStackSize + 1, sb))
yield return expr;
sb.Remove(sb.Length - operand.Length, operand.Length);
remainingOperands.Insert(i, operand);
}
}
The program prints the following output:
12+3+
12-3+
12*3+
12/3+
12+3-
12-3-
12*3-
12/3-
12+3*
12-3*
12*3*
12/3*
12+3/
12-3/
12*3/
12/3/
123++
123-+
123*+
123/+
123+-
123--
123*-
123/-
123+*
123-*
123**
123/*
123+/
123-/
123*/
123//
13+2+
13-2+
13*2+
13/2+
13+2-
13-2-
13*2-
13/2-
13+2*
13-2*
13*2*
13/2*
13+2/
13-2/
13*2/
13/2/
132++
132-+
132*+
132/+
132+-
132--
132*-
132/-
132+*
132-*
132**
132/*
132+/
132-/
132*/
132//
21+3+
21-3+
21*3+
21/3+
21+3-
21-3-
21*3-
21/3-
21+3*
21-3*
21*3*
21/3*
21+3/
21-3/
21*3/
21/3/
213++
213-+
213*+
213/+
213+-
213--
213*-
213/-
213+*
213-*
213**
213/*
213+/
213-/
213*/
213//
23+1+
23-1+
23*1+
23/1+
23+1-
23-1-
23*1-
23/1-
23+1*
23-1*
23*1*
23/1*
23+1/
23-1/
23*1/
23/1/
231++
231-+
231*+
231/+
231+-
231--
231*-
231/-
231+*
231-*
231**
231/*
231+/
231-/
231*/
231//
31+2+
31-2+
31*2+
31/2+
31+2-
31-2-
31*2-
31/2-
31+2*
31-2*
31*2*
31/2*
31+2/
31-2/
31*2/
31/2/
312++
312-+
312*+
312/+
312+-
312--
312*-
312/-
312+*
312-*
312**
312/*
312+/
312-/
312*/
312//
32+1+
32-1+
32*1+
32/1+
32+1-
32-1-
32*1-
32/1-
32+1*
32-1*
32*1*
32/1*
32+1/
32-1/
32*1/
32/1/
321++
321-+
321*+
321/+
321+-
321--
321*-
321/-
321+*
321-*
321**
321/*
321+/
321-/
321*/
321//

Android Kotlin replace while with for next loop

We have a HashMap Integer/String and in Java we would iterate over the HashMap and display 3 key value pairs at a time with the click of a button. Java Code Below
hm.put(1, "1");
hm.put(2, "Dwight");
hm.put(3, "Lakeside");
hm.put(4, "2");
hm.put(5, "Billy");
hm.put(6, "Georgia");
hm.put(7, "3");
hm.put(8, "Sam");
hm.put(9, "Canton");
hm.put(10, "4");
hm.put(11, "Linda");
hm.put(12, "North Canton");
hm.put(13, "5");
hm.put(14, "Lisa");
hm.put(15, "Phoenix");
onNEXT(null);
public void onNEXT(View view){
etCity.setText("");
etName.setText("");
etID.setText("");
X = X + 3;
for(int L = 1; L <= X; L++ ){
String id = hm.get(L);
String name = hm.get(L = L + 1);
String city = hm.get(L = L + 1);
etID.setText(id);
etName.setText(name);
etCity.setText(city);
}
if(X == hm.size()){
X = 0;
}
}
We decoded to let Android Studio convert the above Java Code to Kotlin
The converter decide to change the for(int L = 1; L <= X; L++) loop to a while loop which seemed OK at first then we realized the while loop was running for 3 loops with each button click. Also Kotlin complained a lot about these line of code String name = hm.get(L = L + 1); String city = hm.get(L = L + 1);
We will post the Kotlin Code below and ask the question
fun onNEXT(view: View?) {
etCity.setText("")
etName.setText("")
etID.setText("")
X = X + 3
var L = 0
while (L <= X) {
val id = hm[L - 2]
val name = hm.get(L - 1)
val city = hm.get(L)
etID.setText(id)
etName.setText(name)
etCity.setText(city)
L++
}
if (X == hm.size) {
X = 0
}
}
We tried to write a For Next Loop like this for (L in 15 downTo 0 step 1)
it seems you can not count upTo so we thought we would use the hm:size for the value 15 and just use downTo
So the questions are
How do we use the Kotlin For Next Loop syntax and include the hm:size in the construct?
We have L declared as a integer but Kotlin will not let us use
L = L + 1 in the While loop nor the For Next Loop WHY ?
HERE is the strange part notice we can increment X by using X = X + 3
YES X was declared above as internal var X = 0 as was L the same way
Okay, I'll bite.
The following code will print your triples:
val hm = HashMap<Int, String>()
hm[1] = "1"
hm[2] = "Dwight"
hm[3] = "Lakeside"
hm[4] = "2"
hm[5] = "Billy"
hm[6] = "Georgia"
hm[7] = "3"
hm[8] = "Sam"
hm[9] = "Canton"
hm[10] = "4"
hm[11] = "Linda"
hm[12] = "North Canton"
hm[13] = "5"
hm[14] = "Lisa"
hm[15] = "Phoenix"
for (i in 1..hm.size step 3) {
println(Triple(hm[i], hm[i + 1], hm[i + 2]))
}
Now let's convert the same idea into a function:
var count = 0
fun nextTriplet(hm: HashMap<Int, String>): Triple<String?, String?, String?> {
val result = mutableListOf<String?>()
for (i in 1..3) {
result += hm[(count++ % hm.size) + 1]
}
return Triple(result[0], result[1], result[2])
}
We used a far from elegant set of code to accomplish an answer to the question.
We used a CharArray since Grendel seemed OK with that concept of and Array
internal var YY = 0
val CharArray = arrayOf(1, "Dwight", "Lakeside",2,"Billy","Georgia",3,"Sam","Canton")
In the onCreate method we loaded the first set of data with a call to onCO(null)
Here is the working code to iterate over the CharArray that was used
fun onCO(view: View?){
etCity.setText("")
etName.setText("")
etID.setText("")
if(CharArray.size > YY){
val id = CharArray[YY]
val name = CharArray[YY + 1]
val city = CharArray[YY + 2]
etID.setText(id.toString())
etName.setText(name.toString())
etCity.setText(city.toString())
YY = YY + 3
}else{
YY = 0
val id = CharArray[YY]
val name = CharArray[YY + 1]
val city = CharArray[YY + 2]
etID.setText(id.toString())
etName.setText(name.toString())
etCity.setText(city.toString())
YY = YY + 3
}
Simple but not elegant. Seems the code is a better example of a counter than iteration.
Controlling the For Next Look may involve less lines of code. Control of the look seemed like the wrong direction. We might try to use the KEY WORD "when" to apply logic to this question busy at the moment
After some further research here is a partial answer to our question
This code only show how to traverse a hash map indexing this traverse every 3 records needs to be added to make the code complete. This answer is for anyone who stumbles upon the question. The code and a link to the resource is provide below
fun main(args: Array<String>) {
val map = hashMapOf<String, Int>()
map.put("one", 1)
map.put("two", 2)
for ((key, value) in map) {
println("key = $key, value = $value")
}
}
The link will let you try Kotlin code examples in your browser
LINK
We only did moderate research before asking this question. Our Appoligies. If anyone is starting anew with Kotlin this second link may be of greater value. We seldom find understandable answers in the Android Developers pages. The Kotlin and Android pages are beginner friendlier and not as technical in scope. Enjoy the link
Kotlin and Android

Generate all valid combinations of N pairs of parentheses

UPDATE (task detailed Explanation):
We have a string consist of numbers 0 and 1, divided by operators |, ^ or &. The task is to create all fully parenthesized expressions. So the final expressions should be divided into "2 parts"
For example
0^1 -> (0)^(1) but not extraneously: 0^1 -> (((0))^(1))
Example for expression 1|0&1:
(1)|((0)&(1))
((1)|(0))&(1)
As you can see both expressions above have left and write part:
left: (1); right: ((0)&(1))
left: ((1)|(0)); right: (1)
I tried the following code, but it does not work correctly (see output):
// expression has type string
// result has type Array (ArrayList in Java)
function setParens(expression, result) {
if (expression.length === 1) return "(" + expression + ")";
for (var i = 0; i < expression.length; i++) {
var c = expression[i];
if (c === "|" || c === "^" || c === "&") {
var left = expression.substring(0, i);
var right = expression.substring(i + 1);
leftParen = setParens(left, result);
rightParen = setParens(right, result);
var newExp = leftParen + c + rightParen;
result.push(newExp);
}
}
return expression;
}
function test() {
var r = [];
setParens('1|0&1', r);
console.log(r);
}
test();
code output: ["(0)&(1)", "(0)|0&1", "(1)|(0)", "1|0&(1)"]
Assuming the input expression is not already partially parenthesized and you want only fully parenthesized results:
FullyParenthesize(expression[1...n])
result = {}
// looking for operators
for p = 1 to n do
// binary operator; parenthesize LHS and RHS
// parenthesize the binary operation
if expression[p] is a binary operator then
lps = FullyParenthesize(expression[1 ... p - 1])
rps = FullyParenthesize(expression[p + 1 ... n])
for each lp in lps do
for each rp in rps do
result = result U {"(" + lp + expression[p] + rp + ")"}
// no binary operations <=> single variable
if result == {} then
result = {"(" + expression + ")")}
return result
Example: 1|2&3
FullyParenthesize("1|2&3")
result = {}
binary operator | at p = 2;
lps = FullyParenthesize("1")
no operators
result = {"(" + "1" + ")"}
return result = {"(1)"}
rps = Parenthesize("2&3")
result = {"2&3", "(2&3)"}
binary operator & at p = 2
lps = Parenthesize("2")
no operators
result = {"(" + "2" + ")"}
return result = {"(2)"}
rps = Parenthesize("3")
no operators
result = {"(" + "3" + ")"}
return result = {"(3)"}
lp = "(2)"
rp = "(3)"
result = result U {"(" + "(2)" + "&" + "(3)" + ")"}
return result = {"((2)&(3))"}
lp = "(1)"
rp = "((2)&(3))"
result = result U {"(" + "(1)" + "|" + "((2)&(3))" + ")"}
binary operator & at p = 4
...
result = result U {"(" + "((1)|(2))" + "&" + "(3)" + ")"}
return result {"((1)|((2)&(3)))", "(((1)|(2))&(3))"}
You will have 2^k unique fully parenthesized expressions (without repeated parentheses) given an input expression with k binary operators.

OR operator can only be applied to Boolean expressions or to Integer or Long expressions?

I have the following code:
public class Testcode {
private static final Long[] P = new Long[18];
public void setKey( string key )
{
integer i, j, k;
long data;
integer N = 16;
string[] keytemp = new string[]{}; keytemp.add(key);
// Initialize P and S.
for ( i = 0; i < N + 2; ++i ){
P[i] = Pinit[i];
}
// XOR the key into P.
j = 0;
for ( i = 0; i < N + 2; ++i )
{
data = 0;
for ( k = 0; k < 4; ++k )
{
data = ( data << 8 ) | keytemp[j];
++j;
}
P[i] ^= data;
}
}
private static final long[] Pinit = new Long[] {
604135516L, 2242044355L, 320440478L , 57401183L,
2732047618L, 698298832L, 137296536L , 3964563569L,
1163258022L, 954160567L, 3193502383L, 887688400L,
3234508543L, 3380367581L, 1065660069L, 3041631479L,
2420952273L, 2306437331L
};
}
im getting the following error:
Error: Compile Error: OR operator can only be applied to Boolean expressions or to Integer or Long expressions at line 36 column 18
which is in this line:
data = ( data << 8 ) | keytemp[j];
Is there another way to write this line of code?
Thanks
I'm assuming that the keytemp array contains strings of length 1 since Apex doesn't have a Character primitive. You'll have to convert the first character of each string to an integer and then perform the OR.
Unfortunately Apex doesn't appear to have a built-in way of getting the ASCII value of a single-character String. You may have to write your own convertor function. Here are some people with the same issue with some proposed solutions:
Discussion of how to convert strings to ASCII values
Something like that?
No loops except for the initial priming of the array ... And later we have to get each character separately but looks like your algorithm needs 1-char long strings anyway?
List<Integer> ints = new List<Integer>();
for(Integer i =0; i < 256; ++i){
ints.add(i);
}
String allAscii = String.fromCharArray(ints);
// System.debug(allAscii); // funny result if you really want to start from 0x00 character
System.debug(allAscii.substring(1)); // for demo purposes we'll show only from 0x01 though
String text = 'Hi StackOverflow.com!';
for(Integer i =0; i < text.length(); ++i){
String oneChar = text.mid(i, 1);
System.debug(oneChar + ' => ' + allAscii.indexOf(oneChar));
}
Output:
!"#$%&'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ
H => 72
i => 105
=> 32
S => 83
t => 116
a => 97
c => 99
k => 107
O => 79
v => 118
e => 101
r => 114
f => 102
l => 108
o => 111
w => 119
. => 46
c => 99
o => 111
m => 109
! => 33
Looks good to me (space is # 32 etc).
Could be further optimized to a linear search if you'd build a Map<String, Integer> instead of indexOf but I'd say it's good enough?

Other examples of magical calculations

I have seen this topic here about John Carmack's magical way to calculate square root, which refers to this article: http://www.codemaestro.com/reviews/9. This surprised me a lot, I just didn't ever realized that calculating sqrt could be so faster.
I was just wondering what other examples of "magic" exist out there that computer games use to run faster.
UPDATE:
John Carmack is not the author of the magic code. This article tells more. Thanks #moocha.
There is a book which gathers many of those 'magic tricks' and that may be interesting for you: The Hacker's Delight.
You have for example many tricks like bit twiddling hacks etc... (you have several square root algorithms for example that you can see on the google books version)
Not exactly a mathematical hack, but I like this one about Roman Numerals in Java6:
public class Example {
public static void main(String[] args) {
System.out.println(
MCMLXXVII + XXIV
);
}
}
will give you the expected result (1977 + 24 = 2001), because of a rewrite rule:
class Transform extends TreeTranslator, an internal class of the Java compiler.
Transform visits all statements in the source code, and replaces each variable whose name matches a Roman numeral with an int literal of the same numeric value.
public class Transform extends TreeTranslator {
#Override
public void visitIdent(JCIdent tree) {
String name = tree.getName().toString();
if (isRoman(name)) {
result = make.Literal(numberize(name));
result.pos = tree.pos;
} else {
super.visitIdent(tree);
}
}
}
I'm a big fan of Bresenham Line, but man the CORDIC rotator enabled all kinds of pixel chicanery for me when CPUs were slower.
Bit Twiddling Hacks has many cool tricks.
Although some of it is dated now, I was awed by some of the tricks in "The Zen of Code Optimization" by Michael Abrash. The implementation of the Game Of Life is mind-boggling.
I have always been impressed from two classic 'magic' algorithms that have to do with dates:
Zeller's congruence for computing the day of week of a given date
Gauss's algorithm to calculate the date of Easter
Some (untested) code follows:
import math
def dayOfWeek(dayOfMonth, month, year):
yearOfCentury = year%100
century = year // 100
h = int(dayOfMonth + math.floor(26.0*(month + 1)/10) + yearOfCentury \
+ math.floor(float(yearOfCentury)/4) + math.floor(float(century)/4) \
+ 5*century) % 7
return ['Saturday', 'Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday'][h]
def easter(year):
a = year%19
b = year%4
c = year%7
k = int(math.floor(float(year)/100))
p = int(math.floor((13 + 8.0*k)/25))
q = int(math.floor(float(k)/4))
M = (15 - p + k - q)%30
N = (4 + k - q)%7
d = (19*a + M)%30
e = (2*b + 4*c + 6*d + N)%7
day1 = 22 + d + e
if day1 <= 31: return "March %d"%day1
day2 = d + e - 9
if day2 == 26: return "April 19"
if day2 == 25 and (11*M + 11)%30 < 19: return "April 18"
return "April %d"%day2
print dayOfWeek(2, 12, 2008) # 'Tuesday'
print easter(2008) # 'March 23'

Resources