Trie with association between words - algorithm

I have a usecase where i want to correct words.
I have set of correct and incorrect words[spelling mistake].
I populate the trie with all words.
i have both the correct and incorrect version of every word.
Now in case if i get word as "a" for correction ,
-- i search it in trie.if trie has this word , i want to associate this word with the correct version of this word.
Solution :
i can set correct version ["a1"] of word at last node of incorrect word in trie. And can resolve it to "a1" .
But i have to store correct version of each word at last node that will increase the memory foot print .
Since i have all words loaded in to trie [correct/incorrect].
Is there any way i can make association between correct and incorrect word without again storing entire word in last node as value ?
Any pointer ?
public class TrieNode<T> {
private Map<Character, TrieNode<T>> childs;
private boolean complete;
private T value;
....
}

You could use a single dictionary for this. In C#, that would be:
Dictionary<string, string> MisspellingsLookup = new Dictionary<string, int>();
The key is the misspelling, and the value is the correct spelling.
Now, some words are commonly misspelled in multiple ways. For example, "occasion" is often misspelled as "ocassion" or "occassion" If you want to reduce the memory used by the multiple misspellings, you can use a temporary dictionary during construction. Whenever you add a misspelling, you look up the correct spelling in the good words dictionary, and if it's already there then you use that value. So all you do is store a reference to an existing word rather than creating a new string. Here's an example:
Dictionary<string, string> GoodWords = new Dictionary<string, int>();
Dictionary<string, string> Misspellings = new Dictionary<string, string>();
void AddMisspelling(string misspelled, string correct)
{
string goodWord;
if (!GoodWords.TryGetValue(correct, out goodWord))
{
goodWord = correct;
GoodWords.Add(correct, correct);
}
// Always use goodWord here, so you're not creating duplicate strings.
Misspellings.Add(misspelled, goodWord);
}
When you're done adding words, you can clear the GoodWords dictionary to save space:
GoodWords = null;
I recommend a dictionary here because it'll almost certainly use less memory, and lookup is O(1) rather than O(word length).

Related

Is a map backwards compatible with a repeated keyvaluepair?

Am I able to use
message Foo {
map<string, string> foo = 1;
}
in place of
message Foo {
repeated KeyValuePair foo = 1;
}
message KeyValuePair {
string key = 1;
string value = 2;
}
?
The first source is in proto3 and the second is in proto2.
As long as you don't have duplicate keys, they will be very similar. If you have duplicate keys, using a map will have different behaviour, as duplicatws will either cause overwrites or an exception (I can't recall which, sorry). Also, "repeated" is usually implemented as a list/array/etc, so: order is retained. "map" is usually implemented with some kind of map/dictionary structure, where order is not usually guaranteed.
So: if order doesn't matter and you always have unique keys: you're fine.

Using processing and getting an unexpected token on my array. #hackathon

String[] files= {};
int[] fileNumber = {0};
String commandPromptTxt = "";
String CPTDummy = "";
String blankDummy = "";
String[] currentFile = {};
void makeFile(String[] file, int fileNum, String name1, int level1, int[]parents1, int[] children1, String type1) {
//Warning if you make a file and use the same file number more than once you will override the file
files[fileNum]= {"10"};
};
So I have that amazing piece of code in processing and I am getting an error unexpected token:{ where I say files[fileNum] = {}; also even when I enter values into the brackets I get the same error. Any ideas of a fix for this? Thanks.
Why are you including brackets at all?
The syntax you're using is an array initializer. You use it correctly here:
String[] files= {};
This initializes your files variable to an empty array. You also use the syntax correctly here:
int[] fileNumber = {0};
This initializes your fileNumber variable to an array with a single index, and in that index is the value 0.
This line is where it stops making sense:
files[fileNum]= {"10"}
First of all, you've already initialized your files variable to an array with zero indexes. That means that even if this would compile, you'd get an ArrayIndexOutOfBoundsException, because you're trying to use indexes of an array that doesn't have any.
Secondly, you're misusing the array initialization syntax. I'm pretty sure you don't want the indexes of your array to also be arrays, otherwise you'd have to make them 2D arrays.
So, to sum it up, you need to do two things:
1: Initialize your arrays to actually have indexes. Something like this:
String[] files = new String[10]; //array with 10 indexes
2: Stop misusing the array initialization syntax and just pass values into the array indexes:
files[fileNum]= "10";
You might be better off using ArraysLists instead though. Then you don't need to know how many indexes you'll have ahead of time, and you can simply call the add() function to add stuff to them.
More info can be found in the Processing reference.

Accessing public static final field using JoSQL

I've been using JoSQL for quite a few months now and today I came across a problem I am not sure how to solve. I probably could solve it by binding variables/placeholders, but I'd like to include the fields in the query.
SELECT * FROM ...MyObject WHERE getType != com.mypackage.myclass.TYPE_A
This is the query that I have. TYPE_A is a public static final int attribute in "myclass" class. Accessing methods (such as getType) is easy, because getType is expected to be a method from MyObject - just that I do not write round brackets after it (this is how JoSQL works as far as I know).
Does anyone happen to have an idea how to access a public static final field?
JoSQL uses gentlyweb-utils; it seems to be some sort of Accessor/Getter/Setter framework. I'd love to access that attribute without having to bind variables, but I haven't been able to do so.
Thanks for your help in advance! I really appreciate it.
I think I have figured something out. First: it seems not possible to access the static variables for whatever reason. I've used the following approach to solve my issue:
create a method, which picks up a given JoSQL-statement
mark the constants, which you want to replace, by say "{?FULL_PACKAGE_AND$CONSTANT}"
use reflections to determine the column as well as the column (and value) from the field
iteratively replace the statement until no "{?"-values are available
Example:
JoSQL-statement looks like this:
(isWeapon = TRUE AND getItem.getType2 = {?com.l2jserver.gameserver.model.items.L2Item$TYPE2_WEAPON})
Method using the query-object:
final Query query = DataLayer.createJoSqlQuery(joSql);
Method (pre)processing the JoSQL-statement:
final Query query = new Query();
int variableColumn = 0;
while (joSql.indexOf("{?") > -1) {
variableColumn++;
final int startIndex = joSql.indexOf("{?");
final int endIndex = joSql.indexOf("}", startIndex);
final String value = joSql.substring(startIndex + 2, endIndex);
try {
final Object variableValue = Class.forName(value.split("\\$")[0]).getField(value.split("\\$")[1]).get(null);
query.setVariable(variableColumn, variableValue);
joSql = joSql.replace("{?" + value + "}", "?");
}
catch (...) {
e.printStackTrace();
}
}
query.parse(joSql);
return query;
The JoSQL-statement preprocessing method bascially iterates through a given JoSQL-statement and sees whether it contains the string "{?". If it does, it does some copy and paste (note the dollar-symbol right in front of the constant name).
Finally it creates the objects and sets them using something similar to prepared statements "setObject"-method. In the end it just replaces the values within the JoSQL-statement with question marks ("?") and sets a corresponding object in the newly created Query-object, which is later used to retrieve information.

Two arrays or one in Map structure?

I'm trying to create a Map where the data will be static and not change after the program starts (actually loaded from a server)
Is it better to have two arrays, e.g. in Java:
String keys[] = new String[10];
String values[] = new String[10];
where keys[i] corresponds to values[i]?
or to keep them in a single array, e.g.
String[][] map[] = new String[10][2];
where map[i][0] is the key and map[i][1] is the value?
Personally, the first makes more sense to me, but the second makes more sense to my partner. Is either better performance-wise? Easier to understand?
Update: I'm looking to do this in JavaScript where Map and KeyValuePairs don't exist
Using a Map implementation (in Java) would make this easier to understand as the association is clearer:
static final Map<String, String> my_map;
static
{
my_map = new HashMap<String, String>();
// Populate.
}
A Hashtable looks like what you need. It hashes the keys in such a way that lookup can happen in O(1).
So, you're looking to do this in javascript. Any array or object in js in a map, so you could just do
var mymap = {'key1':'value1','key2':'value2'};

Map Support in Shell Scripting

I am new in Shell Scripting, however i am friendly with Java Maps. I Just wanted to know that how can i use Map facility in Shell Scripting. Below is the facility i need to use in shell-
HashMap<String, ArrayList<String>> users = new HashMap<String, ArrayList<String>>();
String username = "test_user1";
String address = "test_user1_address";
String emailId = "test_user1_emailId";
ArrayList<String> values = new ArrayList<String>();
values.add(address);
values.add(emailId);
users.put(username, values);
String anotherUser = "test_user2";
if (users.containsKey(anotherUser)) {
System.out.println("Do some stuff here");
}
In short, i want to use a Map, which has String as key, either Vector or ArrayList as value (otherwise i have live with Arrays instead of ArrayList and manually take care of indexes) , put method to insert and one more method to check the presence of the key in the existing Map.
The above code is a sample code.
Thank you in advance.
bash does not support nested structures like this. Either use separate variables for each array, or use something more capable such as Python.

Resources