Trying to understand the space complexity of this algorithm - algorithm

I see a lot of articles online explaining Time complexity but haven't found anything good that explains space complexity well. I was trying to solve the following interview question
You have two numbers represented by a linked list, where each node
contains a single digit. The digits are stored in reverse order, such
that the Ts digit is at the head of the list. Write a function that
adds the two numbers and returns the sum as a linked list.
EXAMPLE
Input: (7-> 1 -> 6) + (5 -> 9 -> 2).That is, 617 + 295.
Output: 2 -> 1 -> 9.That is, 912.
My solution for it is the following:
private Node addLists(Node head1, Node head2) {
Node summationHead = null;
Node summationIterator = null;
int num1 = extractNumber(head1);
int num2 = extractNumber(head2);
int sum = num1 + num2;
StringBuilder strValue = new StringBuilder();
strValue.append(sum);
String value = strValue.reverse().toString();
char[] valueArray = value.toCharArray();
for (char charValue : valueArray) {
Node node = createNode(Character.getNumericValue(charValue));
if (summationHead == null) {
summationHead = node;
summationIterator = summationHead;
} else {
summationIterator.next = node;
summationIterator = node;
}
}
return summationHead;
}
private Node createNode(int value) {
Node node = new Node(value);
node.element = value;
node.next = null;
return node;
}
private int extractNumber(Node head) {
Node iterator = head;
StringBuilder strNum = new StringBuilder();
while (iterator != null) {
int value = iterator.element;
strNum.append(value);
iterator = iterator.next;
}
String reversedString = strNum.reverse().toString();
return Integer.parseInt(reversedString);
}
Can someone please deduce the space complexity for this? Thanks.

The space complexity means "how does the amount of space required to run this algorithm change asymptotically as the inputs get larger"?
So you have two lists of length N and M. The resultant list will have length max(N,M), possibly +1 if there's a carry. But that +1 is a constant, and we don't consider it part of the Big-O as the larger of N or M will dominate.
Also note this algo is pretty straightforward. There's no intermediate calculation requiring larger-than-linear space.
The space complexity is max(N,M).

Related

Trie Autocomplete with word weight(frequency)

I was asked this during a recent phone interview -
Given a Dictionary with a word and the weight of a word(frequency, higher is better), like so -
var words = new Dictionary<string,int>();
words.Add("am",7);
words.Add("ant", 5);
words.Add("amazon", 10);
words.Add("amazing", 8);
words.Add("an", 4);
words.Add("as", 11);
words.Add("be", 8);
words.Add("bee", 2);
words.Add("bed", 4);
words.Add("best", 12);
words.Add("amuck", 1);
words.Add("amock", 2);
words.Add("bestest", 1);
Design an API method, that given a prefix and a number k, return the top k words that match the prefix.
The words should be sorted based on their weight, the higher the better.
So, prefix = "am", k = 5, returns amazon, amazing, am, amock, amuck - in that specific order.
Performance on the prefix lookup is paramount, you can pre-process and use as much space as you like, as long as the prefix lookup is fast.
This is a Trie implementation, but my question is how best to handle the word weight and optimise the lookup. In my mind the options are -
a. For each node in the Trie, also store a sorted list of words (SortedDictionary<int,List<string>>) that start with this prefix - more space, but faster lookup.
b. For each node, store the Child nodes in some kind of sorted list, so you would still need to do a DFS for each child node to get the K words needed - less space compared to a., but slower.
I decided to go with option a.
public class TrieWithSuggestions
{
TrieWithSuggestions _trieRoot;
public TrieWithSuggestions()
{
}
public char Character { get; set; }
public int WordCount { get; set; } = 1;
public TrieWithSuggestions[] ChildNodes { get; set; } = new TrieWithSuggestions[26];
//Stores all words with this prefix.
public SortedDictionary<int, HashSet<string>> PrefixWordsDictionary = new SortedDictionary<int, HashSet<string>>();
public TrieWithSuggestions ConstructTrie(Dictionary<string, int> words)
{
if (words.Count > 0)
{
_trieRoot = new TrieWithSuggestions() { Character = default(char) };
foreach (var word in words)
{
var node = _trieRoot;
for (int i = 0; i < word.Key.Length; i++)
{
var c = word.Key[i];
if (node.ChildNodes[c - 'a'] != null)
{
node = node.ChildNodes[c - 'a'];
UpdateParentNodeInformation(node, word.Key, words[word.Key]);
node.WordCount++;
}
else
{
InsertIntoTrie(node, word.Key, i, words);
break;
}
}
}
}
return _trieRoot;
}
public List<string> GetMathchingWords(string prefix, int k)
{
if (_trieRoot != null)
{
var node = _trieRoot;
foreach (var ch in prefix)
{
if (node.ChildNodes[ch - 'a'] != null)
{
node = node.ChildNodes[ch - 'a'];
}
else
return null;
}
if (node != null)
return GetWords(node, k);
else
return null;
}
return null;
}
List<string> GetWords(TrieWithSuggestions node, int k)
{
List<string> output = new List<string>();
foreach (var dictEntry in node.PrefixWordsDictionary)
{
var entries = node.PrefixWordsDictionary[dictEntry.Key];
var take = Math.Min(entries.Count, k);
output.AddRange(entries.Take(take).ToList());
k -= entries.Count;
if (k == 0)
break;
}
return output;
}
void InsertIntoTrie(TrieWithSuggestions parentNode, string word, int startIndex, Dictionary<string, int> words)
{
for (int i = startIndex; i < word.Length; i++)
{
var c = word[i];
var childNode = new TrieWithSuggestions() { Character = c };
parentNode.ChildNodes[c - 'a'] = childNode;
UpdateParentNodeInformation(parentNode, word, words[word]);
parentNode = childNode;
if (i == word.Length - 1)
UpdateParentNodeInformation(parentNode, word, words[word]);
}
}
void UpdateParentNodeInformation(TrieWithSuggestions parentNode, string word, int wordWeight)
{
wordWeight *= -1;
if (parentNode.PrefixWordsDictionary.ContainsKey(wordWeight))
{
if (!parentNode.PrefixWordsDictionary[wordWeight].Contains(word))
parentNode.PrefixWordsDictionary[wordWeight].Add(word);
}
else
parentNode.PrefixWordsDictionary.Add(wordWeight, new HashSet<string>() { word });
}
}
Construct Trie - RunTime O(N* M * logN), Space - O(N * M * N) , N - #of words, M - avg word length.
Justification -
If there were no Dictionary, this would be O(N * M), insertion into a SortedDictionary is O(logN), so worst case Runtime must be O(N* M * logN)
Space seems trickier, but like before if there were no SortedDictionary, space would be O(N * M), and in the worst case, the Dictionary could have all N words, so Space Complexity looks like O(N * M * N)
GetMatchingWords - RunTime O(len(prefix) + k)
Function call -
var trie = new TrieWithSuggestions();
trie.ConstructTrie(words);
var list = trie.GetMathchingWords("am", 10); //amazon, amazing, am, amock, amuck
QUESTION:
Given the conditions on space and pre-processing, is there a better way to do this?
EDIT 1 -
a. Given this setup, it is best to sort the words by weight and then insert into the Trie. In this case a simple List<string> would suffice, since higher frequency words would have been inserted first automatically.
b. Now lets say that in addition to being initialized with a Dictionary<string,int>, we are also going to get additional word, frequency pairs. We would still want a lookup that is as fast as possible, given this requirement what is now the best data-structure to store the sorted list of words within a TrieNode, is a SortedDictionary<int,HashSet<string>> the best option?
You could first sort the input with respect to the weights. Then, you could use Lists instead of Dictionaries on the nodes of trie. Since the words come in increasing (or decreasing) order of weight, checking the last element of the list is enough to decide where to put this new word. This gets rid of the O(logN) time taken by Dictionary.
The input can be sorted in O(N * logN) with a comparison sort, or in O(N + W) with a counting sort where W is the maximum weight.
Time complexity of setting up the trie becomes O(N * logN + N * M). This is better than O(N * M * logN). Query time does not change.
(Last paragraph assumes HashSet operations execute in O(1) as in the question. It is wrong to make this assumption for arbitrary inputs and hash functions.)

How to find the total number of items in linked list?

I have a linked list which is cyclic and I want to find out the total number of elements in this list. How to achieve this?
One solution that I can think of is maintaining two pointers. First pointer (*start) will always point to the starting node, say Node A.
The other pointer (*current) will be initialized as: current = start->next.
Now, just iterate each node with current -> next until it points to start.
And keep incrementing a counter: numberOfNodes++;
The code will look like:
public int countNumberOfItems(Node* start){
Node* current = start -> next;
int numberOfNodes = 1; //Atleast the starting node is there.
while(current->next != start){
numberOfNodes++;
current = current->next;
}
return numberOfNodes;
}
Let's say the list has x nodes before the loop and y nodes in the loop. Run the Floyd cycle detection counting the number of slow steps, s. Once you detect a meet point, run around the loop once more to get y.
Now, starting from the list head, make s - y steps, getting to the node N. Finally, run two slow pointers from N and M until they meet, for t steps. Convince yourself (or better prove) that they meet where the initial part of the list enters the loop.
Therefore, the initial part has s - y + t + 1 nodes, and the loop is formed by y nodes, giving s + t + 1 total.
You just want to count the nodes in your linked list right? I've put an example below. But in your case there is a cycle so you also need to detect that in order not to count some of the nodes multiple times.
I've corrected my answer there is now an ordinary count and count in loop (with a fast and slow pointer).
static int count( Node n)
{
int res = 1;
Node temp = n;
while (temp.next != n)
{
res++;
temp = temp.next;
}
return res;
}
static int countInLoop( Node list)
{
Node s_pointer = list, f_pointer = list;
while (s_pointer !=null && f_pointer!=null && f_pointer.next!=null)
{
s_pointer = s_pointer.next;
f_pointer = f_pointer.next.next;
if (s_pointer == f_pointer)
return count(s_pointer);
}
return 0;
}
First find the cycle using Floyd Cycle Detection algorithm and also maintain count when you checking cycle once found loop then print count for the same.
function LinkedList() {
let length = 0;
let head = null;
let Node = function(element) {
this.element = element;
this.next = null;
}
this.head = function() {
return head;
};
this.add = function(element) {
let node = new Node(element);
if(head === null){
head = node;
} else {
let currentNode = head;
while(currentNode.next) {
currentNode = currentNode.next;
}
currentNode.next = node;
}
};
this.detectLoopWithCount = function() {
head.next.next.next.next.next.next.next.next = head; // make cycle
let fastPtr = head;
let slowPtr = head;
let count = 0;
while(slowPtr && fastPtr && fastPtr.next) {
count++;
slowPtr = slowPtr.next;
fastPtr = fastPtr.next.next;
if (slowPtr == fastPtr) {
console.log("\n Bingo :-) Cycle found ..!! \n ");
console.log('Total no. of elements = ', count);
return;
}
}
}
}
let mylist = new LinkedList();
mylist.add('list1');
mylist.add('list2');
mylist.add('list3');
mylist.add('list4');
mylist.add('list5');
mylist.add('list6');
mylist.add('list7');
mylist.add('list8');
mylist.detectLoopWithCount();
There is a "slow" pointer which moves one node at a time. There is a "fast" pointer which moves twice as fast, two nodes at a time.
A visualization as slow and fast pointers move through linked list with 10 nodes:
1: |sf--------|
2: |-s-f------|
3: |--s--f----|
4: |---s---f--|
5: |----s----f|
At this point one of two things are true: 1) the linked list does not loop (checked with fast != null && fast.next != null) or 2) it does loop. Let's continue visualization assuming it does loop:
6: |-f----s---|
7: |---f---s--|
8: |-----f--s-|
9: |-------f-s|
10: s == f
If the linked list is not looped, the fast pointer finishes the race at O(n/2) time; we can remove the constant and call it O(n). If the linked list does loop, the slow pointer moves through the whole linked list and eventually equals the faster pointer at O(n) time.

Word ladder complexity analysis

I'd like to make sure that I am doing the time complexity analysis correctly. There seems to be many different analysis.
Just in case people don't know the problem this is problem description.
Given two words (beginWord and endWord), and a dictionary's word list, find the length of shortest transformation sequence from beginWord to endWord, such that:
Only one letter can be changed at a time.
Each transformed word must exist in the word list. Note that beginWord is not a transformed word.
For example,
Given:
beginWord = "hit"
endWord = "cog"
wordList = ["hot","dot","dog","lot","log","cog"]
As one shortest transformation is "hit" -> "hot" -> "dot" -> "dog" -> "cog",
return its length 5.
And this is simple BFS algorithm.
static int ladderLength(String beginWord, String endWord, List<String> wordList) {
int level = 1;
Deque<String> queue = new LinkedList<>();
queue.add(beginWord);
queue.add(null);
Set<String> visited = new HashSet<>();
// worst case we can add all dictionary thus N (len(dict)) computation
while (!queue.isEmpty()) {
String word = queue.removeFirst();
if (word != null) {
if (word.equals(endWord)) {
return level;
}
// m * 26 * log N
for (int i = 0; i < word.length(); i++) {
char[] chars = word.toCharArray();
for (char c = 'a'; c <= 'z'; c++) {
chars[i] = c;
String newStr = new String(chars);
if (!visited.contains(newStr) && wordList.contains(newStr)) {
queue.add(newStr);
visited.add(newStr);
}
}
}
} else {
level++;
if (!queue.isEmpty()) {
queue.add(null);
}
}
}
return 0;
}
wordList (dictionary) contains N elements and length of beginWord is m
In worst case, the queue would have all the element in the word list, thus, the outer while loop would run for o(N).
For each word (length m), it tries 26 charaters (a to z) thus inner nested for loop is o(26*m), and inside inner for loop, it does wordList.contains assume it's o(logN).
So overall it's o(N*m*26*logN) => o(N*mlogN)
Is this correct?
The List<T> type does not automatically sort its elements, but instead "faithfully" keeps all elements in the order they were added. So wordList.contains is in fact O(n). However for a HashSet such as visited, this operation is O(1) (amortized), so consider switching to that.

Longest Common Substring

We have two strings a and b respectively. The length of a is greater than or equal to b. We have to find out the longest common substring. If there are multiple answers then we have to output the substring which comes earlier in b (earlier as in whose starting index comes first).
Note: The length of a and b can be up to 106.
I tried to find the longest common substring using suffix array (sorting the suffixes using quicksort). For the case when there is more than one answer, I tried pushing all the common substrings in a stack which are equal to the length of the longest common substring.
I wanted to know is there any faster way to do so?
Build a suffix tree of a string a$b, that is, a concatenated with some character like $ not occurring in both strings, then concatenated with b. A (compressed) suffix tree can be built in O(|a|+|b|) time and memory, and have O(|a|+|b|) nodes.
Now, for each node, we know its depth (the length of the string obtained by starting from the root and traversing the tree down to that node). We also can keep track of two boolean quantities: whether this node was visited during the build phase corresponding to a, and whether it was visited during the build phase corresponding to b (for example, we might as well build the two trees separately and then merge them using pre-order traversal). Now, the task boils down to finding the deepest vertex which was visited during both phases, which can be done by a single pre-order traversal. The case of multiple answers should be easy to handle.
This Wikipedia page contains another (brief) overview of the technique.
This is longest substring,what you are looking for is it with repetition or without .
please go through this it might be helpful.
http://www.programcreek.com/2013/02/leetcode-longest-substring-without-repeating-characters-java/
import java.util.Scanner;
public class JavaApplication8 {
public static int find(String s1,String s2){
int n = s1.length();
int m = s2.length();
int ans = 0;
int[] a = new int[m];
int b[] = new int[m];
for(int i = 0;i<n;i++){
for(int j = 0;j<m;j++){
if(s1.charAt(i)==s2.charAt(j)){
if(i==0 || j==0 )a[j] = 1;
else{
a[j] = b[j-1] + 1;
}
ans = Math.max(ans, a[j]);
}
}
int[] c = a;
a = b;
b = c;
}
return ans;
}
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
String s1 = sc.next();
String s2 = sc.next();
System.out.println(find(s1,s2));
}
}
Time Complexity O(N)
Space Complexity O(N)
package main
import (
"fmt"
"strings"
)
func main(){
fmt.Println(lcs("CLCL","LCLC"))
}
func lcs(s1,s2 string)(max int,str string){
str1 := strings.Split(s1,"")
str2 := strings.Split(s2,"")
fmt.Println(str1,str2)
str = ""
mnMatrix := [4][4]int{}
for i:=0;i<len(str1);i++{
for j:=0;j<len(str2);j++{
if str1[i]==str2[j]{
if i==0 || j==0 {
mnMatrix[i][j] = 1
max = 1
//str = str1[i]
}else{
mnMatrix[i][j] = mnMatrix[i-1][j-1]+1
max = mnMatrix[i][j]
str = ""
for k:=max;k>=1;k--{
str = str + str2[k]
//fmt.Println(str)
}
}
}else{
mnMatrix[i][j] = 0
}
}
}
fmt.Println(mnMatrix)
return max, str
}
enter code here

Lowest Common Ancestor implementations - what's the difference?

I've been reading about the Lowest Common Ancestor algorithm on top coder and I can't understand why the RMQ algorithm is involved - the solution listed there is insanely complicated and has the following properties:
O(sqrt(n)) time complexity for searches, O(n) precalculation time complexity
O(n) space complexity for storing parents of each node
O(n) space complexity again, for storing precalculations of each node
My solution: given 2 integer values, find the nodes through a simple preorder traversal. Take one of the nodes and go up the tree and store the path into a Set. Take the other node and go up the tree and check each node as I go up: if the node is in the Set, stop and return the LCA. Full implementation.
O(n) time complexity for finding each of the 2 nodes, given the values (because it's a regular tree, not a BST -
O(log n) space complexity for storing the path into the Set
O(log n) time complexity for going up the tree with the second node
So given these two choices, is the algorithm on Top Coder better and if yes, why? That's what I can't understand. I thought O(log n) is better than O(sqrt(n)).
public class LCA {
private class Node {
int data;
Node[] children = new Node[0];
Node parent;
public Node() {
}
public Node(int v) {
data = v;
}
#Override
public boolean equals(Object other) {
if (this.data == ((Node) other).data) {
return true;
}
return false;
}
}
private Node root;
public LCA() {
root = new Node(3);
root.children = new Node[4];
root.children[0] = new Node(15);
root.children[0].parent = root;
root.children[1] = new Node(40);
root.children[1].parent = root;
root.children[2] = new Node(100);
root.children[2].parent = root;
root.children[3] = new Node(10);
root.children[3].parent = root;
root.children[0].children = new Node[3];
root.children[0].children[0] = new Node(22);
root.children[0].children[0].parent = root.children[0];
root.children[0].children[1] = new Node(11);
root.children[0].children[1].parent = root.children[0];
root.children[0].children[2] = new Node(99);
root.children[0].children[2].parent = root.children[0];
root.children[2].children = new Node[2];
root.children[2].children[0] = new Node(120);
root.children[2].children[0].parent = root.children[2];
root.children[2].children[1] = new Node(33);
root.children[2].children[1].parent = root.children[2];
root.children[3].children = new Node[4];
root.children[3].children[0] = new Node(51);
root.children[3].children[0].parent = root.children[3];
root.children[3].children[1] = new Node(52);
root.children[3].children[1].parent = root.children[3];
root.children[3].children[2] = new Node(53);
root.children[3].children[2].parent = root.children[3];
root.children[3].children[3] = new Node(54);
root.children[3].children[3].parent = root.children[3];
root.children[3].children[0].children = new Node[2];
root.children[3].children[0].children[0] = new Node(25);
root.children[3].children[0].children[0].parent = root.children[3].children[0];
root.children[3].children[0].children[1] = new Node(26);
root.children[3].children[0].children[1].parent = root.children[3].children[0];
root.children[3].children[3].children = new Node[1];
root.children[3].children[3].children[0] = new Node(27);
root.children[3].children[3].children[0].parent = root.children[3].children[3];
}
private Node findNode(Node root, int value) {
if (root == null) {
return null;
}
if (root.data == value) {
return root;
}
for (int i = 0; i < root.children.length; i++) {
Node found = findNode(root.children[i], value);
if (found != null) {
return found;
}
}
return null;
}
public void LCA(int node1, int node2) {
Node n1 = findNode(root, node1);
Node n2 = findNode(root, node2);
Set<Node> ancestors = new HashSet<Node>();
while (n1 != null) {
ancestors.add(n1);
n1 = n1.parent;
}
while (n2 != null) {
if (ancestors.contains(n2)) {
System.out.println("Found common ancestor between " + node1 + " and " + node2 + ": node " + n2.data);
return;
}
n2 = n2.parent;
}
}
public static void main(String[] args) {
LCA tree = new LCA();
tree.LCA(33, 27);
}
}
The LCA algorithm works for any tree (not necessarily binary and not necessarily balanced). Your "simple algorithm" analysis breaks down since tracing a path to a root node is actually O(N) time and space instead of O(log N)
Just want to point out that the problem is about the rooted tree and not binary search tree. So, in you algorithm the
O(n) time complexity for finding each of the 2 nodes, given the values
O(n) space complexity for storing the path into the Set
O(sqrt(n)) time complexity for going up the tree with the second node and searching in the first n-stored elements.
Checking of each node as we go up from the second node with take O(n), so for n nodes it will take O(sqrt(n)).
The Harel and Tarjan LCA algorithm (reference in the link you gave) uses a pre-calculation with O(n) complexity, after which a lookup is O(1) (not O(sqrt(n) as you claim).

Resources