How to align long texts? [closed] - algorithm

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I want to align a pair of long texts with ~20M chars each.
I've used in the past Smith-Waterman algorithm but (from my limited understanding) it requires creating a 2-dimensional array with the size of the texts (20M by 20M array) - which is not practical.
So I'm looking for an algorithm to align a pair of long texts that will keep a practical memory size and execution time.
UPDATE
I've also tried Myers and Miller using this implementation: https://www.codeproject.com/Articles/42279/Investigating-Myers-diff-algorithm-Part-of
But I still got out of memory exception on "not so large" texts (1MB).

Related

Divide the amount of waters in to containers [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 days ago.
Improve this question
Let 𝑛 > 1 identical containers, one of them with 𝑊 liters of water and the rest empty. You are allowed to perform the following action: take two of the containers and divide the total amount of water in them equally between them. The objective is to achieve the minimum possible amount of water in the container containing all the water of the original arrangement by a sequence of the above actions of transferring water from container to container.
What is the best way to do this? How many actions will be required? Can I have an algorithm?

What are some heuristics for choosing a diff algorithm? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
The Myers diff algorithm performs well when the differences between the two texts are small, because most simple implementations have complexity O((N+M) * D). However when differences are large, it takes a very long time to run. For example, if one of the texts is large and the other is the empty string, many implementations take several minutes to run.
If you knew the differences were large, then you could choose a different algorithm. How do diff tools make this determination in practice?

Algorithm to calculate the sum of a mathematical series [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Recently I came across a online tool, which given a summation calculates its formula. I have tried for many such summations and has given me correct answer.
I was curious as to which algorithm does it use to solve it.
EDIT: It turns out the tool uses wolframaplha api's. But even if you search on wolfram alpha you will get the same result.

Interview ques : Sorting 20 GB data [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Given 20 GB of data (normally numbers) and you have only 1GB of RAM, How will you sort the data?
You can use something similar to merge sort.
Sort 20 groups of numbers and write them to disk. Once their sorted read from all groups simultaneously using a buffer and print out the ordered master set. For this last merge step you should only need constant memory.

Are applications such as image processing naturally slow in Scheme for lacking a random-access data-structure? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Scheme lists are slow for random access, which is a common operation in many applications such as image processing. Does this make it naturally handicapped for that kind of application?
If performance is a concern, then you should definitely consider using fixed-access-time structures. Fortunately, Scheme has lots of these, too. The "vector" is the simplest one; it's a close match to what most languages call an "array".

Resources