Evaluate compression algorithm [closed] - algorithm

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I'm researching on compression algorithms (huffman coding and LZ77) and was wondering how I would evaluate their efficiency depending on the input image. I know how they work but I can't find information on their evaluation (mathematically). Thanks!

General-purpose (universal) compressors like LZ77 are usually compared by testing them against a standard set of sources and comparing the results, see: http://www.maximumcompression.com/, http://www.maximumcompression.com/data/summary_mf.php, for example.
Compressors for specific purposes are tested against source sets that are chosen to be as representative as possible.
For some applications it is also useful to place mathematical bounds on compression efficiency in terms of the source entropy.

Related

What mean resilient, robust and resistant algorithm? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have problems with certain algorithmic terms.
What is a robust algorithm ?
What is a resistant algorithm ?
What is a resilient algorithms ?
Thank you in advance.
These attributes have no exact definition. So it depends on your topic/problem what they mean.
They are all used to describe algorithms that can cope with some kind of errors (e.g. outlier or noise) in the input-data and still deliver a useful / the expected result.
So in general you define the kind of errors the algorithm is expected to handle in a defined way.
E.g 'This algorithm returns for an input with less than 5% outlier a result with an accuracy of 99%.'

Find duplicate images algorithm [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I want to create a program that find the duplicate images into a directory, something like this app does and I wonder what would be the algorithm to determine if two images are the same.
Any suggestion is welcome.
This task can be solved by perceptual-hashing, depending on your use-case, combined with some data-structure responsible for nearest-neighbor search in high-dimensions (kd-tree, ball-tree, ...) which can replace the brute-force search (somewhat).
There are tons of approaches for images: DCT-based, Wavelet-based, Statistics-based, Feature-based, CNNs (and more).
Their designs are usually based on different assumptions about the task, e.g. rotation allowed or not?
A google scholar search on perceptual image hashing will list a lot of papers. You can also look for the term image fingerprinting.
Here is some older ugly python/cython code doing the statistics-based approach.
Remark: Digikam can do that for you too. It's using some older Haar-wavelet based approach i think.

What are some heuristics for choosing a diff algorithm? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
The Myers diff algorithm performs well when the differences between the two texts are small, because most simple implementations have complexity O((N+M) * D). However when differences are large, it takes a very long time to run. For example, if one of the texts is large and the other is the empty string, many implementations take several minutes to run.
If you knew the differences were large, then you could choose a different algorithm. How do diff tools make this determination in practice?

Are applications such as image processing naturally slow in Scheme for lacking a random-access data-structure? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Scheme lists are slow for random access, which is a common operation in many applications such as image processing. Does this make it naturally handicapped for that kind of application?
If performance is a concern, then you should definitely consider using fixed-access-time structures. Fortunately, Scheme has lots of these, too. The "vector" is the simplest one; it's a close match to what most languages call an "array".

Algorithm reference [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
This is a trivial question - but something I always miss in the day-to-day programming.
Is there a gook lookup reference available for the common algorithms that we usually face in our everyday programming - sorting,sequences,graphs.
The emphasis is more on the applicability and pseudocode ,rather than the mathematical proofs(which I find is what books tend to stress on).
The idea is to keep a ready reference,as and when we need to resort to one of these algorithms into our respective development project and languages.
Dictionary of Algorithms and Data Structures
How about this?
List of algorithms#Wikipedia

Resources