Algorithms under Plagiarism detection machines [closed] - algorithm

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm very impressed to how plagiarism checkers (such as Turnitin website ) works. But how do they do that ? In a very effective way, I'm new to this area thus is there any word matching algorithm or anything that is similar to that is used for detecting alike sentences?
Thank you very much.

I'm sure many real-world plagiarism detection systems use more sophisticated schemes, but the general class of problem of detecting how far apart two things are is called the edit distance. That link includes links to many common algorithms used for this purpose. The gist is effectively answering the question "How many edits must I perform to turn one input into the other?". The challenge for real-world systems is performing this across a large corpus in an efficient manner. A related problem is the longest common subsequence, which might also be useful for such schemes to identify passages that are copied verbatim.

Related

Find duplicate images algorithm [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I want to create a program that find the duplicate images into a directory, something like this app does and I wonder what would be the algorithm to determine if two images are the same.
Any suggestion is welcome.
This task can be solved by perceptual-hashing, depending on your use-case, combined with some data-structure responsible for nearest-neighbor search in high-dimensions (kd-tree, ball-tree, ...) which can replace the brute-force search (somewhat).
There are tons of approaches for images: DCT-based, Wavelet-based, Statistics-based, Feature-based, CNNs (and more).
Their designs are usually based on different assumptions about the task, e.g. rotation allowed or not?
A google scholar search on perceptual image hashing will list a lot of papers. You can also look for the term image fingerprinting.
Here is some older ugly python/cython code doing the statistics-based approach.
Remark: Digikam can do that for you too. It's using some older Haar-wavelet based approach i think.

Advanced Rudimentary Computing? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Lets say that my definition of 'rudimentary programming' refers to the fundamental tools employed for a computer to perform a task.
Considering programming rudiments, the learning spectrum usually looks something like this:
Variables, data types and variable memory
Arrays/Lists and their manipulation
Looping and conditionals
Functions
Classes
Multi threading/processing
Streams (hard-disk and web)
My question is, have I missed any of the major rudiments? Is there a 'next' to the spectrum that still eludes me?
I think you missed the most important one: algorithms. Understanding the complexity, know the situation to use them, why use them and more important, how to implement them.
I'm pretty sure that you already know a lot about algorithms but if you think that your tool-knowledge (aka the programming languages) are good enough, you should start focus, more, on the algorithms.
A great book to start is: Introduction to Algorithms, from Thomas H. Cormen

How would be an algorithm to simulate human interaction? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Let's suppose that androids which are physically alike humans are a reality.
What would be an algorithm to make it interact with human beings if we want it to:
1) be indistinguishable from regular people in behavior
2) be as equally friendly to everyone as possible?
I understand that it is very hard to write an algorithm like that. I can, however, imagine an android simulating human behavior fairly well with some sort of machine learning technique.
But how would we train it? The act of collecting data would also be a big big problem.
Which machine learning technique would be ideal?
If you consider requirement 1 to be a hard requirement, such an algorithm would beat the Turing Test at least to some extent, so it would be a pretty advanced (world-class) algorithm.
Your problem basically equates to beating the Turing Test, so check the linked article to see the scientific literature produced by people working on this problem.
Assuming massive data availability and processing power are basically unbounded, I believe an Artificial Neural Network would be the best runner-up to base such an algorithm on.

List of interesting and useful Algorithms [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am in a quest of understanding majority of important algorithm that SO community has used in read world applciations. I know a ready list can be extracted from wiki page. But, i am interested only those algorithm or problem that community has faced either in their projects or asked in interviews. Few lines on that algorithm will also be helpful.
I am looking beyond the generic algorithm D&C, DP, Greedy...
If you are interesting about optimization problems which can be used any computer applications such as network and socket programming these could be useful for you;
NearbyNeighbour
Munkres
Hungarian
BruteForce
Min&Max Finding Algorithms
Ant and Bee Colonies Algorithms
General Genetic Algorithm etc.
I totally advice you to search all aboves but genetic algorithms and ant colonies algorithm are asked many interviewers.
I hope that helps.

Algorithm vs Code [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I came across the declaration in a software best practices guide that algorithm and code shouldn't get mixed up. I'm not sure what is meant by this? As far as I understand, code is the implementation of the algorithm, isn't is? So, what exactly is meant by this statement? and why it is considered as a good practice?
Thank You!
The context in which the author mentioned would be clearer if you had pasted the surrounding lines.
Though what it would mean to me is, an algorithm is just a clear step-by-step logic that you would use to implement. You would leave out the finer implementation details like selection of the right data structure and other implementation details while you write/design the algorithm.
A good explanation can be found here
An algorithm is a series of steps for solving a problem, completing a task or performing a calculation. Algorithms are usually executed by computer programs but the term can also apply to steps in domains such as mathematics for human problem solving.
Code is a series of steps that machines can execute. In many cases, code is composed in a high level language that is then automatically translated into instructions that machines understand.

Resources