Minimum Knowledge required in Datastructres [closed] - data-structures

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I have found numerous data-structures on wikipedia, also have also looked into several books in data-structures and found that they vary. I want to know what are the basic or minimum list of data-structure knowledge a new CS graduate should have?
Also is it necessary to know their implementation in more than one programming knowledge considering there is a difference in the implementation. If i know the implementation of Linked list in C should i know its Java based implementation?
It would be great if you could help me understand categorically:
Basic datastructures(necessary for a CS grad)
Advanced Datastructures
Edit : i am more interested in the list of data structures.

Have a look at Introduction to Algorithms by Cormen et al. In my experience, if you know what is in there you are set for anything coming at you.
I would not consider knowing any implementation very useful. If you know the basics you should be able to implement your own version quickly, but chances are you will never have to because there are libraries for that. So the rule for practice is: know your libraries!
Even so, it is important that you know properties of data structures (e.g. space overhead, runtimes of central operations, behaviour under concurrent accesses, (im)mutability, ...) so you will always use the one best suited to your task at hand.

This question is really a little too broad, even the way you've narrowed it down, because it depends on what sort of future path you're looking at. Grad school? PhD track? Industry? Which industry?
But as a rough minimum, I'd say, take a look at CLRS (as Raphael suggests) and pick out the following:
Linked lists, and the variations like stacks, queues, etc.
Basic heaps
Basic hash tables
Trees, especially including binary search trees, and preferably familiarity with at least one self-balancing BST
Graphs, both matrix-representation and adjacency list representation
And probably some more based on what sort of job you're looking for. As someone on a PhD track... well. All of them. At some point you will take a qualifier and be expected to know most of them.

Check out the MIT's OCW Intro to Algorithm Course It is great tutorial theoretically.
For practicing data structures in Java check : Data Structures & Algorithms in Java by Robert Lafore, it is excellent.
Implementation in one language is sufficient, but try to solve it in structured-oriented language like C and OO language like Java/ C++. This will help a lot while preparing for interviews.
One good resource for basic data structures in C : here

Related

Need help regarding programming challenges solving algorithms [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I want to improve my programming skills, when I participate in some programming competition I feel that every challenge is so tough that i can not solve it,I have good knowledge of coding but I fell to decide the algorithm needed to solve particular problem for that can anyone tell me which books to read
I would suggest first to get comfortable with programming language of your choice. Once you have confidence on your language and Data structure, you can proceed confidently for any programming challenges. Make a habit of writing complete code with all edge cases handled on a sheet of paper rather than simple pseudo code for your practice session.
Now to solve algorithmic problem first to grasp elementary algo functioning via book or online resources. If you are using coreman (good book for algo) then you might want to understand basic concepts of different sorting techniques, heap, queue, hashing, greedy and dynamic algorithm. For some topic i would recommend to research online as well - like dynamic programming and hashing. Almost 70-80% interview questions are either hashing or DP based. Then look for major examples and their solution for these algorithm. Once your mind will set up you would be able to think quickly for any algorithmic problem.
Introduction to Algorithms by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest , Clifford Stein is a good one to start with.
Covers almost everything, from graph theory to geometric algorithms and all related data structures, furthermore they use the commonly used "Big O" notation to indicate the efficiency of the algorithms explained. Most of the time multiple algorithms are presented for the same problem, together with their advantages and disadvantages.

Is there any online judge for data mining [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
There are many Online Judges (OJ) for ACM/ICPC questions. And another Online Judge for Interview questions, named Leetcode (http://leetcode.com).
I think these OJs are very useful for us to learn algorithms. Recently, I am going to learn data mining algorithms. Is there any OJ for data mining questions?
Thank you very much.
There is MLcomp, where you can submit an algorithm and it will run it on a number of data sets to judge how well it is doing.
Plus, there is Kaggle, which hosts various classification competitions.
And of course you can do classes at Cousera. These are pretty much low level, but in order to get submission points you need to reproduce the known performance.
In particular the first also allows you to run several standard algorithms such as naive bayes and SVM and see how well they did. Obviously, your own implementation should perform similar then.
Unfortunately, both are pretty much focused on machine learning (i.e. classification and regression). There is very little in the unsupervised domain, clustering and outlier detection. On unlabeled data, things get too hard even to evaluate locally, so doing any kind of online judging is pretty much unsolved. What you can do is largely a one-class classification, or you just strip labels before running the algorithm.

Naive approaches to detecting plagiarism? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Let's say you wanted to compare essays by students and see if one of those essays was plagiarized. How would you go about this in a naive manner (i.e. not too complicated approach)? Of course, there are simple ways like comparing the words used in the essays, and complicated ways like using compressing functions, but what are some other ways to check plagiarism without too much complexity/theory?
There are several papers giving several approaches, I recommend reading this
The paper shows an algorithm based on an index structure
built over the entire file collection.
So they say their algorithm can be used to find similar code fragments in a large software system. Before the index is built, all the files in the
collection are tokenized. This is a simple parsing problem, and can be solved in
linear time. For each of the N files in the collection, The output of the tokenizer
for a file F_i is a string of n_i tokens.
here is other paper you could read
Other good algorithm is a scam based algorithm that consists on detecting plagiarism by making comparison on a set of words that are common between test document
and registered document. Our plagiarism detection system, like many Information Retrieval systems, is evaluated with metrics of precision and recall.
You could take a look at Dick Grune's similarity comparator, which claims to work on natural language texts as well (I've only tried it on software). The algorithms are described as well. (By the way, his book on parsing is really good, in my opinion.)

Resource for learning Algorithms for non-CS/Math degrees [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I've been asked to recommend a resource (on-line, book or tutorial) to learn Algorithms (in the sense of of the MIT Intro to Algorithms) for non-CS or Math majors. Obviously the MIT book is way too involved and some of the lighter treatments (like OReilly's Algorithms in a Nutshell) still seem as if you would need to have some background in algorithmic analysis. Is there resource that presents the material in a way that developers who do not have a background in theoretical computer science will find useful?
I think the best way to learn algorithms are through the various competition sites.
USACO - my personal favorite, as it gives a clear path through the material
TopCoder - already mentioned
Sphere Online Judge - great if you want to work in another language other than C/C++/Java
As far as books, the best single intro I've seen for the non-math specialist is Data Structures and Algorithms. It takes you through an algorithm line by line and shows you how it decomposes mathematically, something CLRS's otherwise excellent analysis section is a little less clear on.
Skiena's Algorithm Design Manual is also excellent, as is his Programming Challenges, which is essentially a tutorial through the Valladolid Online Judge.
Honestly, though, I think the single most helpful thing a beginner can do is to implement the various algorithms -- merge sort, say, followed by Quicksort -- and time them against variously sized inputs. Create a spreadsheet with a graph that shows their growth over time. Very few non-specialists will have the patience or the know-how to set up a recurrence relation and solve their way through it. But you must understand the effect of, say O n^2 growth over time, and there's no better way to learn this than to watch your own program blow through its memory stack. :)
I say this as a non-CS, non-math programmer who has spent a good couple of months wrapping my mind around algorithmic analysis.
I'd go for the Algorithm Design Manual, by Steven Skiena. It's very readable and starts with the basics in an easy-to-understand way. For example, it explains big-O notation very well. The emphasis is on practical application, which is a big bonus for beginners coming from a non-theoretical field.
The second half of the book is a reference of common algorithm problems and practical approaches to their solutions. I found it invaluable as a learning aid, and now as a reference.
I'm not sure which MIT book you're referring to, but the canonical text is CLRS. I don't think it really assumes any background besides high school math.
Personally, I found doing TopCoder algorithm competitions over the course of the past few years to be the best way for me to learn common algorithms and put them into practice. Perhaps you should try the same. Whatever you do, I suggest that you spend a lot more hands-on-keyboard time implementing things you learn than head-in-book time, because that's the way to really internalize different techniques.

Which data structures and algorithms book should I buy? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I know C and C++ and I have some experience with Java, but I don't know too much about Algorithms and Data Structures.
I did a search on Amazon, but I don't know what book should I choose. I don't want a book which put its basis only on the theoretic part; I want the practical part too (probably more than the theoretical one :) ).
I don't want the code to be implemented in a certain language, but if is in Java, probably I would happier. :)
Don't buy any book use
MIT OCW
.
Introduction to Algorithms by Cormen et. al. is a standard introductory algorithms book, and is used by many universities, including my own. It has pretty good coverage and is very approachable.
And anything by Robert Sedgewick is good too.
I think introduction to Algorithms is the reference books, and a must have for any serious programmer.
http://en.wikipedia.org/wiki/Introduction_to_Algorithms
Other fun book is The algorithm design manual http://www.algorist.com/. It covers more sophisticated algorithms.
I can't not mention The art of computer programming of Knuth
http://www-cs-faculty.stanford.edu/~knuth/taocp.html
If you want the algorithms to be implemented specifically in Java then there is Mitchell Waite's Series book "Data Structures & Algorithms in Java". It starts from basic data structures like linked lists, stacks and queues, and the basic algorithms for sorting and searching. Working your way through it you will eventually get to Tree data structures, Red-Black trees, 2-3 trees and Graphs.
All-in-all its not an extremely theoretical book, but if you just want an introduction in a language you are familiar with then its a good book. At the end of the day, if you want a deeper understanding of algorithms you're going to have to learn some of the more theoretical concepts, and read one of the classics, like Cormen/Leiserson/Rivest/Stein's Introduction to Algorithms.
If you don't need in a complete reference to the most part of algorithms and data structures that are in use and just want to get acquainted with common techniques I would recommend something more lightweight than Cormen, Sedgewick or Knuth. I think, Algorithms and Data Structures by N. Wirth is not as bad choice even in spite of it was printed far ago.

Resources