I want to implement a sort algo in Whitespace. As you know whitespace only uses stack and heap. Any idea which sort algo can be easily implemented?
Since Whitespace only has a single stack and single heap, and in-place sorting algorithm would be ideal.
Insertion sort was the first that came to mind, where the sorting and reading from STDIN can be combined into one. Looking into it a bit further: Shell sort with an Insertion implementation is probably your best bet. In fact, on Rosetta Code's Shell sort page I've been able to find a Whitespace sort program, including explanation and working ideone link†.
† Since Whitespace compilers can each have their own implementation, this algorithm doesn't work in all online Whitespace compilers. In TIO, it for examples gives an error because of negative indices. And the Whitespace vii5ard compiler is unfortunately broken since a couple months ago, so not sure if it would have worked there. I might take another look later on to modify the program for it to work on TIO as well, unless you want to try to do this yourself to better understand the sorting algorithm used in the Whitespace program.
Related
At time (and some may argue more often than not) it's required to sort things the way a human would sort them (alphabetically, not ASCIIbetically), yet to my knowledge there is no convenient way to do so in C++ (I'm not even aware of a boost solution), and people have to roll their own. Has this been rectified in C++11 or later?
Is there predicate or algorithm included in the c++ standard (from c++11 onwards) that allow sorting as humans would?
That is, sorting strings like "z1, z2, z3" instead of "z1, z10, z100, z2, z20, z3" etc
If not, is there some logical reason I've missed that they wouldn't include this functionality?
The answer:
No, there is nothing like that. There is one standard way for comparing strings operator<. std::less defaults to this too... and it's based on a lexicographical compare. Also, by a "normal" sort I would understand std::sort (1), so this is it.
A slight nitpick: I don't think your concept is alphabetical sorting, it does not compare on the character level. "alphabetical" and not "ASCIIbetical" would be more like a different ordering of the letters.
Your concept is more like a alphanumerical, it needs to interpret string and treat a number as an atomic unit for comparison. This needs some parsing logic.
An opinion: I believe there is not a single thing that every human would do the same way. This might be a bit extreme, but even for the common sense example there would be ambiguity. Let's take your example and standard sort program included in most gnu systems. It has a switch -h for human and it does what you describe, however it also checks for the suffix of the number, so 2K is smaller than 2G. It is not obvious which option to choose. After this there would come a question what about fractional numbers? And so on.
To do this properly it would requite quite sophisticated architecture and logic and/or making non-obvious design choices.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Does anyone have insight into the typical big-O complexity of a compiler?
I know it must be >= n (where n is the number of lines in the program), because it needs to scan each line at least once.
I believe it must also be >= n.logn for a procedural language, because the program can introduce O(n) variables, functions, procedures, and types etc., and when these are referenced within the program it will take O(log n) to look up each reference.
Beyond that my very informal understanding of compiler architecture has reached its limits and I am not sure if forward declarations, recursion, functional languages, and/or other tricks will increase the algorithmic complexity of the compiler.
So, in summary:
For a 'typical' procedural language (C, pascal, C#, etc.) is there a limiting big-O for an efficiently designed compiler (as a measure of number of lines)
For a 'typical' functional language (lisp, Haskell, etc.) is there a limiting big-O for an efficiently designed compiler (as a measure of number of lines)
This question is unanswerable in it's current form. The complexity of a compiler certainly wouldn't be measured in lines of code or characters in the source file. This would describe the complexity of the parser or lexer, but no other part of the compiler will ever even touch that file.
After parsing, everything will be in terms of various AST's representing the source file in a more structured manner. A compiler will have a lot of intermediate languages, each with it's own AST. The complexity of various phases would be in terms of the size of the AST, which doesn't correlate at all to the character count or even to the previous AST necessarily.
Consider this, we can parse most languages in linear time to the number of characters and generate some AST. Simple operations such as type checking are generally O(n) for a tree with n leaves. But then we'll translate this AST into a form with potentially, double, triple or even exponentially more nodes then on the original tree. Now we again run single pass optimizations on our tree, but this might be O(2^n) relative to the original AST and lord knows what to the character count!
I think you're going to find it quite impossible to even find what n should be for some complexity f(n) for a compiler.
As a nail in the coffin, compiling some languages is undecidable including java, C# and Scala (it turns out that nominal subtyping + variance leads to undecidable typechecking). Of course C++'s templating system is turing complete which makes decidable compilation equivalent to the halting problem (undecidable). Haskell + some extensions is undecidable. And many others that I can't think of off the top of my head. There is no worst case complexity for these languages' compilers.
Reaching back to what I can remember from my compilers class... some of the details here may be a bit off, but the general gist should be pretty much correct.
Most compilers actually have multiple phases that they go through, so it'd be useful to narrow down the question somewhat. For example, the code is usually run through a tokenizer that pretty much just creates objects to represent the smallest possible units of text. var x = 1; would be split into tokens for the var keyword, a name, an assignment operator, and a literal number, followed by a statement finalizer (';'). Braces, parentheses, etc. each have their own token type.
The tokenizing phase is roughly O(n), though this can be complicated in languages where keywords can be contextual. For example, in C#, words like from and yield can be keywords, but they could also be used as variables, depending on what's around them. So depending on how much of that sort of thing you have going on in the language, and depending on the specific code that's being compiled, just this first phase could conceivably have O(n²) complexity. (Though that would be highly uncommon in practice.)
After tokenizing, then there's the parsing phase, where you try to match up opening/closing brackets (or the equivalent indentations in some languages), statement finalizers, and so forth, and try to make sense of the tokens. This is where you need to determine whether a given name represents a particular method, type, or variable. A wise use of data structures to track what names have been declared within various scopes can make this task pretty much O(n) in most cases, but again there are exceptions.
In one video I saw, Eric Lippert said that correct C# code can be compiled in the time between a user's keystrokes. But if you want to provide meaningful error and warning messages, then the compiler has to do a great deal more work.
After parsing, there can be a number of extra phases including optimizations, conversion to an intermediate format (like byte code), conversion to binary code, just-in-time compilation (and extra optimizations that can be applied at that point), etc. All of these can be relatively fast (probably O(n) most of the time), but it's such a complex topic that it's hard to answer the question even for a single language, and practically impossible to answer it for a genre of languages.
As fas as i know:
It depends on the type of parser the compiler uses in it's parsing step.
The main type of parsers are LL and LR, and both have different complexities.
when faced with a problem that needs a filtering mechanism like sed, how do you analyse or model the problem so that you can solve it with sed? i am asking this question because I have found that deconstructing a sed program that solves a problem into its analytical constituents is very difficult for me. doing analysis targeted towards the sed solution space which involves filtering and cycles really beats me.
you should think in sed way of working (read a line, treat it and go to the next) so basically
sed Find and change pattern of charactere
sed could work with a cumulative buffer of line (1 buffer as memory and 1 buffer for direct action) and they can be manipulate like adding one to another, replacing or swapping
also, there is a test mecanisme that occur after a substitution
one big thing is that, by default sed work only 1 line at a time. It read the loine from the input and treat it than go to the next cycle. This mean, without buffering, no Carriage return on the line and one line cannot "see" another one.
sed is very efficient for quite simple task, could do very hard work (like games) but need to think not like a c/pascal/awk/shell script and often you need to pass through temporary situation like adding think or replace by a temporary pattern before going back to what you expect a bit like working on Reverse Polish in terms of computation.
Best is to give us a small concept problem and we try to show you the way we (there are ofter several ways) solve it
I currently use soft tabs (i.e. spaces) for indenting my Ruby code, if I were to use hard tabs would it increase the performance when the code is interpreted? I assume it's faster to read one tab character than parse 4 space characters (however negligible).
Do you have an idea of all the phases involved in interpreting from source? Only the very first one, lexical analysis, has to deal with whitespace, and in the case of whitespace, "deal with" means "ignore it". This phase only takes a tiny fraction of the total time, it's generally done using regular expression and pretty much has linear complexity. Constrast that with parsing, which can take ages in comparision. And interpreting is only somewhat viable because those two phases (plus a third, bytecode generation, in implementations that use bytecode) takes much less than the actual execution for nontrivial programs.
Don't worry about this. There is no difference anyone would ever notice. Honestly, I'd be surprised if you could measure a difference using time and a small program that does close to no actual work.
Pretty sure that whatever negligible impact the parser may have between reading one byte for tabbed indentation vs. four bytes for spaces will be offset by the next person that has to read your code and fix your tabbed / spaced mess.
Please use spaces. Signed, the next guy to read your code.
Performance impact is ε, that is, a very small number greater than zero. The spaces only get read and parsed once, the Ruby code is then transformed into an intermediate form.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
People in java/.net world has framework which provides methods for sorting a list.
In CS, we all might have gone through Bubble/Insertion/Merge/Shell sorting algorithms.
Do you write any of it these days?
With frameworks in place, do you write code for sorting?
Do you think it makes sense to ask people to write code to sort in an interview? (other than for intern/junior developer requirement)
There are two pieces of code I write today in order to sort data
list.Sort();
enumerable.OrderBy(x => x); // Occasionally a different lambda is used
I work for a developer tools company, and as such I sometimes need to write new RTL types and routines. Sorting is something developers need, so it's something I sometimes need to write.
Don't forget, all that library code wasn't handed down from some mountain: some developer somewhere had to write it.
I don't write the sorting algorithm, but I have implemented the IComparer in .Net for a few classes which was kind of interesting the first couple of times.
I wouldn't write the code for sorting given what is in the frameworks in most cases. There should be an understanding of why a particular sorting strategy like Quick sort is often used in frameworks like .Net.
I could see giving or being given a sorting question where some of the work is implementing the IComparer and understanding the different ways to sort a class. It would be a fairly easy thing to show someone a Bubble sort and ask, "Why wouldn't you want to do this in most applications?"
I can say with 100% certainty that I haven't written one of the 'traditional' sort routines since leaving University. It's nice to know the theory behind it, but to apply them to real-world situations that can't be done by other means doesn't happen very often (at least from my experience...).
only on employer's interview/test =)
I wrote a merge sort when I had to sort multi-gigabyte files with a custom key comparison. I love merge sort - it's easy to comprehend, stable, and has a worst-case O(n log n) performance.
I've been looking for an excuse to try radix sort too. It's not as general purpose as most sorting algorithms, so there aren't going to be any libraries that provide it, but under the right circumstances it should be a good speedup.
Personally, I've not had a need to write my own sorting code for a while.
As far as interview questions go, it would weed out those who didn't pay attention during CS classes.
You could test API knowledge by asking how would you build Comparable (Capital C) objects, or something along those lines.
The way I see it, just like many others fields of knowledge, programming also has a theoretical and a practical approach to it.
The field of "theoretical programming" is the one that gave us quicksort, Radix Sort, Djikstra's Algorithm and many other things absolutely necessary to the advance of computing.
The field of "practical programming" deals with the fact that the solutions created in "theoretical programming" should be easily accessible to all in a much easier way, so that the theoretical ideas can get many, many creative uses. This gave us high-level languages like Python and allowed pretty much any language to implement packed methods for the most basics operations like sorting or searching with a good enough performance to be fit for almost everyone.
One can't live without the other...
most of us not needing to hard code a sorting algorithm doesn't mean no one should.
I've reciently had to write a sort, of sorts.
I had a list of text.. the ten most common had to show up according to the frequency at which they were selected. All other entries had to show up according to alpha sort.
It wasn't crazy hard to do but I did have to write a sort to support it.
I've also had to sort objects whose elements aren't easily sorted with an out of the box code.
Same goes for searching.. I had to walk a file and search staticly sized records.. When I found a record I had to move one record back, because I was inserting before it.
For the most part it was very simple and I mearly pasted in a binary search. Some changes needed to be done to support the method of access, because I wasn't using an array that was acutally in memory.. Ah c&#p.. I could have treated it like a stream.. See now I want to go back and take a look..
Man, if someone asked me in an interview what the best sort algorithm was, and didn't understand immediately when I said 'timsort', I'd seriously reconsider if I wanted to work there.
Timsort
This describes an adaptive, stable,
natural mergesort, modestly called
timsort (hey, I earned it ). It
has supernatural performance on many
kinds of partially ordered arrays
(less than lg(N!) comparisons needed,
and as few as N-1), yet as fast as
Python's previous highly tuned
samplesort hybrid on random arrays.
In a nutshell, the main routine
marches over the array once, left to
right, alternately identifying the
next run, then merging it into the
previous runs "intelligently".
Everything else is complication for
speed, and some hard-won measure of
memory efficiency.
http://svn.python.org/projects/python/trunk/Objects/listsort.txt
Is timsort general-purpose or Python-specific?
I haven't really implemented a sort, except as coding exercise and to observe interesting features of a language (like how you can do quicksort on one line in Python).
I think it's a valid question to ask in an interview because it reflects whether the developer thinks about these kind of things... I feel it's important to know what that list.sort() is doing when you call it. It's just part of my theory that you should know the fundamentals behind everything as a programmer.
I never write anything for which there's a library routine. I haven't coded a sort in decades. Nor would I ever. With quicksort and timsort directly available, there's no reason to write a sort.
I note that SQL does sorting for me.
There are lots of things I don't write, sorts being just one of them.
I never write my own I/O drivers. (Although I have in the past.)
I never write my own graphics libraries. (Yes, I did this once, too, in the '80s)
I never write my own file system. (Avoided this.)
There is definitely no reason to code one anymore. I think it is important though to understand the efficiency of what you are using so that you can pick the best one for the data you are sorting.
Yes. Sometimes digging out Shell sort beats the builtin sort routine when your list is only expected to be at most a few tens of records.