can an algorithm be a single instruction? [closed] - algorithm

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
The most common definition of the word algorithm is:
"An algorithm is a finite ordered set of unambiguous instructions"
is it correct to say
"An algorithm is a finite ordered set of unambiguous instructions/instruction"
simply can an algorithm be a single instruction ?

The definition you quote says that an algorithm is a "finite ordered set", which not only would allow an algorithm to be a single instruction (i.e. a set with one element), it even allows for an algorithm to have no instructions (i.e. an empty set).
That said, we shouldn't take "finite ordered set" too literally, because a set cannot have repeated elements, whereas an algorithm can have repeated instructions. Also, there can be multiple different "implementations" of the "same" algorithm which would not strictly be the exact same ordered set of instructions; see for example Rosetta Code which lists many different implementations of the bubble sort algorithm, which are all different "sets of instructions" in a strict mathematical sense, but they are the same "algorithm" in the sense normally understood by programmers and computer scientists.
So the real answer is, an algorithm can be a single instruction if you define the word "algorithm" to allow that, and most definitions either allow it, don't specifically exclude it, or aren't meant to be strict mathematical definitions anyway.
As a grammatical note, it is not necessary to say "set of instructions/instruction" in order to include the possibility that the set has size 1; you would have to say "set of at least two instructions" if you wanted to exclude that possibility.

Arguably yes, for example you may have an algorithm which add two values (or more interestingly vectors of values?)
This may be both one instruction in your programming language and also backed in the processor by a single instruction!
The processor may do a great deal of work to perform the instruction (and would certainly have an algorithm of its own to do so!), but there would only be one instruction to it.
There is, however, some haziness with how you define such a thing (so I wouldn't worry too much about it), for example if you compile for a custom processor (like one written to an FPGA board), you could make your own instruction with tremendous algorithmic complexity backing it.
.. or on a logical processor (such as the famous JVM) or intermediate representation (such as LLVM IR), you may have cases where one instruction in code becomes a collection of instructions on the logical system, but is then backed by a single operation on a modern processor (I do not know of real cases of this, but it certainly occurs with LLVM)

In that case you would usually just call it an instruction, but I suppose that would also be correct. At the end of the day it's just a matter of interpretation, like asking if water is wet.

Related

SIMD-exploiting implementation of Peterson and Monico's Lanczos algorithm over the field with two elements

(This question is probably flirting with the "no software recommendations" rule; I understand why it might be closed).
In their paper F_2 Lanczos revisited, Peterson and Monico give a version of the Lanczos algorithm for finding a subspace of the kernel of a linear map over Z/2Z. If my cursory reading of their paper is correct (whether it is or not is clearly not a question for SO), the algorithm presented requires a number of iterations that scales inversely proportional to the word size of the machine used. The authors implemented their proof of concept algorithm with a 64 bit word size.
Does there exist a publicly available implementation of that algorithm utilizing wide SIMD words for (a potentially significant) speedup?
An existing implementation would be a software recommendation. A more interesting question is "Is it possible to use SIMD to make this algorithm run faster?" From my glance at the paper, it sounds like SIMD is exactly what they are describing ("We will partition a 64 bit machine word x into eight subwords...where each ... is an 8-bit word") so if the authors' implementation is publicly available somewhere, the answer is "yes" because they're already using it. If this algorithm were written in C/C++ or something like that, an optimizing compiler would likely do a pretty good job of vectorizing it with SIMD even without manually specifying how to split the registers (can be verified by looking at the assembly). It would arguably be preferable to implement in high level language without splitting registers manually, because then the compiler could optimize it for any target machine's word size.

Do "algorithms" exist in Functional Programming? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Is it meaningless to be asked to document "algorithms" of your software (say, in a design specification) if it is implemented in a functional paradigm? Whenever I think of algorithms in technical documents I imagine a while loop with a bunch of sequential steps.
Looking at the informal dictionary meaning of an algorithm:
In mathematics and computer science, an algorithm is a step-by-step procedure for calculations.
The phrase "step-by-step" appears to be against the paradigm of functional programming (as I understand it), because functional programs, in contrast to imperative programs, have no awareness of time in their hypothetical machine. Is this argument correct? Or does lazy evaluation enforce an implicit time component that makes it "step by step"?
EDIT - so many good answers, it's unfair for me to choose a best response :( Thanks for all the perspectives, they all make great observations.
Yes, algorithms still exist in functional languages, although they don't always look the same as imperative ones.
Instead of using an implicit notion of "time" based on state to model steps, functional languages do it with composed data transformations. As a really nice example, you could think of heap sort in two parts: a transformation from a list into a heap and then from a heap into a list.
You can model step-by-step logic quite naturally with recursion, or, better yet, using existing higher-order functions that capture the various computations you can do. Composing these existing pieces is probably what I'd really call the "functional style": you might express your algorithm as an unfold followed by a map followed by a fold.
Laziness makes this even more interesting by blurring the lines between "data structure" and "algorithm". A lazy data structure, like a list, never has to completely exist in memory. This means that you can compose functions that build up large intermediate data structures without actually needing to use all that space or sacrificing asymptotic performance. As a trivial example, consider this definition of factorial (yes, it's a cliche, but I can't come up with anything better :/):
factorial n = product [1..n]
This has two composed parts: first, we generate a list from 1 to n and then we fold it by multiplying (product). But, thanks to laziness, the list never has to exist in memory completely! We evaluate as much of the generating function as we need at each step of product, and the garbage collector reclaims old cells as we're done with them. So even though this looks like it'll need O(n) memory, it actually gets away with O(1). (Well, assuming numbers all take O(1) memory.)
In this case, the "structure" of the algorithm, the sequence of steps, is provided by the list structure. The list here is closer to a for-loop than an actual list!
So in functional programming, we can create an algorithm as a sequence of steps in a few different ways: by direct recursion, by composing transformations (maybe based on common higher-order functions) or by creating and consuming intermediate data structures lazily.
I think you might be misunderstanding the functional programming paradigm.
Whether you use a functional language (Lisp, ML, Haskell) or an imperative/procedural one (C/Java/Python), you are specifying the operations and their order (sometimes the order might not be specified, but this is a side issue).
The functional paradigm sets certain limits on what you can do (e.g., no side effects), which makes it easier to reason about the code (and, incidentally, easier to write a "Sufficiently Smart Compiler").
Consider, e.g., a functional implementation of factorial:
(defun ! (n)
(if (zerop n)
1
(* n (! (1- n)))))
One can easily see the order of execution: 1 * 2 * 3 * .... * n and the fact that there are
n-1 multiplications and subtractions for argument n.
The most important part of the Computer Science is to remember that the language is just the means of talking to computers. CS is about computers no more than Astronomy is about telescopes, and algorithms are to be executed on an abstract (Turing) machine, emulated by the actual box in front of us.
No, I think if you solved a problem functionally and you solved it imperatively, what you have come up with are two separate algorithms. Each is an algorithm. One is a functional algorithm and one is an imperative algorithm. There's many books about algorithms in functional programming languages.
It seems like you are getting caught up in technicalities/semantics here. If you are asked to document an algorithm to solve a problem, whoever asked you wants to know how you are solving the problem. Even if its functional, there will be series of steps to reach the solution (even with all of the lazy evaluation). If you can write the code to reach the solution, then you can write the code in pseudocode, which means you can write the code in terms of an algorithm as far as I'm concerned.
And, since it seems like you are getting very hung up on definitions here, I'll posit a question your way that proves my point. Programming languages, whether functional or imperative, ultimately are run on a machine. Right? Your computer has to be told a step-by-step procedure of low level instructions to run. If this statement holds true, then every high-level computer program can be described in terms of their low level instructions. Therefore, every program, whether functional or imperative, can be described by an algorithm. And if you can't seem to find a way to describe the high-level algorithm, then output the bytecode/assembly and explain your algorithm in terms of these instructions
Consider this functional Scheme example:
(define (make-list num)
(let loop ((x num) (acc '()))
(if (zero? x)
acc
(loop (- x 1) (cons x acc)))))
(make-list 5) ; dumb compilers might do this
(display (make-list 10)) ; force making a list (because we display it)
With your logic make-list wouldn't be considered an algorithm since it doesn't do it's calculation in steps, but is that really true?
Scheme is eager and follows computation in order. Even with lazy languages everything becomes calculations in stages until you have a value. The differences is that lazy languages does calculations in the order of dependencies and not in the order of your instructions.
The underlying machine of a functional language is a register machine so it's hard avoid your beautiful functional program to actually become assembly instructions that mutate registers and memory. Thus a functional language (or a lazy language) is just an abstraction to ease writing code with less bugs.

Big-O for a compiler [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Does anyone have insight into the typical big-O complexity of a compiler?
I know it must be >= n (where n is the number of lines in the program), because it needs to scan each line at least once.
I believe it must also be >= n.logn for a procedural language, because the program can introduce O(n) variables, functions, procedures, and types etc., and when these are referenced within the program it will take O(log n) to look up each reference.
Beyond that my very informal understanding of compiler architecture has reached its limits and I am not sure if forward declarations, recursion, functional languages, and/or other tricks will increase the algorithmic complexity of the compiler.
So, in summary:
For a 'typical' procedural language (C, pascal, C#, etc.) is there a limiting big-O for an efficiently designed compiler (as a measure of number of lines)
For a 'typical' functional language (lisp, Haskell, etc.) is there a limiting big-O for an efficiently designed compiler (as a measure of number of lines)
This question is unanswerable in it's current form. The complexity of a compiler certainly wouldn't be measured in lines of code or characters in the source file. This would describe the complexity of the parser or lexer, but no other part of the compiler will ever even touch that file.
After parsing, everything will be in terms of various AST's representing the source file in a more structured manner. A compiler will have a lot of intermediate languages, each with it's own AST. The complexity of various phases would be in terms of the size of the AST, which doesn't correlate at all to the character count or even to the previous AST necessarily.
Consider this, we can parse most languages in linear time to the number of characters and generate some AST. Simple operations such as type checking are generally O(n) for a tree with n leaves. But then we'll translate this AST into a form with potentially, double, triple or even exponentially more nodes then on the original tree. Now we again run single pass optimizations on our tree, but this might be O(2^n) relative to the original AST and lord knows what to the character count!
I think you're going to find it quite impossible to even find what n should be for some complexity f(n) for a compiler.
As a nail in the coffin, compiling some languages is undecidable including java, C# and Scala (it turns out that nominal subtyping + variance leads to undecidable typechecking). Of course C++'s templating system is turing complete which makes decidable compilation equivalent to the halting problem (undecidable). Haskell + some extensions is undecidable. And many others that I can't think of off the top of my head. There is no worst case complexity for these languages' compilers.
Reaching back to what I can remember from my compilers class... some of the details here may be a bit off, but the general gist should be pretty much correct.
Most compilers actually have multiple phases that they go through, so it'd be useful to narrow down the question somewhat. For example, the code is usually run through a tokenizer that pretty much just creates objects to represent the smallest possible units of text. var x = 1; would be split into tokens for the var keyword, a name, an assignment operator, and a literal number, followed by a statement finalizer (';'). Braces, parentheses, etc. each have their own token type.
The tokenizing phase is roughly O(n), though this can be complicated in languages where keywords can be contextual. For example, in C#, words like from and yield can be keywords, but they could also be used as variables, depending on what's around them. So depending on how much of that sort of thing you have going on in the language, and depending on the specific code that's being compiled, just this first phase could conceivably have O(n²) complexity. (Though that would be highly uncommon in practice.)
After tokenizing, then there's the parsing phase, where you try to match up opening/closing brackets (or the equivalent indentations in some languages), statement finalizers, and so forth, and try to make sense of the tokens. This is where you need to determine whether a given name represents a particular method, type, or variable. A wise use of data structures to track what names have been declared within various scopes can make this task pretty much O(n) in most cases, but again there are exceptions.
In one video I saw, Eric Lippert said that correct C# code can be compiled in the time between a user's keystrokes. But if you want to provide meaningful error and warning messages, then the compiler has to do a great deal more work.
After parsing, there can be a number of extra phases including optimizations, conversion to an intermediate format (like byte code), conversion to binary code, just-in-time compilation (and extra optimizations that can be applied at that point), etc. All of these can be relatively fast (probably O(n) most of the time), but it's such a complex topic that it's hard to answer the question even for a single language, and practically impossible to answer it for a genre of languages.
As fas as i know:
It depends on the type of parser the compiler uses in it's parsing step.
The main type of parsers are LL and LR, and both have different complexities.

How do programmers test their algorithm in TopCoder or other competitions?

Good programmers who write programs of moderate to higher difficulty in competitions of TopCoder or ACM ICPC, have to ensure the correctness of their algorithm before submission.
Although they are provided with some sample test cases to ensure the correct output, but how does it guarantees that program will behave correctly? They can write some test cases of their own but it won't be possible in all cases to know the correct answer through manual calculation. How do they do it?
Update: As it seems, it is not quite possible to analyze and guarantee the outcome of an algorithm given tight constraints of a competitive environment. However, if there are any manual, more common traits which are adopted while solving such problems - should be enough to answer the question. Something like best practices..
In competitions, the top programmers have enough experience to read the question, and think of some test cases that should catch most of the possibilities for input.
It catches most of the bugs usually - but it is NOT 100% safe.
However, in real life critical applications (critical systems on air planes or nuclear reactors for example) there are methods to PROVE some piece of code does what it is supposed to do.
This is the field of formal verification - which is way too complex and time consuming to be done during a contest, but for some systems it is used because mistakes could not be tolerated.
Some additional information:
Formal verification basically consists of 2 parts:
Manual verification - in here we use proving systems such as Hoare logic and manually prove the program does what we wants it to do.
Automatic model checking - modeling the problem as state machine, and use Model Checking tools to verify that the module does what it is supposed to do (or not doing something "bad").
Specifying "what it should do" is usually done with temporal logic.
This is often used to verify correctness of hardware models as well. For example Intel uses it to ensure they won't get the floating point bug again.
Picture this, imagine you are a top programmer.Meaning you know a bunch of algorithms and wouldn't think think twice while implementing them.You know how to modify an already known algorithm to suit the problem's needs.You are strong with estimating time and complexity and you expect that in the worst case your tailored algorithm would run within time and memory constraints.
At this level you simply think and use a scratchpad for about five to ten minutes and have a super clear algorithm before you start to code.Once you finish coding, you hit compile and there is usually no compilation error.Because the code is so intuitive to you.
Then based on the algorithm used and data structures used, you expect that there might be
one of the following issues.
a corner case
an overflow problem
A corner case is basically like you have coded for the general case, however when say N=1, the answer is different from others.So you generally write it as a special case.
An overflow is when intermediate values or results overflow a data type's limits.
You make note of any problems which arise at this point, and use this data during Challenge phase(as in TopCoder).
Once you have checked against these two, you hit Submit.
There's a time element to Top Coder, so it's not possible to test every combination within that constraint. They probably do the best they can and rely on experience for the rest, just as one does in real life. I don't know that it's ever possible to guarantee that a significant piece of code is error free forever.

Proactively using 'lines of code' (LOC) metric in your software-development process? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
Codebase size has a lot to do with complexity of a software system (the higher the size the higher the costs for maintenance and extensions). A way to map codebase size is the simple 'lines of code (LOC)' metric (see also blog-entry 'implications of codebase-size').
I wondered how many of you out there are using this metric as a part for retrospective to create awareness (for removing unused functionality or dead code). I think creating awareness that more lines-of-code mean more complexity in maintenance and extension can be valuable.
I am not taking the LOC as fine grained metric (on method or function level), but on subcomponent or complete product level.
I find it a bit useless. Some kinds of functions - user input handling, for example , are going to be a bit long winded no matter what. I'd much rather use some form of complexity metric. Of course, you can combine the two, and/or any other metrics that take your fancy. All you need is a good tool - I use Source Monitor (with whom I have no relationship other than satisfied user) which is free and can do you both LOC and complexity metrics.
I use SM when writing code to make me notice methods that have got too complex. I then go back and take a look at them. About half the time I say, OK, that NEEDS to be that complicated. What I'd really like is (free) tool as good as SM but which also supports a tag list of some sort which says "ignore methods X,Y & Z - they need to be complicated". But I guess that could be dangerous, which is why I have so far not suggested the feature to SM's author.
I'm thinking it could be used to reward the team when the LOC decreases (assuming they are still producing valuable software and readable code...).
Not always true. While it is usually preferable to have a low LOC, it doesn't mean the code is any less complex. In fact, its usually more-so. Code thats been optimized to get the minimal number of cycles can be completely unreadable, even by the person who wrote it a week later.
As an example from a recent project, imagine setting individual color values (RGBa) from a PNG file. You can do this a bunch of ways, the most compact being just 1 line using bitshifts. This is a lot less readable and maintainable then another approach, such as using bitfields, which would take a structure definition and many more lines.
It also depends on the tool doing the LOC calculations. Does it consider lines with just a single symbol on them as code (Ex: { and } in C-style languages)? That definitely doesn't make it more complex, but does make it more readable.
Just my two cents.
LOCs are easy to obtain and deliver reasonable information whithin one not trivial project. My first step in a new project is always counting LOCs.

Resources