I am taking a logic and functional programming course (with programming in SML) and as part of our first assignment the following question is asked
"... You need to define an (abstract) type called 'a set
Documentation: Describe formally how finite sets will be represented as lists, stating a representational invariant property. ..."
Can anyone explain what does "describe formally" means ?
Check your textbook and class notes to see how "formal description" is used for other type, and then follow that example.
Describe formally in mathematics usually means bounding things with mathematical expressions, and using standard notation and terminology where necessary. In logic, you generally will use implies and therefore instead of more colloquial terms. Wikipedia has an article
Related
While doing my studies on propositional logic i came up with the following question:
Is a software bug always a logical contradiction between the program and the specification?
Consider the following example:
Our specification tells us that "we do action C iff premise A and B are true".
Which is implemented as follows:
main ()
if A then C
if B then C
Clearly one can see that the specification does not fit the implementation since (consider the program above) "we do C iff the premise A or the premise B is true".
Expressing our specification and our program as propositional formulas we get the following equation:
We transform our specification to CNF and apply the resolution calculus and now we can easily see that the very first clause contradicts with the very last clause. Therefore this formula is not satisfiable and therefore our specification contradicts our implementation.
My questions are now (since the above was only an example):
Is this true for every software bug assuming a complete documentation?
and if so:
If we convert a complete specification to propositional formulas could we automate the process of software bug finding?
To answer my own question: This is called "Model Checking" and is very common in computer science in large companies like Intel to check whether hardware is actually doing what it is supposed to do.
Recently model checking started to occur more and more in software development as well. For example NASA and Microsoft are using this technology to quite some extend.
In its base form it works as follows: The specification is converted to logical statements and a compiler translates a given software program to a tree-like structure called "Kripke Structure". Model checkers take these as input and give either a counterexample for which the specification is incomplete or emit a truth value.
https://en.wikipedia.org/wiki/Model_checking
I am reading a book on λ-calculus "Functional programming Through Lambda Calculus" (Greg Michaelson). In the book the author introduces a short-hand notation for defining functions. For example
def identity = λx.x
and goes on saying that we should insist that when using such shorthand "all defined names should be replaced by their definitions before the expression is evaluated"
Later on, when introducing recursion he uses as an example a definition of the addition function such as:
def add x y = if iszero y then x else add (succ x) (pred y)
and goes to say, that had we not had the restriction mentioned above we would be able to evaluate this function by slowly expanding it. However since we have the restriction of replacing all defined names before the evaluation of the expression, we cannot do that since we go on indefinetely replacing add and thus the need of thinking about recursion in a more detailed way.
My question is thus the following: What are the theoritical or practical reasons for placing this restriction upon ourselves? (of having to replace all defined names before the evaluation of the function)? Are there any?
I was trying to show how to build a rich language from a very simple one, by adding successive layers of syntax, where each layer could be translated into the previous layer. So it's important to distinguish translation, which must terminate, from evaluation which needn't. I think it's really interesting that recursion can be translated into non-recursion. I'm sorry if my explanation isn't helpful.
The reason is that we want to stay within the rules of the lambda calculus. Allowing names for terms to mean anything other than immediate substitution would mean adding a recursive let expression to the language, which would mean we would need a truly more expressive system (no longer the lambda calculus).
You can think of the names as no more than syntactic sugar for the original lambda term. The Y-combinator is exactly the way to introduce recursion into a system that does not have it built in.
If the book you are currently reading confuses you, you might want search for some additional resources on the internet explaining the Y-combinator.
I will try to post my own answer, the way I understand it.
For the untyped lambda calculus there is no practical reason, we need the Y combinator. By practical I mean that if someone wants to build an expression evaluator, it is possible to do it without needing the combinator and just slowly expanding the definition.
For theoretical reasons though, we need to make sure that when we define a function this definition has some meaning and is not defined in terms of itself. e.g. there is not much meaning in the following definition:
def something = something
For this reason, we need to see if it is possible to rewrite the definition in a way that it is not self-referential, i.e. it is possible to define something without referring to itself. It turns out that in the untyped lambda calculus we can always do that through the Y-combinator.
Using the Y-combinator we can always construct the solution to the equation x=f(x)=f(f(x))=...=f(f(f(f(x)))=.... for any f,
i.e. we can always rewrite a self-referential definition to a definition that it does not include itself
In coq, defining an inductive proposition seems analogous to adding new inference rules/axioms to a logic. What constraints in defining an inductive proposition guarantees that coq remains consistent?
This is a very good, and not easy to answer question. The "Calculus of Inductive Constructions" has been analyzed in literally hundredths of papers.
The most accepted argument for justification of consistency is the equivalence of W-types with inductive data types. In this sense, every inductive type you add to the theory is just an instance of a W-type, which is an object that is supposed to be well-founded and thus not a danger to the consistency of the theory.
However, the details of Coq's implementation are a bit more complicated, mainly due to the reliance on the "guard condition" for programming convenience. The also provide support for impredicate inductives and these tend to be quite complicated objects. I suggest you read a bit about this and ask more concrete questions. The main reference is "C. Paulin-Mohring. Inductive Definitions in the System Coq" .
See also this wiki page
I have been reading about formal verification and the basic point is that it requires a formal specification and model to work with. However, many sources classify static analysis as a formal verification technique, some mention abstract intepretation and mention its use in compilers.
So I am confused - how can these be formal verification if there is no formal description of the model?
EDIT: A source I found reads:
Static analysis: the abstract semantics is computed automatically from
the program text according to predefined abstractions (that can
sometimes be tailored automatically/manually by the user)
So does it mean it works just on the source code with no need for formal specification? This would be what static analysers do.
Also, is static analysis possible without formal verification? E.g. does SonarQube really perform formal methods?
In the context of hardware and software systems, formal verification is the act of proving or disproving the correctness of intended algorithms underlying a system with respect to a certain formal specification or property, using formal methods of mathematics.
How can these be formal verification if there is no formal description of the model?
A static analyser will generate control/data flow of a piece of code, upon which formal methods can then be applied to verify conformance to the system's/unit's expected design model.
Note that modelling/formal-specification is NOT a part of static-analysis.
However combined together, both of these tools are useful in formal verification.
For example if a system is modeled as a Finite State Machine (FSM) with
a pre-defined number of states
defined by a combination of specific values of certain member data.
a pre-defined set of transitions between various states
defined by the list of member functions.
Then the results of static analysis will help in formal verification of the fact that
the control NEVER flows along a path that is NOT present in the above FSM model.
Also, if a model can be simply defined in terms of type-definition, data-flow, control-flow/call-graph, i.e. code-metrics that a static-analyser can verify, then static-analysis itself is sufficient to formally verify that code conforms to such a model.
NOTE1. The yellow region above would be static analysers used to enforce stuff like coding-guidelines and naming-conventions i.e. aspects of code that cannot affect the program's behavior.
NOTE2. The red region above would be formal verification that requires additional steps like 100% dynamic code-coverage, elimination of unused and dead code. These cannot be detected/enforced using a static-analyser.
Static analysis is highly effective in verifying that a system/unit is implemented using a subset of the language specification to meet goals laid out in the system/unit design.
For example, if it is a design goal to prevent the stack memory from exceeding a particular limit, then one could apply a limit on the depth of recursion (or forbid recursive functions calls altogether). Static-analysis is used to identify such violations of design goals.
In the absence of any warnings from the static-analyser,
the system/unit code stands formally verified against such design-goals of its respective model.
eg. MISRA-C standard for Automotive software defines a subset of C for use in automotive systems.
MISRA-C:2012 contains
143 rules - each of which is checkable using static program analysis.
16 "directives" more open to interpretation, or relate to process.
Static analysis just means "read the source code and possibly complain". (Contrast to "dynamic analysis", meaning, "run the program and possibly complain about some execution behavior").
There are lots of different types of possible static-analysis complaints.
One possible complaint might be,
Your source code does not provably satisfy a formal specification
This complaint would be based on formal verification if the static analyzer had a formal specification which it interpreted "formally", a formal interpretation of the source code, and a trusted theorem prover that could not find an appropriate theorem.
All the other kinds of complaints you might get from a static analyzer are pretty much heuristic opinions, that is, they are based on some informal interpretation of the code (or specification if it indeed even exists).
The "heavy duty" static analyzers such as Coverity etc. have pretty good program models, but they don't tell you that your code meets a specification (they don't even look to see if you have one). At best they only tell you that your code does something undefined according to the language ("dereference a null pointer") and even that complaint isn't always right.
So-called "style checkers" such as MISRA are also static analyzers, but their complaints are essentially "You used a construct that some committee decided was bad form". That's not actually a bug, it is pure opinion.
You can certainly classify static analysis as a kind of formal verification.
how can these be formal verification if there is no formal description of the model?
For static analysis tools, the model is implicit (or in some tools, partly implicit). For example, "a well-formed C++ program will not leak memory, and will not access memory that hasn't been initialized". These sorts of rules can be derived from the language specification, or from the coding standards of a particular project.
Given an algorithm, described in a paper of your choice with a bunch of symbols and a very special notation.
How do I learn to read such algorithm descriptions and turn them into a computer program?
The following picture describes an algorithm to calculate the incremental local outlier factor:
What is the best approach to translate that to a programming language?
Your answer can also help me by pointing to articles that describe the symbols being used, the notation in general and tutorials on how to read and understand such papers.
You seem to be asking for what is generally described as part of a course in first-order predicate calculus. A general introduction (including notation) can be found here or in your library.
Generally, notation like means "for all that are in the set ..." and would often be implemented in programming language using a for loop or similar looping structure, depending on the particular language.
Introductory books on algorithms will also likely be useful in answering the questions you have. An example would be Sedgewick's book Algorithms in C if your target computer language is C.
I think the question can only be answered in a very broad sense. The pseudocode notation in algorithmic papers does not follow a consistent standard, and sometimes no pseudocode notation for the algorithms in question. A general advice would be to study the problem covered as much as possible and to get into mathematics a bit.
I have written a translator that is able to convert some simple mathematical formulas into various programming languages. This is the input in in mathematical notation:
distance_formula(x1,y1,x2,y2) = sqrt((x1-x2)^2+(y1-y2)^2)
and this is the output in Python:
def distance_formula(x1,y1,x2,y2):
return math.sqrt(((x1-x2)**2)+((y1-y2)**2))
Similarly, there are several pseudocode translators that have been designed to convert descriptions of algorithms into equivalent programs.