Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am trying to understand the purpose of the WAM at a conceptual high level, but all the sources I have consulted so far assume that I know more than I currently do at this juncture, and they approach the issue from the bottom (details). They start with throwing trees at me, where as right now I am concerned with seeing the whole forest.
The answers to the following questions would help me in this endeavor:
Pick any group of accomplished, professional Prolog implementers - the SISCtus people, the YAP people, the ECLiPSe people - whoever. Now, give them the goal of implementing a professional, performant, WAM-based Prolog on an existing virtual machine - say the Erlang VM or Java VM. To eliminate answers such as "it depends on what your other goals are," lets say that any other goals they have besides the one I just gave are the ones they had when they developed their previous implementations.
Would they (typically) implement a virtual machine (the WAM) inside of a VM (Erlang/JVM), meaning would you have a virtual machine running on top of, or being simulated by, another virtual machine?
If the answer to 1 is no, does that mean that they would try to somehow map the WAM and its associated instructions and execution straight onto the underlying Erlang/Java VM, in order to make the WAM 'disappear' so to speak and only have one VM running (Erlang/JVM)? If so, does this imply that any WAM heaps, stacks, memory operations, register allocations, instructions, etc. would actually be Erlang/Java ones (with some tweaking or massaging)?
If the answer to 1 is yes, does that mean that any WAM heaps, stacks, memory ops, etc. would simply be normal arrays or linked lists in whatever language (Erlang or Java, or even Clojure running on the JVM for that matter) the developers were using?
What I'm trying to get at is this. Is the WAM merely some abstraction or tool to help the programmer organize code, understand what is going on, map Prolog to the underlying machine, perhaps provide portability, etc. or is it seen as an (almost) necessary, or at least quite useful "end within itself" in order to implement a Prolog?
Thanks.
I'm excited to see what those more knowledgeable than I are able to say in response to this interesting question, but in the unlikely event that I actually know more than you do, let me outline my understanding. We'll both benefit when the real experts show up and correct me and/or supply truer answers.
The WAM gives you a procedural description of a way of implementing Prolog. Prolog as specified does not say how exactly it must be implemented, it just talks about what behavior should be seen. So WAM is an implementation approach. I don't think any of the popular systems follow it purely, they each have their own version of it. It's more like an architectural pattern and algorithm sketch than a specification like the Java virtual machine. If it were firmer, the book Warren's Abstract Machine: A Tutorial Reconstruction probably wouldn't need to exist. My (extremely sparse) understanding is that the principal trick is the employment of two stacks: one being the conventional call/return stack of every programming language since Algol, and the other being a special "trail" used for choice points and backtracking. (edit: #false has now arrived and stated that WAM registers are the principal trick, which I have never heard of before, demonstrating my ignorance.) In any case, to implement Prolog you need a correct way of handling the search. Before WAM, people mostly used ad-hoc methods. I wouldn't be surprised to learn that there are newer and/or more sophisticated tricks, but it's a sound architecture that is widely used and understood.
So the answer to your three-part question is, I think, both. There will be a VM within the VM. The VM within the VM will, of course, be implemented in the appropriate language and will therefore use that language's primitives for handling the invisible parts of the VM (the stack and the trail). Clojure might provide insight into the ways a language can share things with its own implementation language. You would be free to intermix as desired.
The answer to your final question, what you're trying to get at, is that the WAM is merely an abstraction for the purposes you describe and not an end to itself. There is not, for instance, such a thing as "portable WAM bytecode" the way compiled Java becomes portable JVM bytecode which might justify it absent the other benefits. If you have a novel way of implementing Prolog, by all means try it and forget all about WAM.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Considering the fact that currently many libraries already have optimized sort engines, then why still many companies ask about Big O and some sorting algorithms, when in reality in our every day in computing, this type of implementation is not any longer really needed?
For example, self balancing binary tree, is a kind of problem that some big companies from the embedded industry still ask programmers to code as part of their testing and candidate selection process.
Even for embedded coding, there are any circumstances when such kind of implementation is going to take place, given fact that boost, SQLite and other libraries are available for use? In other words, is it worth still to think on ways to optimize such algorithms?
As an embedded programmer, I would say it all comes down to the problem and system constraints. Especially on a microprocessor, you may not want/need to pull in Boost and SQLite may still be too heavy for a given problem. How you chop up problems looks different if you have say, 2K of RAM - but this is definitely the extreme.
So for example, you probably don't want to rewrite code for a red-black tree yourself because as you pointed out, you will likely end up with highly non-optimized code. But in the pursuit of reusability, abstraction often adds layers of indirection to the code. So you may end up reimplementing at least simpler "solved" problems where you can do better than the built-in library for certain niche cases. Recently I wrote a specialized version of linked lists using shared memory pools for allocation across lists, for example. I had benchmarked against STL's list and it just wasn't cutting it because of the added layers of abstraction. Is my code as good? No, probably not. But I was more easily able to specialize the specific uses cases, so it came out better.
So I guess I'd like to address a few things from your post:
-Why do companies still ask about big-O runtime? I have seen even pretty good engineers make bad choices with regards to data structures because they didn't reason carefully about the O() runtime. Quadratic versus linear or linear versus constant time operation is a painful lesson when you get the assumption wrong. (ah, the voice of experience)
-Why do companies still ask about implementing classic structures/algorithms? Arguably you won't reimplement quick sort, but as stated, you may very well end up implementing slightly less complicated structures on a more regular basis. Truthfully, if I'm looking to hire you, I'm going to want to make sure that you understand the theory inside and out for existing solutions so if I need you to come up with a new solution you can take an educated crack at it. And if the other applicant has that and you don't, I'd probably say they have an advantage.
Also, here's another way to think about it. In software development, often the platform is quite powerful and the consumer already owns the hardware platform. In embedded software development, the consumer is probably purchasing the hardware platform - hopefully from your company. This means that the software is often selling the hardware. So often it makes more cents to use less powerful, cheaper hardware to solve a problem and take a bit more time to develop the firmware. The firmware is a one-time cost upfront, whereas hardware is per-unit. So even from the business side there are pressures for constrained hardware which in turn leads to specialized structure implementation on the software side.
If you suggest using SQLite on a 2 kB Arduino, you might hear a lot of laughter.
Embedded systems are bit special. They often have extraordinarily tight memory requirements, so space complexity might be far more important than time complexity. I've rarely worked in that area myself, so I don't know what embedded systems companies are interested in, but if they're asking such questions, then it's probably because you'll need to be more acquainted with such issues than in other areas of I.T.
Nothing is optimized enough.
Besides, the questions are meant to test your understanding of the solution (and each part of the solution) and not how great you are at memorizing stuff. Hence it makes perfect sense to ask such questions.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
So sometimes I need to write a data structure I can't find on Hackage, or what I find isn't tested or quality enough for me to trust, or it's just something I don't want to be a dependency. I am reading Okasaki's book right now, and it's quite good at explaining how to design asymptotically fast data structures.
However, I am working specifically with GHC. Constant factors are a big deal for my applications. Memory usage is also a big deal for me. So I have questions specifically about GHC.
In particular
How to maximize sharing of nodes
How to reduce memory footprint
How to avoid space leaks due to improper strictness/laziness
How to get GHC to produce tight inner loops for important sections of code
I've looked around various places on the web, and I have a vague idea of how to work with GHC, for example, looking at core output, using UNPACK pragmas, and the like. But I'm not sure I get it.
So I popped open my favorite data structures library, containers, and looked at the Data.Sequence module. I can't say I understand a lot of what they're doing to make Seq fast.
The first thing that catches my eye is the definition of FingerTree a. I assume that's just me being unfamiliar with finger trees though. The second thing that catches my eye is all the SPECIALIZE pragmas. I have no idea what's going on here, and I'm very curious, as these are littered all over the code.
Many functions also have an INLINE pragma associated with them. I can guess what that means, but how do I make a judgement call on when to INLINE functions?
Things get really interesting around line ~475, a section headered as 'Applicative Construction'. They define a newtype wrapper to represent the Identity monad, they write their own copy of the strict state monad, and they have a function defined called applicativeTree which, apparently is specialized to the Identity monad and this increases sharing of the output of the function. I have no idea what's going on here. What sorcery is being used to increase sharing?
Anyway, I'm not sure there's much to learn from Data.Sequence. Are there other 'model programs' I can read to gain wisdom? I'd really like to know how to soup up my data structures when I really need them to go faster. One thing in particular is writing data structures that make fusion easy, and how to go about writing good fusion rules.
That's a big topic! Most has been explained elsewhere, so I won't try to write a book chapter right here. Instead:
Real World Haskell, ch 25, "Performance" - discusses profiling, simple specialization and unpacking, reading Core, and some optimizations.
Johan Tibell is writing a lot on this topic:
Computing the size of a data structure
Memory footprints of common data types
Faster persistent structures through hashing
Reasoning about laziness
And some things from here:
Reading GHC Core
How GHC does optimization
Profiling for performance
Tweaking GC settings
General improvements
More on unpacking
Unboxing and strictness
And some other things:
Intro to specialization of code and data
Code improvement flags
applicativeTree is quite fancy, but mainly in a way which has to do with FingerTrees in particular, which are quite a fancy data structure themselves. We had some discussion of the intricacies over at cstheory. Note that applicativeTree is written to work over any Applicative. It just so happens that when it is specialized to Id then it can share nodes in a manner that it otherwise couldn't. You can work through the specialization yourself by inlining the Id methods and seeing what happens. Note that this specialization is used in only one place -- the O(log n) replicate function. The fact that the more general function specializes neatly to the constant case is a very clever bit of code reuse, but that's really all.
In general, Sequence teaches more about designing persistent data structures than about all the tricks for eeking out performance, I think. Dons' suggestions are of course excellent. I'd also just browse through the source of the really canonical and tuned libs -- Map, IntMap, Set, and IntSet in particular. Along with those, its worth taking a look at Milan's paper on his improvements to containers.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I am developing an application using the wrong tools. I don't wish to get into the rights or wrongs of this - the platform has been blessed as Strategic and nothing is going to change that now. But I'm hoping somebody can give me some tips on how to make the best of the situation.
We use a server-side language, let's call it X, and client-side HTML/JS/CSS (on IE6). X is primitive from an application development point of view (but excellent for data processing, which is why we are using it); it doesn't even have the concept of user-defined functions, so trying to make the application modular in any way is a challenge. Think tens of thousands of lines of nested if/then/else statements.
My current assumption is that reducing the spaghetti-factor of the code will not be possible, and that really great documentation is my only weapon against this becoming a totally unsupportable nightmare that ends up on TheDailyWTF.
Anybody got any other ideas?
(I don't mention what language X is simply because I'm hoping for answers to the general problem of working with deficient tools, not any particular tactics for X.)
Edit:
Ok, for the morbidly curious, X is SAS. I didn't want the question to focus on whether function-style macros are functions (they are not, and cannot implement design patterns), or to blame it - given the constraints of this particular project, I actually agree with the decision to use it! I am also sure that the majority of software is developed in incredibly non-optimal environments (broken tools, bad management, overbearing legacy burden, etc.), and that there must be strategies for making things work even so.
Are you familiar with Church thesis?
If you can't solve "A" in Y but you can emulate Z in Y and Z can solve "A" then by definition Y can solve "A".
Maybe you can write some generalized routine that somehow makes X more effective for the problem at hand? A sort of extension to X, or, even better, a little-language implemented in X?
It seems that others tend to conflate "little language" with documentation. While you can try to go that way (in this case I suggest you have a look at Robodoc) I was thinking something closer to Wasabi, in approach - i.e. really using your tool X to create a sort of interpreter for X++ or even Y, without knowing what X is I can't be, of course, more specific than that.
Does X have comments?
Write your little language aka pseudo code in the comments.
In addition to documentation, choices of variable names and conventions for how they are used may help a bit. Also you may be able to set up some structural conventions in the code so that there is some regularity. Way back when, when folks wrote assembler good coders produced readable code.
hmmmm, sounds like another MUMPS/Intersystems Cache developer ;)
Seriously though, you might want to check if there are any tools for 'X' which could map the flow of the program, or as part of the documentation process break out something like Visio or another similar tool where you can walk through the code and map out what it does (more or less). Hardest part would probably be having to go back and stare at that wall of code and jump right back in so anything you can do to document it/graph it/chart it will help.
Is it possible to use a different technology, better suited to your problem between X and the client-side?
Alternatively, you could use more IF/Then/else statements to construct modular blocks of code, which might help with maintenance.
I find it hard to believe that you don't have any form of user defined functions available in X - even batch files have functions (kind of)
As soon as you have functions, you can make things at least fairly modular.
You could find a language you like, and implement the usual "slap some data into a template"-level web-app stuff in that, and implement wrappers to call out to 'X' for the things it is good at.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Every language that is being used is being used for its advantages, generally.
What are the advantages of Prolog?
What are the general situations/ category of problems where one can use Prolog more efficiently than any other language?
Compared to what exactly? Prolog is really just the pre-eminent implementation of logic programming so if your question is really about a comparison of programming paradigms well that's really very broad indeed and you should look here.
If your question is more specifically about prolog vs the more commonly seen OO languages I would argue that you're really comparing apples to oranges - the "advantage" (such as it is) is just a different way of thinking about the world, and sometimes changing the way you ask a question provides a better tool for solving a problem.
Basically, if your program can be stated easily as declaritive formal logic statements, Prolog (or another language in that family) will give the fastest development time. If you use a good Prolog compiler, it will also give the best performance and reliability, because the engine will have had a lot of design and development effort.
Trying to implement this kind of thing in another language tends to be a mess. The cleanest and most general solution probably involves implementing your own unification engine. Even naive implementations aren't exactly trivial, the Warren Abstract Machine has a book or two written about it, and doing better will at the very least involve a fair bit of research, reading some headache-inducing papers.
Of course in the real world, key parts of your program may benefit from Prolog, but a lot of other stuff is better handled using another language. That's why a lot of Prolog compilers can interface with, e.g., C.
One of the best times to use Prolog is when you have a problem suited to solving with backtracking. And that is when you have lots of possible solutions to a problem, and perhaps you want to order them to include/exclude depending on some context. This suggests a lot of ambiguity... as in natural language processing.
It sure would be a lot tidier to write all the potential answers as Prolog clauses. With a imperative language all I think you can really do is write a giant (really giant) CASE statement, which is not too fun.
The stuff that are inherent in Prolog:
pattern matching!
anything that involves a depth first search. ( in Java if you want to do a DFS, you may want to implement it by a visitor pattern or do a (really giant) CASE
unification
??
Paul Graham, is a Lisp person nonetheless he argues that Prolog is really good for 2% of the problems, I am myself like to break this 2% down and figure how he'd come up with such number.
His argument for "better" languages is "less code, more power". Prolog is definitely "less code" and if you go for latter flavours of it (typed ones), you get more power too. The only thing that bothered me when using Prolog is the fact that I don't have random access in lists (no arrays).
Prolog is a very high level programming language. An analogy could be (Prolog : C) as (C : Assembler)
Why is not used that much then? I think that it has to do with the machines we use; They are based on turing machines. C can be compiled into byte code automatically, but Prolog is compiled to run on an emulation of the Abstract Warren Machine, thus, it is not that efficient.
Also, prolog is based on first order logic which is not capable of solving every solvable problem in a declarative manner, thus, at some point, you need to rely on imperative-like code.
I'd say prolog works well for problems where a knowledge base forms an important part of the solution. Especially when the knowledge structure is suited to be encoded as logical rules.
For example, writing a natural language interpreter for a particular problem domain would require a lot of knowledge in that domain. Expert systems also fall within this knowledge driven category.
It's also a nice language to explore solutions to logical puzzles ;-)
I have been programming (for fun) over a year with Swi-Prolog. I think one of the advantages of Prolog is that Prolog has no side effects: Prolog is language that kind of has no use for (local or class member) variables, it kind of forces the programmer not use variables. Prolog objects have no state, kind of. I think. I have been writing command line Prolog (no GUI, except few XPCE tests): it is like a train on a track.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
What types of applications have you used model checking for?
What model checking tool did you use?
How would you summarize your experience w/ the technique, specifically in evaluating its effectiveness in delivering higher quality software?
In the course of my studies, I had a chance to use Spin, and it aroused my curiosity as to how much actual model checking is going on and how much value are organizations getting out of it. In my work experience, I've worked on business applications, where there is (naturally) no consideration of applying formal verification to the logic. I'd really like to learn about SO folks model checking experience and thoughts on the subject. Will model checking ever become a more widely used developing practice that we should have in our toolkit?
I just finished a class on model checking and the big tools we used were Spin and SMV. We ended up using them to check properties on common synchronization problems, and I found SMV just a little bit easier to use.
Although these tools were fun to use, I think they really shine when you combine them with something that dynamically enforces constraints on your program (so that it's a bit easier to verify 'useful' things about your program). We ended up taking the Spring WebFlow framework, which uses XML to write a state-machine like file that specifies which web pages can transition to which other ones, and using SMV to be able to perform verification on said applications (shameless plug here).
To answer your last question, I think model checking is definitely useful to have, but I lean more towards using unit testing as a technique that makes me feel comfortable about delivering my final product.
We have used several model checkers in teaching, systems design, and systems development. Our toolbox includes SPIN, UPPAL, Java Pathfinder, PVS, and Bogor. Each has its strengths and weaknesses. All find problems with models that are simply impossible for human beings to discover. Their usability varies, though most are pushbutton automated.
When to use a model checker? I'd say any time you are describing a model that must have (or not have) particular properties and it is any larger than a handful of concepts. Anyone who thinks that they can describe and understand anything larger or more complex is fooling themselves.
What types of applications have you used model checking for?
We used the Java Path Finder model checker to verify some security (deadlock, race condition) and temporal properties (using Linear temporal logic to specify them). It supports classical assertions (like NotNull) on Java (bytecode) - it is for program model checking.
What model checking tool did you use?
We used Java Path Finder (for academic purposes). It's open source software developed by NASA initially.
How would you summarize your experience w/ the technique, specifically in evaluating its effectiveness in delivering higher quality software?
Program model checking has a major problem with state space explosion (memory & disk usage). But there are a wide variety of techniques to reduce the problems, to handle large artifacts, such as partial order reduction, abstraction, symmetry reduction, etc.
I used SPIN to find a concurrency issue in PLC software. It found an unsuspected race condition that would have been very tough to find by inspection or testing.
By the way, is there a "SPIN for Dummies" book? I had to learn it out of "The SPIN Model Checker" book and various on-line tutorials.
I've done some research on that subject during my time at the university, expanding the State Exploring Assembly Model Checker.
We used a virtual machine to walk each and every possible path/state of the program, using A* and some heuristic, depending on the kind of error (deadlock, I/O errors, ...)
It was inspired by Java Pathfinder and it worked with C++ code. (Everything GCC could compile)
But in our experiences this kind of technology will not be used in business applications soon, because of GUI related problems, the work necessary for creating an initial test environment and the enormous hardware requirements. (You need lots of RAM and disc space, because of the gigantic state space)