casting in implementing gflist_vt_mergesort$cmp - ats

In gflist_vt.sats, the signature of gflist_vt_mergesort$cmp implies that the order used for sorting must be the same as that of stamp. I understand that if such comparing function is given, the soundness of the function is ensured.
In this example, gflist_vt_mergesort$cmp seems to be implemented using unsafe casting.
Is it safe to do that? (i.e. Doesn't that cause any problem? e.g. What if the list is sorted multiple times with different ordering?)
Is there any other (safer) way?

Unsafe casting is inherently unsafe.
Unsafe casting can be removed. In order to do so, you need to implement the abstract type stamped_vt0ype (which is given the alias stamped_vt). For instance, to sort an integer list in descending order, you could do something as follows:
local
assume stamped_vt0ype(_, i) = int(~i)
in (* in-of-local *)
implement
{a}
gflist_vt_mergesort$cmp(x, y) = g1int_sgn(y - x)
end // end of [local]
However, doing something like this does not seem to offer much in terms of practical programming.

Related

Consistent formulations of sets in Coq?

I'm quite new at Coq and trying to develop a framework based on my research. My work is quite definition-heavy and I'm having trouble encoding it because of how Coq seems to treat sets.
There are Type and Set, which they call 'sorts', and I can use them to define a new set:
Variable X: Type.
And then there's a library encoding (sub)sets as 'Ensembles', which are functions from some Type to a Prop. In other words, they are predicates on a Type:
Variable Y: Ensemble X.
Ensembles feel more like proper mathematical sets. Plus, they are built upon by many other libraries. I've tried focussing on them: defining one universal set U: Set, and then limiting myself to (sub)Ensembles on U. But no. Ensembles cannot be used as types for other variables, nor to define new subsets:
Variable y: Y. (* Error *)
Variable Z: Ensemble Y. (* Error *)
Now, I know there are several ways to get around that. The question "Subset parameter" offers two. Both use coercions. The first sticks to Sets. The second essentially uses Ensembles (though not by name). But both require quite some machinery to accomplish something so simple.
Question: What is the recommended way of consistently (and elegantly) handling sets?
Example: Here's an example of what I want to do: Assume a set DD. Define a pair dm = (D, <) where D is a finite subset of DD and < is a strict partial order on D.
I'm sure that with enough tinkering with coercions or other structures, I could accomplish it; but not in a particularly readable way; and without a good intuition of how to manipulate the structure further. For example, the following type-checks:
Record OrderedSet {DD: Set} : Type := {
D : (Ensemble DD);
order : (relation {d | In _ D d});
is_finite : (Finite _ D);
is_strict_partial : (is_strict_partial_order order)
}.
But I'm not so sure it's what I want; and it certainly doesn't look very pretty. Note that I'm going backwards and forwards between Set and Ensemble in a seemingly arbitrary way.
There are plenty of libraries out there which use Ensembles, so there must be a nice way to treat them, but those libraries don't seem to be documented very well (or... at all).
Update: To complicate matters further, there appear to be a number of other set implementations too, like MSets. This one seems to be completely separate and incompatible with Ensemble. It also uses bool rather than Prop for some reason. There is also FSets, but it appears to be an outdated version of MSets.
It's been (literally) years since I used Coq, but let me try to help.
I think mathematically speaking U: Set is like saying U is an universe of elements and Ensemble U would then mean a set of elements from that universe. So for generic notions and definitions you will almost certainly use Set and Ensemble is one possible way about reasoning about subsets of elements.
I'd suggest that you take a look at great work by Matthieu Sozeau who introduced type classes to Coq, a very useful feature based on Haskell's type classes. In particular in the standard library you will find a class-based definition of a PartialOrder that you mention in your question.
Another reference would be the CoLoR library formalizing notions needed to prove termination of term rewriting. It has a fairly large set of generic purpose definitions on orders and what-not.

C++ how to sort dynamically using lambda functions for a vector of unique_ptrs?

So I have a std::vector<std::unique_ptr<Base>> vec and I'm trying to sort it dynamically, given that there are logical comparisons between Derived1 to Derivedn (Derivedn always > Derivedn-1 > ... > Derived1) (say n = 10 or so) and each Derivedx has it's own different comparison with Derivedx. As an example, think 10 digit integer > 9 digit integer > 1 digit integer, but within each derived class 53 > 32 (but I'm not sorting integers).
So I can do this:
std::sort(vec.begin(), vec.end(),
[](std::unique_ptr<Base>& const a, std::unique_ptr<Base>& const b){
return *a<*b;}
And then in Base, have a function Base::operator<(const Base& b) make comparisons if they are different Derived classes, and cast to Derivedx if they are the same with Derivedx::operator<(const Derivedx& d) if they are the same derived.
However, I would think that there's a way that I can compare a to b automatically given the appropriate definitions in the derived classes, but I have been unable to implement it due to compile errors. I cannot get the lambda function to compare Derivedx < Derivedy dynamically.
I've tried Base::operator<(const std::unique_ptr<Base>) and then use return *a<b for a compiler error, saying that I used a deleted copy assignment operator (which I don't understand, where is the assignment??). An abstract virtual Base::operator<(const Base& b) does practically the same thing I'm doing now with more work because I have to implement Derivedx::operator<(const Base& b) (for each Derivedx) and then cast down to (Derivedx) if they're the same.
It may just be better that I compare everything in the base class rather than implementing n^2 comparisons in (n comparisons in n derived classes) though. But I do want to see if I can keep things "object oriented".
Any thoughts on the design issue?
Thanks.
Take a look at Chapter 31, Making functions virtual with respect to more than one object in Scott Meyers, More Effective C++.
Also, try googling on the phrases double dispatch and multiple dispatch.
Hmm... I probably would have used overriding the operator< with the set of relevant variants. This would then be independent from any class hierarchy. But maybe this is not what you want.

Use cases of std::multimap

I don't quite get the purpose of this data structure. What's the difference between std::multimap<K, V> and std::map<K, std::vector<V>>. The same goes for std::multiset- it could just be std::map<K, int> where the int counts the number of occurrences of K. Am I missing something on the uses of these structures?
A counter-example seems to be in order.
Consider a PhoneEntry in an AdressList grouped by name.
int AdressListCompare(const PhoneEntry& p1, const PhoneEntry& p2){
return p1.name<p2.name;
}
multiset<PhoneEntry, AdressListCompare> adressList;
adressList.insert( PhoneEntry("Cpt.G", "123-456", "Cellular") );
adressList.insert( PhoneEntry("Cpt.G", "234-567", "Work") );
// Getting the entries
addressList.equal_range( PhoneENtry("Cpt.G") ); // All numbers
This would not be feasible with a set+count. Your Object+count approach seems to be faster if this behavior is not required. For instance the multiset::count() member states
"Complexity: logarithmic in size +
linear in count."
You could use make the substitutions that you suggest, and extract similar behavior. But the interfaces would be very different than when dealing with regular standard containers. A major design theme of these containers is that they share as much interface as possible, making them as interchangeable as possible so that the appropriate container can be chosen without having to change the code that uses it.
For instance, std::map<K, std::vector<V>> would have iterators that dereference to std::pair<K, std::vector<V>> instead of std::pair<K, V>. std::map<K, std::vector<V>>::Count() wouldn't return the correct result, failing to account for the duplicates in the vector. Of course you could change your code to do the extra steps needed to correct for this, but now you are interfacing with the container in a much different way. You can't later drop in unordered_map or some other map implementation to see it performs better.
In a broader sense, you are breaking the container abstraction by handling container implementation details in your code rather than having a container that handles it's own business.
It's entirely possible that your compiler's implementation of std::multimap is really just a wrapper around std::map<K, std::vector<V>>. Or it might not be. It could be more efficient and friendly to object pool allocation (which vectors are not).
Using std::map<K, int> instead of std::multiset is the same case. Count() would not return the expected value, iterators will not iterate over the duplicates, iterators will dereference to std::pair<k, int> instead of directly to `K.
A multimap or multiset allows you to have elements with duplicate keys.
ie a set is a non-ordered group of elements that are all unique in that {A,B,C} == {B,C,A}

Where is DropWhile in Mathematica?

Mathematica 6 added TakeWhile, which has the syntax:
TakeWhile[list, crit]
gives elements ei from the beginning of list, continuing so long as crit[ei] is True.
There is however no corresponding "DropWhile" function. One can construct DropWhile using LengthWhile and Drop, but it almost seems as though one is discouraged from using DropWhile. Why is this?
To clarify, I am not asking for a way to implement this function. Rather: why is it not already present? It seems to me that there must be a reason for its absence other than an oversight, or it would have been corrected by now. Is there something inefficient, undesirable, or superfluous about DropWhile?
There appears to be some ambiguity about the function of DropWhile, so here is an example:
DropWhile = Drop[#, LengthWhile[#, #2]] &;
DropWhile[{1,2,3,4,5}, # <= 3 &]
Out= {4, 5}
Just a blind guess.
There are a lot list operations that could take a while criteria. For example:
Total..While
Accumulate..While
Mean..While
Map..While
Etc..While
They are not difficult to construct, anyway.
I think those are not included just because the number of "primitive" functions is already growing too long, and the criteria of "is it frequently needed and difficult to implement with good performance by the user?" is prevailing in those cases.
The ubiquitous Lists in Mathematica are fixed length vectors, and when they are of a machine numbers it is a packed array.
Thus the natural functions for a recursively defined linked list (e.g. in Lisp or Haskell) are not the primary tools in Mathematica.
So I am inclined to think this explains why Wolfram did not fill out its repertoire of manipulation functions.

Haskell caching results of a function

I have a function that takes a parameter and produces a result. Unfortunately, it takes quite long for the function to produce the result. The function is being called quite often with the same input, that's why it would be convenient if I could cache the results. Something like
let cachedFunction = createCache slowFunction
in (cachedFunction 3.1) + (cachedFunction 4.2) + (cachedFunction 3.1)
I was looking into Data.Array and although the array is lazy, I need to initialize it with a list of pairs (using listArray) - which is impractical . If the 'key' is e.g. the 'Double' type, I cannot initialize it at all, and even if I can theoretically assign an Integer to every possible input, I have several tens of thousands possible inputs and I only actually use a handful. I would need to initialize the array (or, preferably a hash table, as only a handful of resutls will be used) using a function instead of a list.
Update: I am reading the memoization articles and as far as I understand it the MemoTrie could work the way I want. Maybe. Could somebody try to produce the 'cachedFunction'? Prefereably for a slow function that takes 2 Double arguments? Or, alternatively, that takes one Int argument in a domain of ~ [0..1 billion] that wouldn't eat all memory?
Well, there's Data.HashTable. Hash tables don't tend to play nicely with immutable data and referential transparency, though, so I don't think it sees a lot of use.
For a small number of values, stashing them in a search tree (such as Data.Map) would probably be fast enough. If you can put up with doing some mangling of your Doubles, a more robust solution would be to use a trie-like structure, such as Data.IntMap; these have lookup times proportional primarily to key length, and roughly constant in collection size. If Int is too limiting, you can dig around on Hackage to find trie libraries that are more flexible in the type of key used.
As for how to cache the results, I think what you want is usually called "memoization". If you want to compute and memoize results on demand, the gist of the technique is to define an indexed data structure containing all possible results, in such a way that when you ask for a specific result it forces only the computations needed to get the answer you want. Common examples usually involve indexing into a list, but the same principle should apply for any non-strict data structure. As a rule of thumb, non-function values (including infinite recursive data structures) will often be cached by the runtime, but not function results, so the trick is to wrap all of your computations inside a top-level definition that doesn't depend on any arguments.
Edit: MemoTrie example ahoy!
This is a quick and dirty proof of concept; better approaches may exist.
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
import Data.MemoTrie
import Data.Binary
import Data.ByteString.Lazy hiding (map)
mangle :: Double -> [Int]
mangle = map fromIntegral . unpack . encode
unmangle :: [Int] -> Double
unmangle = decode . pack . map fromIntegral
instance HasTrie Double where
data Double :->: a = DoubleTrie ([Int] :->: a)
trie f = DoubleTrie $ trie $ f . unmangle
untrie (DoubleTrie t) = untrie t . mangle
slow x
| x < 1 = 1
| otherwise = slow (x / 2) + slow (x / 3)
memoSlow :: Double -> Integer
memoSlow = memo slow
Do note the GHC extensions used by the MemoTrie package; hopefully that isn't a problem. Load it up in GHCi and try calling slow vs. memoSlow with something like (10^6) or (10^7) to see it in action.
Generalizing this to functions taking multiple arguments or whatnot should be fairly straightforward. For further details on using MemoTrie, you might find this blog post by its author helpful.
See memoization
There are a number of tools in GHC's runtime system explicitly to support memoization.
Unfortunately, memoization isn't really a one-size fits all affair, so there are several different approaches that we need to support in order to cope with different user needs.
You may find the original 1999 writeup useful as it includes several implementations as examples:
Stretching the Storage Manager: Weak Pointers and Stable Names in Haskell by Simon Peyton Jones, Simon Marlow, and Conal Elliott
I will add my own solution, which seems to be quite slow as well. First parameter is a function that returns Int32 - which is unique identifier of the parameter. If you want to uniquely identify it by different means (e.g. by 'id'), you have to change the second parameter in H.new to a different hash function. I will try to find out how to use Data.Map and test if I get faster results.
import qualified Data.HashTable as H
import Data.Int
import System.IO.Unsafe
cache :: (a -> Int32) -> (a -> b) -> (a -> b)
cache ident f = unsafePerformIO $ createfunc
where
createfunc = do
storage <- H.new (==) id
return (doit storage)
doit storage = unsafePerformIO . comp
where
comp x = do
look <- H.lookup storage (ident x)
case look of
Just res -> return res
Nothing -> do
result <- return (f x)
H.insert storage (ident x) result
return result
You can write the slow function as a higher order function, returning a function itself. Thus you can do all the preprocessing inside the slow function and the part that is different in each computation in the returned (hopefully fast) function. An example could look like this:
(SML code, but the idea should be clear)
fun computeComplicatedThing (x:float) (y:float) = (* ... some very complicated computation *)
fun computeComplicatedThingFast = computeComplicatedThing 3.14 (* provide x, do computation that needs only x *)
val result1 = computeComplicatedThingFast 2.71 (* provide y, do computation that needs x and y *)
val result2 = computeComplicatedThingFast 2.81
val result3 = computeComplicatedThingFast 2.91
I have several tens of thousands possible inputs and I only actually use a handful. I would need to initialize the array ... using a function instead of a list.
I'd go with listArray (start, end) (map func [start..end])
func doesn't really get called above. Haskell is lazy and creates thunks which will be evaluated when the value is actually required.
When using a normal array you always need to initialize its values. So the work required for creating these thunks is necessary anyhow.
Several tens of thousands is far from a lot. If you'd have trillions then I would suggest to use a hash table yada yada
I don't know haskell specifically, but how about keeping existing answers in some hashed datastructure (might be called a dictionary, or hashmap)? You can wrap your slow function in another function that first check the map and only calls the slow function if it hasn't found an answer.
You could make it fancy by limiting the size of the map to a certain size and when it reaches that, throwing out the least recently used entry. For this you would additionally need to keep a map of key-to-timestamp mappings.

Resources