Ok, imagine I have this Matrix: {{1,2},{2,3}}, and I'd rather have {{4,1,2},{5,2,3}}. That is, I prepended a column to the matrix. Is there an easy way to do it?
My best proposal is this:
PrependColumn[vector_List, matrix_List] :=
Outer[Prepend[#1, #2] &, matrix, vector, 1]
But it obfuscates the code and constantly requires loading more and more code. Isn't this built in somehow?
Since ArrayFlatten was introduced in Mathematica 6 the least obfuscated solution must be
matrix = {{1, 2}, {2, 3}}
vector = {{4}, {5}}
ArrayFlatten#{{vector, matrix}}
A nice trick is that replacing any matrix block with 0 gives you a zero block of the right size.
I believe the most common way is to transpose, prepend, and transpose again:
PrependColumn[vector_List, matrix_List] :=
Transpose[Prepend[Transpose[matrix], vector]]
I think the least obscure is the following way of doing this is:
PrependColumn[vector_List, matrix_List] := MapThread[Prepend, {matrix, vector}];
In general, MapThread is the function that you'll use most often for tasks like this one (I use it all the time when adding labels to arrays before formating them nicely with Grid), and it can make things a lot clearer and more concise to use Prepend instead of the equivalent Prepend[#1, #2]&.
THE... ABSOLUTELY.. BY FAR... FASTEST method to append or prepend a column from my tests of various methods on array RandomReal[100,{10^8,5}] (kids, don't try this at home... if your machine isn't built for speed and memory, operations on an array this size are guaranteed to hang your computer)
...is this: Append[tmp\[Transpose], Range#Length#tmp]\[Transpose].
Replace Append with Prepend at will.
The next fastest thing is this: Table[tmp[[n]]~Join~{n}, {n, Length#tmp}] - almost twice as slow.
Related
I am kinda of new in Mathematica and there are lots of appendto in my code which I think take up a look of time. I know there are some other ways optimize but I cannot really know exactly how to achieve. I think getBucketShocks can be improved a lot? Anyone?
getBucketShocks[BucketPivots_,BucketShock_,parallelOffset_:0]:=
Module[{shocks,pivotsNb},
shocks={};
pivotsNb=Length[BucketPivots];
If[pivotsNb>1,
AppendTo[shocks,LinearFunction[{0,BucketShock},{BucketPivots[[1]],BucketShock},{BucketPivots[[2]],0},BucketPivots[[2]],0},parallelOffset]];
Do[AppendTo[shocks,LinearFunction[{BucketPivots[[i-1]],0},{BucketPivots[[i]],BucketShock},{BucketPivots[[i+1]],0},{BucketPivots[[i+1]],0},parallelOffset]],{i,2,pivotsNb-1}];
AppendTo[shocks,LinearFunction[{BucketPivots[[pivotsNb-1]],0},{BucketPivots[[pivotsNb]],BucketShock},{BucketPivots[[pivotsNb]],BucketShock},{BucketPivots[[pivotsNb]],BucketShock},parallelOffset]],
If[pivotsNb==1,AppendTo[shocks,BucketShock+parallelOffset&]];
];
shocks];
LinearInterpolation[x_,{x1_,y1_},{x2_,y2_},parallelOffset_:0]:=parallelOffset+y1+(y2-y1)/(x2-x1)*(x-x1);
LinearFunction[p1_,p2_,p3_,p4_,parallelOffset_:0]:=Which[
#<=p1[[1]],parallelOffset+p1[[2]],
#<=p2[[1]],LinearInterpolation[#,p1,p2,parallelOffset],
#<=p3[[1]],LinearInterpolation[#,p2,p3,parallelOffset],
#<=p4[[1]],LinearInterpolation[#,p3,p4,parallelOffset],
#>p4[[1]],parallelOffset+p4[[2]]]&;
I think you can optimize the middle Do loop a lot by using some form of Map one way or another. At every iteration, you're trying to access 3 adjacent elements of BucketPivots. This seems like this would be the easiest to do with MovingMap, but you need to jump through a few hoops to get the arguments in the right place. This one is probably the easiest solution:
shocks = MovingMap[
LinearFunction[
{#[[1]], 0},
{#[[2]], BucketShock},
{#[[3]], 0},
{#[[3]], 0},
parallelOffset
]&,
BucketPivots,
2
]
As a general principle: if you want to do a Do or For loop in Mathematica that runs over the Length of another list, try to find a way you can do it with a function from the Map family (Map, MapIndexed, MapAt, MapThread, etc.) and get familiar with those. They are great substitutions for iterations!
After this, the first and last elements of shocks you can then add with AppendTo.
BTW: here's a free tip. I recommend that in Mathematica you avoid giving variables and functions names that start with a capital (like you did with BucketPivots). All of Mathematica's own symbols start with capitals, so if you avoid starting with them yourself, you'll never clash with a build-in function.
I've been looking at the ways to check arguments of functions. I noticed that
MatrixQ takes 2 arguments, the second is a test to apply to each element.
But ListQ only takes one argument. (also for some reason, ?ListQ does not have a help page, like ?MatrixQ does).
So, for example, to check that an argument to a function is a matrix of numbers, I write
ClearAll[foo]
foo[a_?(MatrixQ[#, NumberQ] &)] := Module[{}, a + 1]
What would be a good way to do the same for a List? This below only checks that the input is a List
ClearAll[foo]
foo[a_?(ListQ[#] &)] := Module[{}, a + 1]
I could do something like this:
ClearAll[foo]
foo[a_?(ListQ[#] && (And ## Map[NumberQ[#] &, # ]) &)] := Module[{}, a + 1]
so that foo[{1, 2, 3}] will work, but foo[{1, 2, x}] will not (assuming x is a symbol). But it seems to me to be someone complicated way to do this.
Question: Do you know a better way to check that an argument is a list and also check the list content to be Numbers (or of any other Head known to Mathematica?)
And a related question: Any major run-time performance issues with adding such checks to each argument? If so, do you recommend these checks be removed after testing and development is completed so that final program runs faster? (for example, have a version of the code with all the checks in, for the development/testing, and a version without for production).
You might use VectorQ in a way completely analogous to MatrixQ. For example,
f[vector_ /; VectorQ[vector, NumericQ]] := ...
Also note two differences between VectorQ and ListQ:
A plain VectorQ (with no second argument) only gives true if no elements of the list are lists themselves (i.e. only for 1D structures)
VectorQ will handle SparseArrays while ListQ will not
I am not sure about the performance impact of using these in practice, I am very curious about that myself.
Here's a naive benchmark. I am comparing two functions: one that only checks the arguments, but does nothing, and one that adds two vectors (this is a very fast built-in operation, i.e. anything faster than this could be considered negligible). I am using NumericQ which is a more complex (therefore potentially slower) check than NumberQ.
In[2]:= add[a_ /; VectorQ[a, NumericQ], b_ /; VectorQ[b, NumericQ]] :=
a + b
In[3]:= nothing[a_ /; VectorQ[a, NumericQ],
b_ /; VectorQ[b, NumericQ]] := Null
Packed array. It can be verified that the check is constant time (not shown here).
In[4]:= rr = RandomReal[1, 10000000];
In[5]:= Do[add[rr, rr], {10}]; // Timing
Out[5]= {1.906, Null}
In[6]:= Do[nothing[rr, rr], {10}]; // Timing
Out[6]= {0., Null}
Homogeneous non-packed array. The check is linear time, but very fast.
In[7]:= rr2 = Developer`FromPackedArray#RandomInteger[10000, 1000000];
In[8]:= Do[add[rr2, rr2], {10}]; // Timing
Out[8]= {1.75, Null}
In[9]:= Do[nothing[rr2, rr2], {10}]; // Timing
Out[9]= {0.204, Null}
Non-homogeneous non-packed array. The check takes the same time as in the previous example.
In[10]:= rr3 = Join[rr2, {Pi, 1.0}];
In[11]:= Do[add[rr3, rr3], {10}]; // Timing
Out[11]= {5.625, Null}
In[12]:= Do[nothing[rr3, rr3], {10}]; // Timing
Out[12]= {0.282, Null}
Conclusion based on this very simple example:
VectorQ is highly optimized, at least when using common second arguments. It's much faster than e.g. adding two vectors, which itself is a well optimized operation.
For packed arrays VectorQ is constant time.
#Leonid's answer is very relevant too, please see it.
Regarding the performance hit (since your first question has been answered already) - by all means, do the checks, but in your top-level functions (which receive data directly from the user of your functionality. The user can also be another independent module, written by you or someone else). Don't put these checks in all your intermediate functions, since such checks will be duplicate and indeed unjustified.
EDIT
To address the problem of errors in intermediate functions, raised by #Nasser in the comments: there is a very simple technique which allows one to switch pattern-checks on and off in "one click". You can store your patterns in variables inside your package, defined prior to your function definitions.
Here is an example, where f is a top-level function, while g and h are "inner functions". We define two patterns: for the main function and for the inner ones, like so:
Clear[nlPatt,innerNLPatt ];
nlPatt= _?(!VectorQ[#,NumericQ]&);
innerNLPatt = nlPatt;
Now, we define our functions:
ClearAll[f,g,h];
f[vector:nlPatt]:=g[vector]+h[vector];
g[nv:innerNLPatt ]:=nv^2;
h[nv:innerNLPatt ]:=nv^3;
Note that the patterns are substituted inside definitions at definition time, not run-time, so this is exactly equivalent to coding those patterns by hand. Once you test, you just have to change one line: from
innerNLPatt = nlPatt
to
innerNLPatt = _
and reload your package.
A final question is - how do you quickly find errors? I answered that here, in sections "Instead of returning $Failed, one can throw an exception, using Throw.", and "Meta-programming and automation".
END EDIT
I included a brief discussion of this issue in my book here. In that example, the performance hit was on the level of 10% increase of running time, which IMO is borderline acceptable. In the case at hand, the check is simpler and the performance penalty is much less. Generally, for a function which is any computationally-intensive, correctly-written type checks cost only a small fraction of the total run-time.
A few tricks which are good to know:
Pattern-matcher can be very fast, when used syntactically (no Condition or PatternTest present in the pattern).
For example:
randomString[]:=FromCharacterCode#RandomInteger[{97,122},5];
rstest = Table[randomString[],{1000000}];
In[102]:= MatchQ[rstest,{__String}]//Timing
Out[102]= {0.047,True}
In[103]:= MatchQ[rstest,{__?StringQ}]//Timing
Out[103]= {0.234,True}
Just because in the latter case the PatternTest was used, the check is much slower, because evaluator is invoked by the pattern-matcher for every element, while in the first case, everything is purely syntactic and all is done inside the pattern-matcher.
The same is true for unpacked numerical lists (the timing difference is similar). However, for packed numerical lists, MatchQ and other pattern-testing functions don't unpack for certain special patterns, moreover, for some of them the check is instantaneous.
Here is an example:
In[113]:=
test = RandomInteger[100000,1000000];
In[114]:= MatchQ[test,{__?IntegerQ}]//Timing
Out[114]= {0.203,True}
In[115]:= MatchQ[test,{__Integer}]//Timing
Out[115]= {0.,True}
In[116]:= Do[MatchQ[test,{__Integer}],{1000}]//Timing
Out[116]= {0.,Null}
The same, apparently, seems to be true for functions like VectorQ, MatrixQ and ArrayQ with certain predicates (NumericQ) - these tests are extremely efficient.
A lot depends on how you write your test, i.e. to what degree you reuse the efficient Mathematica structures.
For example, we want to test that we have a real numeric matrix:
In[143]:= rm = RandomInteger[10000,{1500,1500}];
Here is the most straight-forward and slow way:
In[144]:= MatrixQ[rm,NumericQ[#]&&Im[#]==0&]//Timing
Out[144]= {4.125,True}
This is better, since we reuse the pattern-matcher better:
In[145]:= MatrixQ[rm,NumericQ]&&FreeQ[rm,Complex]//Timing
Out[145]= {0.204,True}
We did not utilize the packed nature of the matrix however. This is still better:
In[146]:= MatrixQ[rm,NumericQ]&&Total[Abs[Flatten[Im[rm]]]]==0//Timing
Out[146]= {0.047,True}
However, this is not the end. The following one is near instantaneous:
In[147]:= MatrixQ[rm,NumericQ]&&Re[rm]==rm//Timing
Out[147]= {0.,True}
Since ListQ just checks that the head is List, the following is a simple solution:
foo[a:{___?NumberQ}] := Module[{}, a + 1]
I have some issue here which I don't know how to solve in a good way. For example, I want to use BaseForm[1/3, 3]. However, this does not do what I intended unless I input BaseForm[1/3.,3]. Given the data in Rational form, how to turn it to Real? I tried with Apply, it does not work. (Strange enough, uh? To me, Apply can always be used to change header.)
To this specific problem, I could have done something like BaseForm[1/3*1.,3], but it really isn't very nice.
Thanks for your help.
BaseForm takes a rational in base 10 to a rational in what ever base you want... so it does what you expect.
In[1]:= BaseForm[1/3,3]
Out[1]//BaseForm= Subscript[1, 3]/Subscript[10, 3]
And as you pointed out, giving it a Real number can be done like:
In[2]:= BaseForm[1/3.,3]
Out[2]//BaseForm= Subscript[0.1, 3]
The safest way to change things would be to define your own baseForm which is the same as BaseForm except for when it's given rational numbers:
baseForm[r_Rational,b_]:=BaseForm[N[r],b]
Then
In[3]:= baseForm[1/3,3]
Out[3]//BaseForm= Subscript[0.1, 3]
The less safe way (because you don't know what else it might break) is to redefine BaseForm
Unprotect[BaseForm];
BaseForm[r_Rational, b_] := BaseForm[N[r], b]
Protect[BaseForm];
and then use as normal.
I may be missing the subtlety of your request, but if you always want a real-number output, why not merely use N?
BaseForm[N[1/3], 3]
(* Out= 0.13 *)
Mathematica 6 added TakeWhile, which has the syntax:
TakeWhile[list, crit]
gives elements ei from the beginning of list, continuing so long as crit[ei] is True.
There is however no corresponding "DropWhile" function. One can construct DropWhile using LengthWhile and Drop, but it almost seems as though one is discouraged from using DropWhile. Why is this?
To clarify, I am not asking for a way to implement this function. Rather: why is it not already present? It seems to me that there must be a reason for its absence other than an oversight, or it would have been corrected by now. Is there something inefficient, undesirable, or superfluous about DropWhile?
There appears to be some ambiguity about the function of DropWhile, so here is an example:
DropWhile = Drop[#, LengthWhile[#, #2]] &;
DropWhile[{1,2,3,4,5}, # <= 3 &]
Out= {4, 5}
Just a blind guess.
There are a lot list operations that could take a while criteria. For example:
Total..While
Accumulate..While
Mean..While
Map..While
Etc..While
They are not difficult to construct, anyway.
I think those are not included just because the number of "primitive" functions is already growing too long, and the criteria of "is it frequently needed and difficult to implement with good performance by the user?" is prevailing in those cases.
The ubiquitous Lists in Mathematica are fixed length vectors, and when they are of a machine numbers it is a packed array.
Thus the natural functions for a recursively defined linked list (e.g. in Lisp or Haskell) are not the primary tools in Mathematica.
So I am inclined to think this explains why Wolfram did not fill out its repertoire of manipulation functions.
FIRST PROBLEM
I have timed how long it takes to compute the following statements (where V[x] is a time-intensive function call):
Alice = Table[V[i],{i,1,300},{1000}];
Bob = Table[Table[V[i],{i,1,300}],{1000}]^tr;
Chris_pre = Table[V[i],{i,1,300}];
Chris = Table[Chris_pre,{1000}]^tr;
Alice, Bob, and Chris are identical matricies computed 3 slightly different ways. I find that Chris is computed 1000 times faster than Alice and Bob.
It is not surprising that Alice is computed 1000 times slower because, naively, the function V must be called 1000 more times than when Chris is computed. But it is very surprising that Bob is so slow, since he is computed identically to Chris except that Chris stores the intermediate step Chris_pre.
Why does Bob evaluate so slowly?
SECOND PROBLEM
Suppose I want to compile a function in Mathematica of the form
f(x)=x+y
where "y" is a constant fixed at compile time (but which I prefer not to directly replace in the code with its numerical because I want to be able to easily change it). If y's actual value is y=7.3, and I define
f1=Compile[{x},x+y]
f2=Compile[{x},x+7.3]
then f1 runs 50% slower than f2. How do I make Mathematica replace "y" with "7.3" when f1 is compiled, so that f1 runs as fast as f2?
EDIT:
I found an ugly workaround for the second problem:
f1=ReleaseHold[Hold[Compile[{x},x+z]]/.{z->y}]
There must be a better way...
You probably should've posted these as separate questions, but no worries!
Problem one
The problem with Alice is of course what you expect. The problem with Bob is that the inner Table is evaluated once per iteration of the outer Table. This is clearly visible with Trace:
Trace[Table[Table[i, {i, 1, 3}], {3}]]
{
Table[Table[i,{i,1,2}],{2}],
{Table[i,{i,1,2}],{i,1},{i,2},{1,2}},{Table[i,{i,1,2}],{i,1},{i,2},{1,2}},
{{1,2},{1,2}}
}
Line breaks added for emphasis, and yeah, the output of Trace on Table is a little weird, but you can see it. Clearly Mathematica could optimize this better, knowing that the outside table has no iterator, but for whatever reason, it doesn't take that into account. Only Chris does what you want, though you could modify Bob:
Transpose[Table[Evaluate[Table[V[i],{i,1,300}]],{1000}]]
This looks like it actually outperforms Chris by a factor of two or so, because it doesn't have to store the intermediate result.
Problem two
There's a simpler solution with Evaluate, though I expect it won't work with all possible functions to be compiled (i.e. ones that really should be Held):
f1 = Compile[{x}, Evaluate[x + y]];
You could also use a With:
With[{y=7.3},
f1 = Compile[{x}, x + y];
]
Or if y is defined elsewhere, use a temporary:
y = 7.3;
With[{z = y},
f1 = Compile[{x}, x + z];
]
I'm not an expert on Mathematica's scoping and evaluation mechanisms, so there could easily be a much better way, but hopefully one of those does it for you!
Your first problem has already been explained, but I want to point out that ConstantArray was introduced in Mathematica 6 to address this issue. Prior to that time Table[expr, {50}] was used for both fixed and changing expressions.
Since the introduction of ConstantArray there is clear separation between iteration with reevaluation, and simple duplication of an expression. You can see the behavior using this:
ConstantArray[Table[Pause[1]; i, {i, 5}], {50}] ~Monitor~ i
It takes five seconds to loop through Table because of Pause[1], but after that loop is complete it is not reevaluated and the 50 copies are immediately printed.
First problem
Have you checked the output of the Chris_pre computation? You will find that it is not a large matrix at all, since you're trying to store an intermediate result in a pattern, rather than a Variable. Try ChrisPre, instead. Then all the timings are comparable.
Second problem
Compile has a number of tricky restrictions on it's use. One issue is that you cannot refer to global variables. The With construct that was already suggested is the suggested way around this. If you want to learn more about Compile, check out Ted Ersek's tricks:
http://www.verbeia.com/mathematica/tips/Tricks.html