Show a number with specified number of significant digits - wolfram-mathematica

I use the following function to convert a number to a string for display purposes (don't use scientific notation, don't use a trailing dot, round as specified):
(* Show Number. Convert to string w/ no trailing dot. Round to the nearest r. *)
Unprotect[Round]; Round[x_,0] := x; Protect[Round];
shn[x_, r_:0] := StringReplace[
ToString#NumberForm[Round[N#x,r], ExponentFunction->(Null&)], re#"\\.$"->""]
(Note that re is an alias for RegularExpression.)
That's been serving me well for years.
But sometimes I don't want to specify the number of digits to round to, rather I want to specify a number of significant figures.
For example, 123.456 should display as 123.5 but 0.00123456 should display as 0.001235.
To get really fancy, I might want to specify significant digits both before and after the decimal point.
For example, I might want .789 to display as 0.8 but 789.0 to display as 789 rather than 800.
Do you have a handy utility function for this sort of thing, or suggestions for generalizing my function above?
Related: Suppressing a trailing "." in numerical output from Mathematica
UPDATE: I tried asking a general version of this question here:
https://stackoverflow.com/questions/5627185/displaying-numbers-to-non-technical-users

dreeves, I think I finally understand what you want, and you already had it, pretty much. If not, please try again to explain what I am missing.
shn2[x_, r_: 0] :=
StringReplace[
ToString#NumberForm[x, r, ExponentFunction -> (Null &)],
RegularExpression#"\\.0*$" -> ""]
Testing:
shn2[#, 4] & /# {123.456, 0.00123456}
shn2[#, {3, 1}] & /# {789.0, 0.789}
shn2[#, {10, 2}] & /# {0.1234, 1234.}
shn2[#, {4, 1}] & /# {12.34, 1234.56}
Out[1]= {"123.5", "0.001235"}
Out[2]= {"789", "0.8"}
Out[3]= {"0.12", "1234"}
Out[4]= {"12.3", "1235"}

This may not be the complete answer (you need to convert from/to string), but this function takes arguments a number x and significant figures sig wanted. The number of digits it keeps is the maximum of sig or the number of digits to the left of the decimal.
A[x_,sig_]:=NumberForm[x, Max[Last[RealDigits[x]], sig]]
RealDigits

Here's a possible generalization of my original function.
(I've determined that it's not equivalent to Mr Wizard's solution but I'm not sure yet which I think is better.)
re = RegularExpression;
(* Show Number. Convert to string w/ no trailing dot. Use at most d significant
figures after the decimal point. Target t significant figures total (clipped
to be at least i and at most i+d, where i is the number of digits in integer
part of x). *)
shn[x_, d_:5, t_:16] := ToString[x]
shn[x_?NumericQ, d_:5, t_:16] := With[{i= IntegerLength#IntegerPart#x},
StringReplace[ToString#NumberForm[N#x, Clip[t, {i,i+d}],
ExponentFunction->(Null&)],
re#"\\.$"->""]]
Testing:
Here we specify 4 significant digits, but never dropping any to the left of the decimal point and never using more than 2 significant digits to the right of the decimal point.
(# -> shn[#, 2, 4])& /#
{123456, 1234.4567, 123.456, 12.345, 1.234, 1.0001, 0.123, .0001234}
{ 123456 -> "123456",
1234.456 -> "1234",
123.456 -> "123.5"
12.345 -> "12.35",
1.234 -> "1.23",
1.0001 -> "1",
0.123 -> "0.12",
0.0001234 -> "0.00012" }

Related

Mathematica: Print approximate number using E-notation with specified accuracy

In Mathematica, what is the easiest way to print a given approximate number using E-notation with a specified accuracy? For example given:
a = 1.23456789 10^-23
how can a be printed to 3 significant digits to get:
1.23E-23
Notice that trying FortranForm[] with N[] does not work since N[] will not truncate an approximate number.
After some playing around I found an answer:
NumberForm[a, {5, 3},
NumberFormat -> (SequenceForm[#1, "E", #3] &),
ExponentFunction -> (# &)]
If you want to make this more robust and be able to handle rational numbers as well, replace a by N[a,3] .
Notice that the ExponentFunction is needed since a number like 1.2345 would be outputted as 1.23E .
You may use ScientificForm with the NumberFormat option.
With
a = 1.23456789 10^-23;
Then
ScientificForm[a, 3]
1.23*10^(-23)
The output can be further formatted to you spec with the NumberFormat option.
ScientificForm[a, 3, NumberFormat -> (Row[{#1, "E", #3}] &)]
1.23E-23
Hope this helps.

How can I prevent these warnings while using ParallelTable

I have this code to find all the permutations of a set of letters that form legal words.
>>Combinatorica`
Module[{str = "abc", chars, len, r, check},
chars = Characters[str];
len = StringLength[str];
r = Range[len];
check[n_Integer] :=
DictionaryLookup[{"BritishEnglish",
StringJoin[chars[[UnrankPermutation[n, r]]]]}, 1];
DistributeDefinitions[check, chars, r];
ParallelTable[check[i], {i, 1, len!}]]
I've verified that, if I replace ParallelTable with Table, I get this:
{{}, {}, {}, {"cab"}, {}, {}}
With ParallelTable, however, in addition to that result, I also get warnings like these:
Part::pspec: Part specification Combinatorica`UnrankPermutation[1,{1,2,3}] is neither a machine-sized integer nor a list of machine-sized integers.
Part::pspec: Part specification Combinatorica`UnrankPermutation[2,{1,2,3}] is neither a machine-sized integer nor a list of machine-sized integers.
StringJoin::string: String expected at position 1 in StringJoin[{a,b,c}[[Combinatorica`UnrankPermutation[1,{1,2,3}]]]].
StringJoin::string: String expected at position 1 in StringJoin[{a,b,c}[[Combinatorica`UnrankPermutation[2,{1,2,3}]]]].
These warnings seem to come from kernel 7 and higher. My guess is that the computation reaches those kernels and there isn't any data left, since there are only 6 permutations, and causes them to spit out those warnings.
Is my understanding correct? How do I prevent these warnings?
I don't think that's it - if it was the case this simple test would fail too:
ParallelTable[k^2,{k,3}] (* Assuming more than 3 kernels *)
... which runs just fine.
Rather, it seems to me that the function UnrankPermutations[] behaves badly under ParallelTable as you can see from running the simplified version (which also fails):
ParallelTable[Part[chars, UnrankPermutation[n, r]], {n, len!}]
I am not sure that the brute force approach you are taking with this is a good one (consider what happens when word lengths exceed 10 characters), but a work-around following that idea is this:
list = LexicographicPermutations[chars]
ParallelMap[DictionaryLookup[{"BritishEnglish", #}] &,
Map[StringJoin[#] &, list]]
Good luck!

How to construct a list of all Fibonacci numbers less than n in Mathematica

I would like to write a Mathematica function that constructs a list of all Fibonacci numbers less than n. Moreover, I would like to do this as elegantly and functionally as possible(so without an explicit loop).
Conceptually I want to take an infinite list of the natural numbers, map Fib[n] onto it, and then take elements from this list while they are less than n. How can I do this in Mathematica?
The first part can be done fairly easily in Mathematica. Below, I provide two functions nextFibonacci, which provides the next Fibonacci number greater than the input number (just like NextPrime) and fibonacciList, which provides a list of all Fibonacci numbers less than the input number.
ClearAll[nextFibonacci, fibonacciList]
nextFibonacci[m_] := Fibonacci[
Block[{n},
NArgMax[{n, 1/Sqrt[5] (GoldenRatio^n - (-1)^n GoldenRatio^-n) <= m, n ∈ Integers}, n]
] + 1
]
nextFibonacci[1] := 2;
fibonacciList[m_] := Fibonacci#
Range[0, Block[{n},
NArgMax[{n, 1/Sqrt[5] (GoldenRatio^n - (-1)^n GoldenRatio^-n) < m, n ∈ Integers}, n]
]
]
Now you can do things like:
nextfibonacci[15]
(* 21 *)
fibonacciList[50]
(* {0, 1, 1, 2, 3, 5, 8, 13, 21, 34} *)
The second part though, is tricky. What you're looking for is a Haskell type lazy evaluation that will only evaluate if and when necessary (as otherwise, you can't hold an infinite list in memory). For example, something like (in Haskell):
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
which then allows you to do things like
take 10 fibs
-- [0,1,1,2,3,5,8,13,21,34]
takeWhile (<100) fibs
-- [0,1,1,2,3,5,8,13,21,34,55,89]
Unfortunately, there is no built-in support for what you want. However, you can extend Mathematica to implement lazy style lists as shown in this answer, which was also implemented as a package. Now that you have all the pieces that you need, I'll let you work on this yourself.
If you grab my Lazy package from GitHub, your solution is as simple as:
Needs["Lazy`"]
LazySource[Fibonacci] ~TakeWhile~ ((# < 1000) &) // List
If you want to slightly more literally implement your original description
Conceptually I want to take an infinite list of the natural numbers, map Fib[n] onto it, and then take elements from this list while they are less than n.
you could do it as follows:
Needs["Lazy`"]
Fibonacci ~Map~ Lazy[Integers] ~TakeWhile~ ((# < 1000) &) // List
To prove that this is completely lazy, try the previous example without the // List on the end. You'll see that it stops with the (rather ugly) form:
LazyList[First[
LazyList[Fibonacci[First[LazyList[1, LazySource[#1 &, 2]]]],
Fibonacci /# Rest[LazyList[1, LazySource[#1 &, 2]]]]],
TakeWhile[
Rest[LazyList[Fibonacci[First[LazyList[1, LazySource[#1 &, 2]]]],
Fibonacci /# Rest[LazyList[1, LazySource[#1 &, 2]]]]], #1 <
1000 &]]
This consists of a LazyList[] expression whose first element is the first value of the expression that you're lazily evaluating and whose second element is instructions for how to continue the expansion.
Improvements
It's a little bit inefficient to continually call Fibonacci[n] all the time, especially as n starts getting large. It's actually possible to construct a lazy generator that will calculate the current value of the Fibonacci sequence as we stream:
Needs["Lazy`"]
LazyFibonacci[a_,b_]:=LazyList[a,LazyFibonacci[b,a+b]]
LazyFibonacci[]:=LazyFibonacci[1,1]
LazyFibonacci[] ~TakeWhile~ ((# < 1000)&) // List
Finally, we could generalize this up to a more abstract generating function that takes an initial value for an accumulator, a List of Rules to compute the accumulator's value for the next step and a List of Rules to compute the result from the current accumulator value.
LazyGenerator[init_, step_, extract_] :=
LazyList[Evaluate[init /. extract],
LazyGenerator[init /. step, step, extract]]
And could use it to generate the Fibonacci sequence as follows:
LazyGenerator[{1, 1}, {a_, b_} :> {b, a + b}, {a_, b_} :> a]
Ok, I hope I understood the question. But please note, I am not pure math major, I am mechanical engineering student. But this sounded interesting. So I looked up the formula and this is what I can come up with now. I have to run, but if there is a bug, please let me know and I will fix it.
This manipulate asks for n and then lists all Fibonacci numbers less than n. There is no loop to find how many Fibonacci numbers there are less than n. It uses Reduce to solve for the number of Fibonacci numbers less than n. I take the floor of the result and also threw away a constant that came up with in the solution a complex multiplier.
And then simply makes a table of all these numbers using Mathematica Fibonacci command. So if you enter n=20 it will list 1,1,2,3,5,8,13 and so on. I could do it for infinity as I ran out of memory (I only have 8 GB ram on my pc).
I put the limit for n to 500000 Feel free to edit the code and change it.
Manipulate[
Module[{k, m},
k = Floor#N[Assuming[Element[m, Integers] && m > 0,
Reduce[f[m] == n, m]][[2, 1, 2]] /. Complex[0, 2] -> 0];
TableForm#Join[{{"#", "Fibonacci number" }},
Table[{i, Fibonacci[i]}, {i, 1, k}]]
],
{{n, 3, "n="}, 2, 500000, 1, Appearance -> "Labeled", ImageSize -> Small},
SynchronousUpdating -> False,
ContentSize -> {200, 500}, Initialization :>
{
\[CurlyPhi][n_] := ((1 + Sqrt[5])/2)^n;
\[Psi][n_] := -(1/\[CurlyPhi][n]);
f[n_] := (\[CurlyPhi][n] - \[Psi][n])/Sqrt[5];
}]
Screen shot
The index k of the Fibonacci number Fk is k=Floor[Log[GoldenRatio,Fk]*Sqrt[5]+1/2]],
https://en.wikipedia.org/wiki/Fibonacci_number. Hence, the list of Fibonacci numbers less than or equal to n is
FibList[n_Integer]:=Fibonacci[Range[Floor[Log[GoldenRatio,Sqrt[5]*n+1/2]]]]

Algorithm for picking pattern free downvalues from a sparse definition list

I have the following problem.
I am developing a stochastic simulator which samples configurations of the system at random and stores the statistics of how many times each configuration has been visited at certain time instances. Roughly the code works like this
f[_Integer][{_Integer..}] :=0
...
someplace later in the code, e.g.,
index = get index;
c = get random configuration (i.e. a tuple of integers, say a pair {n1, n2});
f[index][c] = f[index][c] + 1;
which tags that configuration c has occurred once more in the simulation at time instance index.
Once the code has finished there is a list of definitions for f that looks something like this (I typed it by hand just to emphasize the most important parts)
?f
f[1][{1, 2}] = 112
f[1][{3, 4}] = 114
f[2][{1, 6}] = 216
f[2][{2, 7}] = 227
...
f[index][someconfiguration] = some value
...
f[_Integer][{_Integer..}] :=0
Please note that pattern free definitions that come first can be rather sparse. Also one cannot know which values and configurations will be picked.
The problem is to efficiently extract down values for a desired index, for example issue something like
result = ExtractConfigurationsAndOccurences[f, 2]
which should give a list with the structure
result = {list1, list2}
where
list1 = {{1, 6}, {2, 7}} (* the list of configurations that occurred during the simulation*)
list2 = {216, 227} (* how many times each of them occurred *)
The problem is that ExtractConfigurationsAndOccurences should be very fast. The only solution I could come up with was to use SubValues[f] (which gives the full list) and filter it with Cases statement. I realize that this procedure should be avoided at any cost since there will be exponentially many configurations (definitions) to test, which slows down the code considerably.
Is there a natural way in Mathematica to do this in a fast way?
I was hoping that Mathematica would see f[2] as a single head with many down values but using DownValues[f[2]] gives nothing. Also using SubValues[f[2]] results in an error.
This is a complete rewrite of my previous answer. It turns out that in my previous attempts, I overlooked a much simpler method based on a combination of packed arrays and sparse arrays, that is much faster and more memory - efficient than all previous methods (at least in the range of sample sizes where I tested it), while only minimally changing the original SubValues - based approach. Since the question was asked about the most efficient method, I will remove the other ones from the answer (given that they are quite a bit more complex and take a lot of space. Those who would like to see them can inspect past revisions of this answer).
The original SubValues - based approach
We start by introducing a function to generate the test samples of configurations for us. Here it is:
Clear[generateConfigurations];
generateConfigurations[maxIndex_Integer, maxConfX_Integer, maxConfY_Integer,
nconfs_Integer] :=
Transpose[{
RandomInteger[{1, maxIndex}, nconfs],
Transpose[{
RandomInteger[{1, maxConfX}, nconfs],
RandomInteger[{1, maxConfY}, nconfs]
}]}];
We can generate a small sample to illustrate:
In[3]:= sample = generateConfigurations[2,2,2,10]
Out[3]= {{2,{2,1}},{2,{1,1}},{1,{2,1}},{1,{1,2}},{1,{1,2}},
{1,{2,1}},{2,{1,2}},{2,{2,2}},{1,{2,2}},{1,{2,1}}}
We have here only 2 indices, and configurations where both "x" and "y" numbers vary from 1 to 2 only - 10 such configurations.
The following function will help us imitate the accumulation of frequencies for configurations, as we increment SubValues-based counters for repeatedly occurring ones:
Clear[testAccumulate];
testAccumulate[ff_Symbol, data_] :=
Module[{},
ClearAll[ff];
ff[_][_] = 0;
Do[
doSomeStuff;
ff[#1][#2]++ & ## elem;
doSomeMoreStaff;
, {elem, data}]];
The doSomeStuff and doSomeMoreStaff symbols are here to represent some code that might preclude or follow the counting code. The data parameter is supposed to be a list of the form produced by generateConfigurations. For example:
In[6]:=
testAccumulate[ff,sample];
SubValues[ff]
Out[7]= {HoldPattern[ff[1][{1,2}]]:>2,HoldPattern[ff[1][{2,1}]]:>3,
HoldPattern[ff[1][{2,2}]]:>1,HoldPattern[ff[2][{1,1}]]:>1,
HoldPattern[ff[2][{1,2}]]:>1,HoldPattern[ff[2][{2,1}]]:>1,
HoldPattern[ff[2][{2,2}]]:>1,HoldPattern[ff[_][_]]:>0}
The following function will extract the resulting data (indices, configurations and their frequencies) from the list of SubValues:
Clear[getResultingData];
getResultingData[f_Symbol] :=
Transpose[{#[[All, 1, 1, 0, 1]], #[[All, 1, 1, 1]], #[[All, 2]]}] &#
Most#SubValues[f, Sort -> False];
For example:
In[10]:= result = getResultingData[ff]
Out[10]= {{2,{2,1},1},{2,{1,1},1},{1,{2,1},3},{1,{1,2},2},{2,{1,2},1},
{2,{2,2},1},{1,{2,2},1}}
To finish with the data-processing cycle, here is a straightforward function to extract data for a fixed index, based on Select:
Clear[getResultsForFixedIndex];
getResultsForFixedIndex[data_, index_] :=
If[# === {}, {}, Transpose[#]] &[
Select[data, First## == index &][[All, {2, 3}]]];
For our test example,
In[13]:= getResultsForFixedIndex[result,1]
Out[13]= {{{2,1},{1,2},{2,2}},{3,2,1}}
This is presumably close to what #zorank tried, in code.
A faster solution based on packed arrays and sparse arrays
As #zorank noted, this becomes slow for larger sample with more indices and configurations. We will now generate a large sample to illustrate that (note! This requires about 4-5 Gb of RAM, so you may want to reduce the number of configurations if this exceeds the available RAM):
In[14]:=
largeSample = generateConfigurations[20,500,500,5000000];
testAccumulate[ff,largeSample];//Timing
Out[15]= {31.89,Null}
We will now extract the full data from the SubValues of ff:
In[16]:= (largeres = getResultingData[ff]); // Timing
Out[16]= {10.844, Null}
This takes some time, but one has to do this only once. But when we start extracting data for a fixed index, we see that it is quite slow:
In[24]:= getResultsForFixedIndex[largeres,10]//Short//Timing
Out[24]= {2.687,{{{196,26},{53,36},{360,43},{104,144},<<157674>>,{31,305},{240,291},
{256,38},{352,469}},{<<1>>}}}
The main idea we will use here to speed it up is to pack individual lists inside the largeres, those for indices, combinations and frequencies. While the full list can not be packed, those parts individually can:
In[18]:= Timing[
subIndicesPacked = Developer`ToPackedArray[largeres[[All,1]]];
subCombsPacked = Developer`ToPackedArray[largeres[[All,2]]];
subFreqsPacked = Developer`ToPackedArray[largeres[[All,3]]];
]
Out[18]= {1.672,Null}
This also takes some time, but it is a one-time operation again.
The following functions will then be used to extract the results for a fixed index much more efficiently:
Clear[extractPositionFromSparseArray];
extractPositionFromSparseArray[HoldPattern[SparseArray[u___]]] := {u}[[4, 2, 2]]
Clear[getCombinationsAndFrequenciesForIndex];
getCombinationsAndFrequenciesForIndex[packedIndices_, packedCombs_,
packedFreqs_, index_Integer] :=
With[{positions =
extractPositionFromSparseArray[
SparseArray[1 - Unitize[packedIndices - index]]]},
{Extract[packedCombs, positions],Extract[packedFreqs, positions]}];
Now, we have:
In[25]:=
getCombinationsAndFrequenciesForIndex[subIndicesPacked,subCombsPacked,subFreqsPacked,10]
//Short//Timing
Out[25]= {0.094,{{{196,26},{53,36},{360,43},{104,144},<<157674>>,{31,305},{240,291},
{256,38},{352,469}},{<<1>>}}}
We get a 30 times speed-up w.r.t. the naive Select approach.
Some notes on complexity
Note that the second solution is faster because it uses optimized data structures, but its complexity is the same as that of Select- based one, which is, linear in the length of total list of unique combinations for all indices. Therefore, in theory, the previously - discussed solutions based on nested hash-table etc may be asymptotically better. The problem is, that in practice we will probably hit the memory limitations long before that. For the 10 million configurations sample, the above code was still 2-3 times faster than the fastest solution I posted before.
EDIT
The following modification:
Clear[getCombinationsAndFrequenciesForIndex];
getCombinationsAndFrequenciesForIndex[packedIndices_, packedCombs_,
packedFreqs_, index_Integer] :=
With[{positions =
extractPositionFromSparseArray[
SparseArray[Unitize[packedIndices - index], Automatic, 1]]},
{Extract[packedCombs, positions], Extract[packedFreqs, positions]}];
makes the code twice faster still. Moreover, for more sparse indices (say, calling the sample-generation function with parameters like generateConfigurations[2000, 500, 500, 5000000] ), the speed-up with respect to the Select- based function is about 100 times.
I'd probably use SparseArrays here (see update below), but if you insist on using functions and *Values to store and retrieve values an approach would be to have the first part (f[2] etc.) replaced by a symbol you create on the fly like:
Table[Symbol["f" <> IntegerString[i, 10, 3]], {i, 11}]
(* ==> {f001, f002, f003, f004, f005, f006, f007, f008, f009, f010, f011} *)
Symbol["f" <> IntegerString[56, 10, 3]]
(* ==> f056 *)
Symbol["f" <> IntegerString[56, 10, 3]][{3, 4}] = 12;
Symbol["f" <> IntegerString[56, 10, 3]][{23, 18}] = 12;
Symbol["f" <> IntegerString[56, 10, 3]] // Evaluate // DownValues
(* ==> {HoldPattern[f056[{3, 4}]] :> 12, HoldPattern[f056[{23, 18}]] :> 12} *)
f056 // DownValues
(* ==> {HoldPattern[f056[{3, 4}]] :> 12, HoldPattern[f056[{23, 18}]] :> 12} *)
Personally I prefer Leonid's solution, as it's much more elegant but YMMV.
Update
On OP's request, about using SparseArrays:
Large SparseArrays take up a fraction of the size of standard nested lists. We can make f to be a large (100,000 entires) sparse array of sparse arrays:
f = SparseArray[{_} -> 0, 100000];
f // ByteCount
(* ==> 672 *)
(* initialize f with sparse arrays, takes a few seconds with f this large *)
Do[ f[[i]] = SparseArray[{_} -> 0, {100, 110}], {i,100000}] // Timing//First
(* ==> 18.923 *)
(* this takes about 2.5% of the memory that a normal array would take: *)
f // ByteCount
(* ==> 108000040 *)
ConstantArray[0, {100000, 100, 100}] // ByteCount
(* ==> 4000000176 *)
(* counting phase *)
f[[1]][[1, 2]]++;
f[[1]][[1, 2]]++;
f[[1]][[42, 64]]++;
f[[2]][[100, 11]]++;
(* reporting phase *)
f[[1]] // ArrayRules
f[[2]] // ArrayRules
f // ArrayRules
(*
==>{{1, 2} -> 2, {42, 64} -> 1, {_, _} -> 0}
==>{{100, 11} -> 1, {_, _} -> 0}
==>{{1, 1, 2} -> 2, {1, 42, 64} -> 1, {2, 100, 11} -> 1, {_, _, _} -> 0}
*)
As you can see, ArrayRules makes a nice list with contributions and counts. This can be done for each f[i] separately or the whole bunch together (last line).
In some scenarios (depending upon the performance needed to generate the values), the following easy solution using an auxiliary list (f[i,0]) may be useful:
f[_Integer][{_Integer ..}] := 0;
f[_Integer, 0] := Sequence ## {};
Table[
r = RandomInteger[1000, 2];
f[h = RandomInteger[100000]][r] = RandomInteger[10];
f[h, 0] = Union[f[h, 0], {r}];
, {i, 10^6}];
ExtractConfigurationsAndOccurences[f_, i_] := {f[i, 0], f[i][#] & /# f[i, 0]};
Timing#ExtractConfigurationsAndOccurences[f, 10]
Out[252]= {4.05231*10^-15, {{{172, 244}, {206, 115}, {277, 861}, {299,
862}, {316, 194}, {361, 164}, {362, 830}, {451, 306}, {614,
769}, {882, 159}}, {5, 2, 1, 5, 4, 10, 4, 4, 1, 8}}}
Many thanks for everyone on the help provided. I've been thinking a lot about everybody's input and I believe that in the simulation setup the following is the optimal solution:
SetAttributes[linkedList, HoldAllComplete];
temporarySymbols = linkedList[];
SetAttributes[bookmarkSymbol, Listable];
bookmarkSymbol[symbol_]:=
With[{old = temporarySymbols}, temporarySymbols= linkedList[old,symbol]];
registerConfiguration[index_]:=registerConfiguration[index]=
Module[
{
cs = linkedList[],
bookmarkConfiguration,
accumulator
},
(* remember the symbols we generate so we can remove them later *)
bookmarkSymbol[{cs,bookmarkConfiguration,accumulator}];
getCs[index] := List ## Flatten[cs, Infinity, linkedList];
getCsAndFreqs[index] := {getCs[index],accumulator /# getCs[index]};
accumulator[_]=0;
bookmarkConfiguration[c_]:=bookmarkConfiguration[c]=
With[{oldCs=cs}, cs = linkedList[oldCs, c]];
Function[c,
bookmarkConfiguration[c];
accumulator[c]++;
]
]
pattern = Verbatim[RuleDelayed][Verbatim[HoldPattern][HoldPattern[registerConfiguration [_Integer]]],_];
clearSimulationData :=
Block[{symbols},
DownValues[registerConfiguration]=DeleteCases[DownValues[registerConfiguration],pattern];
symbols = List ## Flatten[temporarySymbols, Infinity, linkedList];
(*Print["symbols to purge: ", symbols];*)
ClearAll /# symbols;
temporarySymbols = linkedList[];
]
It is based on Leonid's solution from one of previous posts, appended with belsairus' suggestion to include extra indexing for configurations that have been processed. Previous approaches are adapted so that configurations can be naturally registered and extracted using the same code more or less. This is hitting two flies at once since bookkeeping and retrieval and strongly interrelated.
This approach will work better in the situation when one wants to add simulation data incrementally (all curves are normally noisy so one has to add runs incrementally to obtain good plots). The sparse array approach will work better when data are generated in one go and then analyzed, but I do not remember being personally in such a situation where I had to do that.
Also, I was rather naive thinking that the data extraction and generation could be treated separately. In this particular case it seems one should have both perspectives in mind. I profoundly apologise for bluntly dismissing any previous suggestions in this direction (there were few implicit ones).
There are some open/minor problems that I do not know how to handle, e.g. when clearing the symbols I cannot clear headers like accumulator$164, I can only clean subvalues associated with it. Have not clue why. Also, if With[{oldCs=cs}, cs = linkedList[oldCs, c]]; is changed into something like cs = linkedList[cs, c]]; configurations are not stored. Have no clue either why the second option does not work. But these minor problems are well defined satellite issues that one can address in the future. By and large the problem seems solved by the generous help from all involved.
Many thanks again for all the help.
Regards
Zoran
p.s. There are some timings, but to understand what is going on I will append the code that is used for benchmarking. In brief, idea is to generate lists of configurations and just Map through them by invoking registerConfiguration. This essentially simulates data generation process. Here is the code used for testing:
fillSimulationData[sampleArg_] :=MapIndexed[registerConfiguration[#2[[1]]][#1]&, sampleArg,{2}];
sampleForIndex[index_]:=
Block[{nsamples,min,max},
min = Max[1,Floor[(9/10)maxSamplesPerIndex]];
max = maxSamplesPerIndex;
nsamples = RandomInteger[{min, max}];
RandomInteger[{1,10},{nsamples,ntypes}]
];
generateSample :=
Table[sampleForIndex[index],{index, 1, nindexes}];
measureGetCsTime :=((First # Timing[getCs[#]])& /# Range[1, nindexes]) // Max
measureGetCsAndFreqsTime:=((First # Timing[getCsAndFreqs[#]])& /# Range[1, nindexes]) // Max
reportSampleLength[sampleArg_] := StringForm["Total number of confs = ``, smallest accumulator length ``, largest accumulator length = ``", Sequence## {Total[#],Min[#],Max[#]}& [Length /# sampleArg]]
The first example is relatively modest:
clearSimulationData;
nindexes=100;maxSamplesPerIndex = 1000; ntypes = 2;
largeSample1 = generateSample;
reportSampleLength[largeSample1];
Total number of confs = 94891, smallest accumulator length 900, largest accumulator length = 1000;
First # Timing # fillSimulationData[largeSample1]
gives 1.375 secs which is fast I think.
With[{times = Table[measureGetCsTime, {50}]},
ListPlot[times, Joined -> True, PlotRange -> {0, Max[times]}]]
gives times around 0.016 secs, and
With[{times = Table[measureGetCsAndFreqsTime, {50}]},
ListPlot[times, Joined -> True, PlotRange -> {0, Max[times]}]]
gives same times. Now the real killer
nindexes = 10; maxSamplesPerIndex = 100000; ntypes = 10;
largeSample3 = generateSample;
largeSample3 // Short
{{{2,2,1,5,1,3,7,9,8,2},92061,{3,8,6,4,9,9,7,8,7,2}},8,{{4,10,1,5,9,8,8,10,8,6},95498,{3,8,8}}}
reported as
Total number of confs = 933590, smallest accumulator length 90760, largest accumulator length = 96876
gives generation times of ca 1.969 - 2.016 secs which is unbeliavably fast. I mean this is like going through the gigantic list of ca one million elements and applying a function to each element.
The extraction times for configs and {configs, freqs} are roughly 0.015 and 0.03 secs respectivelly.
To me this is a mind blowing speed I would never expect from Mathematica!

a very strange question in mathematica

I am doing this in mma v7.0:
r[x_] := Rationalize[x, 0];
N[Nest[Sqrt, 10., 53] // r, 500]
It gave me
1.000000000000000222044604925031308084726333618164062500000000000000000
However, if I go one step further
N[Nest[Sqrt, 10., 54] // r, 500]
I got all zeros. Does anybody know an explanation, or it is a bug?
Besides, looks like this way to produce more digits from the solution Nest[Sqrt, 10., 53] is not working very well. How to obtain more significant digits for this calculation?
Many thanks.
Edit
If I did Nest[Sqrt, 10., 50], I still got a lot of significant digits.
You have no significant digits other than zeros if you do this 54 times. Hence rationalizing as you do (which simply preserves bit pattern) gives what you saw.
InputForm[n53 = Nest[Sqrt, 10., 53]]
Out[180]//InputForm=
1.0000000000000002
InputForm[n54 = Nest[Sqrt, 10., 54]]
Out[181]//InputForm=
1.
Rationalize[n53, 0]
4503599627370497/4503599627370496
Rationalize[n54, 0]
Out[183]= 1
For the curious: the issue is not loss of precision in the sense of degradation with iterations computation. Indeed, iterating these square roots actually increases precision. We can see this with bignum input.
InputForm[n54 = Nest[Sqrt, 10.`20, 54]]
Out[188]//InputForm=
1.0000000000000001278191493200323453724568038240908339267044`36.25561976585499
Here is the actual problem. When we use machine numbers then after 54 iterations there are no significant digits other than zeros in the resulting machine double. That is to say, our size restriction on the numbers is the cause.
The reason is not too mysterious. Call the resulting value 1+eps. Then we have (1+eps)^(2^54) equal (to close approximation) to 10. A second order expansion then shows eps must be smaller than machine epsilon.
InputForm[epsval =
First[Select[
eps /. N[Solve[Sum[eps^j*Binomial[2^54, j], {j, 2}] == 9, eps]],
Head[#] === Real && # > 0 &]]]
Out[237]//InputForm=
1.864563472253985*^-16
$MachineEpsilon
Out[235]= 2.22045*10^-16
Daniel Lichtblau
Wolfram Research
InputForm /# NestList[Sqrt, 10., 54]
10.
3.1622776601683795
1.7782794100389228
1.333521432163324
1.1547819846894583
1.0746078283213176
1.036632928437698
1.018151721718182
1.0090350448414476
1.0045073642544626
1.002251148292913
1.00112494139988
1.0005623126022087
1.00028111678778
1.0001405485169472
1.0000702717894114
1.000035135277462
1.0000175674844227
1.0000087837036347
1.0000043918421733
1.0000021959186756
1.000001097958735
1.0000005489792168
1.0000002744895706
1.000000137244776
1.0000000686223856
1.000000034311192
1.0000000171555958
1.0000000085777978
1.0000000042888988
1.0000000021444493
1.0000000010722245
1.0000000005361123
1.0000000002680562
1.0000000001340281
1.000000000067014
1.000000000033507
1.0000000000167535
1.0000000000083769
1.0000000000041884
1.0000000000020943
1.0000000000010472
1.0000000000005236
1.0000000000002618
1.000000000000131
1.0000000000000655
1.0000000000000329
1.0000000000000164
1.0000000000000082
1.000000000000004
1.000000000000002
1.0000000000000009
1.0000000000000004
1.0000000000000002
1.
Throwing N[x, 500] on this is like trying to squeeze water from a rock.
The calculations above are done in machine precision, which is very fast. If you are willing to give up speed, you can utilize Mathematica's arbitrary precision arithmetic by specifying a non-machine precision on the input values. The "backtick" can be used to do this (as in the example below) or you can use SetPrecision or SetAccuracy. Here I will specify that the input is the number 10 up to 20 digits of precision.
NestList[Sqrt, 10`20, 54]
10.000000000000000000
3.1622776601683793320
1.77827941003892280123
.
.
.
1.00000000000000051127659728012947952
1.00000000000000025563829864006470708
1.000000000000000127819149320032345372
As you can see you do not need to use InputForm as Mathematica will automatically print arbitrary-precision numbers to as many places as it accurately can.
If you do use InputForm or FullForm you will see a backtick and then a number, which is the current precision of that number.

Resources