Select[nested_list, condition] in Mathematica - wolfram-mathematica

Let's say I have a list:
list=Table[{RandomReal[],RandomReal[],RandomReal[]}, {i,1,100}];
I'd like to make a new list based on conditions. Now I've seen that I should use the Select function, but I don't understand how to define the condition where selection should be based on some element of nested list.
Someone asked a similar question and the answer I like was:
data = {{0,2},{2,3},{4,3},{5,4},{8,4}};
filtered = Select[data, First[#]>3&];
However this only works if the condition is set on first element of sublist. In my case:
Select[list,0.2>First[#]>0.1&]
gives all members of the list where 1. element of sublist is between 0.1 and 0.2. But what if I wanted to make selection based on a second element of a sublist, or in general for the nth element?
Also an example with combination of elements would be nice, for example where the sum of first 2 elements of sublist is smaller than 0.5.

First, generating the list is easier as:
list = RandomReal[1, {100, 3}];
You can use Part (see the docs!!), which is equivalent to the [[ ]] syntax, to take the nth element. E.g. this selects those items which have a second element larger than 0.1.
Select[list, #[[2]] > 0.1 &]
Alternatively use
Cases[list, l_ /; l[[2]] > 0.1]
or
Cases[list, {_, y_, _} /; y > 0.1]
I recommend you explore the tutorials in the documentation, especially the part about list manipulation.

Related

Operate on the y's of a {{x1,y1},{x2,y2},...{xn,yn}} list

I've been scratching my head and I can't figure out a way to conveniently apply an operation to the y values of a list of the form{{x1,y1},{x2,y2},...{xn,yn}}. The list is in this form for plotting with ListPlot[] mostly.
The type of operations I'd like to apply would include:
Mathematica Operations. Ex.: LowpassFilter[y's] (not point-by-point, I know)
Generic mathematic point-by-point operations. Ex: y's*10 + 2
I know I can transpose and then filpity-flop turn the list arround and then target each element, and then transpose back and flopity-flip and overwrite the original list. This becomes tiresome after dealing with each case. I bet there is a cleaver way to do this. Or what would be the best way to hold values in a list that can easily be plotted and manipulated?
Thanks
Map[{#[[1]],2+10 #[[2]]}&,{{x1,y1},{x2,y2},...{xn,yn}}]
MapAt[2+10#&,{{x1,y1},{x2,y2},...{xn,yn}},{All,2}]
if you need to operate on the 'y' list as a list, do like this:
Transpose#MapAt[LowpassFilter[#,1]&,
Transpose#{{x1,y1},{x2,y2},...{xn,yn}},2]
Suppose you named your list as l, i.e.
l={{x1,y1},{x2,y2},...{xn,yn}}
You can get all ys by:
ylist=l[[All,2]]
{#, 10 # + 2} & ### lst
{{x1, 2 + 10 x1}, {x2, 2 + 10 x2}, {xn, 2 + 10 xn}}

Generating chains from an order of lists

I am searching for how to accomplish something I've somewhat a grasp on, but do not:
I have n number of lists of varying size:
{A, B, C, D}
{1,2}
{X, Y, Z}
...to the nth potentially
How do I generate all possible chains of 1 item from each level A1X, A1Y, A1Z, etc. Its an algotrithmic and mathematic task, no its not homework(I know school is starting), its part of something I'm working on, I have no code --- I just need to be pointed in the right direction to formulate my terms.
(You didn't ask for code, but I tend to use Python for executable pseudo-code. You still need to translate the core algorithm to the language of your choice).
In effect, you are talking about forming the Cartesian Product of the lists. It can be done in various ways. If the number of lists isn't known ahead of time, a recursive approach is the most natural.
Let L1*L2* ... *Ln denote the list of all strings which are of the form
s1+s2+...+sn where si in Li and + is the concatenation operator. For a basis you could either take n ==1, a single List, or n == 0, no lists at all. In many ways the latter is more elegant, in which case it is natural to define the product of an empty list of strings to be the list whose sole element is the empty string.
Then:
Return [''] if n == 0
Otherwise return
[a+b | a ranges over L1 and b ranges over (L2 * L3 * ... * Ln)]
where (L2 * L3 * ... *Ln) was alread computed recursively (which will just be the empty string if n is 1).
The last list can easily be built up in a nested loop, or expressed more directly in any language which supports list comprehensions.
Here is a Python implementation which returns the list of all products given a list of lists of strings (abbreviated as lls in the code):
def product(lls):
if len(lls) == 0:
return ['']
else:
return [a+b for a in lls[0] for b in product(lls[1:])]
Tested like thus:
lists_of_strings = [['A','B','C','D'],['1','2','3'],['X','Y','Z']]
print(product(lists_of_strings))
With output:
['A1X', 'A1Y', 'A1Z', 'A2X', 'A2Y', 'A2Z', 'A3X', 'A3Y', 'A3Z', 'B1X', 'B1Y', 'B1Z', 'B2X', 'B2Y', 'B2Z', 'B3X', 'B3Y', 'B3Z', 'C1X', 'C1Y', 'C1Z', 'C2X', 'C2Y', 'C2Z', 'C3X', 'C3Y', 'C3Z', 'D1X', 'D1Y', 'D1Z', 'D2X', 'D2Y', 'D2Z', 'D3X', 'D3Y', 'D3Z']
In Python itself there isn't much motivation to do this since the itertools module has a nice product and the same product can be expressed as:
[''.join(p) for p in itertools.product(*lists_of_strings)]

How to construct a list of all Fibonacci numbers less than n in Mathematica

I would like to write a Mathematica function that constructs a list of all Fibonacci numbers less than n. Moreover, I would like to do this as elegantly and functionally as possible(so without an explicit loop).
Conceptually I want to take an infinite list of the natural numbers, map Fib[n] onto it, and then take elements from this list while they are less than n. How can I do this in Mathematica?
The first part can be done fairly easily in Mathematica. Below, I provide two functions nextFibonacci, which provides the next Fibonacci number greater than the input number (just like NextPrime) and fibonacciList, which provides a list of all Fibonacci numbers less than the input number.
ClearAll[nextFibonacci, fibonacciList]
nextFibonacci[m_] := Fibonacci[
Block[{n},
NArgMax[{n, 1/Sqrt[5] (GoldenRatio^n - (-1)^n GoldenRatio^-n) <= m, n ∈ Integers}, n]
] + 1
]
nextFibonacci[1] := 2;
fibonacciList[m_] := Fibonacci#
Range[0, Block[{n},
NArgMax[{n, 1/Sqrt[5] (GoldenRatio^n - (-1)^n GoldenRatio^-n) < m, n ∈ Integers}, n]
]
]
Now you can do things like:
nextfibonacci[15]
(* 21 *)
fibonacciList[50]
(* {0, 1, 1, 2, 3, 5, 8, 13, 21, 34} *)
The second part though, is tricky. What you're looking for is a Haskell type lazy evaluation that will only evaluate if and when necessary (as otherwise, you can't hold an infinite list in memory). For example, something like (in Haskell):
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
which then allows you to do things like
take 10 fibs
-- [0,1,1,2,3,5,8,13,21,34]
takeWhile (<100) fibs
-- [0,1,1,2,3,5,8,13,21,34,55,89]
Unfortunately, there is no built-in support for what you want. However, you can extend Mathematica to implement lazy style lists as shown in this answer, which was also implemented as a package. Now that you have all the pieces that you need, I'll let you work on this yourself.
If you grab my Lazy package from GitHub, your solution is as simple as:
Needs["Lazy`"]
LazySource[Fibonacci] ~TakeWhile~ ((# < 1000) &) // List
If you want to slightly more literally implement your original description
Conceptually I want to take an infinite list of the natural numbers, map Fib[n] onto it, and then take elements from this list while they are less than n.
you could do it as follows:
Needs["Lazy`"]
Fibonacci ~Map~ Lazy[Integers] ~TakeWhile~ ((# < 1000) &) // List
To prove that this is completely lazy, try the previous example without the // List on the end. You'll see that it stops with the (rather ugly) form:
LazyList[First[
LazyList[Fibonacci[First[LazyList[1, LazySource[#1 &, 2]]]],
Fibonacci /# Rest[LazyList[1, LazySource[#1 &, 2]]]]],
TakeWhile[
Rest[LazyList[Fibonacci[First[LazyList[1, LazySource[#1 &, 2]]]],
Fibonacci /# Rest[LazyList[1, LazySource[#1 &, 2]]]]], #1 <
1000 &]]
This consists of a LazyList[] expression whose first element is the first value of the expression that you're lazily evaluating and whose second element is instructions for how to continue the expansion.
Improvements
It's a little bit inefficient to continually call Fibonacci[n] all the time, especially as n starts getting large. It's actually possible to construct a lazy generator that will calculate the current value of the Fibonacci sequence as we stream:
Needs["Lazy`"]
LazyFibonacci[a_,b_]:=LazyList[a,LazyFibonacci[b,a+b]]
LazyFibonacci[]:=LazyFibonacci[1,1]
LazyFibonacci[] ~TakeWhile~ ((# < 1000)&) // List
Finally, we could generalize this up to a more abstract generating function that takes an initial value for an accumulator, a List of Rules to compute the accumulator's value for the next step and a List of Rules to compute the result from the current accumulator value.
LazyGenerator[init_, step_, extract_] :=
LazyList[Evaluate[init /. extract],
LazyGenerator[init /. step, step, extract]]
And could use it to generate the Fibonacci sequence as follows:
LazyGenerator[{1, 1}, {a_, b_} :> {b, a + b}, {a_, b_} :> a]
Ok, I hope I understood the question. But please note, I am not pure math major, I am mechanical engineering student. But this sounded interesting. So I looked up the formula and this is what I can come up with now. I have to run, but if there is a bug, please let me know and I will fix it.
This manipulate asks for n and then lists all Fibonacci numbers less than n. There is no loop to find how many Fibonacci numbers there are less than n. It uses Reduce to solve for the number of Fibonacci numbers less than n. I take the floor of the result and also threw away a constant that came up with in the solution a complex multiplier.
And then simply makes a table of all these numbers using Mathematica Fibonacci command. So if you enter n=20 it will list 1,1,2,3,5,8,13 and so on. I could do it for infinity as I ran out of memory (I only have 8 GB ram on my pc).
I put the limit for n to 500000 Feel free to edit the code and change it.
Manipulate[
Module[{k, m},
k = Floor#N[Assuming[Element[m, Integers] && m > 0,
Reduce[f[m] == n, m]][[2, 1, 2]] /. Complex[0, 2] -> 0];
TableForm#Join[{{"#", "Fibonacci number" }},
Table[{i, Fibonacci[i]}, {i, 1, k}]]
],
{{n, 3, "n="}, 2, 500000, 1, Appearance -> "Labeled", ImageSize -> Small},
SynchronousUpdating -> False,
ContentSize -> {200, 500}, Initialization :>
{
\[CurlyPhi][n_] := ((1 + Sqrt[5])/2)^n;
\[Psi][n_] := -(1/\[CurlyPhi][n]);
f[n_] := (\[CurlyPhi][n] - \[Psi][n])/Sqrt[5];
}]
Screen shot
The index k of the Fibonacci number Fk is k=Floor[Log[GoldenRatio,Fk]*Sqrt[5]+1/2]],
https://en.wikipedia.org/wiki/Fibonacci_number. Hence, the list of Fibonacci numbers less than or equal to n is
FibList[n_Integer]:=Fibonacci[Range[Floor[Log[GoldenRatio,Sqrt[5]*n+1/2]]]]

What is the optimal way to match list entries after rounding in Mathematica?

I have two lists in Mathematica:
list1 = {{a1, b1, c1}, ... , {an, bn, cn}}
and
list2 = {{d1, e1, f1}, ... , {dn, en, fn}}
the lists contain numerical results and are roughly consisting of 50000 triplets each. Each triplet represents two coordinates and a numerical value of some property at these coordinates. Each list has different length and the coordinates are not quite the same range. My intention is to correlate the numerical values of the third property from each list so I need to scan through the lists and identify the properties whose coordinates are matching. My output will be something like
list3 = {{ci, fj}, ... , {cl, fm}}
where
{ai, bi}, ..., {al, bl}
will be (roughly) equal to, respectively
{dj, ej}, ..., {dm, em}
By "roughly" I mean the coordinates will match once rounded to some desired accuracy:
list1(2) = Round[{#[[1]], #[[2]], #[[3]]}, {1000, 500, 0.1}] & /# list1(2)
so after this process I's have two lists that contain some matching coordinates amongst them. My question is how to perform the operation of identifying them and picking out the pairs of properties in the optimal way?
An example of a 6 element list would be
list1 = {{-1.16371*10^6, 548315., 14903.}, {-1.16371*10^6, 548322., 14903.9},
{-1.16371*10^6, 548330., 14904.2}, {-1.16371*10^6, 548337., 14904.8},
{-1.16371*10^6, 548345., 14905.5}, {-1.16371*10^6, 548352., 14911.5}}
You may want to use something like this:
{Round[{#, #2}], #3} & ### Join[list1, list2];
% ~GatherBy~ First ~Select~ (Length## > 1 &)
This will group all data points that having matching coordinates after rounding. You can use a second argument to Round to specify the fraction to round by.
This assumes that there are not duplicated points within a single list. If there are, you will need to remove those to get useful pairs. Tell me if this is the case and I will update my answer.
Here is another method using Sow and Reap. The same caveats apply. Both of these examples are simply guidelines for how you may implement your functionality.
Reap[
Sow[#3, {Round[{#, #2}]}] & ### Join[list1, list2],
_,
List
][[2]] ~Cases~ {_, {_, __}}
To deal with duplicate-after-round elements within each list, you may use Round and GatherBy on each list as follows.
newList1 = GatherBy[{Round[{#, #2}], #3} & ### list1, First][[All, 1]];
newList2 = GatherBy[{Round[{#, #2}], #3} & ### list2, First][[All, 1]];
and then proceed with:
newList1 ~Join~ newList2 ~GatherBy~ First ~Select~ (Length## > 1 &)
Here's my approach, relying on Nearest to match the points.
Let's assume that list1 doesn't have fewer elements than list2. (Otherwise you can swap them using {list1, list2} = {list2, list1})
(* extract points *)
points1=list1[[All,{1,2}]];
points2=list2[[All,{1,2}]];
(* build a "nearest-function" for matching them *)
nf=Nearest[points1]
(* two points match only if they're closer than threshold *)
threshold=100;
(* This function will find the match of a point from points2 in points1.
If there's no match, the point is discarded using Sequence[]. *)
match[point_]:=
With[{m=First#nf[point]},
If[Norm[m-point]<threshold, {m,point}, Unevaluated#Sequence[]]
]
(* find matching point-pairs *)
matches=match/#points1;
(* build hash tables to retrieve the properties associated with points quickly *)
Clear[values1,values2]
Set[values1[{#1,#2}],#3]&###list1;
Set[values2[{#1,#2}],#3]&###list2;
(* get the property-pairs *)
{values1[#1],values2[#2]}&###matches
An altrenative is to use a custom DistanceFunction in nearest to avoid the use of values1 & values2, and have a shorter program. This may be slower or faster, I didn't test this with large data at all.
Note: How complicated the implementation needs to be really depends on your particular dataset. Does each point from the first set have a match in the second one? Are there any duplicates? How close can points from the same dataset be? Etc. I tried to provide something which can be tweaked to be relatively robust, at the cost of having longer code.

Finding first element of a Mathematica list greater than a threshold

I was wondering how I could obtain the first element of a (already ordered) list that is greater than a given threshold.
I don't know really well the list manipulation function in Mathematica, maybe someone can give me a trick to do that efficiently.
Select does what you need, and will be consistent, respecting the pre-existing order of the list:
Select[list, # > threshold &, 1]
For example:
In[1]:= Select[{3, 5, 4, 1}, # > 3 &, 1]
Out[1]= {5}
You can provide whatever threshold or criterion function you need in the second argument.
The third argument specifies you only one (i.e., the first) element that matches.
Hope that helps!
Joe correctly states in his answer that one would expect a binary search technique to be faster than Select, which seem to just do a linear search even if the list is sorted:
ClearAll[selectTiming]
selectTiming[length_, iterations_] := Module[
{lst},
lst = Sort[RandomInteger[{0, 100}, length]];
(Do[Select[lst, # == 2 &, 1], {i, 1, iterations}] // Timing //
First)/iterations
]
(I arbitrarily put the threshold at 2 for demonstration purposes).
However, the BinarySearch function in Combinatorica is a) not appropriate (it returns an element which does match the requested one, but not the first (leftmost), which is what the question is asking.
To obtain the leftmost element that is larger than a threshold, given an ordered list, we may proceed either recursively:
binSearch[lst_,threshold_]:= binSearchRec[lst,threshold,1,Length#lst]
(*
return position of leftmost element greater than threshold
breaks if the first element is greater than threshold
lst must be sorted
*)
binSearchRec[lst_,threshold_,min_,max_] :=
Module[{i=Floor[(min+max)/2],element},
element=lst[[i]];
Which[
min==max,max,
element <= threshold,binSearchRec[lst,threshold,i+1,max],
(element > threshold) && ( lst[[i-1]] <= threshold ), i,
True, binSearchRec[lst,threshold,min,i-1]
]
]
or iteratively:
binSearchIterative[lst_,threshold_]:=Module[
{min=1,max=Length#lst,i,element},
While[
min<=max,
i=Floor[(min+max)/2];
element=lst[[i]];
Which[
min==max, Break[],
element<=threshold, min=i+1,
(element>threshold) && (lst[[i-1]] <= threshold), Break[],
True, max=i-1
]
];
i
]
The recursive approach is clearer but I'll stick to the iterative one.
To test its speed,
ClearAll[binSearchTiming]
binSearchTiming[length_, iterations_] := Module[
{lst},
lst = Sort[RandomInteger[{0, 100}, length]];
(Do[binSearchIterative[lst, 2], {i, 1, iterations}] // Timing //
First)/iterations
]
which produces
so, much faster and with better scaling behaviour.
Actually it's not necessary to compile it but I did anyway.
In conclusion, then, don't use Select for long lists.
This concludes my answer. There follow some comments on doing a binary search by hand or via the Combinatorica package.
I compared the speed of a (compiled) short routine to do binary search vs the BinarySearch from Combinatorica. Note that this does not do what the question asks (and neither does BinarySearch from Combinatorica); the code I gave above does.
The binary search may be implemented iteratively as
binarySearch = Compile[{{arg, _Integer}, {list, _Integer, 1}},
Module[ {min = 1, max = Length#list,
i, x},
While[
min <= max,
i = Floor[(min + max)/2];
x = list[[i]];
Which[
x == arg, min = max = i; Break[],
x < arg, min = i + 1,
True, max = i - 1
]
];
If[ 0 == max,
0,
max
]
],
CompilationTarget -> "C",
RuntimeOptions -> "Speed"
];
and we can now compare this and BinarySearch from Combinatorica. Note that a) the list must be sorted b) this will not return the first matching element, but a matching element.
lst = Sort[RandomInteger[{0, 100}, 1000000]];
Let us compare the two binary search routines. Repeating 50000 times:
Needs["Combinatorica`"]
Do[binarySearch[2, lst], {i, 50000}] // Timing
Do[BinarySearch[lst, 2], {i, 50000}] // Timing
(*
{0.073437, Null}
{4.8354, Null}
*)
So the handwritten one is faster. Now since in fact a binary search just visits 6-7 points in the list for these parameters (something like {500000, 250000, 125000, 62500, 31250, 15625, 23437} for instance), clearly the difference is simply overhead; perhaps BinarySearch is more general, for instance, or not compiled.
You might want to look at TakeWhile[] and LengthWhile[] as well.
http://reference.wolfram.com/mathematica/ref/TakeWhile.html
http://reference.wolfram.com/mathematica/ref/LengthWhile.html
list /. {___, y_ /; y > 3, ___} :> {y}
For example
{3, 5, 4, 1} /. {___, y_ /; y > 3, ___} :> {y}
{5}
Using Select will solve the problem, but it is a poor solution if you care about efficiency. Select goes over all the elements of the list, and therefore will take time which is linear in the length of the list.
Since you say the list is ordered, it is much better to use BinarySearch, which will work in a time which is logarithmic in the size of the list. The expression (edit: I have made a small adjustment since the previous expression I wrote did not handle correctly recurring elements in the list. another edit: this still doesn't work when the threshold itself appears in the list as a recurring element, see comments):
Floor[BinarySearch[list,threshold]+1]
will give you the index of the desired element. If all the elements are smaller than the threshold, you'll get the length of the list plus one.
p.s. don't forget to call Needs["Combinatorica'"] before using BinarySearch.
Just for future reference, starting from v10 you can use SelectFirst
It has some added niceties, such as returning Missing[] or default values.
From the docs:
SelectFirst[{e1,e2,…}, crit] gives the first ei for which crit[ei] is True, or Missing["NotFound"] if none is found.
SelectFirst[{e1,e2,…}, crit, default] gives default if there is no ei such that crit[ei] is True.
For your specific case, you would use:
SelectFirst[list, # > threshold &]

Resources