I have 3 xyplots from lattice. Up to now I have only ever used
print(pd1, split = c(1,1,2,2), more = TRUE)
print(pd2, split = c(2, 1, 2, 2), more = TRUE) etc (using split)
to arrange plots in 2x2 manner. However, how can I use it for 1x3 or 3x3 arrangement? I tried to do some positions but I have not quite understood how it actually works.
# I think this is what you want
data <- data.frame(class = LETTERS[1:6], value = 1:6)
pd1 <- dotplot(value ~ class, data)
pd2 <- dotplot(class ~ value, data)
pd3 <- dotplot(class ~ value | cut(value, c(0, 3, 6)), data)
print(pd1, split = c(1, 1, 1, 3), more = TRUE)
print(pd2, split = c(1, 2, 1, 3), more = TRUE)
print(pd1, split = c(1, 3, 1, 3))
Related
I have used the solution mentioned here to get the top n elements of a Scala Iterable, efficiently.
End example:
scala> val li = List (4, 3, 6, 7, 1, 2, 9, 5)
li: List[Int] = List(4, 3, 6, 7, 1, 2, 9, 5)
scala> top (2, li)
res0: List[Int] = List(2, 1)
Now, suppose I want to get the top n elements with a lower resolution. The range of integers may somehow be divided/binned/grouped to sub-ranges such as modulo 2: {0-1, 2-3, 4-5, ...}, and in each sub-range I do not differentiate between integers, e.g. 0 and 1 are all the same to me. Therefore, the top element in the above example would still be 1, but the next element would either be 2 or 3. More clearly these results are equivalent:
scala> top (2, li)
res0: List[Int] = List(2, 1)
scala> top (2, li)
res0: List[Int] = List(3, 1)
How do I change this nice function to fit these needs?
Is my intuition correct and this sort should be faster? Since the sort is
on the bins/groups, then taking all or some of the elements of the
bins with no specific order until we get to n elements.
Comments:
The binning/grouping is something simple and fixed like modulo k, doesn't have to
be generic like allowing different lengths of sub-ranges
Inside each bin, assuming we need only some of the elements, we can
just take first elements, or even some random elements, doesn't have
to be some specific system.
Per the comment, you're just changing the comparison.
In this version, 4 and 3 compare equal and 4 is taken first.
object Firstly extends App {
def firstly(taking: Int, vs: List[Int]) = {
import collection.mutable.{ SortedSet => S }
def bucketed(i: Int) = (i + 1) / 2
vs.foldLeft(S.empty[Int]) { (s, i) =>
if (s.size < taking) s += i
else if (bucketed(i) >= bucketed(s.last)) s
else {
s += i
s -= s.last
}
}
}
assert(firstly(taking = 2, List(4, 6, 7, 1, 9, 3, 5)) == Set(4, 1))
}
Edit: example of sorting buckets instead of keeping sorted "top N":
scala> List(4, 6, 7, 1, 9, 3, 5).groupBy(bucketed).toList.sortBy {
| case (i, vs) => i }.flatMap {
| case (i, vs) => vs }.take(5)
res10: List[Int] = List(1, 4, 3, 6, 5)
scala> List(4, 6, 7, 1, 9, 3, 5).groupBy(bucketed).toList.sortBy {
| case (i, vs) => i }.map {
| case (i, vs) => vs.head }.take(5)
res11: List[Int] = List(1, 4, 6, 7, 9)
Not sure which result you prefer, of the last two.
As to whether sorting buckets is better, it depends how many buckets.
How about mapping with integer division before using the original algorithm?
def top(n: Int, li: List[Int]) = li.sorted.distinct.take(n)
val li = List (4, 3, 6, 7, 1, 2, 9, 5)
top(2, li) // List(1, 2)
def topBin(n: Int, bin: Int, li: List[Int]) =
top(n, li.map(_ / bin)) // e.g. List(0, 1)
.map(i => (i * bin) until ((i + 1) * bin))
topBin(2, 2, li) // List(0 to 1, 2 to 3)
Below is the program that displays all positive solutions of equation x1+x2+...+xk = n, where k and n are positive integers:
func solution(k: Int, n: Int) {
if k > n || k <= 0 {
print("No solution")
} else
if k==1 {
print(n)
} else {
for i in 1...(n-k+1) {
print(i, terminator:"")
solution(k-1, n: n-i)
print("")
}
}
}
solution(4, n: 4)
This program runs well with n = 4 and k = 1,2,4, but it displays incorrectly when k = 3. Can somebody helps find the mistake?
The problem is for n = 4 and case k = 1, 2, 4, there is only one solution for each i, so your print(i, terminator:"") work correctly.
However, for case k = 3, for example, after printing 1 at k = 3, so there are more than one correct cases: (1 , 2, 1) or ( 1, 1, 2), which means, just one command print(1, terminator:"") at k = 1 will not be sufficient.
Image the printing routine will be smt like:
at k = 3, i = 1, print 1
at k = 2, i = 1, print 1
at k = 1, i = 2, print 2
So, at this time, we have (1, 1, 2), looks good.
However, when we backtrack to k = 2, i = 2, print 2
at k = 1, i = 1, print 1,
So, we only have (2, 1), which is not correct.
One simple way to fix this is rather than printing at each recursive step, you just store all result in one array, and print this array when k reaches 0
I am given two collections(RDDs). Let's say and a number of samples
val v = sc.parallelize(List("a", "b", "c"))
val a = sc.parallelize(List(1, 2, 3, 4, 5))
val samplesCount = 2
I want to create two collections(samples) consisting of pairs where one value is from the 'v' and second one from 'a'. Each collection must consist all values from v and random values from 'a'.
Example result would be:
(
(("a", 3), ("b", 5), ("c", 1)),
(("a", 4), ("b", 2), ("c", 5))
)
One more to add is that the values from v or a can't repeat within a sample.
I can't think of any good way to achieve this.
You randomly shuffle the RDD to be sampled and then join the two RDDs by line index:
def shuffle[A: reflect.ClassTag](a: RDD[A]): RDD[A] = {
val randomized = a.map(util.Random.nextInt -> _)
randomized.sortByKey().values
}
def joinLines[A: reflect.ClassTag, B](a: RDD[A], b: RDD[B]): RDD[(A, B)] = {
val aNumbered = a.zipWithIndex.map { case (x, i) => (i, x) }
val bNumbered = b.zipWithIndex.map { case (x, i) => (i, x) }
aNumbered.join(bNumbered).values
}
val v = sc.parallelize(List("a", "b", "c"))
val a = sc.parallelize(List(1, 2, 3, 4, 5))
val sampled = joinLines(v, shuffle(a))
RDDs are immutable, so you don't need to "multiply" anything. If you want multiple samples just do:
val sampledRDDs: Seq[RDD[(String, Int)]] =
(1 to samplesCount).map(_ => joinLines(v, shuffle(a)))
Let's say I want to write a function that does this:
input: [1,1,3,3,4,2,2,5,6,6]
output: [[1,1],[3,3],[4],[2,2],[5],[6,6]]
It's grouping adjacent elements that are same.
What should the name of this method be? Is there a standard name for this operation?
In [1,1,3,3,4,2,2,5,6,6], a thing like [1,1] is very often referred to as run (as in run-length encoding, see RLE in Scala). I'd therefore call the method groupRuns.
#tailrec
def groupRuns[A](c: Seq[A], acc: Seq[Seq[A]] = Seq.empty): Seq[Seq[A]] = {
c match {
case Seq() => acc
case xs =>
val (same, rest) = xs.span { _ == xs.head }
groupRuns(rest, acc :+ same)
}
}
scala> groupRuns(Vector(1, 1, 3, 3, 4, 2, 2, 5, 6, 6))
res7: Seq[Seq[Int]] = List(Vector(1, 1), Vector(3, 3), Vector(4), Vector(2, 2), Vector(5), Vector(6, 6))
For[n = 1, n < 6, n = n + 1,
For[m = 1, m < 6, m = m + 1, abc = doc[[n]];
kk = doc[[m]];
v =vector[abc, kk];
vl = VectorLength[v]]]
I want to store the data from each loop into an array or table form. How can I do that?
Try using a Table instead of two For loops. It returns a list of lists of the results (a matrix basically)
Table[
abc = doc[[n]];
kk = doc[[m]];
v = vector[abc, kk];
vl = VectorLength[v], {n, 1, 5}, {m, 1, 5}]
It's not clear to me what data you want to save, but the general way to do this is to use Sow and Reap.
Reap[
For[n = 1, n < 6, n = n + 1, For[m = 1, m < 6, m = m + 1,
abc = doc[[n]];
kk = doc[[m]];
Sow[v = vector[abc, kk]];
vl = VectorLength[v]]]
][[2, 1]]
This saves every value of v = vector[abc, kk]. The Part extraction [[2, 1]] returns only this list.
If you want to save multiple data sets, you can use tags within Sow:
Reap[
For[n = 1, n < 6, n = n + 1, For[m = 1, m < 6, m = m + 1,
abc = doc[[n]];
kk = doc[[m]];
Sow[v = vector[abc, kk], "v"];
Sow[vl = VectorLength[v], "v1"]
]]
]
Here I omit the Part extraction. Output is in the from {body, {{data1, ...}, {data2, ...}}} where body is any output from the expression itself (Null in the case of For). Data sets appear in the order they were first sown to. You can get an explicit order of sets with another argument of Reap as follows:
Reap[
For[ ... ],
{"v1", "v"}
]
See the documentation for Reap for more options.