Kth smallest element in a BST(scheme) - algorithm

Please disregard my broken English as i’m not a native speaker.
I’m looking for the best way to find the kth smallest element in a BST, I’ve thought of ways like appending the tree to a list and traverse that list, but that takes too much time O(n)
I’ve also considered deleting elements from the tree then find the smallest but that also takes more time.
What is the best algorithm to approach this problem?
As scheme is a functional programming language, the solution must be recursive. I’ve tried to look for an answer but most of the answers in C or Java would use some kind of an iterative format.
Thank you for your help,
My function should look like
(Define (kth-smallest T k)
...)

If the BST is ordered and balanced, such that the leaf not most left is the lowest value. Then the nth lowest value would be the nth value you iterate over in an in order tree traversal.
So finding the lowest kth value in a balanced tree is between O(log n) and O(2n). O(n) it is then.
My implementation would create a helper that takes a continuation as well as k and the tree. The default continuation can be (lambda (v) #f) and thus if you want the 30th smallest in a 10 node tree it will call that and you get #f back. The second you find the deepest node and it's k is zero instead of calling the continuation it evaluates to the value of the current node.
If you were to delete the lowest value k times you'd have O(k * log n) ~ O(n log n) > O(n)
Good luck

If the size of the left subtree is larger than k, we look in the left sub-tree, if it is smaller we look at the right subtree with the new k: k - size-of-left-subtree - 1, otherwise (in the equal case) we return the value at the current node. This is performed in O(lg n) time.
#lang racket
(struct no-info ())
(define NONE (no-info))
; [BST X] is one of:
; - (node NONE NONE NONE 0)
; - (node BST BST X Number)
(struct node [left right val size])
(define leaf (node NONE NONE NONE 0))
; examples
(define lt (node (node (node leaf leaf 10 1) (node leaf leaf 24 1) 15 3) leaf 29 4))
(define rt (node (node leaf leaf 77 1) (node leaf (node leaf leaf 99 1) 95 2) 89 4))
(define t (node lt rt 63 9))
; 63
; / \
; 29 89
; / / \
; 15 77 95
; / \ \
; 10 24 99
; node val: 10 15 24 29 63 77 89 95 99
; rank: 1 2 3 4 5 6 7 8 9
; kth-smallest : BST -> X
; the value of the `k`th smallest node in `t`
(define (kth-smallest t k)
(define (kth-smallest-h t k)
(cond [(equal? (node-size t) 1) (node-val t)]
[else (let ([s (node-size (node-left t))])
(cond [(> s k) (kth-smallest-h (node-left t) k)]
[(< s k) (kth-smallest-h (node-right t) (sub1 (- k s)))]
[else (node-val t)]))]))
(if (or (<= k 0) (> k (node-size t)))
(error "k out of range")
(kth-smallest-h t (sub1 k))))
(map (λ (x) (kth-smallest t x)) '(1 2 3 4 5 6 7 8 9))
; => '(10 15 24 29 63 77 89 95 99)

Related

Finding primes up to a certain number in Racket

I'm learning Racket (with the HtDP course) and it's my first shot at a program in a functional language.
I've tried to design a function that finds all primes under a certain input n using (what I think is) a functional approach to the problem, but the program can get really slow (86 seconds for 100.000, while my Python, C and C++ quickly-written solutions take just a couple of seconds).
The following is the code:
;; Natural Natural -> Boolean
;; Helper function to avoid writing the handful (= 0 (modulo na nb))
(define (divisible na nb) (= 0 (modulo na nb)))
;; Natural ListOfNatural -> Boolean
;; n is the number to check, lop is ALL the prime numbers less than n
(define (is-prime? n lop)
(cond [(empty? lop) true]
[(divisible n (first lop)) false]
[ else (is-prime? n (rest lop))]))
;; Natural -> ListOfNatural
(define (find-primes n)
(if (= n 2)
(list 2)
(local [(define LOP (find-primes (sub1 n)))]
(if (is-prime? n LOP)
(append LOP (list n))
LOP))))
(time (find-primes 100000))
I'm using the divisible function instead of just plowing the rest in because I really like to have separated functions when they could be of use in another part of the program. I also should probably define is-prime? inside of find-primes, since no one will ever call is-prime? on a number while also giving all the prime numbers less than that number.
Any pointers on how to improve this?
Here are some ideas for improving the performance, the procedure now returns in under two seconds for n = 100000.
(define (is-prime? n lop)
(define sqrtn (sqrt n))
(if (not (or (= (modulo n 6) 1) (= (modulo n 6) 5)))
false
(let loop ([lop lop])
(cond [(or (empty? lop) (< sqrtn (first lop))) true]
[(zero? (modulo n (first lop))) false]
[else (loop (rest lop))]))))
(define (find-primes n)
(cond [(<= n 1) '()]
[(= n 2) '(2)]
[(= n 3) '(2 3)]
[else
(let loop ([lop '(2 3)] [i 5])
(cond [(> i n) lop]
[(is-prime? i lop) (loop (append lop (list i)) (+ i 2))]
[else (loop lop (+ i 2))]))]))
Some of the optimizations are language-related, others are algorithmic:
The recursion was converted to be in tail position. In this way, the recursive call is the last thing we do at each step, with nothing else to do after it - and the compiler can optimize it to be as efficient as a loop in other programming languages.
The loop in find-primes was modified for only iterating over odd numbers. Note that we go from 3 to n instead of going from n to 2.
divisible was inlined and (sqrt n) is calculated only once.
is-prime? only checks up until sqrt(n), it makes no sense to look for primes after that. This is the most important optimization, instead of being O(n) the algorithm is now O(sqrt(n)).
Following #law-of-fives's advice, is-prime? now skips the check when n is not congruent to 1 or 5 modulo 6.
Also, normally I'd recommend to build the list using cons instead of append, but in this case we need the prime numbers list to be constructed in ascending order for the most important optimization in is-prime? to work.
Here's Óscar López's code, tweaked to build the list in the top-down manner:
(define (is-prime? n lop)
(define sqrtn (sqrt n))
(let loop ([lop lop])
(cond [(or (empty? lop) (< sqrtn (mcar lop))) true]
[(zero? (modulo n (mcar lop))) false]
[else (loop (mcdr lop))])))
(define (find-primes n)
(let* ([a (mcons 3 '())]
[b (mcons 2 a)])
(let loop ([p a] [i 5] [d 2] ; d = diff +2 +4 +2 ...
[c 2]) ; c = count of primes found
(cond [(> i n) c]
[(is-prime? i (mcdr a))
(set-mcdr! p (mcons i '()))
(loop (mcdr p) (+ i d) (- 6 d) (+ c 1))]
[else (loop p (+ i d) (- 6 d) c )]))))
Runs at about ~n1.25..1.32, empirically; compared to the original's ~n1.8..1.9, in the measured range, inside DrRacket (append is the culprit of that bad behaviour). The "under two seconds" for 100K turns into under 0.05 seconds; two seconds gets you well above 1M (one million):
; (time (length (find-primes 100000))) ; with cons times in milliseconds
; 10K 156 ; 20K 437 ; 40K 1607 ; 80K 5241 ; 100K 7753 .... n^1.8-1.9-1.7 OP's
; 10K 62 ; 20K 109 ; 40K 421 ; 80K 1217 ; 100K 2293 .... n^1.8-1.9 Óscar's
; mcons:
(time (find-primes 2000000))
; 100K 47 ; 200K 172 ; 1M 1186 ; 2M 2839 ; 3M 4851 ; 4M 7036 .... n^1.25-1.32 this
; 9592 17984 78498 148933 216816 283146
It's still just a trial division though... :) The sieve of Eratosthenes will be much faster yet.
edit: As for set-cdr!, it is easy to emulate any lazy algorithm with it... Otherwise, we could use extendable arrays (lists of...), for the amortized O(1) snoc/append1 operation (that's lots and lots of coding); or maintain the list of primes split in two (three, actually; see the code below), building the second portion in reverse with cons, and appending it in reverse to the first portion only every so often (specifically, judging the need by the next prime's square):
; times: ; 2M 1934 ; 3M 3260 ; 4M 4665 ; 6M 8081 .... n^1.30
;; find primes up to and including n, n > 2
(define (find-primes n)
(let loop ( [k 5] [q 9] ; next candidate; square of (car LOP2)
[LOP1 (list 2)] ; primes to test by
[LOP2 (list 3)] ; more primes
[LOP3 (list )] ) ; even more primes, in reverse
(cond [ (> k n)
(append LOP1 LOP2 (reverse LOP3)) ]
[ (= k q)
(if (null? (cdr LOP2))
(loop k q LOP1 (append LOP2 (reverse LOP3)) (list))
(loop (+ k 2)
(* (cadr LOP2) (cadr LOP2)) ; next prime's square
(append LOP1 (list (car LOP2)))
(cdr LOP2) LOP3 )) ]
[ (is-prime? k (cdr LOP1))
(loop (+ k 2) q LOP1 LOP2 (cons k LOP3)) ]
[ else
(loop (+ k 2) q LOP1 LOP2 LOP3 ) ])))
;; n is the number to check, lop is list of prime numbers to check it by
(define (is-prime? n lop)
(cond [ (null? lop) #t ]
[ (divisible n (car lop)) #f ]
[ else (is-prime? n (cdr lop)) ]))
edit2: The easiest and simplest fix though, closest to your code, was to decouple the primes calculations of the resulting list, and of the list to check divisibility by. In your
(local [(define LOP (find-primes (sub1 n)))]
(if (is-prime? n LOP)
LOP is used as the list of primes to check by, and it is reused as part of the result list in
(append LOP (list n))
LOP))))
immediately afterwards. Breaking this entanglement enables us to stop the generation of testing primes list at the sqrt of the upper limit, and thus it gives us:
;times: ; 1M-1076 2M-2621 3M-4664 4M-6693
; n^1.28 ^1.33 n^1.32
(define (find-primes n)
(cond
((<= n 4) (list 2 3))
(else
(let* ([LOP (find-primes (inexact->exact (floor (sqrt n))))]
[lp (last LOP)])
(local ([define (primes k ps)
(if (<= k lp)
(append LOP ps)
(primes (- k 2) (if (is-prime? k LOP)
(cons k ps)
ps)))])
(primes (if (> (modulo n 2) 0) n (- n 1)) '()))))))
It too uses the same is-prime? code as in the question, unaltered, as does the second variant above.
It is slower than the 2nd variant. The algorithmic reason for this is clear — it tests all numbers from sqrt(n) to n by the same list of primes, all smaller or equal to the sqrt(n) — but in testing a given prime p < n it is enough to use only those primes that are not greater than sqrt(p), not sqrt(n). But it is the closest to your original code.
For comparison, in Haskell-like syntax, under strict evaluation,
isPrime n lop = null [() | p <- lop, rem n p == 0]
-- OP:
findprimes 2 = [2]
findprimes n = lop ++ [n | isPrime n lop]
where lop = findprimes (n-1)
= lop ++ [n | n <- [q+1..n], isPrime n lop]
where lop = findprimes q ; q = (n-1)
-- 3rd:
findprimes n | n < 5 = [2,3]
findprimes n = lop ++ [n | n <- [q+1..n], isPrime n lop]
where lop = findprimes q ;
q = floor $ sqrt $ fromIntegral n
-- 2nd:
findprimes n = g 5 9 [2] [3] []
where
g k q a b c
| k > n = a ++ b ++ reverse c
| k == q, [h] <- b = g k q a (h:reverse c) []
| k == q, (h:p:ps) <- b = g (k+2) (p*p) (a++[h]) (p:ps) c
| isPrime k a = g (k+2) q a b (k:c)
| otherwise = g (k+2) q a b c
The b and c together (which is to say, LOP2 and LOP3 in the Scheme code) actually constitute a pure functional queue a-la Okasaki, from which sequential primes are taken and appended at the end of the maintained primes prefix a (i.e. LOP1) now and again, on each consecutive prime's square being passed, for a to be used in the primality testing by isPrime.
Because of the rarity of this appending, its computational inefficiency has no impact on the time complexity of the code overall.

Summing up the first n elements of each row in a 2D array in Scheme

I have the following array in Scheme:
((64 28 52 24) (68 29 62 29) (20 72 48 60) (45 102 75 51) (36 84 72 64) (80 9 63 60) (20 56 72 24) (57 53 88 63))
I want to form a 2d array by summing up 'combining' n rows in a single row. For example, let's say that n= 2 for the first two rows we will be combined into one row ((132 57 114 53) ...)?
okay so you have a list of things (rows) in this case, and you want to apply an operation f (adding rows) on them n at a time. So I'd make a general function for that (on-chunks n f lst).
Then your operation can be defined as (on-chunks 2 row-sum matrix). It's easy to define row-sum: (define (row-sum row1 row2) (map + row1 row2)).
For the chunking part, I would recommend defining a helper function for it that takes one chunk off the front of the list and returns two values: that chunk, the rest of the list. Then on-chunks can be implemented with recursion by calling that function. e.g. (chunk-aux 2 '(a b c d e f)) ;=> (a b) (c d e f).
I hope that helps!
Expanding upon the answer above, a fold would be a very easy way to recursively sum up n rows, while srfi-1's (take) and (drop) procedures will split the two-dimensional list as needed:
(use srfi-1)
(define (sum-rows xs)
(foldl (lambda (x acc)
(map + x acc))
(car xs)
(cdr xs)))
(define (merge-rows xs n)
(cons (sum-rows (take xs n))
(drop xs n)))
Though performance wouldn't be ideal for absurdly long lists, as (take) and (drop) would be redundant computations.

Checks the primality of consecutive odd integers in a specified range

The following program finds the smallest integral divisor (greater than 1) of a given number n. It does this in a straightforward way, by testing n for divisibility by successive integers starting with 2.
n is prime if and only if n is its own smallest divisor.
(define (square x) (* x x))
(define (divisible? a b)
(= (remainder a b) 0))
(define (find-divisor n test)
(cond ((> (square test) n) n)
((divisible? n test) test)
(else (find-divisor n (+ test 1)))))
(define (smallest-divisor n)
(find-divisor n 2))
(define (prime? n)
(= (smallest-divisor n) n))
How to write a procedure that checks the primality of consecutive odd integers in a specified range?
(define (search_for_primes from to)
(cond ((> from to) false)
((prime? from) (display from))
(else (search_for_primes (+ 1 from) to))))
My solution just write 1 to the output.
A cond will stop at the first match and execute the corresponding expressions only. So if you execute (search_for_primes 1 xxx), 1 is erroneously identified as a prime and the procedure stops there.
What you want is something like
(define (search_for_primes from to)
(unless (> from to)
(when (prime? from)
(display from)
(display " "))
(search_for_primes (+ 1 from) to)))
where the recursion is done regardless of whether you found a prime or not.
Testing:
> (search_for_primes 2 100)
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97
You should definitely start with doing an efficient sieve (like the sieve of Eratosthenes) over the range to efficiently catch multiples of small primes. If your numbers are small, just doing that up to sqrt(n) is good enough. (This is good enough for, for example, Project Euler problems.)
If your range is small and numbers large, use that to just get "likely primes", then use your favorite primality test (see https://en.wikipedia.org/wiki/Primality_test for some options) on each one.
If your range is large and your numbers are large...you've got problems. :-)

Cut the stick HackerRank Challenge Lisp implementation

I am pretty stumped right now. Mind you the struggle has taught me a lot about lisp already. However, I may need a little nudge or guidance at this point.
Cut the sticks challenge
You are given N sticks, where each stick is of positive integral length. A cut operation is performed on the sticks such that all of them are reduced by the length of the smallest stick.
Suppose we have 6 sticks of length
5 4 4 2 2 8
then in one cut operation we make a cut of length 2 from each of the 6 sticks. For next cut operation 4 sticks are left (of non-zero length), whose length are
3 2 2 6
Above step is repeated till no sticks are left.
Given length of N sticks, print the number of sticks that are cut in subsequent cut operations.
Input Format
The first line contains a single integer N.
The next line contains N integers: a0, a1,...aN-1 separated by space, where ai represents the length of ith stick.
Output Format
For each operation, print the number of sticks that are cut in separate line.
Constraints
1 ≤ N ≤ 1000
1 ≤ ai ≤ 1000
So I get all of the sample test cases correct but some others I do not. For example
With an input of:
8
8 8 14 10 3 5 14 12
They expect an output of
8
7
6
4
3
2
However my code gives
8
7
6
4
2
Here is the function I have come up with for now.
(defun cut-print (numbers cut-length)
(let ((x numbers) (y cut-length) (k 0))
(loop while (> (length x) 0) do
(tagbody
;; subtracting the min value from all list elements
(setq x (map 'list (lambda (i) (- i y)) x))
;; Don't print if the list length hasn't changed
;; from last iteration
;; else save length changes and print
(cond ((= k (length x)) (go bottom))
((not (= k (length x)))
(setq k (length x))
(format t "~d~%" k)))
;; move to here if nothing is printed to
;; stdout during the current iteration
bottom
(setq x (remove-if (lambda (x) (<= x 0)) x))))))
What am I overlooking? Depending on the test case it seems that the logic above will skip over a cut operation according to their expected output.
How is y changing? In your program it is not changing...
Style:
Get rid of TAGBODY and GO.
Replace COND with IF.
What is the use of variables x and y?
Use descriptive names instead of x, y, i, k.
A simple recursive version:
(defun cut (sticks)
(when sticks
(print (length sticks))
(let ((smallest (reduce #'min sticks)))
(cut (remove-if-not #'plusp
(mapcar (lambda (stick)
(- stick smallest))
sticks))))))
Another recursive version could look like this:
(defun cut (sticks)
(labels ((%cut (sticks)
(when sticks
(print (length sticks))
(let ((smallest (first sticks)))
(%cut (mapcar (lambda (stick)
(- stick smallest))
(member smallest (rest sticks)
:test-not #'=)))))))
(%cut (sort sticks #'<))))
or even:
(defun cut (sticks)
(labels ((%cut (sticks length)
(when sticks
(print length)
(let ((prefix-length (or (position (first sticks) sticks
:test-not #'=)
1)))
(%cut (nthcdr prefix-length sticks)
(- length prefix-length))))))
(setf sticks (sort sticks #'<))
(%cut sticks (length sticks))))
A simple LOOP version:
(defun cut (numbers)
(loop with smallest
while numbers do
(print (length numbers))
(setf smallest (reduce #'min numbers)
numbers (loop for n in numbers
for n1 = (- n smallest)
when (plusp n1)
collect n1))))
As a small brain teaser, here is a shorter solution to the problem:
(defun sticks (&rest sticks)
(do ((rest (sort sticks #'<) (remove (car rest) rest)))
((null rest))
(print (length rest))))
Edit: I agree with Rainer Joswig, but leave to code unchanged so that his comment still makes sense.
It looks like you complicating the things. Why to use tagbody at all? Here is a simple Common Lisp solution for this so-called challenge. It passes their test.
(defun cut (sticks)
(let ((shortest (reduce #'min sticks)))
(mapcan (lambda (x) ;; I user mapcan to not traverse list twice
(let ((res (- x shortest)))
(when (plusp res) (list res)))) sticks)))
(defun cut-the-sticks (n sticks)
(if (null sticks)
nil
(let ((cutted (cut sticks)))
(format t "~&~D" n)
(cut-the-sticks (length cutted) cutted))))
(cut-the-sticks (read)
(with-input-from-string (in (read-line))
(loop :for x = (read in nil nil)
:while x :collect x)))
Have very little practice with lisp (can't get hang of cons cells) so I will give solution in python
def cutprint(lst):
#sort the list
lst = sorted(lst)
#let the maxcut so far be the size of first stick
maxcut = lst[0]
#get the size of the list
n = len(lst)
#submit the initial size of the list
yield n
#Loop over all sticks in the list
for stick in lst:
#subtract the current max cut from the stick
stick -= maxcut
#if the cut was to little, we have done the maximum cuts possible
if stick > 0:
#Add the remainder of the last cut to maxcut
maxcut += stick
#submit the current value of n
yield n
#Since we are cutting at each iteration, subtract 1 from n
n -= 1
I think the code is pretty self explanatory and it should be easy to understand
Usage:
>>> import stick
>>> for k in stick.cutprint([2, 2, 3, 4, 5, 7, 4, 2, 3, 4, 5, 6, 7, 34]):
... print k
...
14
11
9
6
4
3
1

In clojure[script], how to return nearest elements between 2 sorted vectors

In clojure[script], how to write a function nearest that receives two sorted vectors a, b and returns for each element of a the nearest element of b?
As an example,
(nearest [1 2 3 101 102 103] [0 100 1000]); [0 0 0 100 100 100]
I would like the solution to be both idiomatic and with good performances: O(n^2) is not acceptable!
Using a binary search or a sorted-set incurs a O(n*log m) time complexity where n is (count a) and m (count b).
However leveraging the fact that a and b are sorted the time complexity can be O(max(n, m)).
(defn nearest [a b]
(if-let [more-b (next b)]
(let [[x y] b
m (/ (+ x y) 2)
[<=m >m] (split-with #(<= % m) a)]
(lazy-cat (repeat (count <=m) x)
(nearest >m more-b)))
(repeat (count a) (first b))))
=> (nearest [1 2 3 101 102 103 501 601] [0 100 1000])
(0 0 0 100 100 100 100 1000)
Let n be (count a) and m be (count b). Then, if a and b are both ordered, then this can be done in what I believe ought to be O(n log(log m)) time, in other words, very close to linear in n.
First, let's re-implement abs and a binary-search (improvements here) to be independent of host (leveraging a native, e.g. Java's, version ought to be significantly faster)
(defn abs [x]
(if (neg? x) (- 0 x) x))
(defn binary-search-nearest-index [v x]
(if (> (count v) 1)
(loop [lo 0 hi (dec (count v))]
(if (== hi (inc lo))
(min-key #(abs (- x (v %))) lo hi)
(let [m (quot (+ hi lo) 2)]
(case (compare (v m) x)
1 (recur lo m)
-1 (recur m hi)
0 m))))
0))
If b is sorted, a binary search in b takes log m steps. So, mapping this over a is a O(n log m) solution, which for the pragmatist is likely good enough.
(defn nearest* [a b]
(map #(b (binary-search-nearest-index b %)) a))
However, we can also use the fact that a is sorted to divide and conquer a.
(defn nearest [a b]
(if (< (count a) 3)
(nearest* a b)
(let [m (quot (count a) 2)
i (binary-search-nearest-index b (a m))]
(lazy-cat
(nearest (subvec a 0 m) (subvec b 0 (inc i)))
[(b i)]
(nearest (subvec a (inc m)) (subvec b i))))))
I believe this ought to be O(n log(log m)). We start with the median of a and find nearest in b in log m time. Then we recurse on each half of a with split portions of b. If a m-proportional factor of b are split each time, you have O(n log log m). If only a constant portion is split off then the half of a working on that portion is linear time. If that continues (iterative halves of a work on constant size portions of b) then you have O(n).
Inspired by #amalloy, I have found this interesting idea by Chouser and wrote this solution:
(defn abs[x]
(max x (- x)))
(defn nearest-of-ss [ss x]
(let [greater (first (subseq ss >= x))
smaller (first (rsubseq ss <= x))]
(apply min-key #(abs (- % x)) (remove nil? [greater smaller]))))
(defn nearest[a b]
(map (partial nearest-of-ss (apply sorted-set a)) b))
Remark: It's important to create the sorted-set only once, in order to avoid performance penalty!

Resources