Points and segments - algorithm

I'm doing online course and got stuck at this problem.
The first line contains two non-negative integers 1 ≤ n, m ≤ 50000 — the number of segments and points on a line, respectively. The next n lines contain two integers a_i ≤ b_i defining the i-th segment. The next line contain m integers defining points. All the integers are of absolute value at most 10^8. For each segment, output the number of points it is used from the n-points table.
My solution is :
for point in points:
occurrence = 0
for l, r in segments:
if l <= point <= r:
occurrence += 1
print(occurrence),
The complexity of this algorithm is O(m*n), which is obviously not very efficient. What is the best way of solving this problem? Any help will be appreciated!
Sample Input:
2 3
0 5
7 10
1 6 11
Sample Output:
1 0 0
Sample Input 2:
1 3
-10 10
-100 100 0
Sample Output 2:
0 0 1

You can use sweep line algorithm to solve this problem.
First, break each segment into two points, open and close points.
Add all these points together with those m points, and sort them based on their locations.
Iterating through the list of points, maintaining a counter, every time you encounter an open point, increase the counter, and if you encounter an end point, decrease it. If you encounter a point in list m point, the result for this point is the value of counter at this moment.
For example 2, we have:
1 3
-10 10
-100 100 0
After sorting, what we have is:
-100 -10 0 10 100
At point -100, we have `counter = 0`
At point -10, this is open point, we increase `counter = 1`
At point 0, so result is 1
At point 10, this is close point, we decrease `counter = 0`
At point 100, result is 0
So, result for point -100 is 0, point 100 is 0 and point 0 is 1 as expected.
Time complexity is O((n + m) log (n + m)).

[Original answer] by how many segments is each point used
I am not sure I got the problem correctly but looks like simple example of Histogram use ...
create counter array (one item per point)
set it to zero
process the last line incrementing each used point counter O(m)
write the answer by reading histogram O(n)
So the result should be O(m+n) something like (C++):
const int n=2,m=3;
const int p[n][2]={ {0,5},{7,10} };
const int s[m]={1,6,11};
int i,cnt[n];
for (i=0;i<n;i++) cnt[i]=0;
for (i=0;i<m;i++) if ((s[i]>=0)&&(s[i]<n)) cnt[s[i]]++;
for (i=0;i<n;i++) cout << cnt[i] << " "; // result: 0 1
But as you can see the p[] coordinates are never used so either I missed something in your problem description or you missing something or it is there just to trick solvers ...
[edit1] after clearing the inconsistencies in OP the result is a bit different
By how many points is each segment used:
create counter array (one item per segment)
set it to zero
process the last line incrementing each used point counter O(m)
write the answer by reading histogram O(m)
So the result is O(m) something like (C++):
const int n=2,m=3;
const int p[n][2]={ {0,5},{7,10} };
const int s[m]={1,6,11};
int i,cnt[m];
for (i=0;i<m;i++) cnt[i]=0;
for (i=0;i<m;i++) if ((s[i]>=0)&&(s[i]<n)) cnt[i]++;
for (i=0;i<m;i++) cout << cnt[i] << " "; // result: 1,0,0
[Notes]
After added new sample set to OP it is clear now that:
indexes starts from 0
the problem is how many points from table p[n] are really used by each segment (m numbers in output)

Use Binary Search.
Sort the line segments according to 1st value and the second value. If you use c++, you can use custom sort like this:
sort(a,a+n,fun); //a is your array of pair<int,int>, coordinates representing line
bool fun(pair<int,int> a, pair<int,int> b){
if(a.first<b.first)
return true;
if(a.first>b.first)
return false;
return a.second < b.second;
}
Then, for every point, find the 1st line that captures the point and the first line that does not (after the line that does of course). If no line captures the point, you can return -1 or something (and not check for the point that does not).
Something like:
int checkFirstHold(pair<int,int> a[], int p,int min, int max){ //p is the point
while(min < max){
int mid = (min + max)/2;
if(a[mid].first <= p && a[mid].second>=p && a[mid-1].first<p && a[mid-1].second<p) //ie, p is in line a[mid] and not in line a[mid-1]
return mid;
if(a[mid].first <= p && a[mid].second>=p && a[mid-1].first<=p && a[mid-1].second>=p) //ie, p is both in line a[mid] and not in line a[mid-1]
max = mid-1;
if(a[mid].first < p && a[mid].second<p ) //ie, p is not in line a[mid]
min = mid + 1;
}
return -1; //implying no point holds the line
}
Similarly, write a checkLastHold function.
Then, find checkLastHold - checkFirstHold for every point, which is the answer.
The complexity of this solution will be O(n log m), as it takes (log m) for every calculation.

Here is my counter-based solution in Java.
Note that all points, segment start and segment end are read into one array.
If points of different PointType have the same x-coordinate, then the point is sorted after segment start and before segment end. This is done to count the point as "in" the segment if it coincides with both the segment start (counter already increased) and the segment end (counter not yet decreased).
For storing an answer in the same order as the points from the input, I create the array result of size pointsCount (only points counted, not the segments) and set its element with index SuperPoint.index, which stores the position of the point in the original input.
import java.util.Arrays;
import java.util.Scanner;
public final class PointsAndSegmentsSolution {
enum PointType { // in order of sort, so that the point will be counted on both segment start and end coordinates
SEGMENT_START,
POINT,
SEGMENT_END,
}
static class SuperPoint {
final PointType type;
final int x;
final int index; // -1 (actually does not matter) for segments, index for points
public SuperPoint(final PointType type, final int x) {
this(type, x, -1);
}
public SuperPoint(final PointType type, final int x, final int index) {
this.type = type;
this.x = x;
this.index = index;
}
}
private static int[] countSegments(final SuperPoint[] allPoints, final int pointsCount) {
Arrays.sort(allPoints, (o1, o2) -> {
if (o1.x < o2.x)
return -1;
if (o1.x > o2.x)
return 1;
return Integer.compare( o1.type.ordinal(), o2.type.ordinal() ); // points with the same X coordinate by order in PointType enum
});
final int[] result = new int[pointsCount];
int counter = 0;
for (final SuperPoint superPoint : allPoints) {
switch (superPoint.type) {
case SEGMENT_START:
counter++;
break;
case SEGMENT_END:
counter--;
break;
case POINT:
result[superPoint.index] = counter;
break;
default:
throw new IllegalArgumentException( String.format("Unknown SuperPoint type: %s", superPoint.type) );
}
}
return result;
}
public static void main(final String[] args) {
final Scanner scanner = new Scanner(System.in);
final int segmentsCount = scanner.nextInt();
final int pointsCount = scanner.nextInt();
final SuperPoint[] allPoints = new SuperPoint[(segmentsCount * 2) + pointsCount];
int allPointsIndex = 0;
for (int i = 0; i < segmentsCount; i++) {
final int start = scanner.nextInt();
final int end = scanner.nextInt();
allPoints[allPointsIndex] = new SuperPoint(PointType.SEGMENT_START, start);
allPointsIndex++;
allPoints[allPointsIndex] = new SuperPoint(PointType.SEGMENT_END, end);
allPointsIndex++;
}
for (int i = 0; i < pointsCount; i++) {
final int x = scanner.nextInt();
allPoints[allPointsIndex] = new SuperPoint(PointType.POINT, x, i);
allPointsIndex++;
}
final int[] pointsSegmentsCounts = countSegments(allPoints, pointsCount);
for (final int count : pointsSegmentsCounts) {
System.out.print(count + " ");
}
}
}

Related

Find all anagrams in a string O(n) solution

Here is the problem:
Given a string s and a non-empty string p, find all the start indices of p's anagrams in s.
Input: s: "cbaebabacd" p: "abc"
Output: [0, 6]
Input: s: "abab" p: "ab"
Output: [0, 1, 2]
Here is my solution
vector<int> findAnagrams(string s, string p) {
vector<int> res, s_map(26,0), p_map(26,0);
int s_len = s.size();
int p_len = p.size();
if (s_len < p_len) return res;
for (int i = 0; i < p_len; i++) {
++s_map[s[i] - 'a'];
++p_map[p[i] - 'a'];
}
if (s_map == p_map)
res.push_back(0);
for (int i = p_len; i < s_len; i++) {
++s_map[s[i] - 'a'];
--s_map[s[i - p_len] - 'a'];
if (s_map == p_map)
res.push_back(i - p_len + 1);
}
return res;
}
However, I think it is O(n^2) solution because I have to compare vectors s_map and p_map.
Does a O(n) solution exist for this problem?
lets say p has size n.
lets say you have an array A of size 26 that is filled with the number of a,b,c,... which p contains.
then you create a new array B of size 26 filled with 0.
lets call the given (big) string s.
first of all you initialize B with the number of a,b,c,... in the first n chars of s.
then you iterate through each word of size n in s always updating B to fit this n-sized word.
always B matches A you will have an index where we have an anagram.
to change B from one n-sized word to another, notice you just have to remove in B the first char of the previous word and add the new char of the next word.
Look at the example:
Input
s: "cbaebabacd"
p: "abc" n = 3 (size of p)
A = {1, 1, 1, 0, 0, 0, ... } // p contains just 1a, 1b and 1c.
B = {1, 1, 1, 0, 0, 0, ... } // initially, the first n-sized word contains this.
compare(A,B)
for i = n; i < size of s; i++ {
B[ s[i-n] ]--;
B[ s[ i ] ]++;
compare(A,B)
}
and suppose that compare(A,B) prints the index always A matches B.
the total complexity will be:
first fill of A = O(size of p)
first fill of B = O(size of s)
first comparison = O(26)
for-loop = |s| * (2 + O(26)) = |s| * O(28) = O(28|s|) = O(size of s)
____________________________________________________________________
2 * O(size of s) + O(size of p) + O(26)
which is linear in size of s.
Your solution is the O(n) solution. The size of the s_map and p_map vectors is a constant (26) that doesn't depend on n. So the comparison between s_map and p_map takes a constant amount of time regardless of how big n is.
Your solution takes about 26 * n integer comparisons to complete, which is O(n).
// In papers on string searching algorithms, the alphabet is often
// called Sigma, and it is often not considered a constant. Your
// algorthm works in (Sigma * n) time, where n is the length of the
// longer string. Below is an algorithm that works in O(n) time even
// when Sigma is too large to make an array of size Sigma, as long as
// values from Sigma are a constant number of "machine words".
// This solution works in O(n) time "with high probability", meaning
// that for all c > 2 the probability that the algorithm takes more
// than c*n time is 1-o(n^-c). This is a looser bound than O(n)
// worst-cast because it uses hash tables, which depend on randomness.
#include <functional>
#include <iostream>
#include <type_traits>
#include <vector>
#include <unordered_map>
#include <vector>
using namespace std;
// Finding a needle in a haystack. This works for any iterable type
// whose members can be stored as keys of an unordered_map.
template <typename T>
vector<size_t> AnagramLocations(const T& needle, const T& haystack) {
// Think of a contiguous region of an ordered container as
// representing a function f with the domain being the type of item
// stored in the container and the codomain being the natural
// numbers. We say that f(x) = n when there are n x's in the
// contiguous region.
//
// Then two contiguous regions are anagrams when they have the same
// function. We can track how close they are to being anagrams by
// subtracting one function from the other, pointwise. When that
// difference is uniformly 0, then the regions are anagrams.
unordered_map<remove_const_t<remove_reference_t<decltype(*needle.begin())>>,
intmax_t> difference;
// As we iterate through the haystack, we track the lead (part
// closest to the end) and lag (part closest to the beginning) of a
// contiguous region in the haystack. When we move the region
// forward by one, one part of the function f is increased by +1 and
// one part is decreased by -1, so the same is true of difference.
auto lag = haystack.begin(), lead = haystack.begin();
// To compare difference to the uniformly-zero function in O(1)
// time, we make sure it does not contain any points that map to
// 0. The the property of being uniformly zero is the same as the
// property of having an empty difference.
const auto find = [&](const auto& x) {
difference[x]++;
if (0 == difference[x]) difference.erase(x);
};
const auto lose = [&](const auto& x) {
difference[x]--;
if (0 == difference[x]) difference.erase(x);
};
vector<size_t> result;
// First we initialize the difference with the first needle.size()
// items from both needle and haystack.
for (const auto& x : needle) {
lose(x);
find(*lead);
++lead;
if (lead == haystack.end()) return result;
}
size_t i = 0;
if (difference.empty()) result.push_back(i++);
// Now we iterate through the haystack with lead, lag, and i (the
// position of lag) updating difference in O(1) time at each spot.
for (; lead != haystack.end(); ++lead, ++lag, ++i) {
find(*lead);
lose(*lag);
if (difference.empty()) result.push_back(i);
}
return result;
}
int main() {
string needle, haystack;
cin >> needle >> haystack;
const auto result = AnagramLocations(needle, haystack);
for (auto x : result) cout << x << ' ';
}
import java.util.*;
public class FindAllAnagramsInAString_438{
public static void main(String[] args){
String s="abab";
String p="ab";
// String s="cbaebabacd";
// String p="abc";
System.out.println(findAnagrams(s,p));
}
public static List<Integer> findAnagrams(String s, String p) {
int i=0;
int j=p.length();
List<Integer> list=new ArrayList<>();
while(j<=s.length()){
//System.out.println("Substring >>"+s.substring(i,j));
if(isAnamgram(s.substring(i,j),p)){
list.add(i);
}
i++;
j++;
}
return list;
}
public static boolean isAnamgram(String s,String p){
HashMap<Character,Integer> map=new HashMap<>();
if(s.length()!=p.length()) return false;
for(int i=0;i<s.length();i++){
char chs=s.charAt(i);
char chp=p.charAt(i);
map.put(chs,map.getOrDefault(chs,0)+1);
map.put(chp,map.getOrDefault(chp,0)-1);
}
for(int val:map.values()){
if(val!=0) return false;
}
return true;
}
}

Special numbers challenge in programming

First, sorry for my bad English.
Special numbers are numbers that the sum of the digits is divisible to the number of the digit.
Example: 135 is a special number because the sum of the digits is 1+3+5 = 9, the number of the digit is 3, and 9 is divisible to 3 because 9 % 3 == 0. 2,3,9,13,17,15,225, 14825 are also special numbers.
Requirement:
Write a program that read the number n (n <= 10^6) from a file named SNUMS.INP (SNUMS.INP can contain up to 10^6 numbers) and print the result out into the file SNUMS.OUT. Number n is the order of the special number and the result will be that special number in n order (sorry I don't know how to express it).
Example: n = 3 means you have to print out the 3rd special number which is 3, n = 10 you have to print out 10th special number which is 11, n = 13 you have to print out 13th special number which is 17, n = 15 you have to print out 15th special number which is 20.
The example bellow will demonstrate the file SNUMS.INP and SNUMS.OUT (Remember: SNUMS.INP can contain up to 10^6 numbers)
SNUMS.INP:
2
14
17
22
SNUMS.OUT:
2
19
24
35
I have my own alogrithm but the the running time exceeds 1 second (my SNUMS.INP has 10^6 numbers). So I need the optimal alogrithm so that the running time will be less than or equal 1s.
Guys I decide to post my own code which is written in Java, it always take more than 4 seconds to run. Could you guys please suggest some ideas to improve or how to make it run faster
import java.util.Scanner;
import java.io.*;
public class Test
{
public static void main(String[]args) throws IOException
{
File file = new File("SNUMS.INP");
Scanner inputFile = new Scanner(file);
int order = 1;
int i = 1;
int[] special = new int[1000000+1];
// Write all 10^6 special numbers into an array named "special"
while (order <= 1000000)
{
if (specialNumber(i) == true)
{
special[order] = i;
order++;
}
i++;
}
// Write the result to file
PrintWriter outputFile = new PrintWriter("SNUMS.OUT");
outputFile.println(special[inputFile.nextInt()]);
while (inputFile.hasNext())
outputFile.println(special[inputFile.nextInt()]);
outputFile.close();
}
public static boolean specialNumber(int i)
{
// This method check whether the number is a special number
boolean specialNumber = false;
byte count=0;
long sum=0;
while (i != 0)
{
sum = sum + (i % 10);
count++;
i = i / 10;
}
if (sum % count == 0) return true;
else return false;
}
}
This is file SNUMS.INP (sample) contains 10^6 numbers if you guys want to test.
https://drive.google.com/file/d/0BwOJpa2dAZlUNkE3YmMwZmlBOTg/view?usp=sharing
I've managed to solve it in 0.6 seconds on C# 6.0 (.Net 4.6 IA-64) at Core i7 3.2 GHz with HDD 7200 rpc; hope that precompution will be fast enough at your workstation:
// Precompute beautiful numbers
private static int[] BeautifulNumbers(int length) {
int[] result = new int[length];
int index = 0;
for (int i = 1; ; ++i) {
int sum = 0;
int count = 0;
for (int v = i; v > 0; sum += v % 10, ++count, v /= 10)
;
if (sum % count == 0) {
result[index] = i;
if (++index >= result.Length)
return result;
}
}
}
...
// Test file with 1e6 items
File.WriteAllLines(#"D:\SNUMS.INP", Enumerable
.Range(1, 1000000)
.Select(index => index.ToString()));
...
Stopwatch sw = new Stopwatch();
sw.Start();
// Precomputed numbers (about 0.3 seconds to be created)
int[] data = BeautifulNumbers(1000000);
// File (about 0.3 seconds for both reading and writing)
var result = File
.ReadLines(#"D:\SNUMS.INP")
.Select(line => data[int.Parse(line) - 1].ToString());
File.WriteAllLines(#"D:\SNUMS.OUT", result);
sw.Stop();
Console.Write("Elapsed time {0}", sw.ElapsedMilliseconds);
The output vary from
Elapsed time 516
to
Elapsed time 660
with average elapsed time at about 580 milliseconds
Now that you have the metaphor of abacus implemented below, here are some hints
instead of just incrementing with 1 inside a cycle, can we incremente more aggressively? Indeed we can, but with an extra bit of care.
first, how much aggressive we can be? Looking to 11 (first special with 2 digits), it doesn't pay to just increment by 1, we can increment it by 2. Looking to 102 (special with 3 digits), we can increment it by 3. Is it natural to think we should use increments equal with the number of digits?
now the "extra bit of care" - whenever the "increment by the number of digits" causes a "carry", the naive increment breaks. Because the carry will add 1 to the sum of digits, so that we may need to subtract that one from something to keep the sum of digits well behaved.
one of the issues in the above is that we jumped quite happily at "first special with N digits", but the computer is not us to see it at a glance. Fortunately, the "first special with N digits" is easy to compute: it is 10^(N-1)+(N-1) - 10^(N-1) brings an 1 and the rest is zero, and N-1 brings the rest to make the sum of digits be the first divisible with N. Of course, this will break down if N > 10, but fortunately the problem is limited to 10^6 special numbers, which will require at most 7 digits (the millionth specual number is 6806035 - 7 digits);
so, we can detect the "first special number with N digits" and we know we should try with care to increment it by N. Can we look now better into that "extra care"?.
The code - twice as speedy as the previous one and totally "orthodox" in obtaining the data (via getters instead of direct access to data members).
Feel free to inline:
import java.util.ArrayList;
import java.util.Arrays;
public class Abacus {
static protected int pow10[]=
{1,10,100,1000, 10000, 100000, 1000000, 10000000, 100000000}
;
// the value stored for line[i] corresponds to digit[i]*pow10[i]
protected int lineValues[];
protected int sumDigits;
protected int representedNumber;
public Abacus() {
this.lineValues=new int[0];
this.sumDigits=0;
this.representedNumber=0;
}
public int getLineValue(int line) {
return this.lineValues[line];
}
public void clearUnitLine() {
this.sumDigits-=this.lineValues[0];
this.representedNumber-=this.lineValues[0];
this.lineValues[0]=0;
}
// This is how you operate the abacus in real life being asked
// to add a number of units to the line presenting powers of 10
public boolean addWithCarry(int units, int line) {
if(line-1==pow10.length) {
// don't have enough pow10 stored
pow10=Arrays.copyOf(pow10, pow10.length+1);
pow10[line]=pow10[line-1]*10;
}
if(line>=this.lineValues.length) {
// don't have enough lines for the carry
this.lineValues=Arrays.copyOf(this.lineValues, line+1);
}
int digitOnTheLine=this.lineValues[line]/pow10[line];
int carryOnTheNextLine=0;
while(digitOnTheLine+units>=10) {
carryOnTheNextLine++;
units-=10;
}
if(carryOnTheNextLine>0) {
// we have a carry, the sumDigits will be affected
// 1. the next two statememts are equiv with "set a value of zero on the line"
this.sumDigits-=digitOnTheLine;
this.representedNumber-=this.lineValues[line];
// this is the new value of the digit to set on the line
digitOnTheLine+=units;
// 3. set that value and keep all the values synchronized
this.sumDigits+=digitOnTheLine;
this.lineValues[line]=digitOnTheLine*pow10[line];
this.representedNumber+=this.lineValues[line];
// 4. as we had a carry, the next line will be affected as well.
this.addWithCarry(carryOnTheNextLine, line+1);
}
else { // we an simply add the provided value without carry
int delta=units*pow10[line];
this.lineValues[line]+=delta;
this.representedNumber+=delta;
this.sumDigits+=units;
}
return carryOnTheNextLine>0;
}
public int getSumDigits() {
return this.sumDigits;
}
public int getRepresentedNumber() {
return this.representedNumber;
}
public int getLinesCount() {
return this.lineValues.length;
}
static public ArrayList<Integer> specials(int N) {
ArrayList<Integer> ret=new ArrayList<>(N);
Abacus abacus=new Abacus();
ret.add(1);
abacus.addWithCarry(1, 0); // to have something to add to
int increment=abacus.getLinesCount();
while(ret.size()<N) {
boolean hadCarry=abacus.addWithCarry(increment, 0);
if(hadCarry) {
// need to resynch the sum for a perfect number
int newIncrement=abacus.getLinesCount();
abacus.clearUnitLine();
if(newIncrement!=increment) {
// we switched powers of 10
abacus.addWithCarry(newIncrement-1, 0);
increment=newIncrement;
}
else { // simple carry
int digitsSum=abacus.getSumDigits();
// how much we should add to the last digit to make the sumDigits
// divisible again with the increment?
int units=increment-digitsSum % increment;
if(units<increment) {
abacus.addWithCarry(units, 0);
}
}
}
ret.add(abacus.getRepresentedNumber());
}
return ret;
}
// to understand how the addWithCarry works, try the following code
static void add13To90() {
Abacus abacus; // starts with a represented number of 0
// line==1 means units of 10^1
abacus.addWithCary(9, 1); // so this should make the abacus store 90
System.out.println(abacus.getRepresentedNumber());
// line==0 means units of 10^0
abacus.addWithCarry(13, 0);
System.out.println(abacus.getRepresentedNumber()); // 103
}
static public void main(String[] args) {
int count=1000000;
long t1=System.nanoTime();
ArrayList<Integer> s1=Abacus.specials(count);
long t2=System.nanoTime();
System.out.println("t:"+(t2-t1));
}
}
Constructing the numbers from their digits is bound to be faster.
Remember the abacus? Ever used one?
import java.util.ArrayList;
public class Specials {
static public ArrayList<Integer> computeNSpecials(int N) {
ArrayList<Integer> specials = new ArrayList<>();
int abacus[] = new int[0]; // at index i we have the digit for 10^i
// This way, when we don't have enough specials,
// we simply reallocate the array and continue
while (specials.size() < N) {
// see if a carry operation is necessary
int currDigit = 0;
for (; currDigit < abacus.length && abacus[currDigit] == 9; currDigit++) {
abacus[currDigit] = 0; // a carry occurs when adding 1
}
if (currDigit == abacus.length) {
// a carry, but we don't have enough lines on the abacus
abacus = new int[abacus.length + 1];
abacus[currDigit] = 1; // we resolved the carry, all the digits below
// are 0
} else {
abacus[currDigit]++; // we resolve the carry (if there was one),
currDigit = 0; // now it's safe to continue incrementing at 10^0
}
// let's obtain the current number and the sum of the digits
int sumDigits = 0;
for (int i = 0; i<abacus.length; i++) {
sumDigits += abacus[i];
}
// is it special?
if (sumDigits % abacus.length == 0) {
// only now compute the number and collect it as special
int number = 0;
for (int i = abacus.length - 1; i >= 0; i--) {
number = 10 * number + abacus[i];
}
specials.add(number);
}
}
return specials;
}
static public void main(String[] args) {
ArrayList<Integer> specials=Specials.computeNSpecials(100);
for(int i=0; i<specials.size(); i++) {
System.out.println(specials.get(i));
}
}
}

Check if binary string can be partitioned such that each partition is a power of 5

I recently came across this question - Given a binary string, check if we can partition/split the string into 0..n parts such that each part is a power of 5. Return the minimum number of splits, if it can be done.
Examples would be:
input = "101101" - returns 1, as the string can be split once to form "101" and "101",as 101= 5^1.
input = "1111101" - returns 0, as the string itself is 5^3.
input = "100"- returns -1, as it can't be split into power(s) of 5.
I came up with this recursive algorithm:
Check if the string itself is a power of 5. if yes, return 0
Else, iterate over the string character by character, checking at every point if the number seen so far is a power of 5. If yes, add 1 to split count and check the rest of the string recursively for powers of 5 starting from step 1.
return the minimum number of splits seen so far.
I implemented the above algo in Java. I believe it works alright, but it's a straightforward recursive solution. Can this be solved using dynamic programming to improve the run time?
The code is below:
public int partition(String inp){
if(inp==null || inp.length()==0)
return 0;
return partition(inp,inp.length(),0);
}
public int partition(String inp,int len,int index){
if(len==index)
return 0;
if(isPowerOfFive(inp,index))
return 0;
long sub=0;
int count = Integer.MAX_VALUE;
for(int i=index;i<len;++i){
sub = sub*2 +(inp.charAt(i)-'0');
if(isPowerOfFive(sub))
count = Math.min(count,1+partition(inp,len,i+1));
}
return count;
}
Helper functions:
public boolean isPowerOfFive(String inp,int index){
long sub = 0;
for(int i=index;i<inp.length();++i){
sub = sub*2 +(inp.charAt(i)-'0');
}
return isPowerOfFive(sub);
}
public boolean isPowerOfFive(long val){
if(val==0)
return true;
if(val==1)
return false;
while(val>1){
if(val%5 != 0)
return false;
val = val/5;
}
return true;
}
Here is simple improvements that can be done:
Calculate all powers of 5 before start, so you could do checks faster.
Stop split input string if the number of splits is already greater than in the best split you've already done.
Here is my solution using these ideas:
public static List<String> powers = new ArrayList<String>();
public static int bestSplit = Integer.MAX_VALUE;
public static void main(String[] args) throws Exception {
// input string (5^5, 5^1, 5^10)
String inp = "110000110101101100101010000001011111001";
// calc all powers of 5 that fits in given string
for (int pow = 1; ; ++pow) {
String powStr = Long.toBinaryString((long) Math.pow(5, pow));
if (powStr.length() <= inp.length()) { // can be fit in input string
powers.add(powStr);
} else {
break;
}
}
Collections.reverse(powers); // simple heuristics, sort powers in decreasing order
// do simple recursive split
split(inp, 0, -1);
// print result
if (bestSplit == Integer.MAX_VALUE) {
System.out.println(-1);
} else {
System.out.println(bestSplit);
}
}
public static void split(String inp, int start, int depth) {
if (depth >= bestSplit) {
return; // can't do better split
}
if (start == inp.length()) { // perfect split
bestSplit = depth;
return;
}
for (String pow : powers) {
if (inp.startsWith(pow, start)) {
split(inp, start + pow.length(), depth + 1);
}
}
}
EDIT:
I also found another approach which looks like very fast one.
Calculate all powers of 5 whose string representation is shorter than input string. Save those strings in powers array.
For every string power from powers array: if power is substring of input then save its start and end indexes into the edges array (array of tuples).
Now we just need to find shortest path from index 0 to index input.length() by edges from the edges array. Every edge has the same weight, so the shortest path can be found very fast with BFS.
The number of edges in the shortest path found is exactly what you need -- minimum number of splits of the input string.
Instead of calculating all possible substrings, you can check the binary representation of the powers of 5 in search of a common pattern. Using something like:
bc <<< "obase=2; for(i = 1; i < 40; i++) 5^i"
You get:
51 = 1012
52 = 110012
53 = 11111012
54 = 10011100012
55 = 1100001101012
56 = 111101000010012
57 = 100110001001011012
58 = 10111110101111000012
59 = 1110111001101011001012
510 = 1001010100000010111110012
511 = 101110100100001110110111012
512 = 11101000110101001010010100012
513 = 10010001100001001110011100101012
514 = 1011010111100110001000001111010012
515 = 111000110101111110101001001100011012
516 = 100011100001101111001001101111110000012
517 = 10110001101000101011110000101110110001012
518 = 1101111000001011011010110011101001110110012
...
529 = 101000011000111100000111110101110011011010111001000010111110010101012
As you can see, odd powers of 5 always ends with 101 and even powers of 5 ends with the pattern 10+1 (where + means one or more occurrences).
You could put your input string in a trie and then iterate over it identifying the 10+1 pattern, once you have a match, evaluate it to check if is not a false positive.
You just have to save the value for a given string in a map. For example having if you have a string ending like this: (each letter may be a string of arbitrary size)
ABCD
You find that part A mod 5 is ok, so you try again for BCD, but find that B mod 5 is also ok, same for C and D as well as CD together. Now you should have the following results cached:
C -> 0
D -> 0
CD -> 0
BCD -> 1 # split B/CD is the best
But you're not finished with ABCD - you find that AB mod 5 is ok, so you check the resulting CD - it's already in the cache and you don't have to process it from the beginning.
In practice you just need to cache answers from partition() - either for the actual string or for the (string, start, length) tuple. Which one is better depends on how many repeating sequences you have and whether it's faster to compare the contents, or just indexes.
Given below is a solution in C++. Using dynamic programming I am considering all the possible splits and saving the best results.
#include<bits/stdc++.h>
using namespace std;
typedef long long ll;
int isPowerOfFive(ll n)
{
if(n == 0) return 0;
ll temp = (ll)(log(n)/log(5));
ll t = round(pow(5,temp));
if(t == n)
{
return 1;
}
else
{
return 0;
}
}
ll solve(string s)
{
vector<ll> dp(s.length()+1);
for(int i = 1; i <= s.length(); i++)
{
dp[i] = INT_MAX;
for(int j = 1; j <= i; j++)
{
if( s[j-1] == '0')
{
continue;
}
ll num = stoll(s.substr(j-1, i-j+1), nullptr, 2);
if(isPowerOfFive(num))
{
dp[i] = min(dp[i], dp[j-1]+1);
}
}
}
if(dp[s.length()] == INT_MAX)
{
return -1;
}
else
{
return dp[s.length()];
}
}
int main()
{
string s;
cin>>s;
cout<<solve(s);
}

Finding union length of many line segments

I have few bolded line segments on x-axis in form of their beginning and ending x-coordinates. Some line segments may be overlapping. How to find the union length of all the line segments.
Example, a line segment is 5,0 to 8,0 and other is 9,0 to 12,0. Both are non overlapping, so sum of length is 3 + 3 = 6.
a line segment is 5,0 to 8,0 and other is 7,0 to 12,0. But they are overlapping for range, 7,0 to 8,0. So union of length is 7.
But the x- coordinates may be floating points.
Represent a line segment as 2 EndPoint object. Each EndPoint object has the form <coordinate, isStartEndPoint>. Put all EndPoint objects of all the line segments together in a list endPointList.
The algorithm:
Sort endPointList, first by coordinate in ascending order, then place the start end points in front of the tail end points (regardless of which segment, since it doesn't matter - all at the same coordinate).
Loop through the sorted list according to this pseudocode:
prevCoordinate = -Inf
numSegment = 0
unionLength = 0
for (endPoint in endPointList):
if (numSegment > 0):
unionLength += endPoint.coordinate - prevCoordinate
prevCoordinate = endPoint.coordinate
if (endPoint.isStartCoordinate):
numSegment = numSegment + 1
else:
numSegment = numSegment - 1
The numSegment variable will tell whether we are in a segment or not. When it is larger than 0, we are inside some segment, so we can include the distance to the previous end point. If it is 0, it means that the part before the current end point doesn't contain any segment.
The complexity is dominated by the sorting part, since comparison-based sorting algorithm has lower bound of Omega(n log n), while the loop is clearly O(n) at best. So the complexity of the algorithm can be said to be O(n log n) if you choose an O(n log n) comparison-based sorting algorithm.
Use a range tree. A range tree is n log(n), just like the sorted begin/end points, but it has the additional advantage that overlapping ranges will reduce the number of elements (but maybe increase the cost of insertion) Snippet (untested)
struct segment {
struct segment *ll, *rr;
float lo, hi;
};
struct segment * newsegment(float lo, float hi) {
struct segment * ret;
ret = malloc (sizeof *ret);
ret->lo = lo; ret->hi = hi;
ret->ll= ret->rr = NULL;
return ret;
}
struct segment * insert_range(struct segment *root, float lo, float hi)
{
if (!root) return newsegment(lo, hi);
/* non-overlapping(or touching) ranges can be put into the {l,r} subtrees} */
if (hi < root->lo) {
root->ll = insert_range(root->ll, lo, hi);
return root;
}
if (lo > root->hi) {
root->rr = insert_range(root->rr, lo, hi);
return root;
}
/* when we get here, we must have overlap; we can extend the current node
** we also need to check if the broader range overlaps the child nodes
*/
if (lo < root->lo ) {
root->lo = lo;
while (root->ll && root->ll->hi >= root->lo) {
struct segment *tmp;
tmp = root->ll;
root->lo = tmp->lo;
root->ll = tmp->ll;
tmp->ll = NULL;
// freetree(tmp);
}
}
if (hi > root->hi ) {
root->hi = hi;
while (root->rr && root->rr->lo <= root->hi) {
struct segment *tmp;
tmp = root->rr;
root->hi = tmp->hi;
root->rr = tmp->rr;
tmp->rr = NULL;
// freetree(tmp);
}
}
return root;
}
float total_width(struct segment *ptr)
{
float ret;
if (!ptr) return 0.0;
ret = ptr->hi - ptr->lo;
ret += total_width(ptr->ll);
ret += total_width(ptr->rr);
return ret;
}
Here is a solution I just wrote in Haskell and below it is an example of how it can be implemented in the interpreter command prompt. The segments must be presented in the form of a list of tuples [(a,a)]. I hope you can get a sense of the algorithm from the code.
import Data.List
unionSegments segments =
let (x:xs) = sort segments
one_segment = snd x - fst x
in if xs /= []
then if snd x > fst (head xs)
then one_segment - (snd x - fst (head xs)) + unionSegments xs
else one_segment + unionSegments xs
else one_segment
*Main> :load "unionSegments.hs"
[1 of 1] Compiling Main ( unionSegments.hs, interpreted )
Ok, modules loaded: Main.
*Main> unionSegments [(5,8), (7,12)]
7
Java implementation
import java.util.*;
public class HelloWorld{
static void unionLength(int a[][],int sets)
{
TreeMap<Integer,Boolean> t=new TreeMap<>();
for(int i=0;i<sets;i++)
{
t.put(a[i][0],false);
t.put(a[i][1],true);
}
int count=0;
int res=0;
int one=1;
Set set = t.entrySet();
Iterator it = set.iterator();
int prev=0;
while(it.hasNext()) {
if(one==1){
Map.Entry me = (Map.Entry)it.next();
one=0;
prev=(int)me.getKey();
if((boolean)me.getValue()==false)
count++;
else
count--;
}
Map.Entry me = (Map.Entry)it.next();
if(count>0)
res=res+((int)me.getKey()-prev);
if((boolean)me.getValue()==false)
count++;
else
count--;
prev=(int)me.getKey();
}
System.out.println(res);
}
public static void main(String []args){
int a[][]={{0, 4}, {3, 6},{8,10}};
int b[][]={{5, 10}, {8, 12}};
unionLength(a,3);
unionLength(b,2);
}
}

Find Second largest number in array at most n+log₂(n)−2 comparisons [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
The community reviewed whether to reopen this question 12 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
You are given as input an unsorted array of n distinct numbers, where n is a power of 2. Give an algorithm that identifies the second-largest number in the array, and that uses at most n+log₂(n)−2 comparisons.
Start with comparing elements of the n element array in odd and even positions and determining largest element of each pair. This step requires n/2 comparisons. Now you've got only n/2 elements. Continue pairwise comparisons to get n/4, n/8, ... elements. Stop when the largest element is found. This step requires a total of n/2 + n/4 + n/8 + ... + 1 = n-1 comparisons.
During previous step, the largest element was immediately compared with log₂(n) other elements. You can determine the largest of these elements in log₂(n)-1 comparisons. That would be the second-largest number in the array.
Example: array of 8 numbers [10,9,5,4,11,100,120,110].
Comparisons on level 1: [10,9] ->10 [5,4]-> 5, [11,100]->100 , [120,110]-->120.
Comparisons on level 2: [10,5] ->10 [100,120]->120.
Comparisons on level 3: [10,120]->120.
Maximum is 120. It was immediately compared with: 10 (on level 3), 100 (on level 2), 110 (on level 1).
Step 2 should find the maximum of 10, 100, and 110. Which is 110. That's the second largest element.
sly s's answer is derived from this paper, but he didn't explain the algorithm, which means someone stumbling across this question has to read the whole paper, and his code isn't very sleek as well. I'll give the crux of the algorithm from the aforementioned paper, complete with complexity analysis, and also provide a Scala implementation, just because that's the language I chose while working on these problems.
Basically, we do two passes:
Find the max, and keep track of which elements the max was compared to.
Find the max among the elements the max was compared to; the result is the second largest element.
In the picture above, 12 is the largest number in the array, and was compared to 3, 1, 11, and 10 in the first pass. In the second pass, we find the largest among {3, 1, 11, 10}, which is 11, which is the second largest number in the original array.
Time Complexity:
All elements must be looked at, therefore, n - 1 comparisons for pass 1.
Since we divide the problem into two halves each time, there are at most log₂n recursive calls, for each of which, the comparisons sequence grows by at most one; the size of the comparisons sequence is thus at most log₂n, therefore, log₂n - 1 comparisons for pass 2.
Total number of comparisons <= (n - 1) + (log₂n - 1) = n + log₂n - 2
def second_largest(nums: Sequence[int]) -> int:
def _max(lo: int, hi: int, seq: Sequence[int]) -> Tuple[int, MutableSequence[int]]:
if lo >= hi:
return seq[lo], []
mid = lo + (hi - lo) // 2
x, a = _max(lo, mid, seq)
y, b = _max(mid + 1, hi, seq)
if x > y:
a.append(y)
return x, a
b.append(x)
return y, b
comparisons = _max(0, len(nums) - 1, nums)[1]
return _max(0, len(comparisons) - 1, comparisons)[0]
The first run for the given example is as follows:
lo=0, hi=1, mid=0, x=10, a=[], y=4, b=[]
lo=0, hi=2, mid=1, x=10, a=[4], y=5, b=[]
lo=3, hi=4, mid=3, x=8, a=[], y=7, b=[]
lo=3, hi=5, mid=4, x=8, a=[7], y=2, b=[]
lo=0, hi=5, mid=2, x=10, a=[4, 5], y=8, b=[7, 2]
lo=6, hi=7, mid=6, x=12, a=[], y=3, b=[]
lo=6, hi=8, mid=7, x=12, a=[3], y=1, b=[]
lo=9, hi=10, mid=9, x=6, a=[], y=9, b=[]
lo=9, hi=11, mid=10, x=9, a=[6], y=11, b=[]
lo=6, hi=11, mid=8, x=12, a=[3, 1], y=11, b=[9]
lo=0, hi=11, mid=5, x=10, a=[4, 5, 8], y=12, b=[3, 1, 11]
Things to note:
There are exactly n - 1=11 comparisons for n=12.
From the last line, y=12 wins over x=10, and the next pass starts with the sequence [3, 1, 11, 10], which has log₂(12)=3.58 ~ 4 elements, and will require 3 comparisons to find the maximum.
I have implemented this algorithm in Java answered by #Evgeny Kluev. The total comparisons are n+log2(n)−2. There is also a good reference:
Alexander Dekhtyar: CSC 349: Design and Analyis of Algorithms. This is similar to the top voted algorithm.
public class op1 {
private static int findSecondRecursive(int n, int[] A){
int[] firstCompared = findMaxTournament(0, n-1, A); //n-1 comparisons;
int[] secondCompared = findMaxTournament(2, firstCompared[0]-1, firstCompared); //log2(n)-1 comparisons.
//Total comparisons: n+log2(n)-2;
return secondCompared[1];
}
private static int[] findMaxTournament(int low, int high, int[] A){
if(low == high){
int[] compared = new int[2];
compared[0] = 2;
compared[1] = A[low];
return compared;
}
int[] compared1 = findMaxTournament(low, (low+high)/2, A);
int[] compared2 = findMaxTournament((low+high)/2+1, high, A);
if(compared1[1] > compared2[1]){
int k = compared1[0] + 1;
int[] newcompared1 = new int[k];
System.arraycopy(compared1, 0, newcompared1, 0, compared1[0]);
newcompared1[0] = k;
newcompared1[k-1] = compared2[1];
return newcompared1;
}
int k = compared2[0] + 1;
int[] newcompared2 = new int[k];
System.arraycopy(compared2, 0, newcompared2, 0, compared2[0]);
newcompared2[0] = k;
newcompared2[k-1] = compared1[1];
return newcompared2;
}
private static void printarray(int[] a){
for(int i:a){
System.out.print(i + " ");
}
System.out.println();
}
public static void main(String[] args) {
//Demo.
System.out.println("Origial array: ");
int[] A = {10,4,5,8,7,2,12,3,1,6,9,11};
printarray(A);
int secondMax = findSecondRecursive(A.length,A);
Arrays.sort(A);
System.out.println("Sorted array(for check use): ");
printarray(A);
System.out.println("Second largest number in A: " + secondMax);
}
}
the problem is:
let's say, in comparison level 1, the algorithm need to be remember all the array element because largest is not yet known, then, second, finally, third. by keep tracking these element via assignment will invoke additional value assignment and later when the largest is known, you need also consider the tracking back. As the result, it will not be significantly faster than simple 2N-2 Comparison algorithm. Moreover, because the code is more complicated, you need also think about potential debugging time.
eg: in PHP, RUNNING time for comparison vs value assignment roughly is :Comparison: (11-19) to value assignment: 16.
I shall give some examples for better understanding. :
example 1 :
>12 56 98 12 76 34 97 23
>>(12 56) (98 12) (76 34) (97 23)
>>> 56 98 76 97
>>>> (56 98) (76 97)
>>>>> 98 97
>>>>>> 98
The largest element is 98
Now compare with lost ones of the largest element 98. 97 will be the second largest.
nlogn implementation
public class Test {
public static void main(String...args){
int arr[] = new int[]{1,2,2,3,3,4,9,5, 100 , 101, 1, 2, 1000, 102, 2,2,2};
System.out.println(getMax(arr, 0, 16));
}
public static Holder getMax(int[] arr, int start, int end){
if (start == end)
return new Holder(arr[start], Integer.MIN_VALUE);
else {
int mid = ( start + end ) / 2;
Holder l = getMax(arr, start, mid);
Holder r = getMax(arr, mid + 1, end);
if (l.compareTo(r) > 0 )
return new Holder(l.high(), r.high() > l.low() ? r.high() : l.low());
else
return new Holder(r.high(), l.high() > r.low() ? l.high(): r.low());
}
}
static class Holder implements Comparable<Holder> {
private int low, high;
public Holder(int r, int l){low = l; high = r;}
public String toString(){
return String.format("Max: %d, SecMax: %d", high, low);
}
public int compareTo(Holder data){
if (high == data.high)
return 0;
if (high > data.high)
return 1;
else
return -1;
}
public int high(){
return high;
}
public int low(){
return low;
}
}
}
Why not to use this hashing algorithm for given array[n]? It runs c*n, where c is constant time for check and hash. And it does n comparisons.
int first = 0;
int second = 0;
for(int i = 0; i < n; i++) {
if(array[i] > first) {
second = first;
first = array[i];
}
}
Or am I just do not understand the question...
In Python2.7: The following code works at O(nlog log n) for the extra sort. Any optimizations?
def secondLargest(testList):
secondList = []
# Iterate through the list
while(len(testList) > 1):
left = testList[0::2]
right = testList[1::2]
if (len(testList) % 2 == 1):
right.append(0)
myzip = zip(left,right)
mymax = [ max(list(val)) for val in myzip ]
myzip.sort()
secondMax = [x for x in myzip[-1] if x != max(mymax)][0]
if (secondMax != 0 ):
secondList.append(secondMax)
testList = mymax
return max(secondList)
public static int FindSecondLargest(int[] input)
{
Dictionary<int, List<int>> dictWinnerLoser = new Dictionary<int, List<int>>();//Keeps track of loosers with winners
List<int> lstWinners = null;
List<int> lstLoosers = null;
int winner = 0;
int looser = 0;
while (input.Count() > 1)//Runs till we get max in the array
{
lstWinners = new List<int>();//Keeps track of winners of each run, as we have to run with winners of each run till we get one winner
for (int i = 0; i < input.Count() - 1; i += 2)
{
if (input[i] > input[i + 1])
{
winner = input[i];
looser = input[i + 1];
}
else
{
winner = input[i + 1];
looser = input[i];
}
lstWinners.Add(winner);
if (!dictWinnerLoser.ContainsKey(winner))
{
lstLoosers = new List<int>();
lstLoosers.Add(looser);
dictWinnerLoser.Add(winner, lstLoosers);
}
else
{
lstLoosers = dictWinnerLoser[winner];
lstLoosers.Add(looser);
dictWinnerLoser[winner] = lstLoosers;
}
}
input = lstWinners.ToArray();//run the loop again with winners
}
List<int> loosersOfWinner = dictWinnerLoser[input[0]];//Gives all the elemetns who lost to max element of array, input array now has only one element which is actually the max of the array
winner = 0;
for (int i = 0; i < loosersOfWinner.Count(); i++)//Now max in the lossers of winner will give second largest
{
if (winner < loosersOfWinner[i])
{
winner = loosersOfWinner[i];
}
}
return winner;
}

Resources