Two ways of doing Counting Sort - sorting

Here are my two implementations of Counting Sort
In this implementation which is a very simple one, all I do is count the number of occurrences of the element, and insert as many times as the occurrences in the output array.
Implementation 1
public class Simple
{
static int[] a = {5,6,6,4,4,4,8,8,8,9,4,4,3,3,4};
public static void main(String[] args)
{
fun(a);
print(a);
}
static void fun(int[] a)
{
int max = findMax(a);
int[] temp = new int[max+1];
for(int i = 0;i<a.length;i++)
{
temp[a[i]]++;
}
print(temp);
//print(temp);
int k = 0;
for(int i = 0;i<temp.length;i++)
{
for(int j = 0;j<temp[i];j++)
a[k++] = i;
}
print(a);
}
static int findMax(int[] a)
{
int max = a[0];
for(int i= 1;i<a.length;i++)
{
if(a[i] > max)
max = a[i];
}
return max;
}
static void print(int[] a)
{
for(int i = 0;i<a.length;i++)
System.out.print(a[i] + " ");
System.out.println("");
}
}
Implementation 2
In this implementation which I saw on a lot of places online, you create an array saying how many elements there exists less than or equal to, that element, and then insert the element at that position. Once you insert, you reduce the count of the number of elements that are less than or equal to that element, since you have included that element. By the element, this array turns to all zeros. As you can see this implementation is fairly complex compared to the previous one, and am not sure why this is widely popular online.
public class NotVerySimple {
public static void main(String[] args) {
static int[] a = {5,6,6,4,4,4,8,8,8,9,4,4,3,3,4};
sort(a);
}
static void sort(int[] a)
{
int min = smallest(a);
int max = largest(a);
int[] A = new int[max - min + 1];
for(int i = 0;i<a.length;i++)
{
A[a[i] - min]++;
}
for(int i = 1;i<A.length;i++)
A[i] = A[i-1] + A[i];
int[] B = new int[a.length];
for(int i = 0;i<a.length;i++)
{
B[ A[a[i] - min] - 1 ] = a[i];
A[a[i] - min]--;
}
print(B);
}
static int smallest(int[] a)
{
int ret = a[0];
for(int i = 1;i<a.length;i++)
{
if(a[i] < ret)
ret = a[i];
}
return ret;
}
static int largest(int[] a)
{
int ret = a[0];
for(int i = 1;i<a.length;i++)
{
if(a[i] > ret)
ret = a[i];
}
return ret;
}
static void print(int[] a)
{
for(int x : a)
System.out.print(x+ " ");
}
}
Are there any advantages of the second complex implementation as compared to the first simple one, which makes it so popular?

Related

Sorting algorithm is skipping the last element in my array

I have a simple algorithm to order numbers in an array, all of the elements become ordered except for the last one. I have tried changing the bounds of my loops to fix this, but it just creates an infinite loop instead.
while (pointer < arrayLength){
int min = findMinFrom(pointer);
for (int i = pointer; i < arrayLength; i ++){
if (A[i] == min){
swap(i, pointer);
pointer ++;
}
compNewS ++;
}
}
You see what's the problem? Your pointer will be updated only if A[i] == min if not then it will keep looping. Put your pointer++ out of that condition.
This can be done with only two loops but here is an adjusted version of your code:
public class Numbers {
private static int [] A ;
public static void main(String [] args) {
int [] array = {3,2,1,4,5,6,7,8,9,7};
A = array;
newSort(array, array.length);
for(int i = 0; i < A.length;i++)
System.out.println(A[i]);
}
public static void newSort(int[] array, int arrayLength){
int pointer = 0;
int p = 0;
while(p < array.length) {
int min = findMinFrom(p,array);
int temp = array[p];
array[p] = min;
array[min] = temp;
p++;
}
}
public static int findMinFrom(int p, int[] array){
int min = p;
for (int i = p; i < array.length; i ++){
if (A[i] < array[p]){
min =i;
}
}
return min;
}
}

Optimizing quick sort

I am implementing quick sort algorithm in java and here is the code :
public class quickSort {
private int array[];
private int length;
public void sort(int[] inputArr) {
if (inputArr == null || inputArr.length == 0) {
return;
}
this.array = inputArr;
length = inputArr.length;
quickSorter(0, length - 1);
}
private void quickSorter(int lowerIndex, int higherIndex) {
int i = lowerIndex;
int j = higherIndex;
// calculate pivot number, I am taking pivot as middle index number
int pivot = array[lowerIndex+(higherIndex-lowerIndex)/2];
// Divide into two arrays
while (i <= j) {
while (array[i] < pivot) {
i++;
}
while (array[j] > pivot) {
j--;
}
if (i <= j) {
exchangeNumbers(i, j);
//move index to next position on both sides
i++;
j--;
}
}
// call quickSort() method recursively
if (lowerIndex < j)
quickSorter(lowerIndex, j);
if (i < higherIndex)
quickSorter(i, higherIndex);
}
private void exchangeNumbers(int i, int j) {
int temp = array[i];
array[i] = array[j];
array[j] = temp;
}
}
Then I implement it with (median of three)
public class quickSort {
private int array[];
private int length;
public void sort(int[] inputArr) {
if (inputArr == null || inputArr.length == 0) {
return;
}
this.array = inputArr;
length = inputArr.length;
quickSorter(0, length - 1);
}
private void quickSorter(int lowerIndex, int higherIndex) {
int i = lowerIndex;
int j = higherIndex;
int mid = lowerIndex+(higherIndex-lowerIndex)/2;
if (array[i]>array[mid]){
exchangeNumbers( i, mid);
}
if (array[i]>array[j]){
exchangeNumbers( i, j);
}
if (array[j]<array[mid]){
exchangeNumbers( j, mid);
}
int pivot = array[mid];
// Divide into two arrays
while (i <= j) {
while (array[i] < pivot) {
i++;
}
while (array[j] > pivot) {
j--;
}
if (i <= j) {
exchangeNumbers(i, j);
//move index to next position on both sides
i++;
j--;
}
}
// call quickSort() method recursively
if (lowerIndex < j)
quickSorter(lowerIndex, j);
if (i < higherIndex)
quickSorter(i, higherIndex);
}
private void exchangeNumbers(int i, int j) {
int temp = array[i];
array[i] = array[j];
array[j] = temp;
}
}
and the testing main :
public static void main(String[] args) {
File number = new File ("f.txt");
final int size = 10000000;
try{
quickSortOptimize opti = new quickSortOptimize();
quickSort s = new quickSort();
PrintWriter printWriter = new PrintWriter(number);
for (int i=0;i<size;i++){
printWriter.println((int)(Math.random()*100000));
}
printWriter.close();
Scanner in = new Scanner (number);
int [] arr1 = new int [size];
for (int i=0;i<size;i++){
arr1[i]=Integer.parseInt(in.nextLine());
}
long a=System.currentTimeMillis();
opti.sort(arr1);
long b=System.currentTimeMillis();
System.out.println("Optimaized quicksort: "+(double)(b-a)/1000);
in.close();
int [] arr2 = new int [size];
Scanner in2= new Scanner(number);
for (int i=0;i<size;i++){
arr2[i]=Integer.parseInt(in2.nextLine());
}
long c=System.currentTimeMillis();
s.sort(arr2);
long d=System.currentTimeMillis();
System.out.println("normal Quicksort: "+(double)(d-c)/1000);
}catch (Exception ex){ex.printStackTrace();}
}
The problem is that this method of optimization should improve performance by 5%
but, what happens actually is that I have done this test many times and almost always getting better result on normal quicksort that optimized one
so what is wrong with the second implementation
A median of three (or more) will usually be slower for input that's randomly ordered.
A median of three is intended to help prevent a really bad case from being quite as horrible. There are ways of making it pretty bad anyway, but at least avoids the problem for a few common orderings--e.g., selecting the first element as the pivot can produce terrible results if/when (most of) the input is already ordered.

Are these complexity classes correct?

I have a few problems to do and I have a decent understanding of how they work I just want feedback on if I am correct. I need to figure out the big-oh-notation of the following.
1.
public static int[] mystery1(int[] list) {
int[] result = new int[2*list.length];
for (int i=0; i<list.length; i++) {
result[2*i] = list[i] / 2+list[i] % 2;
result[2*i+1] = list[i] / 2;
}
I think this one would be Nlog(N)
2.
public static int[] mystery2(int[] list) {
for (int i=0; i<list.length/2; i++) {
int j = list.length-1-i;
int temp = list[i];
list[i] = list[j];
list[j] = temp;
}
return list;
}
I think this one would be O(logN) because it's diving by 2 until it finishes
3.
public static void mystery3(ArrayList<String> list) {
for (int i=0; i<list.size-1; i+=2) {
String first = list.remove(i);
list.add(i+1, first);
}
}
I think this one would be O(N)
4.
public static void mystery4(ArrayList<String> list) {
for (int i=0; i<list.size-1; i+=2) {
String first = list.get(i);
list.set(i, list.get(i+1));
list.set(i+1, first);
}
}
I think this one would be O(N).
All are O(N) except Mystrey3 which is O(N^2)= due to add.list

MergeSort gives StackOverflow error

this is the code for the mergeSort,this gives an stackoverflow error in line 53 and 54(mergeSort(l,m); and mergeSort(m,h);)
Any help will be regarded so valuable,please help me out,i am clueless,Thank you.
package codejam;
public class vector {
static int[] a;
static int[] b;
public static void main(String[] args) {
int[] a1 = {12,33,2,1};
int[] b1 = {12,333,11,1};
mergeSort(0,a1.length);
a1=b1;
mergeSort(0,b1.length);
for (int i = 0; i < a1.length; i++) {
System.out.println(a[i]);
}
}
public static void merge(int l,int m,int h) {
int n1=m-l+1;
int n2 = h-m+1;
int[] left = new int[n1];
int[] right = new int[n2];
int k=l;
for (int i = 0; i < n1 ; i++) {
left[i] = a[k];
k++;
}
for (int i = 0; i < n2; i++) {
right[i] = a[k];
k++;
}
left[n1] = 100000000;
right[n1] = 10000000;
int i=0,j=0;
for ( k =l ; k < h; k++) {
if(left[i]>=right[j])
{
a[k] = right[j];
j++;
}
else
{
a[k] = left[i];
i++;
}
}
}
public static void mergeSort(int l,int h) {
int m =(l+h)/2;
if(l<h)
{
mergeSort(l,m);
mergeSort(m,h);
merge(l,m,h);;
}
}
}
Following is the recursive iterations table of the mergeSort function with argument l=0 and h=4
when the value of l is 0 and value of h is 1 , expression calculate m value which turn out to be 0 but we are checking condition with h which is still 1 so 0<1 become true , recursive calls of this mergeSort function forms a pattern , this pattern doesn't let the function to terminate , stack runs out of memory , cause stackoverflow error.
import java.lang.*;
import java.util.Random;
public class MergeSort {
public static int[] merge_sort(int[] arr, int low, int high ) {
if (low < high) {
int middle = low + (high-low)/2;
merge_sort(arr,low, middle);
merge_sort(arr,middle+1, high);
arr = merge (arr,low,middle, high);
}
return arr;
}
public static int[] merge(int[] arr, int low, int middle, int high) {
int[] helper = new int[arr.length];
for (int i = 0; i <=high; i++){
helper[i] = arr[i];
}
int i = low;
int j = middle+1;
int k = low;
while ( i <= middle && j <= high) {
if (helper[i] <= helper[j]) {
arr[k++] = helper[i++];
} else {
arr[k++] = helper[j++];
}
}
while ( i <= middle){
arr[k++] = helper[i++];
}
while ( j <= high){
arr[k++] = helper[j++];
}
return arr;
}
public static void printArray(int[] B) {
for (int i = 0; i < B.length ; i++) {
System.out.print(B[i] + " ");
}
System.out.println("");
}
public static int[] populateA(int[] B) {
for (int i = 0; i < B.length; i++) {
Random rand = new Random();
B[i] = rand.nextInt(20);
}
return B;
}
public static void main(String[] args) {
// TODO Auto-generated method stub
int A[] = new int[10];
A = populateA(A);
System.out.println("Before sorting");
printArray(A);
A = merge_sort(A,0, A.length -1);
System.out.println("Sorted Array");
printArray(A);
}
}

Finding the longest repeated substring

What would be the best approach (performance-wise) in solving this problem?
I was recommended to use suffix trees. Is this the best approach?
Check out this link: http://introcs.cs.princeton.edu/java/42sort/LRS.java.html
/*************************************************************************
* Compilation: javac LRS.java
* Execution: java LRS < file.txt
* Dependencies: StdIn.java
*
* Reads a text corpus from stdin, replaces all consecutive blocks of
* whitespace with a single space, and then computes the longest
* repeated substring in that corpus. Suffix sorts the corpus using
* the system sort, then finds the longest repeated substring among
* consecutive suffixes in the sorted order.
*
* % java LRS < mobydick.txt
* ',- Such a funny, sporty, gamy, jesty, joky, hoky-poky lad, is the Ocean, oh! Th'
*
* % java LRS
* aaaaaaaaa
* 'aaaaaaaa'
*
* % java LRS
* abcdefg
* ''
*
*************************************************************************/
import java.util.Arrays;
public class LRS {
// return the longest common prefix of s and t
public static String lcp(String s, String t) {
int n = Math.min(s.length(), t.length());
for (int i = 0; i < n; i++) {
if (s.charAt(i) != t.charAt(i))
return s.substring(0, i);
}
return s.substring(0, n);
}
// return the longest repeated string in s
public static String lrs(String s) {
// form the N suffixes
int N = s.length();
String[] suffixes = new String[N];
for (int i = 0; i < N; i++) {
suffixes[i] = s.substring(i, N);
}
// sort them
Arrays.sort(suffixes);
// find longest repeated substring by comparing adjacent sorted suffixes
String lrs = "";
for (int i = 0; i < N - 1; i++) {
String x = lcp(suffixes[i], suffixes[i+1]);
if (x.length() > lrs.length())
lrs = x;
}
return lrs;
}
// read in text, replacing all consecutive whitespace with a single space
// then compute longest repeated substring
public static void main(String[] args) {
String s = StdIn.readAll();
s = s.replaceAll("\\s+", " ");
StdOut.println("'" + lrs(s) + "'");
}
}
Have a look at http://en.wikipedia.org/wiki/Suffix_array as well - they are quite space-efficient and have some reasonably programmable algorithms to produce them, such as "Simple Linear Work Suffix Array Construction" by Karkkainen and Sanders
Here is a simple implementation of longest repeated substring using simplest suffix tree. Suffix tree is very easy to implement in this way.
#include <iostream>
#include <vector>
#include <unordered_map>
#include <string>
using namespace std;
class Node
{
public:
char ch;
unordered_map<char, Node*> children;
vector<int> indexes; //store the indexes of the substring from where it starts
Node(char c):ch(c){}
};
int maxLen = 0;
string maxStr = "";
void insertInSuffixTree(Node* root, string str, int index, string originalSuffix, int level=0)
{
root->indexes.push_back(index);
// it is repeated and length is greater than maxLen
// then store the substring
if(root->indexes.size() > 1 && maxLen < level)
{
maxLen = level;
maxStr = originalSuffix.substr(0, level);
}
if(str.empty()) return;
Node* child;
if(root->children.count(str[0]) == 0) {
child = new Node(str[0]);
root->children[str[0]] = child;
} else {
child = root->children[str[0]];
}
insertInSuffixTree(child, str.substr(1), index, originalSuffix, level+1);
}
int main()
{
string str = "banana"; //"abcabcaacb"; //"banana"; //"mississippi";
Node* root = new Node('#');
//insert all substring in suffix tree
for(int i=0; i<str.size(); i++){
string s = str.substr(i);
insertInSuffixTree(root, s, i, s);
}
cout << maxLen << "->" << maxStr << endl;
return 1;
}
/*
s = "mississippi", return "issi"
s = "banana", return "ana"
s = "abcabcaacb", return "abca"
s = "aababa", return "aba"
*/
the LRS problem is one that is best solved using either a suffix tree or a suffix array. Both approaches have a best time complexity of O(n).
Here is an O(nlog(n)) solution to the LRS problem using a suffix array. My solution can be improved to O(n) if you have a linear construction time algorithm for the suffix array (which is quite hard to implement). The code was taken from my library. If you want more information on how suffix arrays work make sure to check out my tutorials
/**
* Finds the longest repeated substring(s) of a string.
*
* Time complexity: O(nlogn), bounded by suffix array construction
*
* #author William Fiset, william.alexandre.fiset#gmail.com
**/
import java.util.*;
public class LongestRepeatedSubstring {
// Example usage
public static void main(String[] args) {
String str = "ABC$BCA$CAB";
SuffixArray sa = new SuffixArray(str);
System.out.printf("LRS(s) of %s is/are: %s\n", str, sa.lrs());
str = "aaaaa";
sa = new SuffixArray(str);
System.out.printf("LRS(s) of %s is/are: %s\n", str, sa.lrs());
str = "abcde";
sa = new SuffixArray(str);
System.out.printf("LRS(s) of %s is/are: %s\n", str, sa.lrs());
}
}
class SuffixArray {
// ALPHABET_SZ is the default alphabet size, this may need to be much larger
int ALPHABET_SZ = 256, N;
int[] T, lcp, sa, sa2, rank, tmp, c;
public SuffixArray(String str) {
this(toIntArray(str));
}
private static int[] toIntArray(String s) {
int[] text = new int[s.length()];
for(int i=0;i<s.length();i++)text[i] = s.charAt(i);
return text;
}
// Designated constructor
public SuffixArray(int[] text) {
T = text;
N = text.length;
sa = new int[N];
sa2 = new int[N];
rank = new int[N];
c = new int[Math.max(ALPHABET_SZ, N)];
construct();
kasai();
}
private void construct() {
int i, p, r;
for (i=0; i<N; ++i) c[rank[i] = T[i]]++;
for (i=1; i<ALPHABET_SZ; ++i) c[i] += c[i-1];
for (i=N-1; i>=0; --i) sa[--c[T[i]]] = i;
for (p=1; p<N; p <<= 1) {
for (r=0, i=N-p; i<N; ++i) sa2[r++] = i;
for (i=0; i<N; ++i) if (sa[i] >= p) sa2[r++] = sa[i] - p;
Arrays.fill(c, 0, ALPHABET_SZ, 0);
for (i=0; i<N; ++i) c[rank[i]]++;
for (i=1; i<ALPHABET_SZ; ++i) c[i] += c[i-1];
for (i=N-1; i>=0; --i) sa[--c[rank[sa2[i]]]] = sa2[i];
for (sa2[sa[0]] = r = 0, i=1; i<N; ++i) {
if (!(rank[sa[i-1]] == rank[sa[i]] &&
sa[i-1]+p < N && sa[i]+p < N &&
rank[sa[i-1]+p] == rank[sa[i]+p])) r++;
sa2[sa[i]] = r;
} tmp = rank; rank = sa2; sa2 = tmp;
if (r == N-1) break; ALPHABET_SZ = r + 1;
}
}
// Use Kasai algorithm to build LCP array
private void kasai() {
lcp = new int[N];
int [] inv = new int[N];
for (int i = 0; i < N; i++) inv[sa[i]] = i;
for (int i = 0, len = 0; i < N; i++) {
if (inv[i] > 0) {
int k = sa[inv[i]-1];
while( (i + len < N) && (k + len < N) && T[i+len] == T[k+len] ) len++;
lcp[inv[i]-1] = len;
if (len > 0) len--;
}
}
}
// Finds the LRS(s) (Longest Repeated Substring) that occurs in a string.
// Traditionally we are only interested in substrings that appear at
// least twice, so this method returns an empty set if this is not the case.
// #return an ordered set of longest repeated substrings
public TreeSet <String> lrs() {
int max_len = 0;
TreeSet <String> lrss = new TreeSet<>();
for (int i = 0; i < N; i++) {
if (lcp[i] > 0 && lcp[i] >= max_len) {
// We found a longer LRS
if ( lcp[i] > max_len )
lrss.clear();
// Append substring to the list and update max
max_len = lcp[i];
lrss.add( new String(T, sa[i], max_len) );
}
}
return lrss;
}
public void display() {
System.out.printf("-----i-----SA-----LCP---Suffix\n");
for(int i = 0; i < N; i++) {
int suffixLen = N - sa[i];
String suffix = new String(T, sa[i], suffixLen);
System.out.printf("% 7d % 7d % 7d %s\n", i, sa[i],lcp[i], suffix );
}
}
}
public class LongestSubString {
public static void main(String[] args) {
String s = findMaxRepeatedString("ssssssssssss this is a ddddddd word with iiiiiiiiiis and loads of these are ppppppppppppps");
System.out.println(s);
}
private static String findMaxRepeatedString(String s) {
Processor p = new Processor();
char[] c = s.toCharArray();
for (char ch : c) {
p.process(ch);
}
System.out.println(p.bigger());
return new String(new char[p.bigger().count]).replace('\0', p.bigger().letter);
}
static class CharSet {
int count;
Character letter;
boolean isLastPush;
boolean assign(char c) {
if (letter == null) {
count++;
letter = c;
isLastPush = true;
return true;
}
return false;
}
void reassign(char c) {
count = 1;
letter = c;
isLastPush = true;
}
boolean push(char c) {
if (isLastPush && letter == c) {
count++;
return true;
}
return false;
}
#Override
public String toString() {
return "CharSet [count=" + count + ", letter=" + letter + "]";
}
}
static class Processor {
Character previousLetter = null;
CharSet set1 = new CharSet();
CharSet set2 = new CharSet();
void process(char c) {
if ((set1.assign(c)) || set1.push(c)) {
set2.isLastPush = false;
} else if ((set2.assign(c)) || set2.push(c)) {
set1.isLastPush = false;
} else {
set1.isLastPush = set2.isLastPush = false;
smaller().reassign(c);
}
}
CharSet smaller() {
return set1.count < set2.count ? set1 : set2;
}
CharSet bigger() {
return set1.count < set2.count ? set2 : set1;
}
}
}
I had an interview and I needed to solve this problem. This is my solution:
public class FindLargestSubstring {
public static void main(String[] args) {
String test = "ATCGATCGA";
System.out.println(hasRepeatedSubString(test));
}
private static String hasRepeatedSubString(String string) {
Hashtable<String, Integer> hashtable = new Hashtable<>();
int length = string.length();
for (int subLength = length - 1; subLength > 1; subLength--) {
for (int i = 0; i <= length - subLength; i++) {
String sub = string.substring(i, subLength + i);
if (hashtable.containsKey(sub)) {
return sub;
} else {
hashtable.put(sub, subLength);
}
}
}
return "No repeated substring!";
}}
There are way too many things that affect performance for us to answer this question with only what you've given us. (Operating System, language, memory issues, the code itself)
If you're just looking for a mathematical analysis of the algorithm's efficiency, you probably want to change the question.
EDIT
When I mentioned "memory issues" and "the code" I didn't provide all the details. The length of the strings you will be analyzing are a BIG factor. Also, the code doesn't operate alone - it must sit inside a program to be useful. What are the characteristics of that program which impact this algorithm's use and performance?
Basically, you can't performance tune until you have a real situation to test. You can make very educated guesses about what is likely to perform best, but until you have real data and real code, you'll never be certain.

Resources