I use the following code to sort both items and an index:
There are about 256*1024*1024 entries in the values and index arrays.
The uncompressed values array occupies about 10GB, which is why I want to compress it by grouping together duplicate values of which there are many.
I've come up with the following proof of concept code, but there is a problem in the Compress method. Every time it finds a duplicate it performs a search in the index which costs O(N2) time.
I need to keep an index, because I need to be able to access elements in the array as if no de-duplication took place.
This is done, using a simple default property array property, thus mimicking the original array:
function TLookupTable.GetItems(Index: integer): TSlice;
begin
Result:= FData[FIndex[Index]];
end;
The proof of concept code (which is dog slow) is as follows.
TMyArray = class
class procedure QuickSort<T,Idx>(var Values: array of T; var Index: array of Idx; const Comparer: IComparer<T>;
L, R: Integer);
class procedure Compress<T>(const Values: array of T; var Index: array of Integer; out CompressedValues: TArray<T>; const Comparer: IComparer<T>);
end;
class procedure TMyArray.QuickSort<T,Idx>(var Values: array of T; var Index: array of Idx; const Comparer: IComparer<T>;
L, R: Integer);
var
I, J: Integer;
pivot, temp: T;
TempIdx: Idx;
begin
if (Length(Values) = 0) or ((R - L) <= 0) then
Exit;
repeat
I := L;
J := R;
pivot := Values[L + (R - L) shr 1];
repeat
while Comparer.Compare(Values[I], pivot) < 0 do Inc(I);
while Comparer.Compare(Values[J], pivot) > 0 do Dec(J);
if I <= J then begin
if I <> J then begin
temp := Values[I];
Values[I] := Values[J];
Values[J] := temp;
//Keep the index in sync
tempIdx := Index[I];
Index[I] := Index[J];
Index[J] := tempIdx;
end;
Inc(I);
Dec(J);
end;
until I > J;
if L < J then QuickSort<T,Idx>(Values, Index, Comparer, L, J);
L := I;
until I >= R;
end;
class procedure TMyArray.Compress<T>(const Values: array of T; var Index: array of integer; out CompressedValues: TArray<T>; const Comparer: IComparer<T>);
var
i,j: integer;
Count: integer;
Duplicate: integer;
begin
Count:= 256*1024*1024;
SetLength(CompressedValues, Count);
CompressedValues[0]:= Values[0];
Duplicate:= 0;
for i := 1 to High(Values) do begin
//Compress duplicate values
if Comparer.Compare(Values[i], CompressedValues[Duplicate]) = 0 then begin
//Search for the indexed item
//Very time consuming: O(N*N)
for j:= i to High(Index) do if Index[j] = i then begin
Index[j]:= Duplicate; //Fix up the index
Break;
end; {for j}
end else begin
Inc(Duplicate);
if Duplicate >= Count then begin
Inc(Count, 1024*1024);
SetLength(CompressedValues, Count);
end;
CompressedValues[Duplicate]:= Values[i];
end;
end; {for i}
SetLength(CompressedValues, Duplicate+1)
end;
How can I speed up the compress step so that it takes O(N) time?
If there is a way to both keep the index and remove duplicates and sort, all at the same time, that would be great. My answer below splits the sort and the dedup into two separate stages.
The trick is to leave the original data array alone and just sort the index.
We can then exploit the fact that the original data is in the correct order and use that to build a new index.
In addition that means that we no longer need a custom Sort function; it also moves a lot less data around.
Create the index like so:
FIndex: TArray<integer>;
....
SetLength(FIndex, Length(FAllData));
for i:= 0 to count-1 do begin
FIndex[i]:= i;
end;
TArray.Sort<Integer>(FIndex, TDelegatedComparer<integer>.Construct(
function(const Left, Right: Integer): Integer
begin
if FAllData[Left] > FAllData[Right] then Exit(1);
if FAllData[Left] < FAllData[Right] then Exit(-1);
Result:= 0;
end));
Change the TMyArray class like so:
TMyArray = class
class procedure Compress<T>(const Values: array of T; var Index: TArray<integer>; out CompressedValues: TArray<T>; const Comparer: IComparer<T>);
end;
class procedure TMyArray.Compress<T>(const Values: array of T; var Index: TArray<integer>; out CompressedValues: TArray<T>; const Comparer: IComparer<T>);
const
Equal = 0;
var
i,j: integer;
Count: integer;
Duplicate: integer;
IndexEntry: integer;
OutIndex: TArray<integer>;
begin
Count:= 16*1024*1024; //Start with something reasonable
SetLength(CompressedValues, Count);
SetLength(OutIndex, Length(Index));
Duplicate:= 0;
CompressedValues[0]:= Values[Index[0]];
OutIndex[Index[0]]:= 0;
for i:= 1 to High(Index) do begin
if Comparer.Compare(Values[Index[i]], CompressedValues[Duplicate]) = Equal then begin
OutIndex[Index[i]]:= Duplicate;
end else begin
Inc(Duplicate);
//Grow as needed
if (Duplicate >= Length(CompressedValues)) then SetLength(CompressedValues, Length(CompressedValues) + 1024*1024);
CompressedValues[Duplicate]:= Values[Index[i]];
OutIndex[Index[i]]:= Duplicate;
end;
end;
Index:= OutIndex;
SetLength(CompressedValues, Duplicate+1);
end;
I was wondering if given a binary sequence we can check if it matches a string using the Huffman algorithm.
for example, if we a string "abdcc" and several binary sequences we can calculate which one is a possible representation of "abdcc" that used Huffman's algorithm
Interesting puzzle. As mentioned by j_random_hacker in a comment, it's possible to do this using a backtracking search. There are a few constraints to valid Huffman encodings of the string that we can use to narrow the search down:
No two Huffman codes of length n and m can be identical in the first n or m bits (whichever is shorter). This is because otherwise a Huffman decoder wouldn't be able to tell if it had encountered the longer or the shorter code when decoding. And obviously two codes of the same length cannot be identical. (1)
If at any time there are less bits remaining in the bitstream than characters remaining in the string we are matching then the string cannot match. (2)
If we reach the end of the string and there are still bits remaining in the bitstream then the string does not match (3)
If we encounter a character in the string for the second time, and we have already assumed a Huffman code for that same character earlier in the string, then an identical code must be present in the bit stream or the string cannot match. (4)
We can define a function matchHuffmanString that matches a string with Huffman encoded bitstream, with a Huffman code table as part of the global state. To begin with the code table is empty and we call matchHuffmanString, passing the start of the string and the start of the bitstream.
When the function is called, it checks if there are enough bits in the stream to match the string and returns if not. (2)
If the string is empty, then if the bitstream is also empty then there is a match and the code table is output. If the stream is empty but the bitstream is not then there is no match so the function returns. (3)
If characters remain in the string, then the first character is read. The function checks if there is already an entry in the code table for that character, and if so then the same code must be present in the bitstream. If not then there is no match so the function returns (4). If there is then the function calls itself, moving on to the next character and past the matching code in the bitstream.
If there is no matching code for the character, then the possibility that it is represented by a code of every possible length n from 1 bit to 32 bits (an arbitrary limit) is considered. n bits are read from the bitstream and checked to see if such a code would conflict with any existing codes according to rule (1). If no conflict exists then the code is added to the code table, then the function recurses, moving onto the next character and past the assumed code of length n bits. After returning then it backtracks by removing the code from the table.
Simple implementation in C:
#include <stdio.h>
// Huffman table:
// a 01
// b 0001
// c 1
// d 0010
char* string = "abdcc";
// 01 0001 0010 1 1
// reverse bit order (MSB first) an add extra 0 for padding to stop getBits reading past the end of the array:
#define MESSAGE_LENGTH (12)
unsigned int message[] = {0b110100100010, 0};
// can handle messages of >32 bits, even though the above message is only 12 bits long
unsigned int getBits(int start, int n)
{
return ((message[start>>5] >> (start&31)) | (message[(start>>5)+1] << (32-(start&31)))) & ((1<<n)-1);
}
unsigned int codes[26];
int code_lengths[26];
int callCount = 0;
void outputCodes()
{
// output the codes:
int i, j;
for(i = 0; i < 26; i++)
{
if(code_lengths[i] != 0)
{
printf("%c ", i + 'a');
for(j = 0; j < code_lengths[i]; j++)
printf("%s", codes[i] & (1 << j) ? "1" : "0");
printf("\n");
}
}
}
void matchHuffmanString(char* s, int len, int startbit)
{
callCount++;
if(len > MESSAGE_LENGTH - startbit)
return; // not enough bits left to encode the rest of the message even at 1 bit per char (2)
if(len == 0) // no more characters to match
{
if(startbit == MESSAGE_LENGTH)
{
// (3) we exactly used up all the bits, this stream matches.
printf("match!\n\n");
outputCodes();
printf("\nCall count: %d\n", callCount);
}
return;
}
// read a character from the string (assume 'a' to 'z'):
int c = s[0] - 'a';
// is there already a code for this character?
if(code_lengths[c] != 0)
{
// check if the code in the bit stream matches:
int length = code_lengths[c];
if(startbit + length > MESSAGE_LENGTH)
return; // ran out of bits in stream, no match
unsigned int bits = getBits(startbit, length);
if(bits != codes[c])
return; // bits don't match (4)
matchHuffmanString(s + 1, len - 1, startbit + length);
}
else
{
// this character doesn't have a code yet, consider every possible length
int i, j;
for(i = 1; i < 32; i++)
{
// are there enough bits left for a code this long?
if(startbit + i > MESSAGE_LENGTH)
continue;
unsigned int bits = getBits(startbit, i);
// does this code conflict with an existing code?
for(j = 0; j < 26; j++)
{
if(code_lengths[j] != 0) // check existing codes only
{
// do the two codes match in the first i or code_lengths[j] bits, whichever is shorter?
int length = code_lengths[j] < i ? code_lengths[j] : i;
if((bits & ((1 << length)-1)) == (codes[j] & ((1 << length)-1)))
break; // there's a conflict (1)
}
}
if(j != 26)
continue; // there was a conflict
// add the new code to the codes array and recurse:
codes[c] = bits; code_lengths[c] = i;
matchHuffmanString(s + 1, len - 1, startbit + i);
code_lengths[c] = 0; // clear the code (backtracking)
}
}
}
int main(void) {
int i;
for(i = 0; i < 26; i++)
code_lengths[i] = 0;
matchHuffmanString(string, 5, 0);
return 0;
}
output:
match!
a 01
b 0001
c 1
d 0010
Call count: 42
Ideone.com Demo
The above code could be improved by iterating over the string as long as it is encountering characters that it already has a code for, and only recursing when it finds one it doesn't. Also it only works for lowercase letters a-z with no spaces and doesn't do any validation. I'd have to test it to be sure, but I think it's a tractable problem even for long strings, because any possible combinatorial explosion only happens when encountering new characters that don't already have codes in the table, and even then it's subject to contraints.
I seem to have a problem with a function in Pascal. The program is just an ordinary recursive binsearch, but it returns always the value of 4? Can anybody point to the mistake in the solution?
var i: integer;
const n = 10;
type tablice = array[1..n] of integer;
function Binsearch(const tab:tablice;l:integer;p:integer;x:integer):integer;
var s: integer;
begin
if l=p then
begin
if tab[l]=x then
Binsearch:=p
else
Binsearch:=-1;
end
else
begin
s:=(l+p) div 2;
if tab[s]<x then
l:=s+1
else
p:=s;
Binsearch(tab,l,p,x);
end;
end;
var A:tablice;
x:integer;
begin
for i:=1 to n do A[i]:=i;
x:=30;
writeln(Binsearch(A,1,n,x));
readln;
end.
On the other hand same code in C++ works fine:
using namespace std;
int rekursja(int tab[], int l, int p, int x){
if(l==p){
if(tab[l]==x) return l;
else return -1;
}else{
int s=(l+p)/2;
if(tab[s]<x) l=s+1;
else p=s;
rekursja(tab,l,p,x);
}
}
int main(){
int t[] = {1,2,3,4,5,6,7,8,9,11};
cout << rekursja(t,0,9,11);
}
Binsearch returns the result only exiting from the last call (result := -1 in this case). In the other cases no result is assigned, so a random value is shown.
Change the ricorsive call with
Binsearch:=Binsearch(tab,l,p,x);
In this way it returns the result to all previos call.
What is the best method to find the number of digits of a positive integer?
I have found this 3 basic methods:
conversion to string
String s = new Integer(t).toString();
int len = s.length();
for loop
for(long long int temp = number; temp >= 1;)
{
temp/=10;
decimalPlaces++;
}
logaritmic calculation
digits = floor( log10( number ) ) + 1;
where you can calculate log10(x) = ln(x) / ln(10) in most languages.
First I thought the string method is the dirtiest one but the more I think about it the more I think it's the fastest way. Or is it?
There's always this method:
n = 1;
if ( i >= 100000000 ) { n += 8; i /= 100000000; }
if ( i >= 10000 ) { n += 4; i /= 10000; }
if ( i >= 100 ) { n += 2; i /= 100; }
if ( i >= 10 ) { n += 1; }
Well the correct answer would be to measure it - but you should be able to make a guess about the number of CPU steps involved in converting strings and going through them looking for an end marker
Then think how many FPU operations/s your processor can do and how easy it is to calculate a single log.
edit: wasting some more time on a monday morning :-)
String s = new Integer(t).toString();
int len = s.length();
One of the problems with high level languages is guessing how much work the system is doing behind the scenes of an apparently simple statement. Mandatory Joel link
This statement involves allocating memory for a string, and possibly a couple of temporary copies of a string. It must parse the integer and copy the digits of it into a string, possibly having to reallocate and move the existing memory if the number is large. It might have to check a bunch of locale settings to decide if your country uses "," or ".", it might have to do a bunch of unicode conversions.
Then finding the length has to scan the entire string, again considering unicode and any local specific settings such as - are you in a right->left language?.
Alternatively:
digits = floor( log10( number ) ) + 1;
Just because this would be harder for you to do on paper doesn't mean it's hard for a computer! In fact a good rule in high performance computing seems to have been - if something is hard for a human (fluid dynamics, 3d rendering) it's easy for a computer, and if it's easy for a human (face recognition, detecting a voice in a noisy room) it's hard for a computer!
You can generally assume that the builtin maths functions log/sin/cos etc - have been an important part of computer design for 50years. So even if they don't map directly into a hardware function in the FPU you can bet that the alternative implementation is pretty efficient.
I don't know, and the answer may well be different depending on how your individual language is implemented.
So, stress test it! Implement all three solutions. Run them on 1 through 1,000,000 (or some other huge set of numbers that's representative of the numbers the solution will be running against) and time how long each of them takes.
Pit your solutions against one another and let them fight it out. Like intellectual gladiators. Three algorithms enter! One algorithm leaves!
Test conditions
Decimal numeral system
Positive integers
Up to 10 digits
Language: ActionScript 3
Results
digits: [1,10],
no. of runs: 1,000,000
random sample: 8777509,40442298,477894,329950,513,91751410,313,3159,131309,2
result: 7,8,6,6,3,8,3,4,6,1
CONVERSION TO STRING: 724ms
LOGARITMIC CALCULATION: 349ms
DIV 10 ITERATION: 229ms
MANUAL CONDITIONING: 136ms
Note: Author refrains from making any conclusions for numbers with more than 10 digits.
Script
package {
import flash.display.MovieClip;
import flash.utils.getTimer;
/**
* #author Daniel
*/
public class Digits extends MovieClip {
private const NUMBERS : uint = 1000000;
private const DIGITS : uint = 10;
private var numbers : Array;
private var digits : Array;
public function Digits() {
// ************* NUMBERS *************
numbers = [];
for (var i : int = 0; i < NUMBERS; i++) {
var number : Number = Math.floor(Math.pow(10, Math.random()*DIGITS));
numbers.push(number);
}
trace('Max digits: ' + DIGITS + ', count of numbers: ' + NUMBERS);
trace('sample: ' + numbers.slice(0, 10));
// ************* CONVERSION TO STRING *************
digits = [];
var time : Number = getTimer();
for (var i : int = 0; i < numbers.length; i++) {
digits.push(String(numbers[i]).length);
}
trace('\nCONVERSION TO STRING - time: ' + (getTimer() - time));
trace('sample: ' + digits.slice(0, 10));
// ************* LOGARITMIC CALCULATION *************
digits = [];
time = getTimer();
for (var i : int = 0; i < numbers.length; i++) {
digits.push(Math.floor( Math.log( numbers[i] ) / Math.log(10) ) + 1);
}
trace('\nLOGARITMIC CALCULATION - time: ' + (getTimer() - time));
trace('sample: ' + digits.slice(0, 10));
// ************* DIV 10 ITERATION *************
digits = [];
time = getTimer();
var digit : uint = 0;
for (var i : int = 0; i < numbers.length; i++) {
digit = 0;
for(var temp : Number = numbers[i]; temp >= 1;)
{
temp/=10;
digit++;
}
digits.push(digit);
}
trace('\nDIV 10 ITERATION - time: ' + (getTimer() - time));
trace('sample: ' + digits.slice(0, 10));
// ************* MANUAL CONDITIONING *************
digits = [];
time = getTimer();
var digit : uint;
for (var i : int = 0; i < numbers.length; i++) {
var number : Number = numbers[i];
if (number < 10) digit = 1;
else if (number < 100) digit = 2;
else if (number < 1000) digit = 3;
else if (number < 10000) digit = 4;
else if (number < 100000) digit = 5;
else if (number < 1000000) digit = 6;
else if (number < 10000000) digit = 7;
else if (number < 100000000) digit = 8;
else if (number < 1000000000) digit = 9;
else if (number < 10000000000) digit = 10;
digits.push(digit);
}
trace('\nMANUAL CONDITIONING: ' + (getTimer() - time));
trace('sample: ' + digits.slice(0, 10));
}
}
}
This algorithm might be good also, assuming that:
Number is integer and binary encoded (<< operation is cheap)
We don't known number boundaries
var num = 123456789L;
var len = 0;
var tmp = 1L;
while(tmp < num)
{
len++;
tmp = (tmp << 3) + (tmp << 1);
}
This algorithm, should have speed comparable to for-loop (2) provided, but a bit faster due to (2 bit-shifts, add and subtract, instead of division).
As for Log10 algorithm, it will give you only approximate answer (that is close to real, but still), since analytic formula for computing Log function have infinite loop and can't be calculated precisely Wiki.
Use the simplest solution in whatever programming language you're using. I can't think of a case where counting digits in an integer would be the bottleneck in any (useful) program.
C, C++:
char buffer[32];
int length = sprintf(buffer, "%ld", (long)123456789);
Haskell:
len = (length . show) 123456789
JavaScript:
length = String(123456789).length;
PHP:
$length = strlen(123456789);
Visual Basic (untested):
length = Len(str(123456789)) - 1
conversion to string: This will have to iterate through each digit, find the character that maps to the current digit, add a character to a collection of characters. Then get the length of the resulting String object. Will run in O(n) for n=#digits.
for-loop: will perform 2 mathematical operation: dividing the number by 10 and incrementing a counter. Will run in O(n) for n=#digits.
logarithmic: Will call log10 and floor, and add 1. Looks like O(1) but I'm not really sure how fast the log10 or floor functions are. My knowledge of this sort of things has atrophied with lack of use so there could be hidden complexity in these functions.
So I guess it comes down to: is looking up digit mappings faster than multiple mathematical operations or whatever is happening in log10? The answer will probably vary. There could be platforms where the character mapping is faster, and others where doing the calculations is faster. Also to keep in mind is that the first method will creats a new String object that only exists for the purpose of getting the length. This will probably use more memory than the other two methods, but it may or may not matter.
You can obviously eliminate the method 1 from the competition, because the atoi/toString algorithm it uses would be similar to method 2.
Method 3's speed depends on whether the code is being compiled for a system whose instruction set includes log base 10.
For very large integers, the log method is much faster. For instance, with a 2491327 digit number (the 11920928th Fibonacci number, if you care), Python takes several minutes to execute the divide-by-10 algorithm, and milliseconds to execute 1+floor(log(n,10)).
import math
def numdigits(n):
return ( int(math.floor(math.log10(n))) + 1 )
Regarding the three methods you propose for "determining the number of digits necessary to represent a given number in a given base", I don't like any of them, actually; I prefer the method I give below instead.
Re your method #1 (strings): Anything involving converting back-and-forth between strings and numbers is usually very slow.
Re your method #2 (temp/=10): This is fatally flawed because it assumes that x/10 always means "x divided by 10". But in many programming languages (eg: C, C++), if "x" is an integer type, then "x/10" means "integer division", which isn't the same thing as floating-point division, and it introduces round-off errors at every iteration, and they accumulate in a recursive formula such as your solution #2 uses.
Re your method #3 (logs): it's buggy for large numbers (at least in C, and probably other languages as well), because floating-point data types tend not to be as precise as 64-bit integers.
Hence I dislike all 3 of those methods: #1 works but is slow, #2 is broken, and #3 is buggy for large numbers. Instead, I prefer this, which works for numbers from 0 up to about 18.44 quintillion:
unsigned NumberOfDigits (uint64_t Number, unsigned Base)
{
unsigned Digits = 1;
uint64_t Power = 1;
while ( Number / Power >= Base )
{
++Digits;
Power *= Base;
}
return Digits;
}
Keep it simple:
long long int a = 223452355415634664;
int x;
for (x = 1; a >= 10; x++)
{
a = a / 10;
}
printf("%d", x);
You can use a recursive solution instead of a loop, but somehow similar:
#tailrec
def digits (i: Long, carry: Int=1) : Int = if (i < 10) carry else digits (i/10, carry+1)
digits (8345012978643L)
With longs, the picture might change - measure small and long numbers independently against different algorithms, and pick the appropriate one, depending on your typical input. :)
Of course nothing beats a switch:
switch (x) {
case 0: case 1: case 2: case 3: case 4: case 5: case 6: case 7: case 8: case 9: return 1;
case 10: case 11: // ...
case 99: return 2;
case 100: // you get the point :)
default: return 10; // switch only over int
}
except a plain-o-array:
int [] size = {1,1,1,1,1,1,1,1,1,2,2,2,2,2,... };
int x = 234561798;
return size [x];
Some people will tell you to optimize the code-size, but yaknow, premature optimization ...
log(x,n)-mod(log(x,n),1)+1
Where x is a the base and n is the number.
Here is the measurement in Swift 4.
Algorithms code:
extension Int {
var numberOfDigits0: Int {
var currentNumber = self
var n = 1
if (currentNumber >= 100000000) {
n += 8
currentNumber /= 100000000
}
if (currentNumber >= 10000) {
n += 4
currentNumber /= 10000
}
if (currentNumber >= 100) {
n += 2
currentNumber /= 100
}
if (currentNumber >= 10) {
n += 1
}
return n
}
var numberOfDigits1: Int {
return String(self).count
}
var numberOfDigits2: Int {
var n = 1
var currentNumber = self
while currentNumber > 9 {
n += 1
currentNumber /= 10
}
return n
}
}
Measurement code:
var timeInterval0 = Date()
for i in 0...10000 {
i.numberOfDigits0
}
print("timeInterval0: \(Date().timeIntervalSince(timeInterval0))")
var timeInterval1 = Date()
for i in 0...10000 {
i.numberOfDigits1
}
print("timeInterval1: \(Date().timeIntervalSince(timeInterval1))")
var timeInterval2 = Date()
for i in 0...10000 {
i.numberOfDigits2
}
print("timeInterval2: \(Date().timeIntervalSince(timeInterval2))")
Output
timeInterval0: 1.92149806022644
timeInterval1: 0.557608008384705
timeInterval2: 2.83262193202972
On this measurement basis String conversion is the best option for the Swift language.
I was curious after seeing #daniel.sedlacek results so I did some testing using Swift for numbers having more than 10 digits. I ran the following script in the playground.
let base = [Double(100090000000), Double(100050000), Double(100050000), Double(100000200)]
var rar = [Double]()
for i in 1...10 {
for d in base {
let v = d*Double(arc4random_uniform(UInt32(1000000000)))
rar.append(v*Double(arc4random_uniform(UInt32(1000000000))))
rar.append(Double(1)*pow(1,Double(i)))
}
}
print(rar)
var timeInterval = NSDate().timeIntervalSince1970
for d in rar {
floor(log10(d))
}
var newTimeInterval = NSDate().timeIntervalSince1970
print(newTimeInterval-timeInterval)
timeInterval = NSDate().timeIntervalSince1970
for d in rar {
var c = d
while c > 10 {
c = c/10
}
}
newTimeInterval = NSDate().timeIntervalSince1970
print(newTimeInterval-timeInterval)
Results of 80 elements
0.105069875717163 for floor(log10(x))
0.867973804473877 for div 10 iterations
Adding one more approach to many of the already mentioned approaches.
The idea is to use binarySearch on an array containing the range of integers based on the digits of the int data type.
The signature of Java Arrays class binarySearch is :
binarySearch(dataType[] array, dataType key) which returns the index of the search key, if it is contained in the array; otherwise, (-(insertion point) – 1).
The insertion point is defined as the point at which the key would be inserted into the array.
Below is the implementation:
static int [] digits = {9,99,999,9999,99999,999999,9999999,99999999,999999999,Integer.MAX_VALUE};
static int digitsCounter(int N)
{
int digitCount = Arrays.binarySearch(digits , N<0 ? -N:N);
return 1 + (digitCount < 0 ? ~digitCount : digitCount);
}
Please note that the above approach only works for : Integer.MIN_VALUE <= N <= Integer.MAX_VALUE, but can be easily extended for Long data type by adding more values to the digits array.
For example,
I) for N = 555, digitCount = Arrays.binarySearch(digits , 555) returns -3 (-(2)-1) as it's not present in the array but is supposed to be inserted at point 2 between 9 & 99 like [9, 55, 99].
As the index we got is negative we need to take the bitwise compliment of the result.
At last, we need to add 1 to the result to get the actual number of digits in the number N.
In Swift 5.x, you get the number of digit in integer as below :
Convert to string and then count number of character in string
let nums = [1, 7892, 78, 92, 90]
for i in nums {
let ch = String(describing: i)
print(ch.count)
}
Calculating the number of digits in integer using loop
var digitCount = 0
for i in nums {
var tmp = i
while tmp >= 1 {
tmp /= 10
digitCount += 1
}
print(digitCount)
}
let numDigits num =
let num = abs(num)
let rec numDigitsInner num =
match num with
| num when num < 10 -> 1
| _ -> 1 + numDigitsInner (num / 10)
numDigitsInner num
F# Version, without casting to a string.