Print integer with "most appropriate" kilo/mega/etc multiplier [duplicate] - algorithm

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to convert byte size into human readable format in java?
Given an integer, I'd like to print it in a human-readable way using kilo, mega, giga etc. multipliers. How do I pick the "best" multiplier?
Here are some examples
1 print as 1
12345 print as 12.3k
987654321 print as 988M
Ideally the number of digits printed should be configurable, e.g. in the last example, 3 digits would lead to 988M, 2 digits would lead to 1.0G, 1 digit would lead to 1G, and 4 digits would lead to 987.7M.
Example: Apple uses an algorithm of this kind, I think, when OSX tells me how many more bytes have to be copied.
This will be for Java, but I'm more interested in the algorithm than the language.

As a starting point, you could use the Math.log() function to get the "magnitude" of your value, and then use some form of associative container for the suffix (k, M, G, etc).
var magnitude = Math.log(value) / Math.log(10);
Hope this helps somehow

Related

Fastest algorithm to convert hexadecimal numbers into decimal form without using a fixed length variable to store the result

I want to write a program to convert hexadecimal numbers into their decimal forms without using a variable of fixed length to store the result because that would restrict the range of inputs that my program can work with.
Let's say I were to use a variable of type long long int to calculate, store and print the result. Doing so would limit the range of hexadecimal numbers that my program can handle to between 8000000000000001 and 7FFFFFFFFFFFFFFF. Anything outside this range would cause the variable to overflow.
I did write a program that calculates and stores the decimal result in a dynamically allocated string by performing carry and borrow operations but it runs much slower, even for numbers that are as big as 7FFFFFFFF!
Then I stumbled onto this site which could take numbers that are way outside the range of a 64 bit variable. I tried their converter with numbers much larger than 16^65 - 1 and still couldn't get it to overflow. It just kept on going and printing the result.
I figured that they must be using a much better algorithm for hex to decimal conversion, one that isn't limited to 64 bit values.
So far, Google's search results have only led me to algorithms that use some fixed-length variable for storing the result.
That's why I am here. I wanna know if such an algorithm exists and if it does, what is it?
Well, it sounds like you already did it when you wrote "a program that calculates and stores the decimal result in a dynamically allocated string by performing carry and borrow operations".
Converting from base 16 (hexadecimal) to base 10 means implementing multiplication and addition of numbers in a base 10x representation. Then for each hex digit d, you calculate result = result*16 + d. When you're done you have the same number in a 10-based representation that is easy to write out as a decimal string.
There could be any number of reasons why your string-based method was slow. If you provide it, I'm sure someone could comment.
The most important trick for making it reasonably fast, though, is to pick the right base to convert to and from. I would probably do the multiplication and addition in base 109, so that each digit will be as large as possible while still fitting into a 32-bit integer, and process 7 hex digits at a time, which is as many as I can while only multiplying by single digits.
For every 7 hex digts, I'd convert them to a number d, and then do result = result * ‭(16^7) + d.
Then I can get the 9 decimal digits for each resulting digit in base 109.
This process is pretty easy, since you only have to multiply by single digits. I'm sure there are faster, more complicated ways that recursively break the number into equal-sized pieces.

Conversion of Integer to Characters [duplicate]

This question already has answers here:
How do I print an integer in Assembly Level Programming without printf from the c library? (itoa, integer to decimal ASCII string)
(5 answers)
Closed 3 years ago.
I have been looking for the way that integers are converted to characters. I understand that there are ways using modulo and division to extract each number. I am looking for the way that programming languages do this.
Example:
int a = 101;
printf("%d\", &a);
This prints 101 to the console.
I want to understand how the bits 01100101 turn into "101" at the processor level.
I want to understand how the bits 01100101 turn into "101" at the processor level.
It doesn't. At the Assembly level the meanings of bitpatterns are entirely dependent on their usage. If I do mov eax, 1 the computer doesn't know whether I mean the decimal number 1, the boolean true or even the ASCII delete character. Meaning is something humans ascribe to their programs.
I understand that there are ways using modulo and division to extract each number. I am looking for the way that programming languages do this.
This is pretty much what programming languages do, sometimes they will also use look up tables and such to speed things up but the core algorithm stays the same.

How does Ruby store large numbers? [duplicate]

This question already has answers here:
Arbitrary precision arithmetic with Ruby
(5 answers)
Closed 8 years ago.
Ruby can store extremely large numbers. Now that I think about it though, I don't even know how that's possible.
Computers store data in a series of two digits (0s and 1s). This is referred to as binary notation. However, there is a limit to the size of numbers they can store.
Most current Operating Systems these days run at 64 bits. That means the highest allocatable address space for a variable is 64 bits.
Integers are stored in a base 2 system, which means the highest value a computer should be able to store is
1111111111111111111111111111111111111111111111111111111111111111
Since computers can only read 2 possible values, this means the above number can be represented as
2 ^ 64
This means that the highest value an integer can read should be at most 18,446,744,073,709,551,615
I honestly don't even understand how it's possible to store integer values higher than that.
Ruby uses Bignum objects to store number larger than 2^64. You can see here a description of how this works:
On the left, you can see RBignum contains an inner structure called
RBasic, which contains internal, technical values used by all Ruby
objects. Below that I show values specific to Bignum objects: digits
and len. digits is a pointer to an array of 32-bit values that contain
the actual big integer’s bits grouped into sets of 32. len records how
many 32-bit groups are in the digits array. Since there can be any
number of groups in the digits array, Ruby can represent arbitrarily
large integers using RBignum.

ZPL - Code 128 Understanding better how to use Subsets B and C

I'm getting involved with ZPL (a little bit) since a few days, so I'm sorry if the questions will look stupid.
I've got to build a bar code 128 and I finally realized: I got to make it as shorter as possible.
My main question is: is it possible to switch to subset C and then back to B for just 2 digits? I read the documentation and subset C will ready digits from 00 to 99, so in theory it should work, practically, will it be worth it?
Basically when I translate a bar code with Zebra designer, and print it to a file, it doesn't bother to switch to subset C for just a couple of digits.
This is the text I need to see in the bar code:
AB1C234D567890123456
By the documentation I read, I would build something like this:
FD>:AB1C>523>64D>5567890123456
Instead Zebra Designer does:
FD>:AB1C234D>5567890123456
So the other question is, will the bar code be the same length? Actually, will mine be shorter? [I don't have a printer with me at the moment]
Last question:
Let's say I don't want to spend much time scripting this up, will the following work ok, or will it make the bar code larger?
AB1C>523>64D>556>578>590>512>534>556
So I can just build a very simple script which checks two chars per time, if they're both numbers, then add >5 to the string.
Thank you :)
Ah, some nice loose terminology. Do you mean couple="exactly 2" or couple="a few"?
Changing from one subset to another takes one code element, so for exactly 2 digits, you'd need one element to change and one to represent the 2 digits in subset C. On the other hand, staying with your original subset would take 2 elements - so no, it's not worth the change.
Further, if you were to change to C for 2 digits and then back to your original, the change would actually be costly - C(12)B = 3 elements whereas 12 would only be 2.
If you repeat the exercise for 4 digits, then switching to C would generate C(12)(34) = 3 elements against 4 to stay with what you have; or C(12)(34)B = 4 elements if you switch and change back, or 4 elements if you stick - so no gain.
With 6 or more successive numerics, then you gain regardless of whether or not you switch back.
So overall,
2-digit terminal : No difference
2-digit other : code is longer
4-digit terminal : code is shorter
4-digit other : no difference
more than 4 digits : code is shorter.
And an ODD number of digits would need to be output in code A or B for the first digit and then the above table applies to the remainder.
This may not be the answer you're looking for, but specifying A (Automatic Mode) as the final parameter to the ^BC command will make the printer do this for you.
Example:
^XA
^FO100,100
^BY3
^BCN,100,N,N,A
^FD0123456789^FS
^XZ

What methods can I use to analyse and guess 4-bit checksum algorithm?

[Background Story]
I am working with a 5 year old user identification system, and I am trying to add IDs to the database. The problem I have is that the system that reads the ID numbers requires some sort of checksum, and no-one working here now has ever worked with it, so no-one knows how it works.
I have access to the list of existing IDs, which already have correct checksums. Also, as the checksum only has 16 possible values, I can create any ID I want and run it through the authentication system up to 16 times until I get the correct checksum (but this is quite time consuming)
[Question]
What methods can I use to help guess the checksum algorithm of used for some data?
I have tried a few simple methods such as XORing and summing, but these have not worked.
So my question is: if I have data (in hexadecimal) like this:
data checksum
00029921 1
00013481 B
00026001 3
00004541 8
What methods can I use work out what sort of checksum is used?
i.e. should I try sequential numbers such as 00029921,00029922,00029923,... or 00029911,00029921,00029931,... If I do this what patterns should I look for in the changing checksum?
Similarly, would comparing swapped digits tell me anything useful about the checksum?
i.e. 00013481 and 00031481
Is there anything else that could tell me something useful? What about inverting one bit, or maybe one hex digit?
I am assuming that this will be a common checksum algorithm, but I don't know where to start in testing it.
I have read the following links, but I am not sure if I can apply any of this to my case, as I don't think mine is a CRC.
stackoverflow.com/questions/149617/how-could-i-guess-a-checksum-algorithm
stackoverflow.com/questions/2896753/find-the-algorithm-that-generates-the-checksum
cosc.canterbury.ac.nz/greg.ewing/essays/CRC-Reverse-Engineering.html
[ANSWER]
I have now downloaded a much larger list of data, and it turned out to be simpler than I was expecting, but for completeness, here is what I did.
data:
00024901 A
00024911 B
00024921 C
00024931 D
00042811 A
00042871 0
00042881 1
00042891 2
00042901 A
00042921 C
00042961 0
00042971 1
00042981 2
00043021 4
00043031 5
00043041 6
00043051 7
00043061 8
00043071 9
00043081 A
00043101 3
00043111 4
00043121 5
00043141 7
00043151 8
00043161 9
00043171 A
00044291 E
From these, I could see that when just one value was increased by a value, the checksum was also increased by the same value as in:
00024901 A
00024911 B
Also, two digits swapped did not change the checksum:
00024901 A
00042901 A
This means that the polynomial value (for these two positions at least) must be the same
Finally, the checksum for 00000000 was A, so I calculated the sum of digits plus A mod 16:
( (Σxi) +0xA )mod16
And this matched for all the values I had. Just to check that there was nothing sneaky going on with the first 3 digits that never changed in my data, I made up and tested some numbers as Eric suggested, and those all worked with this too!
Many checksums I've seen use simple weighted values based on the position of the digits. For example, if the weights are 3,5,7 the checksum might be 3*c[0] + 5*c[1] + 7*c[2], then mod 10 for the result. (In your case, mod 16, since you have 4 bit checksum)
To check if this might be the case, I suggest that you feed some simple values into your system to get an answer:
1000000 = ?
0100000 = ?
0010000 = ?
... etc. If there are simple weights based on position, this may reveal it. Even if the algorithm is something different, feeding in nice, simple values and looking for patterns may be enlightening. As Matti suggested, you/we will likely need to see more samples before decoding the pattern.

Resources