How to convert 4 char string, "1995", to decmial 1.9? - ruby

I am displaying a large number of vehicle records (120k+), each record has an engine size for that vehicle. The engine size can either in one of two formats:
"1.8"
OR
"1995" #cc's
If the engine size is saved as a 4 char string I want to abbreviate in the view, to the nearest 100th - for example "1995" should get displayed as "2.0" and "1900" should get displayed as "1.9".
What is the best way I can do this? (cannot update database - this is view logic only)
Thanks!

(size.to_f / 100).round / 10.0

I don't know what other cases you have, but if you can separate the two cases by a simple comparison, it is
if size > 1000 # or whatever condition here
sprintf("%.1f", size.to_f / 1000)
else
sprintf("%.1f", size)
end

Try this:
size = (size.length == 4) ? (size.to_f / 100).round / 10.0 : size.to_f

Related

Express decimal with precision of 2 in KQL?

I have a value, expressed in bytes, being returned from an Azure Log Analytics query:
I want to convert this to megabytes and make it more human readable. In this case, "4.19 MB".
When I try to convert the byte value into megabyte, I can't seem to get KQL to add the desired precision of 2 places.
Tried:
RequestBodySize = strcat(round(RequestBodySize / 1000 / 1000, 2), ' MB') but this results in "4.0 MB".
How do I get this value to correctly reflect a precision of 2?
EDIT 1:
format_bytes(RequestBodySize, 2) returns "4 MB". No precision.
Same with `format_bytes(RequestBodySize, 2, 'MB')
I used a simple query to simulate the case and it works as expected for me.
In the first example, I added the unit to the field's name to maintain consistent value format that is aligned with the way values are projected in queries:
AzureDiagnostics
| where TimeGenerated > startofday(ago(20d))
| summarize volumeSizeMB = round(sum(_BilledSize)/pow(1024,2),2)
Results:
17.27
And when adding the unit to the value:
AzureDiagnostics
| where TimeGenerated > startofday(ago(20d))
| summarize volumeSize = strcat(round(sum(_BilledSize)/pow(1024,2),2), ' MB')
Results:
17.27 MB
If your issue persists and you don't see the expected precision, I suggest you open support case to have it investigated.
Use print format_bytes(12345678) to get 12 MB.
Use print format_bytes(12345678, 2) to get 11.77 MB.
Read the doc for more info.
Hi The answers above me are great,
just wanted to add one small input.
the reason you are not getting the fraction after the decimal is because you are dividing two integers.
to get a real number you will need to first convert one of the numbers to float or double, by using the todouble() toflout() https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/todoublefunction
RequestBodySize = strcat(round(todouble(RequestBodySize) / 1024 / 1024, 2), ' MB')
or, as suggested by Yossi, just multiply by 1.0
RequestBodySize = strcat(round(1.0 * RequestBodySize / 1024 / 1024, 2), ' MB')
try something like this:
| summarize GB = 1.0 * sum(TheThingsYouSub) / 1024 / 1024 / 1024 by SomeFilter

How to efficiently slice binary data in Ruby?

After reviewing SO post Ruby: Split binary data, I used the following code which works.
z = 'A' * 1_000_000
z.bytes.each_slice( STREAMING_CHUNK_SIZE ).each do | chunk |
c = chunk.pack( 'C*' )
end
However, it is very slow:
Benchmark.realtime do
...
=> 0.0983949700021185
98ms to slice and pack a 1MB file. This is very slow.
Use Case:
Server receives binary data from an external API, and streams it using socket.write chunk.pack( 'C*' ).
The data is expected to be between 50KB and 5MB, with an average of 500KB.
So, how to efficiently slice binary data in Ruby?
Notes
Your code looks nice, uses the correct Ruby methods and the correct syntax, but it still :
creates a huge Array of Integers
slices this big Array in multiple Arrays
pack those Arrays back to a String
Alternative
The following code extracts the parts directly from the string, without converting anything :
def get_binary_chunks(string, size)
Array.new(((string.length + size - 1) / size)) { |i| string.byteslice(i * size, size) }
end
(string.length + size - 1) / size) is just to avoid missing the last chunk if it is smaller than size.
Performance
With a 500kB pdf file and chunks of 12345 bytes, Fruity returns :
Running each test 16 times. Test will take about 28 seconds.
_eric_duminil is faster than _b_seven by 380x ± 100.0
get_binary_chunks is also 6x times faster than StringIO#each(n) with this example.
Further optimization
If you're sure the string is binary (not UTF8 with multibyte characters like 'ä'), you can use slice instead of byteslice:
def get_binary_chunks(string, size)
Array.new(((string.length + size - 1) / size)) { |i| string.slice(i * size, size) }
end
which makes the code even faster (about 500x compared to your method).
If you use this code with a Unicode String, the chunks will have size characters but might have more than size bytes.
Using the chunks directly
Finally, if you're not interested in getting an Array of Strings, you could use the chunks directly :
def send_binary_chunks(socket, string, size)
((string.length + size - 1) / size).times do |i|
socket.write string.slice(i * size, size)
end
end
Use StringIO#each(n) with a string that has BINARY encoding:
require 'stringio'
string.force_encoding(Encoding::BINARY)
StringIO.new(string).each(size) { |chunk| socket.write(chunk) }
This only allocates the intermediate arrays just before pushing them to the socket.

Hashing a long integer ID into a smaller string

Here is the problem, where I need to transform an ID (defined as a long integer) to a smaller alfanumeric identifier. The details are the following:
Each individual on the problem as an unique ID, a long integer of size 13 (something like 123123412341234).
I need to generate a smaller representation of this unique ID, a alfanumeric string, something like A1CB3X. The problem is that 5 or 6 character length will not be enough to represent such a large integer.
The new ID (eg A1CB3X) should be valid in a context where we know that only a small number of individuals are present (less than 500). The new ID should be unique within that small set of individuals.
The new ID (eg A1CB3X) should be the result of a calculation made over the original ID. This means that taking the original ID elsewhere and applying the same calculation, we should get the same new ID (eg A1CB3X).
This calculation should occur when the individual is added to the set, meaning that not all individuals belonging to that set will be know at that time.
Any directions on how to solve such a problem?
Assuming that you don't need a formula that goes in both directions (which is impossible if you are reducing a 13-digit number to a 5 or 6-character alphanum string):
If you can have up to 6 alphanumeric characters that gives you 366 = 2,176,782,336 possibilities, assuming only numbers and uppercase letters.
To map your larger 13-digit number onto this space, you can take a modulo of some prime number slightly smaller than that, for example 2,176,782,317, the encode it with base-36 encoding.
alphanum_id = base36encode(longnumber_id % 2176782317)
For a set of 500, this gives you a
2176782317P500 / 2176782317500 chance of a collision
(P is permutation)
Best option is to change the base to 62 using case sensitive characters
If you want it to be shorter, you can add unicode characters. See below.
Here is javascript code for you: https://jsfiddle.net/vewmdt85/1/
function compress(n) {
var symbols = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïð'.split('');
var d = n;
var compressed = '';
while (d >= 1) {
compressed = symbols[(d - (symbols.length * Math.floor(d / symbols.length)))] + compressed;
d = Math.floor(d / symbols.length);
}
return compressed;
}
$('input').keyup(function() {
$('span').html(compress($(this).val()))
})
$('span').html(compress($('input').val()))
How about using some base-X conversion, for example 123123412341234 becomes 17N644R7CI in base-36 and 9999999999999 becomes 3JLXPT2PR?
If you need a mapping that works both directions, you can simply go for a larger base.
Meaning: using base 16, you can reduce 1 to 16 to a single character.
So, base36 is the "maximum" that allows for shorter strings (when 1-1 mapping is required)!

How would I find an unknown pattern in an array of bytes?

I am building a tool to help me reverse engineer database files. I am targeting my tool towards fixed record length flat files.
What I know:
1) Each record has an index(ID).
2) Each record is separated by a delimiter.
3) Each record is fixed width.
4) Each column in each record is separated by at least one x00 byte.
5) The file header is at the beginning (I say this because the header does not contain the delimiter..)
Delimiters I have found in other files are: ( xFAxFA, xFExFE, xFDxFD ) But this is kind of irrelevant considering that I may use the tool on a different database in the future. So I will need something that will be able to pick out a 'pattern' despite how many bytes it is made of. Probably no more than 6 bytes? It would probably eat up too much data if it was more. But, my experience doing this is limited.
So I guess my question is, how would I find UNKNOWN delimiters in a large file? I feel that given, 'what I know' I should be able to program something, I just dont know where to begin...
# Really loose pseudo code
def begin_some_how
# THIS IS THE PART I NEED HELP WITH...
# find all non-zero non-ascii sets of 2 or more bytes that repeat more than twice.
end
def check_possible_record_lengths
possible_delimiter = begin_some_how
# test if any of the above are always the same number of bytes apart from each other(except one instance, the header...)
possible_records = file.split(possible_delimiter)
rec_length_count = possible_records.map{ |record| record.length}.uniq.count
if rec_length_count == 2 # The header will most likely not be the same size.
puts "Success! We found the fixed record delimiter: #{possible_delimiter}
else
puts "Wrong delimiter found"
end
end
possible = [",", "."]
result = [0, ""]
possible.each do |delimiter|
sizes = file.split( delimiter ).map{ |record| record.size }
next if sizes.size < 2
average = 0.0 + sizes.inject{|sum,x| sum + x }
average /= sizes.size #This should be the record length if this is the right delimiter
deviation = 0.0 + sizes.inject{|sum,x| sum + (x-average)**2 }
matching_value = average / (deviation**2)
if matching_value > result[0] then
result[0] = matching_value
result[1] = delimiter
end
end
Take advantage of the fact that the records have constant size. Take every possible delimiter and check how much each record deviates from the usual record length. If the header is small enough compared rest of the file this should work.

Division on rails 4

I'm having a hard time on thinking what should I do with this problem. I'm dividing the two numbers and expecting a non-whole number answer. Meaning to say it should be on decimal format. But unfortunately it answers a whole number.
example: 5 / 2 = 2
s.apts = sum_pts.to_f / sum_game.to_f
You just need to tell Ruby you are doing non-integer division, by writing the problem as "5.0 / 2.0"
See:
Why is division in Ruby returning an integer instead of decimal value?
Try this:
a = 5.0
b = 2.0
puts a / b

Resources