bcrypt cost 1 blank hash? - bcrypt

i want the crypt-formatted bcrypt hash for a blank password with cost 1, but the api i'm using to hash these passwords refuse to generate a hash with a cost below 4 (for obvious reasons), what does a cost 1 password look like?
for completeness, here's a blank password with the salt being 22x A with cost 4, it's the closest i've got so far: $2a$04$AAAAAAAAAAAAAAAAAAAAA.lvvkzzqrMPdnab8Xxl8zf7j6C1s84c6
and that was generated with the php code crypt("",'$2a$04$AAAAAAAAAAAAAAAAAAAAAA') (but php refuse to create hashes below cost 4)

for password=(emptystring) cost=1 and salt=AAAAAAAAAAAAAAAAAAAAA the hash is:
$2a$01$AAAAAAAAAAAAAAAAAAAAA.9/Ai1w9JdKxud1gCb2hYi1hHz9IYr0m
it's also possible to use 0 iterations, in which the hash is
$2a$00$AAAAAAAAAAAAAAAAAAAAA.xQIqw.yJbZDA8V.ef0psIHmBBBaQfhy
here is output from a modified php where the minimum cost is removed:
root#x2ratma:/temp/php-src# sapi/cli/php -r 'var_dump(crypt("",'\''$2a$04$AAAAAAAAAAAAAAAAAAAAAA'\''));'
string(60) "$2a$04$AAAAAAAAAAAAAAAAAAAAA.lvvkzzqrMPdnab8Xxl8zf7j6C1s84c6"
root#x2ratma:/temp/php-src# sapi/cli/php -r 'var_dump(crypt("",'\''$2a$03$AAAAAAAAAAAAAAAAAAAAAA'\''));'
string(60) "$2a$03$AAAAAAAAAAAAAAAAAAAAA.TCFhOtNOtk2Oeef1z4xP561tW1AQOMW"
root#x2ratma:/temp/php-src# sapi/cli/php -r 'var_dump(crypt("",'\''$2a$02$AAAAAAAAAAAAAAAAAAAAAA'\''));'
string(60) "$2a$02$AAAAAAAAAAAAAAAAAAAAA.C43gUGAZorcHKVYot1kaRPwCYWt2Ehm"
root#x2ratma:/temp/php-src# sapi/cli/php -r 'var_dump(crypt("",'\''$2a$01$AAAAAAAAAAAAAAAAAAAAAA'\''));'
string(60) "$2a$01$AAAAAAAAAAAAAAAAAAAAA.9/Ai1w9JdKxud1gCb2hYi1hHz9IYr0m"
root#x2ratma:/temp/php-src# sapi/cli/php -r 'var_dump(crypt("",'\''$2a$00$AAAAAAAAAAAAAAAAAAAAAA'\''));'
string(60) "$2a$00$AAAAAAAAAAAAAAAAAAAAA.xQIqw.yJbZDA8V.ef0psIHmBBBaQfhy"
the changes made to php-src was:
root#x2ratma:/temp/php-src# git diff --patch
diff --git a/ext/standard/crypt_blowfish.c b/ext/standard/crypt_blowfish.c
index 3806a290ae..f72cfb51df 100644
--- a/ext/standard/crypt_blowfish.c
+++ b/ext/standard/crypt_blowfish.c
## -647,7 +647,7 ## static const unsigned char flags_by_subtype[26] =
static char *BF_crypt(const char *key, const char *setting,
char *output, int size,
BF_word min)
-{
+{min=0;
struct {
BF_ctx ctx;
BF_key expanded_key;
diff --git a/ext/standard/password.c b/ext/standard/password.c
index a19266d214..3c7e1d6926 100644
--- a/ext/standard/password.c
+++ b/ext/standard/password.c
## -192,7 +192,7 ## static zend_string* php_password_bcrypt_hash(const zend_string *password, zend_a
cost = zval_get_long(zcost);
}
- if (cost < 4 || cost > 31) {
+ if (cost < 0 || cost > 31) {
zend_value_error("Invalid bcrypt cost parameter specified: " ZEND_LONG_FMT, cost);
return NULL;
}
... unfortunately php's password_verify() (incorrectly?) returns bool(false) for hashes below cost 4...

Related

Bash array transformation like .map() in JavaScript

In JavaScript, the Array.map() function exists such that
const array1 = [1, 4, 9, 16];
const map1 = array1.map(x => x * 2);
console.log(map1);
// expected output: Array [2, 8, 18, 32]
I need a bash equivalent where I can take my array, manipulate its contents, then receive a new array with the manipulations.
array1=(1 4 9 16)
map1=# ????
echo ${map1[*]}
Soooooooo, just write the loop.
array1=(1 4 9 16)
map1=()
for i in "${array1[#]}"; do
map1+=("$((i * 2))")
done
echo "${map1[#]}"
Might be a good time to re-read an introduction to Bash arrays.
It is possible to implement an array_walk with a callback to perform operation on each element this way:
#!/usr/bin/env bash
# Applies the user-defined callback function to each element of the array.
#
# #params
# $1: The array name to walk
# $2: The callback command or function name
array_walk() {
local -n __array=$1
local -- __callback=$2 __i
for __i in "${!__array[#]}"; do
"$__callback" "$1[$__i]"
done
}
x2() {
local -n __e=$1
__e=$((2 * __e))
}
array1=(1 4 9 16)
array_walk array1 x2
printf '%s\n' "${array1[*]}"

Compare strings and remove more general pattern in Perl

I have an array of strings that numbers, possibly separated by forward slashes, e.g. 754 or 754/128. These strings can have an undefined length, in other words: something such as the following is possible: 1234/34/21/120/3. In the array I want to only keep the more specialised patterns that contain other patterns. For instance, in the first example above 754/128 contains 754, so 754 can be removed from the array.
This concept of containing is as broad as one would expect, maybe even broader: it is similar to how you look at a directed graph where each slash in the pattern refers to a step forward. the contained pattern can be of arbitrary length as long as it is inside the containing pattern one way or another. This means that the small path can occur in any (chronologically correct) form. For instance, 903/900 is contained in 903/902/900 even though the pattern is 'split open'. One way to visualise this is: in the small path we go from point A to point B. In the larger path we also go from pont A to B but we make a stop at C. The larger path visits more places than the small path without missing out on anything. As a consequence, the smaller path can occur in any split-up form - as long as the order of the path is respected. As an example:
2/5 - 1/2/3/4/5
# included
5/2 - 1/2/3/4/5
# not included
What I mean here is that the position of the 'contained' items should be identical in the large path. For instance: 1/3/2 'matches' in 1/5/3/4/2 because the order is identical in the small and large path: 1 comes in a position before 3, which in turn is in some position before 2. 1/2/3, 2/1/3 and so on would NOT match the larger path 1/5/3/4/2 even though they are valid paths with the same items. This is because the order of occurrence is different.
The above example also illustrates that the items in the small pattern can occur any where in the large pattern; not only in the first and last position or in subsequent positions. In other words, all included paths for 1/2/3/4 are:
1/2
1/2/3
1/3
1/4
2/3
2/3/4
2/4
3/4
I am looking for an efficient way to remove paths in a given array that are included in others from that same array.
I got this far, but I'm not sure how I should efficiently check the contains relationship between two items.
#!/usr/bin/perl
my #arr = ("903/900", "903/902/900", "903/904/902/901", "903/904/902/908/900", "903");
my #res = ();
OUTER: for (my $i = 0; $i < #arr; $i++) {
my $first = $arr[$i];
my $j = $i+1;
INNER: while($j < #arr) {
my $second = $arr[$j];
&compare_paths($first, $second);
$j++;
}
}
sub compare_paths {
my ($first, $second) = #_;
#first_items = split(/\//, $first);
#second_items = split(/\//, $second);
# Compare values from 1 and 2
}
The expected output for the code above would be
#res = ("903/904/902/901", "903/904/902/908/900");
Reasons for removal:
903/900 contained in 903/902/900
903/902/900 contained in 903/904/902/908/900
903 contained in 903/904/902/901
How can I implement such algorithm efficiently? My main idea would be to check if an item of #first_items is a present in $second, if not continue, but if so check if a second item is also present and if so: check its substring position. This has to be larger than the first items substring position. Continue for each item (and the other way around for #second_items and $first) until all strings are matched. (If it helps for speed, the initial array can be swapped for a hash with the former array as the keys.)
I expect that there are general algorithms that can account for this problem, and probably libraries that can be leveraged. However, here is a hand-rolled one.
First, we sort the array by the number of terms in the path . Then we go up that array, comparing each element with all longer ones. This way each path is excluded at earliest opportunity.
The comparison is between arrays obtained by splitting on /. It checks whether all elements of the smaller array are in the larger one as an exact sub-sequence, so that the larger one would yield the smaller one by only removing elements (without rearrangements).
use warnings;
use strict;
my #arr = qw(902/904 903/900 903/902/900 903/904/902/901
903/904/902/908/900 903);
my #sorted = sort { (split '/', $a) > (split '/', $b) } #arr;
my #primes;
OUTER:
for my $i (0..$#sorted) {
for my $j ($i+1..$#sorted) {
next OUTER if is_contained($sorted[$i], $sorted[$j]);
}
push #primes, $sorted[$i];
}
print "#primes\n";
sub is_contained
{
my ($small, $large) = #_;
my #small = split '/', $small;
my #large = split '/', $large;
# There can be no duplicates so equal-length paths are distinct
return 0 if #small == #large;
# Indices of elements of #small in #large cannot decrease
my ($match, $index) = (0, 0);
for my $sm (#small) {
for my $i (0..$#large) {
$sm == $large[$i] || next;
return 0 if $i < $index; # out of order
$index = $i;
$match = 1;
last;
}
return 0 if not $match; # $sm from #small not in #large
$match = 0;
}
return 1;
}
Prints the line: 902/904 903/904/902/901 903/904/902/908/900.
A note on how we check that the #smaller matches a subsequence in #larger.
Once a #smaller element is found in #larger, its index in #larger
cannot be lower than the one found previously. An element must come after the previous one, not before. See a different procedure below.
So with 2/7/5 and 1/2/5/7/8, first 2 is found at index 1, then 7 at index 3, then 5 but at index 2. The subsequence 2-5-7 does not match 2-7-5. I added 902/904 to data to test for this.
This is an alternate procedure for checking whether a path is contained in another.
Once it finds an element of #smaller in #larger it searches for the next one starting from the next index in #larger. This way it skips the searched part of the path, but it cannot detect out-of-order elements early either.
With the example of 2/7/5 and 1/2/5/7/8, after it found 7 at index 3 it starts from index 4 and detects failure by not finding a 5 in the rest of the target path.
sub is_contained_2
{
my #large = split '/', $_[0];
my #small = split '/', $_[1];
# Is #small found in #large as an exact sub-sequence?
my ($match, $j) = (0, 0);
for my $sm (#small) {
for my $i ($j..$#large) {
$sm == $large[$i] || next;
$j = $i+1, $match = 1;
last;
}
return 0 if not $match;
$match = 0;
}
return 1;
}
This is slower (by 10-15%) for this data set, see benchmark with a comment below.
I benchmarked the two array-based versions here and ikegami's regex+trie. So far I have used only the specific data set from the question, with 902/904 added.
use warnings;
use strict;
use Benchmark qw(cmpthese);
my $secs_to_run = shift || 10;
my #arr = ('902/904', '903/900', '903/902/900', '903/904/902/901',
'903/904', '/902/908/900', '903');
# sorted array checked shorter-to-longer, manual iterations
sub contained {
my ($rarr) = #_; my #arr = #$arr;
# program copied from this post
return \#primes;
}
sub is_contained { ... } # copied
# Same program, but using is_contained_2()
sub contained_2 { ... }
sub is_contained_2 { ... }
# Regex-trie, copied from ikegami's post
sub add { my $p = \shift; $p = \( $$p->{$_} ) for #_, ''; }
sub as_pat { my $trie = shift; ... } # copied
sub regex_trie {
my ($rpaths) = #_; my #paths = #$rpaths;
# program copied from ikegami's post
return \#filtered_paths;
}
cmpthese(-$secs_to_run, {
containted => sub { my $rprimes = contained(\#arr) },
cont_next => sub { my $rprimes = contained_2(\#arr) },
regex_trie => sub { my $rfiltered = regex_trie(\#arr) },
});
With bench_cont.pl 300, on a newer workstation-laptop (2.5GHz) with v5.16
Rate regex_trie cont_next containted
regex_trie 15264/s -- -15% -27%
cont_next 17946/s 18% -- -14%
containted 20939/s 37% 17% --
on an older server (2.8GHz) with v5.16
Rate regex_trie cont_next containted
regex_trie 11750/s -- -13% -27%
cont_next 13537/s 15% -- -16%
containted 16042/s 37% 19% --
on an older server (3.5GHz) with v5.10
Rate cont_next regex_trie containted
cont_next 12266/s -- -17% -17%
regex_trie 14832/s 21% -- -0%
containted 14845/s 21% 0% --
This surprised me, as I expected the regex-based solution to be fastest.
I expect the trend to reverse for data composed of longer paths, having more distinct (not contained) paths, with containment found later in the path, and with a few out-of-order dismissals.
I'll add tests once I get to generate such data, or once it is provided.
To track some of the processing change the body to
use feature 'say';
OUTER:
for my $i (0..$#sorted) {
say "Check $sorted[$i]";
for my $j ($i+1..$#sorted) {
my $is_inside = is_contained($sorted[$i], $sorted[$j]);
say "\t$is_inside: $sorted_arr[$i] inside $sorted_arr[$j]";
next OUTER if $is_inside;
}
push #primes, $sorted[$i];
}
say "\nNot contained: #primes";
This prints
Check 903
0: 903 vs. 902/904
1: 903 vs. 903/900
Check 902/904
0: 902/904 vs. 903/900
0: 902/904 vs. 903/902/900
0: 902/904 vs. 903/904/902/901
0: 902/904 vs. 903/904/902/908/900
Check 903/900
1: 903/900 vs. 903/902/900
Check 903/902/900
0: 903/902/900 vs. 903/904/902/901
1: 903/902/900 vs. 903/904/902/908/900
Check 903/904/902/901
0: 903/904/902/901 vs. 903/904/902/908/900
Check 903/904/902/908/900
Not contained: 902/904 903/904/902/901 903/904/902/908/900
To optimize what you have so far, I'd suggest that you pre-split all elements of the array (and then re-merge them later):
#arr = map [split "/", $_], #arr;
With that done:
sub contains(\#\#) {
my ($larger_ref, $smaller_ref) = #_;
return '' if #$larger_ref <= #$smaller_ref;
my ($i, $j) = 0;
while ($i < #$larger_ref && $j <= #$smaller_ref) {
++$j if $larger_ref->[$i] == $smaller_ref->[$j];
++$i;
}
return $j == #$smaller_ref;
}
I: for (my $i = 0; $i < #arr; ++$i) {
J: for (my $j = 0; $j < #arr; ++$j) {
next J if $j == $i;
next I if contains #{$arr[$j]}, #{$arr[i]};
}
push #res, join '/', #{$arr[$i]};
}
There are some potential further optimizations you can make in contains (for example, it might make sense to abort early if/when #$larger_ref - $i < #$smaller_ref - $j), but you'd want to test: they might turn out to be pessimizations.
If a/b/c is path, you want to remove the paths for which the following is true:
"/$path" =~ m{ ^ (?:/a)?+ (?:/b)?+ (?:/c)?+ \z }x && $path ne 'a/b/c'
That can also be written as
"/$path" =~ m{ ^ (?:/a)?+ (?:/b)?+ (?:/c)?+ \z (?<! ^ /a/b/c ) }x
If both a/b/c, a/i/j, a/x/y and d/e/f are paths, you want to remove the paths for which the following is true:
"/$path" =~ m{
^ (?:/a)?+ (?:/b)?+ (?:/c)?+ \z (?<! ^ /a/b/c )
| ^ (?:/a)?+ (?:/i)?+ (?:/j)?+ \z (?<! ^ /a/i/j )
| ^ (?:/a)?+ (?:/x)?+ (?:/y)?+ \z (?<! ^ /a/x/y )
| ^ (?:/d)?+ (?:/e)?+ (?:/f)?+ \z (?<! ^ /d/e/f )
}x
We removed alot of the backtracking by using the possessive modifier (+), but there is still a possibility for backtracking because of the common prefixes. So let's remove them!
"/$path" =~ m{
^
(?: (?:/a)?+ (?: (?:/b)?+ (?:/c)?+ \z (?<! ^ /a/b/c )
| (?:/i)?+ (?:/j)?+ \z (?<! ^ /a/i/j )
| (?:/x)?+ (?:/y)?+ \z (?<! ^ /a/x/y )
)
| (?:/d)?+ (?:/e)?+ (?:/f)?+ \z (?<! ^ /d/e/f )
)
}x
Now we have an efficient solution!
The following uses a trie to do this to remove the common prefixes.
use strict;
use warnings;
use feature qw( say );
sub add {
my $p = \shift;
$p = \( $$p->{$_} ) for #_, '';
}
sub as_pat {
my $trie = shift;
my #sub_pats =
map { $_ eq '' ? '' : $_ . as_pat($trie->{$_}) }
keys(%$trie);
if (#sub_pats == 1) {
return $sub_pats[0];
} else {
return '(?:'.join('|', #sub_pats).')';
}
}
my #paths = ( "903/900", "903/902/900", "903/904/902/901", "903/904/902/908/900", "903" );
my $trie;
add($trie, map({ "(?:/\Q$_\E)?+" } split qr{/}), "\\z(?<!^/\Q$_\E)" )
for #paths;
my $pat = as_pat($trie);
my $re = qr{^(?:$pat)};
my #filtered_paths = grep { "/$_" !~ /$re/ } #paths;
say for #filtered_paths;
Say N is the number of paths, and O(M) bounds the length of the paths. Like earlier answers, this one completes in O(N2 * M2) time, so it doesn't scale any better. However, I believe you will find mine faster should you benchmark them.
Say N is the number of paths, and O(M) bounds the length of the paths, the earlier answers complete in O(N2 * M2) time.
The following solution is O(N * 2M). This means it can handle very large numbers of paths much much more efficiently, as long as the paths are rather short (because it effectively becomes O(N2) vs O(N)). It does require far more memory than the solutions in the earlier answers.
use strict;
use warnings;
use feature qw( say );
sub fetch {
my $trie = shift;
for (#_, '') {
return () if !$trie;
$trie = $trie->{$_}
}
return $trie;
}
sub add {
local *_helper = sub {
my $trie_ptr = \shift;
my $exact = shift;
if (#_) {
my $lead = shift(#_);
_helper($$trie_ptr->{$lead}, $exact, #_);
_helper($$trie_ptr, 0, #_);
} else {
if ($exact) {
$$trie_ptr->{''} ||= 1;
} else {
$$trie_ptr->{''} = 2;
}
}
};
my $trie_ptr = \shift;
return _helper($$trie_ptr, 1, #_);
}
my #paths = ( "903/900", "903/902/900", "903/904/902/901", "903/904/902/908/900", "903" );
my #split_paths = map [ split qr{/} ], #paths;
my $trie;
add($trie, #$_)
for #split_paths;
use Data::Dumper qw( Dumper );
local $Data::Dumper::Sortkeys = 1;
print(Dumper($trie));
my #filtered_paths =
map join('/', #$_),
grep { fetch($trie, #$_) == 1 }
#split_paths;
say for #filtered_paths;

How to write comments inside a patch/diff file? [duplicate]

This question already has answers here:
is it possible to add a comment to a diff file (unified)?
(3 answers)
Closed 5 years ago.
I would like to review a patch of a colleague. We are unable to use a review tool. So I would like to comment the patch file he made. Is it possible to write inline comments to a (svn) patch file?
I couldn't find any information in the svn red book on it. I was even unable to find the patch file grammar to figure it out myself.
The diff format is just the unified diff format. If you wanted you could put some text after the range info. Consider this diff produced with command svn diff -c 1544711 https://svn.apache.org/repos/asf/subversion/trunk:
Index: subversion/mod_dav_svn/mod_dav_svn.c
===================================================================
--- subversion/mod_dav_svn/mod_dav_svn.c (revision 1544710)
+++ subversion/mod_dav_svn/mod_dav_svn.c (revision 1544711)
## -1097,7 +1097,8 ##
/* Fill the filename on the request with a bogus path since we aren't serving
* a file off the disk. This means that <Directory> blocks will not match and
- * that %f in logging formats will show as "svn:/path/to/repo/path/in/repo". */
+ * %f in logging formats will show as "dav_svn:/path/to/repo/path/in/repo".
+ */
static int dav_svn__translate_name(request_rec *r)
{
const char *fs_path, *repos_basename, *repos_path;
## -1146,7 +1147,7 ##
if (repos_path && '/' == repos_path[0] && '\0' == repos_path[1])
repos_path = NULL;
- /* Combine 'svn:', fs_path and repos_path to produce the bogus path we're
+ /* Combine 'dav_svn:', fs_path and repos_path to produce the bogus path we're
* placing in r->filename. We can't use our standard join helpers such
* as svn_dirent_join. fs_path is a dirent and repos_path is a fspath
* (that can be trivially converted to a relpath by skipping the leading
## -1154,7 +1155,7 ##
* repository is 'trunk/c:hi' this results in a non canonical dirent on
* Windows. Instead we just cat them together. */
r->filename = apr_pstrcat(r->pool,
- "svn:", fs_path, repos_path, SVN_VA_NULL);
+ "dav_svn:", fs_path, repos_path, SVN_VA_NULL);
/* Leave a note to ourselves so that we know not to decline in the
* map_to_storage hook. */
If you add the option -x-p to that command you'll get:
Index: subversion/mod_dav_svn/mod_dav_svn.c
===================================================================
--- subversion/mod_dav_svn/mod_dav_svn.c (revision 1544710)
+++ subversion/mod_dav_svn/mod_dav_svn.c (revision 1544711)
## -1097,7 +1097,8 ## static int dav_svn__handler(request_rec *r)
/* Fill the filename on the request with a bogus path since we aren't serving
* a file off the disk. This means that <Directory> blocks will not match and
- * that %f in logging formats will show as "svn:/path/to/repo/path/in/repo". */
+ * %f in logging formats will show as "dav_svn:/path/to/repo/path/in/repo".
+ */
static int dav_svn__translate_name(request_rec *r)
{
const char *fs_path, *repos_basename, *repos_path;
## -1146,7 +1147,7 ## static int dav_svn__translate_name(request_rec *r)
if (repos_path && '/' == repos_path[0] && '\0' == repos_path[1])
repos_path = NULL;
- /* Combine 'svn:', fs_path and repos_path to produce the bogus path we're
+ /* Combine 'dav_svn:', fs_path and repos_path to produce the bogus path we're
* placing in r->filename. We can't use our standard join helpers such
* as svn_dirent_join. fs_path is a dirent and repos_path is a fspath
* (that can be trivially converted to a relpath by skipping the leading
## -1154,7 +1155,7 ## static int dav_svn__translate_name(request_rec *r)
* repository is 'trunk/c:hi' this results in a non canonical dirent on
* Windows. Instead we just cat them together. */
r->filename = apr_pstrcat(r->pool,
- "svn:", fs_path, repos_path, SVN_VA_NULL);
+ "dav_svn:", fs_path, repos_path, SVN_VA_NULL);
/* Leave a note to ourselves so that we know not to decline in the
* map_to_storage hook. */
Note how the function is added after the ## on the range lines. This portion of the lines are ignored by any software processing the diff. So you're free to put whatever you want there. You could put your comments there.
Unidiff hunks start each line with ' ' (space) to mean context (as in an unchanged line), '+' to mean an added line, or '-' to mean a removed line. A lot of parsers will (including Subversion's svn patch command) will discard lines that start with some other character. So you might be able to simply insert a line that starts with some other character. But that's not guaranteed to be as portable as the above method.

Calculate sum of size notated figures?

I want to calculate the total size of all .mobi files from this
link (it's a good link by the way).
In my attempt of making this as my learning experience, I have made a 'pipe' (let's call it a) that output all the sizes from that page which looks like:
189K
20M
549K
2.2M
1.9M
3.1M
2.5M
513K
260K
1.1M
2.8M
5.1M
3.7M
1.5M
5.6M
1.0M
5.6M
1.5M
4.9M
3.4M
810K
My target is to get the total size (ex: 50.50M, or 50000K) - sum of all these numbers.
My question is, how to calculate that target, using pipeling (a | some_other_commands). Answers using python or any other language (preferably one liners) are welcome. Thanks a lot.
For the fun a solution in shell:
a | sed -e 's/M$/ 1024 * +/' -e 's/K$/ +/' | dc -e '0' -f - -e 'p'
Perl one-liner:
a | perl -ne 's/^([\d.]+)M$/$1*1024/e;$sum+=$_; END{print $sum."K"}'
see it
It assumes that all entries are in either Kilobytes or Megabytes as shown in OPs input.
Sigh, someone says “one-liner” and all my code-golf reflexes fire...
ruby -e 'puts $<.read.split.inject(0){ |m,e| m += e.to_f * { "M" => 1, "K" => 0.001 }[e[-1,1]]}.to_s+"M"'
or, with some shortcuts...
ruby -ne 'p #e=#e.to_f+$_.to_f*{"M"=>1,"K"=>0.001}[$_[-2,1]]'
Update: Heh, ok, hard to read. The OP asked for a "one liner". :-)
#!/usr/bin/env ruby
total = 0
while s = gets # get line
scalefactorMK = s.chomp[-1,1] # get the M or K
scalefactor = { 'M'=>1,'K'=>0.001 }[scalefactorMK] # get numeric scale
total += s.to_f * scalefactor # accumulate total
end
puts "%5.1fM" % [total]
if you have Ruby (1.9+)
require 'net/http'
url="http://hewgill.com/~greg/stackoverflow/ebooks/"
response = Net::HTTP.get_response( URI.parse(url) )
data=response.body
total=0
data.split("\n").each do |x|
if x=~/\.mobi/
size = x.split(/\s+/)[-1]
c = case size[-1]
when 'K' then 1024
when 'M' then 1024 * 1024
when 'G' then 1024 * 1024 * 1024
end
total+=size[0..-1].to_i * c
end
end
puts "Total size: %.2f MB" % ( total/(1024.0 * 1024.0) )
awk (assume files less than 1K don't substantially add to the total):
a | awk '/K/ {sum += $1/1024} /M/ {sum += $1} END {printf("%.2fM\n", sum)}'

Determining All Possibilities for a Random String?

I was hoping someone with better math capabilities would assist me in figuring out the total possibilities for a string given it's length and character set.
i.e. [a-f0-9]{6}
What are the possibilities for this pattern of random characters?
It is equal to the number of characters in the set raised to 6th power.
In Python (3.x) interpreter:
>>> len("0123456789abcdef")
16
>>> 16**6
16777216
>>>
EDIT 1:
Why 16.7 million? Well, 000000 ... 999999 = 10^6 = 1M, 16/10 = 1.6 and
>>> 1.6**6
16.77721600000000
* EDIT 2:*
To create a list in Python, do: print(['{0:06x}'.format(i) for i in range(16**6)])
However, this is too huge. Here is a simpler, shorter example:
>>> ['{0:06x}'.format(i) for i in range(100)]
['000000', '000001', '000002', '000003', '000004', '000005', '000006', '000007', '000008', '000009', '00000a', '00000b', '00000c', '00000d', '00000e', '00000f', '000010', '000011', '000012', '000013', '000014', '000015', '000016', '000017', '000018', '000019', '00001a', '00001b', '00001c', '00001d', '00001e', '00001f', '000020', '000021', '000022', '000023', '000024', '000025', '000026', '000027', '000028', '000029', '00002a', '00002b', '00002c', '00002d', '00002e', '00002f', '000030', '000031', '000032', '000033', '000034', '000035', '000036', '000037', '000038', '000039', '00003a', '00003b', '00003c', '00003d', '00003e', '00003f', '000040', '000041', '000042', '000043', '000044', '000045', '000046', '000047', '000048', '000049', '00004a', '00004b', '00004c', '00004d', '00004e', '00004f', '000050', '000051', '000052', '000053', '000054', '000055', '000056', '000057', '000058', '000059', '00005a', '00005b', '00005c', '00005d', '00005e', '00005f', '000060', '000061', '000062', '000063']
>>>
EDIT 3:
As a function:
def generateAllHex(numDigits):
assert(numDigits > 0)
ceiling = 16**numDigits
for i in range(ceiling):
formatStr = '{0:0' + str(numDigits) + 'x}'
print(formatStr.format(i))
This will take a while to print at numDigits = 6.
I recommend dumping this to file instead like so:
def generateAllHex(numDigits, fileName):
assert(numDigits > 0)
ceiling = 16**numDigits
with open(fileName, 'w') as fout:
for i in range(ceiling):
formatStr = '{0:0' + str(numDigits) + 'x}'
fout.write(formatStr.format(i))
If you are just looking for the number of possibilities, the answer is (charset.length)^(length). If you need to actually generate a list of the possibilities, just loop through each character, recursively generating the remainder of the string.
e.g.
void generate(char[] charset, int length)
{
generate("",charset,length);
}
void generate(String prefix, char[] charset, int length)
{
for(int i=0;i<charset.length;i++)
{
if(length==1)
System.out.println(prefix + charset[i]);
else
generate(prefix+i,charset,length-1);
}
}
The number of possibilities is the size of your alphabet, to the power of the size of your string (in the general case, of course)
assuming your string size is 4: _ _ _ _ and your alphabet = { 0 , 1 }:
there are 2 possibilities to put 0 or 1 in the first place, second place and so on.
so it all sums up to: alphabet_size^String_size
first: 000000
last: ffffff
This matches hexadecimal numbers.
For any given set of possible values, the number of permutations is the number of possibilities raised to the power of the number of items.
In this case, that would be 16 to the 6th power, or 16777216 possibilities.

Resources