Can this be modified to run faster? - performance

I'm creating a word list using python that hits every combination of of characters which is a monster of a calculation past 944. Before you ask where I'm getting 94, 94 covers ASCII characters 32 to 127. Understandably this function runs super slow, I'm curious if there's a way to make it more efficient.
This is the meat and potatoes of the my code.
def CreateTable(name,ASCIIList,size):
f = open(name + '.txt','w')
combo = itertools.product(ASCIIList, repeat = size)
for x in combo:
passwords = ''.join(x)
f.write(str(passwords) + '\n')
f.close()
I'm using this so that I can make lists to use in a brute force where I don't know the length of the passwords or what characters the password contains. Using a list like this I hit every possible combination of words so I'm sure to hit the right one eventually. Having stated earlier that this is a slow program this also slow to read in and will not my first choice for a brute force, this more or less for a last ditch effort.
To give you an idea of how long that piece of code runs. I was creating all the combinations of size 5 and ran for 3 hours ending at a little over 50GB.

Warning : I have not tested this code.
I would convert combo to a list: combo_list = list(combo)
I would then break it into chunks:
# https://stackoverflow.com/a/312464/596841
def get_chunks(l, n):
"""Yield successive n-sized chunks from l."""
for i in range(0, len(l), n):
yield l[i:i + n]
# Change 1000 to whatever works.
chunks = get_chunks(combo_list, 1000)
Next, I would use multithreading to process each chunk:
class myThread (threading.Thread):
def __init__(self, chunk_id, chunk):
threading.Thread.__init__(self)
self.chunk_id = chunk_id
self.chunk = chunk
def run(self):
print ("Starting " + self.chunk_id)
process_data(self.chunk)
print ("Exiting " + self.chunk_id)
def process_data():
f = open(self.chunk_id + '.txt','w')
for item in self.chunk:
passwords = ''.join(item)
f.write(str(passwords) + '\n')
f.close()
I would then do something like this:
threads = []
for i, chunk in enumerate(chunks):
thread = myThread(i, chunk)
thread.start()
threads.append(thread)
# Wait for all threads to complete
for t in threads:
t.join()
You could then write another script to merge all the output files, if you need.

I did some testing on this, and I think the main problem is that you're writing in text mode.
Binary mode is faster, and you're only dealing with ASCII, so you might as well just spit out bytes rather than strings.
Here's my code:
import itertools
import time
def CreateTable(name,ASCIIList,size):
f = open(name + '.txt','w')
combo = itertools.product(ASCIIList, repeat = size)
for x in combo:
passwords = ''.join(x)
f.write(str(passwords) + '\n')
f.close()
def CreateTableBinary(name,ASCIIList,size):
f = open(name + '.txt', 'wb')
combo = itertools.product(ASCIIList, repeat = size)
for x in combo:
passwords = bytes(x)
f.write(passwords)
f.write(b'\n')
f.close()
def CreateTableBinaryFast(name,first,last,size):
f = open(name + '.txt', 'wb')
x = bytearray(chr(first) * size, 'ASCII')
while True:
f.write(x)
f.write(b'\n')
i = size - 1
while (x[i] == last) and (i > 0):
x[i] = first
i -= 1
if i == 0 and x[i] == last:
break
x[i] += 1
f.close()
def CreateTableTheoreticalMax(name,ASCIIList,size):
f = open(name + '.txt', 'wb')
combo = range(0, len(ASCIIList)**size)
passwords = b'A' * size
for x in combo:
f.write(passwords)
f.write(b'\n')
f.close()
print("writing real file in text mode")
start = time.time()
chars = [chr(x) for x in range(32, 126)]
CreateTable("c:/temp/output", chars, 4)
print("that took ", time.time() - start, "seconds.")
print("writing real file in binary mode")
start = time.time()
chars = bytes(range(32, 126))
CreateTableBinary("c:/temp/output", chars, 4)
print("that took ", time.time() - start, "seconds.")
print("writing real file in fast binary mode")
start = time.time()
CreateTableBinaryFast("c:/temp/output", 32, 125, size)
print("that took ", time.time() - start, "seconds.")
print("writing fake file at max speed")
start = time.time()
chars = [chr(x) for x in range(32, 126)]
CreateTableTheoreticalMax("c:/temp/output", chars, 4)
print("that took ", time.time() - start, "seconds.")
Output:
writing real file in text mode
that took 101.5869083404541 seconds.
writing real file in binary mode
that took 40.960529804229736 seconds.
writing real file in fast binary mode
that took 35.54869604110718 seconds.
writing fake file at max speed
that took 26.43029284477234 seconds.
So you can see a pretty big improvement just by switching to binary mode.
Also, there still seems to be some slack to take up, since omitting the itertools.product and writing hard-coded bytes is even faster. Maybe you could write your own version of product that directly output bytes-like objects. Not sure about that.
Edit: I had a go at a manual itertools.product working directly on a bytearray. It's a bit faster - see "fast binary mode" in the code.

Related

Faster concatenation of cell arrays of different sizes

I have a cell array of size m x 1 and each cell is again s x t cell array (size varies). I would like to concatenate vertically. The code is as follows:
function(cell_out) = vert_cat(cell_in)
[row,col] = cellfun(#size,cell_in,'Uni',0);
fcn_vert = #(x)([x,repmat({''},size(x,1),max(cell2mat(col))-size(x,2))]);
cell_out = cellfun(fcn_vert,cell_in,'Uni',0); % Taking up lot of time
cell_out = vertcat(cell_out{:});
end
Step 3 takes a lot of time. Is it the right way to do or is there any another faster way to achieve this?
cellfun has been found to be slower than loops (kind of old, but agrees with what I have seen).
In addition, repmat has also been a performance hit in the past (though that may be different now).
Try this two-loop code that aims to accomplish your task:
function cellOut = vert_cat(c)
nElem = length(c);
colPad = zeros(nElem,1);
nRow = zeros(nElem,1);
for k = 1:nElem
[nRow(k),colPad(k)] = size(c{k});
end
colMax = max(colPad);
colPad = colMax - colPad;
cellOut = cell(sum(nRow),colMax);
bottom = cumsum(nRow) - nRow + 1;
top = bottom + nRow - 1;
for k = 1:nElem
cellOut(bottom(k):top(k),:) = [c{k},cell(nRow(k),colPad(k))];
end
end
My test for this code was
A = rand(20,20);
A = mat2cell(A,ones(20,1),ones(20,1));
C = arrayfun(#(c) A(1:c,1:c),randi([1,15],1,5),'UniformOutput',false);
ccat = vert_cat(c);
I used this pice of code to generate data:
%generating some dummy data
m=1000;
s=100;
t=100;
cell_in=cell(m,1);
for idx=1:m
cell_in{idx}=cell(randi(s),randi(t));
end
Applying some minor modifications, I was able to speed up the code by a factor of 5
%Minor modifications of the original code
%use arrays instead of cells for row and col
[row,col] = cellfun(#size,cell_in);
%claculate max(col) once
tcol=max(col);
%use cell instead of repmat to generate an empty cell
fcn_vert = #(x)([x,cell(size(x,1),tcol-size(x,2))]);
cell_out = cellfun(fcn_vert,cell_in,'Uni',0); % Taking up lot of time
cell_out = vertcat(cell_out{:});
Using simply a for loop is even faster, because the data is only moved once
%new approac. Basic idea: move every data only once
[row,col] = cellfun(#size,cell_in);
trow=sum(row);
tcol=max(col);
r=1;
cell_out2 = cell(trow,tcol);
for idx=1:numel(cell_in)
cell_out2(r:r+row(idx)-1,1:col(idx))=cell_in{idx};
r=r+row(idx);
end

speeding up some for loops in matlab

Basically I am trying to solve a 2nd order differential equation with the forward euler method. I have some for loops inside my code, which take considerable time to solve and I would like to speed things up a bit. Does anyone have any suggestions how could I do this?
And also when looking at the time it takes, I notice that my end at line 14 takes 45 % of my total time. What is end actually doing and why is it taking so much time?
Here is my simplified code:
t = 0:0.01:100;
dt = t(2)-t(1);
B = 3.5 * t;
F0 = 2 * t;
BB=zeros(1,length(t)); % Preallocation
x = 2; % Initial value
u = 0; % Initial value
for ii = 1:length(t)
for kk = 1:ii
BB(ii) = BB(ii) + B(kk) * u(ii-kk+1)*dt; % This line takes the most time
end % This end takes 45% of the other time
x(ii+1) = x(ii) + dt*u(ii);
u(ii+1) = u(ii) + dt * (F0(ii) - BB(ii));
end
Running the code it takes me 8.552 sec.
You can remove the inner loop, I think:
for ii = 1:length(t)
for kk = 1:ii
BB(ii) = BB(ii) + B(kk) * u(ii-kk+1)*dt; % This line takes the most time
end % This end takes 45% of the other time
x(ii+1) = x(ii) + dt*u(ii);
u(ii+1) = u(ii) + dt * (F0(ii) - BB(ii));
end
So BB(ii) = BB(ii) (zero at initalisation) + sum for 1 to ii of BB(kk)* u(ii-kk+1).dt
but kk = 1:ii, so for a given ii, ii-kk+1 → ii-(1:ii) + 1 → ii:-1:1
So I think this is equivalent to:
for ii = 1:length(t)
BB(ii) = sum(B(1:ii).*u(ii:-1:1)*dt);
x(ii+1) = x(ii) + dt*u(ii);
u(ii+1) = u(ii) + dt * (F0(ii) - BB(ii));
end
It doesn't take as long as 8 seconds for me using either method, but the version with only one loop is about 2x as fast (the output of BB appears to be the same).
Is the sum loop of B(kk) * u(ii-kk+1) just conv(B(1:ii),u(1:ii),'same')
The best way to speed up loops in matlab is to try to avoid them. Try if you are able to perform a matrix operation instead of the inner loop. For example try to break the calculation you do there in small parts, then decide, if there are parts you can perform in advance without knowing the results of the next iteration of the loop.
to your secound part of the question, my guess:: The end contains the check if the loop runs for another round and this check by it self is not that long but called 50.015.001 times!

How to optimize MATLAB bitwise operations

I have written my own SHA1 implementation in MATLAB, and it gives correct hashes. However, it's very slow (a string a 1000 a's takes 9.9 seconds on my Core i7-2760QM), and I think the slowness is a result of how MATLAB implements bitwise logical operations (bitand, bitor, bitxor, bitcmp) and bitwise shifts (bitshift, bitrol, bitror) of integers.
Especially I wonder the need to construct fixed-point numeric objects for bitrol and bitror using fi command, because anyway in Intel x86 assembly there's rol and ror both for registers and memory addresses of all sizes. However, bitshift is quite fast (it doesn't need any fixed-point numeric costructs, a regular uint64 variable works fine), which makes the situation stranger: why in MATLAB bitrol and bitror need fixed-point numeric objects constructed with fi, whereas bitshift does not, when in assembly level it all comes down to shl, shr, rol and ror?
So, before writing this function in C/C++ as a .mex file, I'd be happy to know if there is any way to improve the performance of this function. I know there are some specific optimizations for SHA1, but that's not the issue, if the very basic implementation of bitwise rotations is so slow.
Testing a little bit with tic and toc, it's evident that what makes it slow are the loops in with bitrol and fi. There are two such loops:
%# Define some variables.
FFFFFFFF = uint64(hex2dec('FFFFFFFF'));
%# constants: K(1), K(2), K(3), K(4).
K(1) = uint64(hex2dec('5A827999'));
K(2) = uint64(hex2dec('6ED9EBA1'));
K(3) = uint64(hex2dec('8F1BBCDC'));
K(4) = uint64(hex2dec('CA62C1D6'));
W = uint64(zeros(1, 80));
... some other code here ...
%# First slow loop begins here.
for index = 17:80
W(index) = uint64(bitrol(fi(bitxor(bitxor(bitxor(W(index-3), W(index-8)), W(index-14)), W(index-16)), 0, 32, 0), 1));
end
%# First slow loop ends here.
H = sha1_handle_block_struct.H;
A = H(1);
B = H(2);
C = H(3);
D = H(4);
E = H(5);
%# Second slow loop begins here.
for index = 1:80
rotatedA = uint64(bitrol(fi(A, 0, 32, 0), 5));
if (index <= 20)
% alternative #1.
xorPart = bitxor(D, (bitand(B, (bitxor(C, D)))));
xorPart = bitand(xorPart, FFFFFFFF);
temp = rotatedA + xorPart + E + W(index) + K(1);
elseif ((index >= 21) && (index <= 40))
% FIPS.
xorPart = bitxor(bitxor(B, C), D);
xorPart = bitand(xorPart, FFFFFFFF);
temp = rotatedA + xorPart + E + W(index) + K(2);
elseif ((index >= 41) && (index <= 60))
% alternative #2.
xorPart = bitor(bitand(B, C), bitand(D, bitxor(B, C)));
xorPart = bitand(xorPart, FFFFFFFF);
temp = rotatedA + xorPart + E + W(index) + K(3);
elseif ((index >= 61) && (index <= 80))
% FIPS.
xorPart = bitxor(bitxor(B, C), D);
xorPart = bitand(xorPart, FFFFFFFF);
temp = rotatedA + xorPart + E + W(index) + K(4);
else
error('error in the code of sha1_handle_block.m!');
end
temp = bitand(temp, FFFFFFFF);
E = D;
D = C;
C = uint64(bitrol(fi(B, 0, 32, 0), 30));
B = A;
A = temp;
end
%# Second slow loop ends here.
Measuring with tic and toc, the entire computation of SHA1 hash of message abc takes on my laptop around 0.63 seconds, of which around 0.23 seconds is passed in the first slow loop and around 0.38 seconds in the second slow loop. So is there some way to optimize those loops in MATLAB before writing a .mex file?
There's this DataHash from the MATLAB File Exchange that calculates SHA-1 hashes lightning fast.
I ran the following code:
x = 'The quick brown fox jumped over the lazy dog'; %# Just a short sentence
y = repmat('a', [1, 1e6]); %# A million a's
opt = struct('Method', 'SHA-1', 'Format', 'HEX', 'Input', 'bin');
tic, x_hashed = DataHash(uint8(x), opt), toc
tic, y_hashed = DataHash(uint8(y), opt), toc
and got the following results:
x_hashed = F6513640F3045E9768B239785625CAA6A2588842
Elapsed time is 0.029250 seconds.
y_hashed = 34AA973CD4C4DAA4F61EEB2BDBAD27316534016F
Elapsed time is 0.020595 seconds.
I verified the results with a random online SHA-1 tool, and the calculation was indeed correct. Also, the 106 a's were hashed ~1.5 times faster than the first sentence.
So how does DataHash do it so fast??? Using the java.security.MessageDigest library, no less!
If you're interested with a fast MATLAB-friendly SHA-1 function, this is the way to go.
However, if this is just an exercise for implementing fast bit-level operations, then MATLAB doesn't really handle them efficiently, and in most cases you'll have to resort to MEX.
why in MATLAB bitrol and bitror need fixed-point numeric objects constructed with fi, whereas bitshift does not
bitrol and bitror are not part of the set of bitwise logic functions that are applicable for uints. They are part of the fixed-point toolbox, which also contains variants of bitand, bitshift etc that apply to fixed-point inputs.
A bitrol could be expressed as two bitshifts, a bitand and a bitor if you want to try using only the uint-functions. That might be even slower though.
As most MATLAB functions, bitand, bitor, bitxor are vectorized. So you get a lot faster if you give these function vector input rather than calling them in a loop over each element
Example:
%# create two sets of 10k random numbers
num = 10000;
hex = '0123456789ABCDEF';
A = uint64(hex2dec( hex(randi(16, [num 16])) ));
B = uint64(hex2dec( hex(randi(16, [num 16])) ));
%# compare loop vs. vectorized call
tic
C1 = zeros(size(A), class(A));
for i=1:numel(A)
C1(i) = bitxor(A(i),B(i));
end
toc
tic
C2 = bitxor(A,B);
toc
assert(isequal(C1,C2))
The timing was:
Elapsed time is 0.139034 seconds.
Elapsed time is 0.000960 seconds.
That's an order of magnitude faster!
The problem is, and as far as I can tell, the SHA-1 computation cannot be well vectorized. So you might not be able to take advantage of such vectorization.
As an experiment, I implemented a pure MATLAB-based funciton to compute such bit operations:
function num = my_bitops(op,A,B)
%# operation to perform: not, and, or, xor
if ischar(op)
op = str2func(op);
end
%# integer class: uint8, uint16, uint32, uint64
clss = class(A);
depth = str2double(clss(5:end));
%# bit exponents
e = 2.^(depth-1:-1:0);
%# convert to binary
b1 = logical(dec2bin(A,depth)-'0');
if nargin == 3
b2 = logical(dec2bin(B,depth)-'0');
end
%# perform binary operation
if nargin < 3
num = op(b1);
else
num = op(b1,b2);
end
%# convert back to integer
num = sum(bsxfun(#times, cast(num,clss), cast(e,clss)), 2, 'native');
end
Unfortunately, this was even worse in terms of performance:
tic, C1 = bitxor(A,B); toc
tic, C2 = my_bitops('xor',A,B); toc
assert(isequal(C1,C2))
The timing was:
Elapsed time is 0.000984 seconds.
Elapsed time is 0.485692 seconds.
Conclusion: write a MEX function or search the File Exchange to see if someone already did :)

Number crunching in Ruby (optimisation needed)

Ruby may not be the optimal language for this but I'm sort of comfortable working with this in my terminal so that's what I'm going with.
I need to process the numbers from 1 to 666666 so I pin out all the numbers that contain 6 but doesn't contain 7, 8 or 9. The first number will be 6, the next 16, then 26 and so forth.
Then I needed it printed like this (6=6) (16=6) (26=6) and when I have ranges like 60 to 66 I need it printed like (60 THRU 66=6) (SPSS syntax).
I have this code and it works but it's neither beautiful nor very efficient so how could I optimize it?
(silly code may follow)
class Array
def to_ranges
array = self.compact.uniq.sort
ranges = []
if !array.empty?
# Initialize the left and right endpoints of the range
left, right = array.first, nil
array.each do |obj|
# If the right endpoint is set and obj is not equal to right's successor
# then we need to create a range.
if right && obj != right.succ
ranges << Range.new(left,right)
left = obj
end
right = obj
end
ranges << Range.new(left,right) unless left == right
end
ranges
end
end
write = ""
numbers = (1..666666).to_a
# split each number in an array containing it's ciphers
numbers = numbers.map { |i| i.to_s.split(//) }
# delete the arrays that doesn't contain 6 and the ones that contains 6 but also 8, 7 and 9
numbers = numbers.delete_if { |i| !i.include?('6') }
numbers = numbers.delete_if { |i| i.include?('7') }
numbers = numbers.delete_if { |i| i.include?('8') }
numbers = numbers.delete_if { |i| i.include?('9') }
# join the ciphers back into the original numbers
numbers = numbers.map { |i| i.join }
numbers = numbers.map { |i| i = Integer(i) }
# rangify consecutive numbers
numbers = numbers.to_ranges
# edit the ranges that go from 1..1 into just 1
numbers = numbers.map do |i|
if i.first == i.last
i = i.first
else
i = i
end
end
# string stuff
numbers = numbers.map { |i| i.to_s.gsub(".."," thru ") }
numbers = numbers.map { |i| "(" + i.to_s + "=6)"}
numbers.each { |i| write << " " + i }
File.open('numbers.txt','w') { |f| f.write(write) }
As I said it works for numbers even in the millions - but I'd like some advice on how to make prettier and more efficient.
I deleted my earlier attempt to parlez-vous-ruby? and made up for that. I know have an optimized version of x3ro's excellent example.
$,="\n"
puts ["(0=6)", "(6=6)", *(1.."66666".to_i(7)).collect {|i| i.to_s 7}.collect do |s|
s.include?('6')? "(#{s}0 THRU #{s}6=6)" : "(#{s}6=6)"
end ]
Compared to x3ro's version
... It is down to three lines
... 204.2 x faster (to 66666666)
... has byte-identical output
It uses all my ideas for optimization
gen numbers based on modulo 7 digits (so base-7 numbers)
generate the last digit 'smart': this is what compresses the ranges
So... what are the timings? This was testing with 8 digits (to 66666666, or 823544 lines of output):
$ time ./x3ro.rb > /dev/null
real 8m37.749s
user 8m36.700s
sys 0m0.976s
$ time ./my.rb > /dev/null
real 0m2.535s
user 0m2.460s
sys 0m0.072s
Even though the performance is actually good, it isn't even close to the C optimized version I posted before: I couldn't run my.rb to 6666666666 (6x10) because of OutOfMemory. When running to 9 digits, this is the comparative result:
sehe#meerkat:/tmp$ time ./my.rb > /dev/null
real 0m21.764s
user 0m21.289s
sys 0m0.476s
sehe#meerkat:/tmp$ time ./t2 > /dev/null
real 0m1.424s
user 0m1.408s
sys 0m0.012s
The C version is still some 15x faster... which is only fair considering that it runs on the bare metal.
Hope you enjoyed it, and can I please have your votes if only for learning Ruby for the purpose :)
(Can you tell I'm proud? This is my first encounter with ruby; I started the ruby koans 2 hours ago...)
Edit by #johndouthat:
Very nice! The use of base7 is very clever and this a great job for your first ruby trial :)
Here's a slight modification of your snippet that will let you test 10+ digits without getting an OutOfMemory error:
puts ["(0=6)", "(6=6)"]
(1.."66666666".to_i(7)).each do |i|
s = i.to_s(7)
puts s.include?('6') ? "(#{s}0 THRU #{s}6=6)" : "(#{s}6=6)"
end
# before:
real 0m26.714s
user 0m23.368s
sys 0m2.865s
# after
real 0m15.894s
user 0m13.258s
sys 0m1.724s
Exploiting patterns in the numbers, you can short-circuit lots of the loops, like this:
If you define a prefix as the 100s place and everything before it,
and define the suffix as everything in the 10s and 1s place, then, looping
through each possible prefix:
If the prefix is blank (i.e. you're testing 0-99), then there are 13 possible matches
elsif the prefix contains a 7, 8, or 9, there are no possible matches.
elsif the prefix contains a 6, there are 49 possible matches (a 7x7 grid)
else, there are 13 possible matches. (see the image below)
(the code doesn't yet exclude numbers that aren't specifically in the range, but it's pretty close)
number_range = (1..666_666)
prefix_range = ((number_range.first / 100)..(number_range.last / 100))
for p in prefix_range
ps = p.to_s
# TODO: if p == prefix_range.last or p == prefix_range.first,
# TODO: test to see if number_range.include?("#{ps}6".to_i), etc...
if ps == '0'
puts "(6=6) (16=6) (26=6) (36=6) (46=6) (56=6) (60 thru 66) "
elsif ps =~ /7|8|9/
# there are no candidate suffixes if the prefix contains 7, 8, or 9.
elsif ps =~ /6/
# If the prefix contains a 6, then there are 49 candidate suffixes
for i in (0..6)
print "(#{ps}#{i}0 thru #{ps}#{i}6) "
end
puts
else
# If the prefix doesn't contain 6, 7, 8, or 9, then there are only 13 candidate suffixes.
puts "(#{ps}06=6) (#{ps}16=6) (#{ps}26=6) (#{ps}36=6) (#{ps}46=6) (#{ps}56=6) (#{ps}60 thru #{ps}66) "
end
end
Which prints out the following:
(6=6) (16=6) (26=6) (36=6) (46=6) (56=6) (60 thru 66)
(106=6) (116=6) (126=6) (136=6) (146=6) (156=6) (160 thru 166)
(206=6) (216=6) (226=6) (236=6) (246=6) (256=6) (260 thru 266)
(306=6) (316=6) (326=6) (336=6) (346=6) (356=6) (360 thru 366)
(406=6) (416=6) (426=6) (436=6) (446=6) (456=6) (460 thru 466)
(506=6) (516=6) (526=6) (536=6) (546=6) (556=6) (560 thru 566)
(600 thru 606) (610 thru 616) (620 thru 626) (630 thru 636) (640 thru 646) (650 thru 656) (660 thru 666)
(1006=6) (1016=6) (1026=6) (1036=6) (1046=6) (1056=6) (1060 thru 1066)
(1106=6) (1116=6) (1126=6) (1136=6) (1146=6) (1156=6) (1160 thru 1166)
(1206=6) (1216=6) (1226=6) (1236=6) (1246=6) (1256=6) (1260 thru 1266)
(1306=6) (1316=6) (1326=6) (1336=6) (1346=6) (1356=6) (1360 thru 1366)
(1406=6) (1416=6) (1426=6) (1436=6) (1446=6) (1456=6) (1460 thru 1466)
(1506=6) (1516=6) (1526=6) (1536=6) (1546=6) (1556=6) (1560 thru 1566)
(1600 thru 1606) (1610 thru 1616) (1620 thru 1626) (1630 thru 1636) (1640 thru 1646) (1650 thru 1656) (1660 thru 1666)
etc...
Note I don't speak ruby, but I intend to dohave done a ruby version later just for speed comparison :)
If you just iterate all numbers from 0 to 117648 (ruby <<< 'print "666666".to_i(7)') and print them in base-7 notation, you'll at least have discarded any numbers containing 7,8,9. This includes the optimization suggestion by MrE, apart from lifting the problem to simple int arithmetic instead of char-sequence manipulations.
All that remains, is to check for the presence of at least one 6. This would make the algorithm skip at most 6 items in a row, so I deem it less unimportant (the average number of skippable items on the total range is 40%).
Simple benchmark to 6666666666
(Note that this means outputting 222,009,073 (222M) lines of 6-y numbers)
Staying close to this idea, I wrote this quite highly optimized C code (I don't speak ruby) to demonstrate the idea. I ran it to 282475248 (congruent to 6666666666 (mod 7)) so it was more of a benchmark to measure: 0m26.5s
#include <stdio.h>
static char buf[11];
char* const bufend = buf+10;
char* genbase7(int n)
{
char* it = bufend; int has6 = 0;
do
{
has6 |= 6 == (*--it = n%7);
n/=7;
} while(n);
return has6? it : 0;
}
void asciify(char* rawdigits)
{
do { *rawdigits += '0'; }
while (++rawdigits != bufend);
}
int main()
{
*bufend = 0; // init
long i;
for (i=6; i<=282475248; i++)
{
char* b7 = genbase7(i);
if (b7)
{
asciify(b7);
puts(b7);
}
}
}
I also benchmarked another approach, which unsurprisingly ran in less than half the time because
this version directly manipulates the results in ascii string form, ready for display
this version shortcuts the has6 flag for deeper recursion levels
this version also optimizes the 'twiddling' of the last digit when it is required to be '6'
the code is simply shorter...
Running time: 0m12.8s
#include <stdio.h>
#include <string.h>
inline void recursive_permute2(char* const b, char* const m, char* const e, int has6)
{
if (m<e)
for (*m = '0'; *m<'7'; (*m)++)
recursive_permute2(b, m+1, e, has6 || (*m=='6'));
else
if (has6)
for (*e = '0'; *e<'7'; (*e)++)
puts(b);
else /* optimize for last digit must be 6 */
puts((*e='6', b));
}
inline void recursive_permute(char* const b, char* const e)
{
recursive_permute2(b, b, e-1, 0);
}
int main()
{
char buf[] = "0000000000";
recursive_permute(buf, buf+sizeof(buf)/sizeof(*buf)-1);
}
Benchmarks measured with:
gcc -O4 t6.c -o t6
time ./t6 > /dev/null
$range_start = -1
$range_end = -1
$f = File.open('numbers.txt','w')
def output_number(i)
if $range_end == i-1
$range_end = i
elsif $range_start < $range_end
$f.puts "(#{$range_start} thru #{$range_end})"
$range_start = $range_end = i
else
$f.puts "(#{$range_start}=6)" if $range_start > 0 # no range, print out previous number
$range_start = $range_end = i
end
end
'1'.upto('666') do |n|
next unless n =~ /6/ # keep only numbers that contain 6
next if n =~ /[789]/ # remove nubmers that contain 7, 8 or 9
output_number n.to_i
end
if $range_start < $range_end
$f.puts "(#{$range_start} thru #{$range_end})"
end
$f.close
puts "Ruby is beautiful :)"
I came up with this piece of code, which I tried to keep more or less in FP-styling. Probably not much more efficient (as it has been said, with basic number logic you will be able to increase performance, for example by skipping from 19xx to 2000 directly, but that I will leave up to you :)
def check(n)
n = n.to_s
n.include?('6') and
not n.include?('7') and
not n.include?('8') and
not n.include?('9')
end
def spss(ranges)
ranges.each do |range|
if range.first === range.last
puts "(" + range.first.to_s + "=6)"
else
puts "(" + range.first.to_s + " THRU " + range.last.to_s + "=6)"
end
end
end
range = (1..666666)
range = range.select { |n| check(n) }
range = range.inject([0..0]) do |ranges, n|
temp = ranges.last
if temp.last + 1 === n
ranges.pop
ranges.push(temp.first..n)
else
ranges.push(n..n)
end
end
spss(range)
My first answer was trying to be too clever. Here is a much simpler version
class MutablePrintingCandidateRange < Struct.new(:first, :last)
def to_s
if self.first == nil and self.last == nil
''
elsif self.first == self.last
"(#{self.first}=6)"
else
"(#{self.first} thru #{self.last})"
end
end
def <<(x)
if self.first == nil and self.last == nil
self.first = self.last = x
elsif self.last == x - 1
self.last = x
else
puts(self) # print the candidates
self.first = self.last = x # reset the range
end
end
end
and how to use it:
numer_range = (1..666_666)
current_range = MutablePrintingCandidateRange.new
for i in numer_range
candidate = i.to_s
if candidate =~ /6/ and candidate !~ /7|8|9/
# number contains a 6, but not a 7, 8, or 9
current_range << i
end
end
puts current_range
Basic observation: If the current number is (say) 1900 you know that you can safely skip up to at least 2000...
(I didn't bother updating my C solution for formatting. Instead I went with x3ro's excellent ruby version and optimized that)
Undeleted:
I still am not sure whether the changed range-notation behaviour isn't actually what the OP wants: This version changes the behaviour of breaking up ranges that are actually contiguous modulo 6; I wouldn't be surprised the OP actually expected
.
....
(555536=6)
(555546=6)
(555556 THRU 666666=6)
instead of
....
(666640 THRU 666646=6)
(666650 THRU 666656=6)
(666660 THRU 666666=6)
I'll let the OP decide, and here is the modified version, which runs in 18% of the time as x3ro's version (3.2s instead of 17.0s when generating up to 6666666 (7x6)).
def check(n)
n.to_s(7).include?('6')
end
def spss(ranges)
ranges.each do |range|
if range.first === range.last
puts "(" + range.first.to_s(7) + "=6)"
else
puts "(" + range.first.to_s(7) + " THRU " + range.last.to_s(7) + "=6)"
end
end
end
range = (1..117648)
range = range.select { |n| check(n) }
range = range.inject([0..0]) do |ranges, n|
temp = ranges.last
if temp.last + 1 === n
ranges.pop
ranges.push(temp.first..n)
else
ranges.push(n..n)
end
end
spss(range)
My answer below is not complete, but just to show a path (I might come back and continue the answer):
There are only two cases:
1) All the digits besides the lowest one is either absent or not 6
6, 16, ...
2) At least one digit besides the lowest one includes 6
60--66, 160--166, 600--606, ...
Cases in (1) do not include any continuous numbers because they all have 6 in the lowest digit, and are different from one another. Cases in (2) all appear as continuous ranges where the lowest digit continues from 0 to 6. Any single continuation in (2) is not continuous with another one in (2) or with anything from (1) because a number one less than xxxxx0 will be xxxxy9, and a number one more than xxxxxx6 will be xxxxxx7, and hence be excluded.
Therefore, the question reduces to the following:
3)
Get all strings between "" to "66666" that do not include "6"
For each of them ("xxx"), output the string "(xxx6=6)"
4)
Get all strings between "" to "66666" that include at least one "6"
For each of them ("xxx"), output the string "(xxx0 THRU xxx6=6)"
The killer here is
numbers = (1..666666).to_a
Range supports iterations so you would be better off by going over the whole range and accumulating numbers that include your segments in blocks. When one block is finished and supplanted by another you could write it out.

Airplane scheduling challenge from 2009 ACM-ICPC World Finals

Out of curiosity, I was checking out the problem set to the 2009 ACM International Collegiate Programming Contest. The questions are pretty interesting. They're available at http://cm.baylor.edu/resources/pdf/2009Problems.pdf. I could not come up with an algorithm that solved problem 1, which I will reproduce here. It set off a lively discussion in the office, and we think we're pretty close to an answer, but we'd really appreciate it if somebody could find/work out a full solution (code not required).
I will reproduce problem here for your convenience:
Problem 1
Consider the task of scheduling the airplanes that are landing at an airport. Incoming airplanes report their positions, directions, and speeds, and then the controller has to devise a landing schedule that brings all airplanes safely to the ground. Generally, the more time there is between successive landings, the “safer” a landing schedule is. This extra time gives pilots the opportunity to react to changing weather and other surprises.
Luckily, part of this scheduling task can be automated – this is where you come in. You will be given scenarios of airplane landings. Each airplane has a time window during which it can safely land. You must compute an order for landing all airplanes that respects these time windows. Furthermore, the airplane landings should be stretched out as much as possible so that the minimum time gap between successive landings is as large as possible. For example, if three airplanes land at 10:00am, 10:05am, and 10:15am, then the smallest gap is five minutes, which occurs between the first two airplanes. Not all gaps have to be the same, but the smallest gap should be as large as possible.
Input
The input file contains several test cases consisting of descriptions of landing scenarios. Each test case starts with a line containing a single integer n (2 ≤ n ≤ 8), which is the number of airplanes in the scenario. This is followed by n lines, each containing two integers ai, bi, which give the beginning and end of the closed interval [ai, bi] during which the ith plane can land safely. The numbers ai and bi are specified in minutes and satisfy 0 ≤ ai ≤ bi ≤ 1440.
The input is terminated with a line containing the single integer zero.
Output
For each test case in the input, print its case number (starting with 1) followed by the minimum achievable time gap between successive landings. Print the time split into minutes and seconds, rounded to the closest second. Follow the format of the sample output.
Sample Input
3
0 10
5 15
10 15
2
0 10
10 20
0
Sample Output
Case 1: 7:30
Case 2: 20:00
I'll give a sketch of the algorithm.
First you binary search through the answer (minimal interval between flights). To do that, for each selected interval T you must be able to check whether it is possible to achieve it. If it is possible to achieve T, then you try making it smaller, if it is not - make it bigger.
To check whether you can achieve T, try all n! orders in which the planes may be landing (8! is small enough for this algo to work in time). For each permutation P1...Pn, you try assigning the times in a greedy manner:
int land = a[0];
for (int i = 1; i < n; i++) {
land = max(a[i], land + **T**);
if (land > b[i]) return "CAN NOT ACHIEVE INTERVAL T";
}
return "CAN ACHIEVE";
This optimization problem can be solved by linear programming http://en.wikipedia.org/wiki/Linear_programming
I would do something like this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef uint MASK;
#define INPUT_SCALE 60
#define MAX_TIME (1440 * 60)
void readPlaneData(int& endTime, MASK landingMask[MAX_TIME], int index)
{
char buf[128];
gets(buf);
int start, end;
sscanf(buf, "%d %d", &start, &end);
for(int i=start * INPUT_SCALE; i<=end * INPUT_SCALE; i++)
landingMask[i] |= 1 << index;
if(end * INPUT_SCALE > endTime)
endTime = end * INPUT_SCALE;
}
int findNextLandingForPlane(MASK landingMask[MAX_TIME], int start, int index)
{
while(start < MAX_TIME)
{
if(landingMask[start] & (1 << index))
return start;
start++;
}
return -1;
}
bool canLandPlanes(int minTime, MASK landingMask[MAX_TIME], int planeCount)
{
int next = 0;
for(int i=0; i<planeCount; i++)
{
int nextForPlane = findNextLandingForPlane(landingMask, next, i);
if(nextForPlane == -1)
return false;
next = nextForPlane + minTime;
}
return true;
}
int main(int argc, char* argv[])
{
while(true)
{
char buf[128];
gets(buf);
int count = atoi(buf);
if(count == 0)
break;
MASK landingMask[MAX_TIME];
memset(landingMask, 0, sizeof(landingMask));
int endTime = 0;
for(int i=0; i<count; i++)
readPlaneData(endTime, landingMask, i);
while((endTime > 0) && !canLandPlanes(endTime, landingMask, count))
endTime--;
printf("%d:%02d\n", endTime / 60, endTime % 60);
}
}
Here's some Ruby code that brute-forces the solution. Note that test_case_one actually fails because I have commented out the code that would make this work with seconds (instead of just whole minutes).
The brute-force strategy is to permute all the sequences in which the planes may land. For each landing sequence, create the product of all possible landing times. This is fine with whole minutes, brutal with seconds.
But of course premature optimization, evil, and all that, so this is a first step:
require 'test/unit'
class SampleTests < Test::Unit::TestCase
def test_case_one
problem = Problem.new
problem.add_plane(Plane.new(0, 10))
problem.add_plane(Plane.new(5, 15))
problem.add_plane(Plane.new(10, 15))
problem.solve()
minimum_gap = problem.minimum_gap()
assert_equal(7.5, minimum_gap)
end
def test_case_two
problem = Problem.new
problem.add_plane(Plane.new(0,10))
problem.add_plane(Plane.new(10, 20))
problem.solve()
minimum_gap = problem.minimum_gap()
assert_equal(20, minimum_gap)
end
def test_case_three
problem = Problem.new
problem.add_plane(Plane.new(0, 2))
problem.add_plane(Plane.new(7, 10))
problem.add_plane(Plane.new(4, 6))
minimum_gap = problem.minimum_gap()
assert_equal(5, minimum_gap)
end
def test_case_four
problem = Problem.new
problem.add_plane(Plane.new(1439, 1440))
problem.add_plane(Plane.new(1439, 1440))
problem.add_plane(Plane.new(1439, 1440))
assert_equal(0, problem.minimum_gap())
end
def test_case_five
problem = Problem.new
problem.add_plane(Plane.new(0, 10))
problem.add_plane(Plane.new(1, 2))
assert_equal(9, problem.minimum_gap())
end
def test_case_six
problem = Problem.new
problem.add_plane(Plane.new(8, 9))
problem.add_plane(Plane.new(0, 10))
assert_equal(9, problem.minimum_gap())
end
end
class Plane
def initialize(min, max)
#ts = Array.new
#This is a cheat to prevent combinatorial explosion. Just ignore 60 seconds in a minute!
#min = min * 60
#max = max * 60
min.upto(max) { | t | #ts << t}
end
#Array of times at which the plane might land.
def times
return #ts
end
end
#from 'permutation' gem
class Array
def permute(prefixed=[])
if (length < 2)
# there are no elements left to permute
yield(prefixed + self)
else
# recursively permute the remaining elements
each_with_index do |e, i|
(self[0,i]+self[(i+1)..-1]).permute(prefixed+[e]) { |a| yield a }
end
end
end
end
class Problem
def initialize
#solved = false
#maximum_gap = 0
#planes = Array.new
end
def add_plane(plane)
#planes << plane
end
#given a particular landing schedule, what's the minimum gap?
#A: Sort schedule and spin through it, looking for the min diff
#Note that this will return 0 for invalid schedules (planes landing simultaneously)
def gap_for(schedule)
schedule.sort!
min_gap = 1440
0.upto(schedule.length - 2) { | i |
gap = schedule[i + 1] - schedule[i]
if gap < min_gap
min_gap = gap
end
}
return min_gap
end
#Brute-force strategy
#Get every possible plane sequence (permute)
#Get every possible schedule for that sequence (brute_force_schedule)
#Check that schedule
def solve
#planes.permute { | sequence |
schedules = brute_force_schedule(sequence)
schedules.each { | schedule |
schedule.flatten!
gap = gap_for(schedule)
if gap > #maximum_gap
#puts "Found a new one: #{schedule.inspect}"
#maximum_gap = gap
end
}
}
end
#The list of all possible schedules associated with an array of planes
def brute_force_schedule(planes)
head = planes[0]
tail = planes[1..-1]
if tail.empty?
#Last element, return the times
return head.times.to_a
else
#Recurse and combine (product)
return head.times.to_a.product(brute_force_schedule(tail))
end
end
def minimum_gap
unless #solved
solve
end
return #maximum_gap
end
end

Resources