Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Introduction
A valid Sudoku grid is filled with numbers 1 to 9, with no number occurring more than once in each sub-block of 9, row or column. Read this article for further details if you're unfamiliar with this popular puzzle.
Challenge
The challenge is to write the shortest program that validates a Sudoku grid that might not be full.
Input will be a string of 9 lines of 9 characters each, representing the grid. An empty cell will be represented by a .. Your output should be Valid if the grid is valid, otherwise output Invalid.
Example
Input
123...789
...456...
456...123
789...456
...123...
564...897
...231...
897...564
...564...
Output
Valid
Input
123456789
987654321
123456789
123456789
987654321
123456789
123456789
987654321
123456789
Output
Invalid
Code Golf Rules
Please post your shortest code in any language that solves this problem. Input and output may be handled via stdin and stdout or by other files of your choice.
Winner will be the shortest solution (by byte count) in a language with an implementation existing prior to the posting of this question. So while you are free to use a language you've just made up in order to submit a 0-byte solution, it won't count, and you'll probably get downvotes.
Golfscript: 56
n%{zip''+9/.{'.'-..&=}%$0=\}:|2*{3/}%|;**"InvV"3/="alid"
C: 165 162 161 160 159
int v[1566],x,y=9,c,b;main(){while(y--)for(x=9;x--+1;)if((c
=getchar()*27)>1242)b|=v[x+c]++|v[y+9+c]++|v[x-x%3+y/3+18+c]
++;puts(b?"Invalid":"Valid");return 0;}
The two newlines are not needed. One char saved by josefx :-) ...
Haskell: 207 230 218 195 172
import List
t=take 3
h=[t,t.drop 3,drop 6]
v[]="V"
v _="Inv"
f s=v[1|v<-[s,transpose s,[g=<<f s|f<-h,g<-h]],g<-map(filter(/='.'))v,g/=nub g]++"alid\n"
main=interact$f.lines
Perl: 168 128
$_=join'',<>;#a=/.../g;print+(/(\d)([^\n]{0,8}|(.{10})*.{9})\1/s
+map"#a[$_,$_+3,$_+6]"=~/(\d).*\1/,0..2,9..11,18..20)?Inv:V,alid
The first regex checks for duplicates that are in the same row and column; the second regex handles duplicates in the "same box".
Further improvement is possible by replacing the \n in the first regex with a literal newline (1 char), or with >= Perl 5.12, replacing [^\n] with \N (3 char)
Earlier, 168 char solution:
Input is from stdin, output is to stderr because it makes things so easy. Linebreaks are optional and not counted.
$_=join'',<>;$m=alid.$/;$n=Inv.$m;/(\d)(\N{0,8}|(.{10})*.{9})\1/s&&
die$n;#a=/.../g;for$i(0,8,17){for$j($i..$i+2){
$_=$a[$j].$a[$j+3].$a[$j+6];/(\d).*\1/&&die$n}}die"V$m"
Python: 230 221 200 185
First the readable version at len=199:
import sys
r=range(9)
g=[raw_input()for _ in r]
s=[[]for _ in r*3]
for i in r:
for j in r:
n=g[i][j]
for x in i,9+j,18+i/3*3+j/3:
<T>if n in s[x]:sys.exit('Invalid')
<T>if n>'.':s[x]+=n
print'Valid'
Since SO doesn't display tab characters, I've used <T> to represent a single tab character.
PS. the same approach minEvilized down to 185 chars:
r=range(9)
g=[raw_input()for _ in r]
s=['']*27
for i in r:
for j in r:
for x in i,9+j,18+i/3*3+j/3:n=g[i][j];s[x]+=n[:n>'.']
print['V','Inv'][any(len(e)>len(set(e))for e in s)]+'alid'
Perl, 153 char
#B contains the 81 elements of the board.
&E tests whether a subset of #B contains any duplicate digits
main loop validates each column, "block", and row of the puzzle
sub E{$V+="#B[#_]"=~/(\d).*\1/}
#B=map/\S/g,<>;
for$d(#b=0..80){
E grep$d==$_%9,#b;
E grep$d==int(($_%9)/3)+3*int$_/27,#b;
E$d*9..$d*9+8}
print$V?Inv:V,alid,$/
Python: 159 158
v=[0]*244
for y in range(9):
for x,c in enumerate(raw_input()):
if c>".":
<T>for k in x,y+9,x-x%3+y//3+18:v[k*9+int(c)]+=1
print["Inv","V"][max(v)<2]+"alid"
<T> is a single tab character
Common Lisp: 266 252
(princ(let((v(make-hash-table))(r "Valid"))(dotimes(y 9)(dotimes(x
10)(let((c(read-char)))(when(>(char-code c)46)(dolist(k(list x(+ 9
y)(+ 18(floor(/ y 3))(- x(mod x 3)))))(when(>(incf(gethash(+(* k
9)(char-code c)-49)v 0))1)(setf r "Invalid")))))))r))
Perl: 186
Input is from stdin, output to stdout, linebreaks in input optional.
#y=map/\S/g,<>;
sub c{(join'',map$y[$_],#$h)=~/(\d).*\1/|c(#_)if$h=pop}
print(('V','Inv')[c map{$x=$_;[$_*9..$_*9+8],[grep$_%9==$x,0..80],[map$_+3*$b[$x],#b=grep$_%9<3,0..20]}0..8],'alid')
(Linebreaks added for "clarity".)
c() is a function that checks the input in #y against a list of lists of position numbers passed as an argument. It returns 0 if all position lists are valid (contain no number more than once) and 1 otherwise, using recursion to check each list. The bottom line builds this list of lists, passes it to c() and uses the result to select the right prefix to output.
One thing that I quite like is that this solution takes advantage of "self-similarity" in the "block" position list in #b (which is redundantly rebuilt many times to avoid having #b=... in a separate statement): the top-left position of the ith block within the entire puzzle can be found by multiplying the ith element in #b by 3.
More spread out:
# Grab input into an array of individual characters, discarding whitespace
#y = map /\S/g, <>;
# Takes a list of position lists.
# Returns 0 if all position lists are valid, 1 otherwise.
sub c {
# Pop the last list into $h, extract the characters at these positions with
# map, and check the result for multiple occurences of
# any digit using a regex. Note | behaves like || here but is shorter ;)
# If the match fails, try again with the remaining list of position lists.
# Because Perl returns the last expression evaluated, if we are at the
# end of the list, the pop will return undef, and this will be passed back
# which is what we want as it evaluates to false.
(join '', map $y[$_], #$h) =~ /(\d).*\1/ | c(#_) if $h = pop
}
# Make a list of position lists with map and pass it to c().
print(('V','Inv')[c map {
$x=$_; # Save the outer "loop" variable
[$_*9..$_*9+8], # Columns
[grep$_%9==$x,0..80], # Rows
[map$_+3*$b[$x],#b=grep$_%9<3,0..20] # Blocks
} 0..8], # Generates 1 column, row and block each time
'alid')
Perl: 202
I'm reading Modern Perl and felt like coding something... (quite a cool book by the way:)
while(<>){$i++;$j=0;for$s(split//){$j++;$l{$i}{$s}++;$c{$j}{$s}++;
$q{(int(($i+2)/3)-1)*3+int(($j+2)/3)}{$s}++}}
$e=V;for$i(1..9){for(1..9){$e=Inv if$l{$i}{$_}>1or$c{$i}{$_}>1or$q{$i}{$_}>1}}
print $e.alid
Count is excluding unnecessary newlines.
This may require Perl 5.12.2.
A bit more readable:
#use feature qw(say);
#use JSON;
#$json = JSON->new->allow_nonref;
while(<>)
{
$i++;
$j=0;
for $s (split //)
{
$j++;
$l{$i}{$s}++;
$c{$j}{$s}++;
$q{(int(($i+2)/3)-1)*3+int(($j+2)/3)}{$s}++;
}
}
#say "lines: ", $json->pretty->encode( \%l );
#say "columns: ", $json->pretty->encode( \%c );
#say "squares: ", $json->pretty->encode( \%q );
$e = V;
for $i (1..9)
{
for (1..9)
{
#say "checking {$i}{$_}: " . $l{$i}{$_} . " / " . $c{$i}{$_} . " / " . $q{$i}{$_};
$e = Inv if $l{$i}{$_} > 1 or $c{$i}{$_} > 1 or $q{$i}{$_} > 1;
}
}
print $e.alid;
Ruby — 176
f=->x{x.any?{|i|(i-[?.]).uniq!}}
a=[*$<].map{|i|i.scan /./}
puts f[a]||f[a.transpose]||f[a.each_slice(3).flat_map{|b|b.transpose.each_slice(3).map &:flatten}]?'Invalid':'Valid'
Lua, 341 bytes
Although I know that Lua isn't the best golfing language, however, considering it's size, I think it's worth posting it ;).
Non-golfed, commented and error-printing version, for extra fun :)
i=io.read("*a"):gsub("\n","") -- Get input, and strip newlines
a={{},{},{}} -- checking array, 1=row, 2=columns, 3=squares
for k=1,3 do for l=1,9 do a[k][l]={0,0,0,0,0,0,0,0,0}end end -- fillup array with 0's (just to have non-nils)
for k=1,81 do -- loop over all numbers
n=tonumber(i:sub(k,k):match'%d') -- get current character, check if it's a digit, and convert to a number
if n then
r={math.floor((k-1)/9)+1,(k-1)%9+1} -- Get row and column number
r[3]=math.floor((r[1]-1)/3)+3*math.floor((r[2]-1)/3)+1 -- Get square number
for l=1,3 do v=a[l][r[l]] -- 1 = row, 2 = column, 3 = square
if v[n] then -- not yet eliminated in this row/column/square
v[n]=nil
else
print("Double "..n.." in "..({"row","column","square"}) [l].." "..r[l]) --error reporting, just for the extra credit :)
q=1 -- Flag indicating invalidity
end
end
end
end
io.write(q and"In"or"","Valid\n")
Golfed version, 341 bytes
f=math.floor p=io.write i=io.read("*a"):gsub("\n","")a={{},{},{}}for k=1,3 do for l=1,9 do a[k][l]={0,0,0,0,0,0,0,0,0}end end for k=1,81 do n=tonumber(i:sub(k,k):match'%d')if n then r={f((k-1)/9)+1,(k-1)%9+1}r[3]=f((r[1]-1)/3)+1+3*f((r[2]-1)/3)for l=1,3 do v=a[l][r[l]]if v[n]then v[n]=nil else q=1 end end end end p(q and"In"or"","Valid\n")
Python: 140
v=[(k,c) for y in range(9) for x,c in enumerate(raw_input()) for k in x,y+9,(x/3,y/3) if c>'.']
print["V","Inv"][len(v)>len(set(v))]+"alid"
ASL: 108
args1["\n"x2I3*x;{;{:=T(T'{:i~{^0}?})}}
{;{;{{,0:e}:;{0:^},u eq}}/`/=}:-C
dc C#;{:|}C&{"Valid"}{"Invalid"}?P
ASL is a Golfscript inspired scripting language I made.
Related
def pick_random_line
chosen_line = nil
File.foreach("id'sForCascade.txt").each_with_index do |line, id|
chosen_line = line if rand < 1.0/(id+1)
end
return chosen_line
end`enter code here
Hey, i'm trying to make that code pick 37 different lines. So how would I do that i'm stuck and confused.
Assuming you don't want the same line to repeat more than once, I would do it in one line like this:
File.read("test.txt").split("\n").shuffle.first(37)
File.read("test.txt") reads the entire file.
split("\n") splits the file to lines based on the \n delimiter (I assume your file is textual and have lines separated by new line character).
shuffle is a very convenient method of Array that shuffles the lines randomly. You can read about it here:
http://docs.ruby-lang.org/en/2.0.0/Array.html#method-i-shuffle
Finally, first(37) gives you the first 37 lines out of the shuffled array. These are guaranteed to be random from the shuffle operation.
You can do something like this:
input_lines = File.foreach("test.txt").map(&:to_s)
output_lines = []
37.times do
output_lines << input_lines.delete_at(rand(input_lines.length))
end
puts output_lines
This will ensure that you aren't grabbing duplicate lines and you don't need to do any fancy checking.
However, if your file is less than 37 lines this may cause a problem, it also assumes that your file exists.
EDIT:
What is happening is the rand call is now changing the range on which it is called based on the size of the input lines. And since you are deleting at an index when you take the line out, the length shrinks and you do not risk duplicating lines.
If you want to save relatively few lines from a large file, reading the entire file into an array (and then randomly selecting lines) could be costly. It might be better to count the number of lines in the file, randomly select line offsets and then save the lines at those offsets to an array. This approach is no more difficult to implement than the former one, but makes the method more robust, even if the files in the current application are not overly large.1
Suppose your filename were given by FName. Here are three ways to count the numbers of lines in the file:
Count lines, literally
cnt = File.foreach(FName).reduce(0) { |c,_| c+1 }
Use $.
File.foreach(FName) {}
cnt = $.
On Unix-family computers, shell-out to the operating system
cnt = %x{wc -l #{FName}}.split.first.to_ii
The third option is very fast.
Random offsets (base 1) for n lines to be saved could be computed as follows:
lines = (1..cnt).to_a.sample(n).sort
Saving the lines at those offsets to an array is straightforward; for example:
File.foreach(FName).with_object([]) do |line,a|
if lines.first == $.
a << line
lines.shift
break a if lines.empty?
end
end
Note that $. #=> 1 after the first line is first line is read, and $. is incremented by 1 after each successive line is read. (Hence base 1 for line offsets.)
1 Moreover, many programmers, not just Rubiests, are repelled by the idea of amassing large numbers of anything and then discarding all but a few.
I've been looking everywhere for an answer to this, and I just can't get it to work.
I have an input file that is read into an array using Perl. The file is a text file containing a table. Perl reads it in as an array, with each element being a full line (including all five columns). This is what the array looks like:
0__len__340 16 324 0 0.0470588235294118
1__len__251 2 249 0 0.00796812749003984
2__len__497 0 497 0 0
3__len__55 7 48 0 0.127272727272727
4__len__171 0 171 0 0
5__len__75 0 75 0 0
6__len__160 75 85 0 0.46875
7__len__285 1 284 0 0.00350877192982456
8__len__94 44 50 0 0.468085106382979
I need to sort this table by the last column in descending order. So my output should be:
6__len__160 75 85 0 0.46875
8__len__94 44 50 0 0.468085106382979
3__len__55 7 48 0 0.127272727272727
0__len__340 16 324 0 0.0470588235294118
1__len__251 2 249 0 0.00796812749003984
7__len__285 1 284 0 0.00350877192982456
2__len__497 0 497 0 0
4__len__171 0 171 0 0
5__len__75 0 75 0 0
I've tried a few approaches, but none have worked. Here's the code I've tried:
#input = <FILENAME>;
#Close the file
close FILENAME;
my #fractions;
my $y = 0;
for (my $x = 1; $x <= $#input; ++$x) {
$fractions[$y] = (split (/\s/, $input[$x]))[4];
++$y;
}
my #sorted = sort {$b <=> $a} #fractions;
my $e = 1;
my $z = 0;
my $f = 0;
my #final;
do {
do {
if ((split (/\s/, $input[$e]))[4] == $sorted[$z]){
$final[$f] = $input[$e];
++$e;
++$f;
} else {
++$e;
}
} until ($e > $#input);
do {
++$z;
} until ($sorted[$z] != $sorted[$z - 1]);
$e = 0;
} until ($z > $#sorted);
for (my $h = 0; $h <= $#final; ++$h) {
print $final[$h] . "\n\n";
}
With this one, I basically tried to put the column 5 numbers into their own array, sort them, and then go back through the original array and pull out the elements that match the sorted array, and put them into the final array.
This may work if I keep working on it, but it takes so long to run that it's impractical. This small table I'm using to test my code with took a long time for this to run, and once the code is working it will be dealing with a table that has millions of rows.
I also tried applying the sort command to the table itself, but my output is the exact same table as my input...it doesn't get sorted.
#input = <FILENAME>;
close FILENAME;
my #sorted = sort { $b->[4] <=> $a->[4] } #input;
for (my $h = 0; $h <= $#sorted; ++$h) {
print $sorted[$h] . "\n\n";
}
exit;
Lastly, I tried to put the array into a hash where the key was the first four columns, since the first column name is unique, and the values being the fifth column.
Then I hoped I could sort the hash by the values and the keys would stay with their assigned values. I couldn't get this to work either, though unfortunately it was a couple days ago and I erased the code.
One problem was that I couldn't figure out how to split the string only before the fifth column, so I end up with two strings, one containing the first four columns and one containing the fifth.
What am I doing wrong with the sort command? Is there a better way to do this?
In your last code example you can replace
my #sorted = sort { $b->[4] <=> $a->[4] } #input;
with
my #sorted = sort { (split(' ', $b))[4] <=> (split(' ', $a))[4] } #input;
or even
my #sorted = sort { (split(/\s+/, $b))[4] <=> (split(/\s+/, $a))[4] } #input;
if input data has no lines with leading spaces.
In case this helps folks dropping by in the future - here are some inelegant attempts to sort() the content of lines.txt (data from question), by its fifth column, with a Perl one-liner. This should work:
perl -E 'say "#$_" for sort {$a->[4] <=> $b->[4]} map {[(split)]} <>' file
This is more or less the same thing but with the split "automated" with the autosplit (-a) switch which creates the #F array:
perl -anE 'push #t,[#F]}{say "#$_" for sort {$a->[4] <=> $b->[4]} #t' file
If the split pattern is not white space, you can substitute it for the default (\s+) shown here:
perl -E 'say sort {(split(/\s+/,$a))[4] <=> (split(/\s+/,$b))[4]} <>' file
This is the shortest way to sort and print the fifth column:
perl -E 'say for sort map{ (split)[4] } <>' file
Transforming the sort
Can we map, split and sort in one pass? This is a short way to sort the fifth column:
perl -E 'say for sort map{ [(split)[4], $_]->[0] } <>' file
Dissecting this last example: perl first maps the STDIN to split() - making a list; takes the fifth element (i.e. [4]) of this split() list and wraps that list item and the whole line that was just read ($_) inside an array constructor []; then takes the first element of that anonymous array (i.e. the fifth column of each line) and passes it to sort(). Phew!
This just prints the fifth column since we only passed the first element ->[0] of the anonymous array to sort. To print the whole line sorted by the column in this way we need to pass the whole anonymous array to sort and tell sort to use the element which holds the column's contents to do its work, and then pass the other element of the anonymous array (the one that holds the entire line) to print (or say) - this way we can sort by the fifth column, but print out the whole line:
perl -E 'say $_->[1] for sort{$a->[0] <=> $b->[0]} map{[(split)[4], $_]} <>' file
This is just like our very first example above. If, instead of running through the list that is created using for, we map the second element and pass it to print we get:
perl -E 'say map $_->[1], sort{$a->[0] <=> $b->[0]} map{[(split)[4],$_]} <>' file
We have reinvented the Schwartzian transform which is such a great Perl idiom that it is "built in" to Perl 6 ;-)
To get a sense of how this works you can "visualize" things with Data::Printer:
perl -MDDP -e '#t = sort map{ [ (split /\s+/)[4], $_ ] } <> ; p #t' file
Learn more about Perl idioms from the Perl Idioms Explained posts at perlmonks.org and the Perl Beyond Syntax chapter of Modern Perl.
You might also like the nsort_by function from List::UtilsBy:
use List::UtilsBy 'rev_nsort_by';
my #sorted = rev_nsort_by { (split(' ', $_))[4] } #input;
Reading this page on TLDP: http://www.tldp.org/LDP/abs/html/mathc.html
I've found the following command:
bash$ echo "16i[q]sa[ln0=aln100%Pln100/snlbx]sbA0D68736142snlbxq" | dc
Bash
And I was just wondering: someone knows what type of hash is that and how I can convert a simple text to generate that kind of hash?
thanks
Relatively good tutorial is here. Using it, you'll get the next decomposition:
16i - the input will be hexadecimal (like in bc - ibase=16)
[q] - the [..] mean string. So this store a string q to the top of stack. Usually the [....] is a series of commands stored for later execution (like a macro), in this case the command q - quit.
the s - take the top of stack and store it to register identified by the next letter, so the [q]sa together in this case mean: store the the string q into register a
same with the next, store the string ln0=aln100%Pln100/snlbx into register b
the next A0D68736142sn, store the A0D68736142 into register n. In this case not a string, but because we have input hexaadecimal, this is an hexadecimal number
The above is an like a preparation phase, translated into common programming language:
$a = "q";
$b = "ln0=aln100%Pln100/snlbx";
$n = 0xA0D68736142";
The stack is now empty (everything is stored in registres).
The next is the final calculation:
the l mean take the the the value from the followed register, and put it to the top of stack so the lb mean: take the value of register b.
the x mean: execute it as an series of commands - in this case read it as eval "$b";
the last q mean quit
now need decompose the ln0=aln100%Pln100/snlbx (content of register b - what is eval-ing.)
ln - as above, take the value from the register n in our case: the hexa-number
next is a tricky part: 0=a, take the top of stack, and if it is zero execute the string (macro) stored in the registry "a" (in our case this is a "q" - quit), e.g. like:
if( $n == 0 ) eval "a"; #a contains "q" = quit
if the $n isn't zero, take the content of regsiter n again (ln) (put to the top of stack)
put the number 100 (it is hexa) to to top of stack
% modulo the two numbers from the top of the stack (so: $n % 0x100) and put it to the top of the stack
print the top of stack as an ascii string ( chr($stack) )
ln take again the number $n (again, it comes to the top of stack)
put 100 to the stack
/ divide the to numbers in the stack (result comes into the top)
sn store the stack into the $n (replaces the original number)
lbx - as above - eval the register b
So, in short:
from the number 0xA0D68736142 stored in $n,
modulo it by 0x100 get the last two digits, convert it to ascii and print it,
divide it by 0x100 (removes last two digits) - and store it to $n
repeat until nothing remained (zero)
For the 0xA0D68736142 you will get (use man ascii)
42 - B
61 - a
73 - s
68 - h
0D - \r
A (as 0A) - \n
Creating such hexsting from normal ascii string with perl
perl -E '$s=reverse("Bash\r\n");$s =~ s/(.)/sprintf("%02x",ord($1))/seg;say uc $s'
prints
0A0D68736142
or
perl -E '$s=reverse("Stackoverflow rocks\n");$s =~ s/(.)/sprintf("%02x",ord($1))/seg;say uc $s'
0A736B636F7220776F6C667265766F6B63617453
and
echo "16i[q]sa[ln0=aln100%Pln100/snlbx]sb0A736B636F7220776F6C667265766F6B63617453snlbxq" | dc
prints:
Stackoverflow rocks
Ps: with dc is one strange thing - it is much-much easier to read and understand a "program" as creating one. :)
Say I have a file blah.rb which is constantly written to somehow and has patterns like :
bagtagrag" " hellobello " blah0 blah1 " trag kljesgjpgeagiafw blah2 " gneo" whatttjtjtbvnblah3
Basically, it's garbage. But I want to check for the blah that keeps on coming up and find the latest value i.e. number in front of the blah.
Hence, something like :
grep "blah"{$1} | tail var/test/log
My file is at location var/test/log and as you can see, I need to get the number in front of the blah.
def get_last_blah("filename")
// Code to get the number after the last blah in the less of the filename
end
def display_the_last_blah()
puts get_last_blah("var/test/log")
end
Now, I could just keep on reading the file and performing something akin to string pattern search on the entire file again and again. Obtaining the last value, I can then get the number. But what if I only want to look at the added text in the less and not the entire text.
Moreover, is there a quick one-liner or smart command to get this?
Use IO.open to read the file and Enumerable#grep to search the desired text using a regular expression like the following code does:
def get_last_blah(filename)
open(filename) { |f| f.grep(/.*blah(\d).*$/){$1}.last.to_i }
end
puts get_last_blah('var/test/log')
# => 3
The method return the number in from of the last "blah" word of the file. It is reading the entire file but the result is the same as if is done with tail.
If you want to use a proper tail, take a look at the File::Tail gem.
I presume you wish to avoid reading the entire file each time; rather, you want to start at the end and work backward until you find the last string of interest. Here's a way to do that.
Code
BLOCK_SIZE = 30
MAX_BLAH_NBR = 123
def doit(fname, blah_text)
#f = File.new(fname)
#blah_text = blah_text
#chars_to_read = BLOCK_SIZE + #blah_text.size + MAX_BLAH_NBR.to_s.size
ptr = #f.size
block_size = BLOCK_SIZE
loop do
return nil if ptr.zero?
ptr -= block_size
if ptr < 0
block_size += ptr
ptr = 0
end
blah_nbr = read_block(ptr)
(f.close; return blah_nbr.to_i) if blah_nbr
end
end
def read_block(ptr)
#f.seek(ptr)
#f.read(#chars_to_read)[/.*#{#blah_text}(\d+)/,1]
end
Demo
Let's first write something interesting to a file.
MY_FILE = 'my_file.txt'
text =<<_
Now is the time
for all blah2 to
come to the aid of
their blah3, blah4 enemy or
perhaps do blagh5 something
else like wash the dishes.
_
File.write(MY_FILE, text)
Now run the program:
p doit(MY_FILE, "blah") #=> 4
We expected it to return 4 and it did.
Explanation
doit first instructs read_block to read up to 37 characters, beginning BLOCK_SIZE (30) characters from the end of the file. That's at the beginning of the string
"ng\nelse like wash the dishes.\n"
which is 30 characters long. (I'll explain the "37" in a moment.) read_block finds no text matching the regex (like "blah3"), so returns nil.
As nil was returned, doit makes the same request of read_block, but this time starting BLOCK_SIZE characters closer to the beginning of the file. This time read_block reads the 37 character string:
"y or\nperhaps do blagh5 something\nelse"
but, again, does not match the regex, so returns nil to doit. Notice that it read the seven characters, "ng\nelse", that it read previously. This overlap is necessary in case one 30-character block ended, "...bla" and the next one began "h3...". Hence the need to read more characters (here 37) than the block size.
read_block next reads the string:
"aid of\ntheir blah3, blah4 enemy or\npe"
and finds that "blah4" matches the regex (not "blah3", because the regex is being "greedy" with .*), so it returns "4" to doit, which converts that to the number 4, which it returns.
doit would return nil if the regex did not match any text in the file.
A word is an anagram if the letters in that word can be re-arranged to form a different word.
Task:
The shortest source code by character count to find all sets of anagrams given a word list.
Spaces and new lines should be counted as characters
Use the code ruler
---------10--------20--------30--------40--------50--------60--------70--------80--------90--------100-------110-------120
Input:
a list of words from stdin with each word separated by a new line.
e.g.
A
A's
AOL
AOL's
Aachen
Aachen's
Aaliyah
Aaliyah's
Aaron
Aaron's
Abbas
Abbasid
Abbasid's
Output:
All sets of anagrams, with each set separated by a separate line.
Example run:
./anagram < words
marcos caroms macros
lump's plum's
dewar's wader's
postman tampons
dent tend
macho mocha
stoker's stroke's
hops posh shop
chasity scythia
...
I have a 149 char perl solution which I'll post as soon as a few more people post :)
Have fun!
EDIT: Clarifications
Assume anagrams are case insensitive (i.e. upper and lower case letters are equivalent)
Only sets with more than 1 item should be printed
Each set of anagrams should only be printed once
Each word in an anagram set should only occur once
EDIT2: More Clarifications
If two words differ only in capitalization, they should be collapsed into the same word, and it's up to you to decide which capitalization scheme to use for the collapsed word
sets of words only have to end in a new line, as long as each word is separated in some way, e.g. comma separated, or space separated is valid. I understand some languages have quick array printing methods built in so this should allow you to take advantage of that if it doesn't output space separated arrays.
Powershell, 104 97 91 86 83 chars
$k=#{};$input|%{$k["$([char[]]$_|%{$_+0}|sort)"]+=#($_)}
$k.Values|?{$_[1]}|%{"$_"}
Update for the new requirement (+8 chars):
To exclude the words that only differ in capitalization, we could just remove the duplicates (case-insensitvely) from the input list, i.e. $input|sort -u where -u stands for -unique. sort is case-insenstive by default:
$k=#{};$input|sort -u|%{$k["$([char[]]$_|%{$_+0}|sort)"]+=#($_)}
$k.Values|?{$_[1]}|%{"$_"}
Explanation of the [char[]]$_|%{$_+0}|sort -part
It's a key for the hashtable entry under which anagrams of a word are stored. My initial solution was: $_.ToLower().ToCharArray()|sort. Then I discovered I didn't need ToLower() for the key, as hashtable lookups are case-insensitive.
[char[]]$_|sort would be ideal, but sorting of the chars for the key needs to be case-insensitive (otherwise Cab and abc would be stored under different keys). Unfortunately, sort is not case-insenstive for chars (only for strings).
What we need is [string[]][char[]]$_|sort, but I found a shorter way of converting each char to string, which is to concat something else to it, in this case an integer 0, hence [char[]]$_|%{$_+0}|sort. This doesn't affect the sorting order, and the actual key ends up being something like: d0 o0 r0 w0. It's not pretty, but it does the job :)
Perl, 59 characters
chop,$_{join'',sort split//,lc}.="$_ "for<>;/ ./&&say for%_
Note that this requires Perl 5.10 (for the say function).
Haskell, 147 chars
prior sizes: 150 159 chars
import Char
import List
x=sort.map toLower
g&a=g(x a).x
main=interact$unlines.map unwords.filter((>1).length).groupBy((==)&).sortBy(compare&).lines
This version, at 165 chars satisifies the new, clarified rules:
import Char
import List
y=map toLower
x=sort.y
g&f=(.f).g.f
w[_]="";w a=show a++"\n"
main=interact$concatMap(w.nubBy((==)&y)).groupBy((==)&x).sortBy(compare&x).lines
This version handles:
Words in the input that differ only by case should only count as one word
The output needs to be one anagram set per line, but extra punctuation is acceptable
Ruby, 94 characters
h={};(h[$_.upcase.bytes.sort]||=[])<<$_ while gets&&chomp;h.each{|k,v|puts v.join' 'if v.at 1}
Python, 167 characters, includes I/O
import sys
d={}
for l in sys.stdin.readlines():
l=l[:-1]
k=''.join(sorted(l)).lower()
d[k]=d.pop(k,[])+[l]
for k in d:
if len(d[k])>1: print(' '.join(d[k]))
Without the input code (i.e. if we assume the wordlist already in a list w), it's only 134 characters:
d={}
for l in w:
l=l[:-1]
k=''.join(lower(sorted(l)))
d[k]=d.pop(k,[])+[l]
for k in d:
if len(d[k])>1: print(' '.join(d[k]))
AWK - 119
{split(toupper($1),a,"");asort(a);s="";for(i=1;a[i];)s=a[i++]s;x[s]=x[s]$1" "}
END{for(i in x)if(x[i]~/ .* /)print x[i]}
AWK does not have a join function like Python, or it could have been shorter...
It assumes uppercase and lowercase as different.
C++, 542 chars
#include <iostream>
#include <map>
#include <vector>
#include <boost/algorithm/string.hpp>
#define ci const_iterator
int main(){using namespace std;typedef string s;typedef vector<s> vs;vs l;
copy(istream_iterator<s>(cin),istream_iterator<s>(),back_inserter(l));map<s, vs> r;
for (vs::ci i=l.begin(),e=l.end();i!=e;++i){s a=boost::to_lower_copy(*i);
sort(a.begin(),a.end());r[a].push_back(*i);}for (map<s,vs>::ci i=r.begin(),e=r.end();
i!=e;++i)if(i->second.size()>1)*copy(i->second.begin(),i->second.end(),
ostream_iterator<s>(cout," "))="\n";}
Python, O(n^2)
import sys;
words=sys.stdin.readlines()
def s(x):return sorted(x.lower());
print '\n'.join([''.join([a.replace('\n',' ') for a in words if(s(a)==s(w))]) for w in words])