Use wordlist for Codeigniter captcha - codeigniter

I have a wordlist of dictionary words in .txt format. How can I use this with the captcha_helper instead of random characters? I've already extended the captcha_helper file but am having issues integrating my wordlist.txt file for use.
After doing some poking, I found a solution:
// This is the modified version in captcha_helper.php
if($word == ''){
$wordsfile = '../words.php';
$fp = fopen($wordsfile, 'r');
$length = strlen(fgets($fp));
$line = rand(1, (filesize($wordsfile)/$length)-2);
if(fseek($fp, $length*$line) == -1) return FALSE;
$word = trim(fgets($fp));
fclose($fp);
}
But I noticed that sometimes the last letter would get cut off. Is there a way to make sure that the first and last letter never get placed outside of the bounding box?

in this case you can use one function which pass your words randomly to the script to display..
i think this is better option.
instead to work wit

Related

Search for a specific file name in a specific folder in laravel

everyone. So, what I basically want to do is to search for all files that start with "dm" or end with ".tmp" in storage_path("app/public/session").
I already tried File::allFiles() and File::files() but what I get is all files that are into that session folder and I can't figure out how to do it. What I could find in here is questions on how to empty a folder but that's not what I am looking for. Thanks.
Try this code :
$files = File::allFiles(storage_path("app/public/session"));
$files = array_filter($files, function ($file) {
return (strpos($file->getFilename(), 'dm') === 0) || (substr($file->getFilename(), -4) === '.tmp');
});
Or you can use the glob function like this :
$files = array_merge(
glob(storage_path("app/public/session/dm*")),
glob(storage_path("app/public/session/*.tmp"))
);
In Laravel, you can use the File facade's glob() method to search for files that match a certain pattern. The glob() function searches for all the pathnames matching a specified pattern according to the rules used by the libc glob() function, which is similar to the rules used by common shells.
You can use the glob() method to search for files that start with "dm" or end with ".tmp" in the "app/public/session" directory like this:
use Illuminate\Support\Facades\File;
$storagePath = storage_path("app/public/session");
// Find files that start with "dm"
$files = File::glob("$storagePath/dm*");
// Find files that end with ".tmp"
$files = File::glob("$storagePath/*.tmp");
You can also use the ? and [] wildcard characters,
for example ? matches one any single character and [] matches one character out of the set of characters between the square brackets,
to search for files that match more specific patterns, like this:
// Find files that starts with "dm" and ends with ".tmp"
$files = File::glob("$storagePath/dm*.tmp");
Note that, File::glob() method return array of matched path, you can loop and see the files or use it according to your needs.

decode_utf8 consumes time in execution

The following is the code snippet that I have been working on to improve its speed.
use strict;
use warnings;
use Encode;
open(IN,"<utf8",$ARGV[0]) or die "Cannot open $ARGV[0]:$!\n"; ##treat it as a huge data of 35,000 lines in devnagari script.
my #in = <IN>;
close(IN);
my $key = "अच्छा"; #key to be matched contains devanagari script as a string
foreach my $in(#in) {
chomp($in);
$key = decode_utf8($key);
$in = decode_utf8($in);
if($key eq $in) {
print "$key: matched\n";
}
else {
print "Not matched\n";
}
}
I am trying to match the lines in the file with key.
By profiling my code following are the results I get.
The results are such that decode_utf8 consumes 34% of time.
Since my data is in utf8 I used decode_utf8.
What can I do to improve the speed here. Any other workaround for replacing decode_utf8 in the code to match the unicode data.
The results are such that decode_utf8 consumes 34% of time.
Well yeah, that's basically all your program does.
More importantly, your code is buggy. You're decoding previously decoded strings!
You decode the contents of the file when you read them (via :utf8), and then you decode the already-decoded content in the loop.
You decode the contents of $key every pass through the loop so that by the fourth pass, you are using decode_utf8(decode_utf8(decode_utf8(decode_utf8($key)))).
Fix:
use utf8; # Source code encoded using UTF-8.
use open ':std', ':encoding(UTF-8)'; # Term provides and expects UTF-8. Default for files.
use strict;
use warnings;
my $key = "अच्छा";
my $found = 0;
while (my $line = <>) {
chomp($line);
if ($line eq $key) {
$found = 1;
last;
}
}
if ($found) {
print "Match found\n";
else {
print "No match\n";
}
This fixes other issues too:
Encodes outputs (using use open ':std').
Doesn't needlessly use global variables. (Use open my $IN instead of open IN.)
Doesn't needlessly load the entire file into memory.
Doesn't needlessly read the entire file.
Doesn't print Not matched 34,999 times when the key is found.
Avoids :utf8 in favour of :encoding(UTF-8).
Doesn't reinvent <>.
Doesn't hide a die in the middle of a line. (Put a line break before or die.)
Doesn't use "cannot". (Use "can't"!)

perl text mining code can't handle massive amounts of data

I'm doing a large text mining project. I have 100,000 text files. I've extracted two- and three-word phrases from sets of 1,000 documents at a time and have created 100 files. Each file has roughly 8 million lines in this format:
total_references num_docs_referencing_phrase phrase
I want to create an aggregate list of total references and number of docs referencing each phrase by processing the 100 intermediate files. To that end I wrote this program.
#!/usr/bin/perl -w
$| = 1 ; # Don't buffer output
use File::Find ;
$dir = "/home/sl/phrase-counts" ;
find(\&processFile, $dir) ;
for $key ( keys %TOTALREFS ) {
print "$TOTALREFS{$key} $NUMDOCS{$key} ${key}\n" ;
}
sub processFile {
my $file = $_ ;
my $fullName = $File::Find::name ;
if ( $fullName =~ /\.txt$/ ) {
$date = `date` ;
chomp $date ;
print "($date) file: $fullName\n" ;
open INFILE, "$fullName" or die "Cannot read ${fullName}";
while ( <INFILE> ) {
my $line = $_ ;
chomp $line ;
( $totalRefs, $numDocs, $phrase ) = split (/\s+/, $line, 3) ;
$TOTALREFS{$phrase} += $totalRefs ;
$NUMDOCS{$phrase} += $numDocs ;
}
close ( INFILE ) ;
}
}
The code produces strange errors after 8 or so files are processed and then it hangs, i.e. it stops listing files it should be processing.
Use of uninitialized value $date in scalar chomp at ./getCounts line 21.
Use of uninitialized value $date in concatenation (.) or string at ./getCounts line 22.
I don't believe the problem is really my date command, especially since it runs fine for a number of early files processed and because the problem does not occur at the same point in the run every time I run it. I assume the problem is that my program is consuming too much system resource and corrupting the state of the running environment. Running top and watching memory use go up to 97% of the machine concerns me although I notice that the errors and hang occur before top shows little memory left. And, there is some swap on the machine.
My question is, how can I rewrite this program to actually complete its execution? With 8 million lines of data for each of 100 files there could be 800 million lines of output although I would guess that the total is more likely in the range of 50-100 million lines. I have done some cleanup of the data and could consider more aggressive sanitizing of phrases to cut down on the numbers but I'd like to understand how I can design this code better.
I've seen articles that tell programmers to put their data into a database. My concern is the time it might take to update a database 100 million times.
Suggestions?
It looks like you're running on a *nix system, so make sort do all the work for you. It knows how to use memory efficiently.
sort -k 3 all_your_input_files*.txt > sorted.txt
Why do this? Because now all lines corresponding to the same phrase appear in a single block within the file, so you can compute totals easily: just write a short Perl script that adds the current line's numbers to the current totals, and writes them out whenever the phrase changes from the previous line (and at the end):
my ($oldPhrase, $totTotalRefs, $totNumDocs) = (undef, 0, 0);
while ( <INFILE> ) {
my $line = $_ ;
chomp $line ;
( $totalRefs, $numDocs, $phrase ) = split (/\s+/, $line, 3) ;
if (defined($oldPhrase) && $phrase ne $oldPhrase) {
print "$totTotalRefs $totNumDocs $oldPhrase\n" ;
$totTotalRefs = $totNumDocs = 0;
}
$totTotalRefs += $totalRefs ;
$totNumDocs += $numDocs ;
$oldPhrase = $phrase;
}
close ( INFILE ) ;
print "$totTotalRefs $totNumDocs $oldPhrase\n" ;
The above code is untested, but should work with appropriate boilerplate added I think.
[EDIT: Fix bug in which $oldPhrase never gets set, as suggested by Sol.]
You are storing all of the different phrases as keys for both %TOTALREFS and %NUMDOCS, so things are at least twice as bad as they need to be.
I suggest you try the following
Add use strict and use warnings (instead of -w) and declare all of your variables properly
Don't use capitals in your variable names. Capital letters are reserved for global identifiers
Don't start 100 subprocesses just to get the time of day. Just use localtime like this
printf "(%s) file: %s\n", scalar localtime, $full_name;
Use find just to generate an array of the files to be processed, so it would look like this
my #files;
find(sub {
push #files, $File::Find::name if -f and /\.txt$/i;
}, $dir) ;
Then you can process each file with a simple for loop
for my $file (#files) {
...
}
Take two passes through the files, the first time generating a hash that relates each phrase to an integer starting at zero, and the second that uses those integers to index arrays #total_refs and #num_docs and increment their elements
You may still run out of memory, but those measures will certainly give you a better chance.
Update
Just to be clear, this is how I imagine it would work. I've done this as a single pass, but it may be better to write it as two passes as I described so that you can check your intermediate data.
Note that this isn't tested apart from making sure that it compiles.
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use autodie;
STDOUT->autoflush;
use File::Find;
my $dir = '/home/sl/phrase-counts';
my #files;
find(sub {
push #files, $File::Find::name if -f and /\.txt$/i;
}, $dir);
my (%phrases, #total_refs, #num_docs);
my $num_phrases = 0;
for my $file (#files) {
printf "(%s) file: %s\n", scalar localtime, $file;
open my $in_fh, '<', $file;
while (<$in_fh>) {
chomp;
my ($total_refs, $num_docs, $phrase) = split ' ', $_, 3;
my $phrase_num = $phrases{$phrase} //= $num_phrases++;
$total_refs[$phrase_num] += $total_refs;
$num_docs[$phrase_num] += $num_docs;
}
}
for my $phrase (keys %phrases) {
my $phrase_num = $phrases{$phrase};
printf "%s %s %s\n",
$total_refs[$phrase_num],
$num_docs[$phrase_num],
$phrase_num;
}
Trying to use more resources than available causes exceptions for being unable to allocate memory or results in system calls returning error messages. It doesn't corruption memory.
In this case, the result of backticks is undef, which means the command could not be executed. That could very well be because you have insufficient memory left. Where did you get the idea that being unable to execute a program is the result of corrupted memory?! Furthermore, you have an error you don't understand, yet you didn't check what error was returned? Backticks sets $? (and $! when $? is negative) as per system. Assuming it's a bug in Perl is a very bad assumption to make, especially when the system tells you what error occurred.
Use less memory, either through the use of a more appropriate and/or efficient data structure, or by keeping a portion of the data out of memory (e.g. on disk or in a database).

how to find substring between given characters in smarty?

I have following code in php to which I am looking in smarty
function get_string_between($string, $start, $end){
$string = " ".$string;
$ini = strpos($string,$start);
if ($ini == 0) return "";
$ini += strlen($start);
$len = strpos($string,$end,$ini) - $ini;
return substr($string,$ini,$len);
}
$fullstring = '[1]wholesale';
$parsed = get_string_between($fullstring, "[", "]");
$filename = substr(strrchr($fullstring, "]"), 1);
echo $parsed.'/'.$filename;
//Output 1/Wholesale
I want this output 1/Wholesalein two different variables (as done in php $parsed,$filename) in Smarty (Let say string is [1]wholesale).
You could achieve it this way:
{assign var="var" value="[1]wholesale"}
{$var|substr:($var|strpos:'['+1):($var|strpos:']' - $var|strpos:'['-1)}/{$var|substr:($var|strpos:']'+1)}
Output for this will be:
1/wholesale
However as you see it's rather quite illegible. You could of course move it to some function/modifier but as always I need to tell, the best option is to prepare data in PHP and then simply display it in Smarty not using advanced transformations.

aggregate totals when key changes in Perl

I have an input file with the following format
ant,1
bat,1
bat,2
cat,4
cat,1
cat,2
dog,4
I need to aggregate the col2 for each key (column1) so the result is:
ant,1
bat,3
cat,7
dog,4
Other considerations:
Assume that the input file is sorted
The input file is pretty large (about 1M rows), so I don't want to use an array and take up memory
Each input line should be processed as we read it, and move to the next line
I need to write the results to an outFile
I need to do this in Perl, but a pseudo-code or algorithm would help just as fine
Thanks!
This is what I came up with... want to see if this can be written better/elegant.
open infile, outFile
prev_line = <infile>;
print_line = $prev_line;
while(<>){
curr_line = $_;
#prev_cols=split(',', $prev_line);
#curr_cols=split(',', $curr_line);
if ( $prev_cols[0] eq $curr_cols[0] ){
$prev_cols[1] += curr_cols[1];
$print_line = "$prev_cols[0],$prev_cols[1]\n";
$print_flag = 0;
}
else{
$print outFile "$print_line";
$print_flag = 1;
$print_line = $curr_line;
}
$prev_line = $curr_line;
}
if($print_flag = 1){
print outFile "$curr_line";
}
else{
print outFile "$print_line";
}
#!/usr/bin/perl
use warnings;
use strict;
use integer;
my %a;
while (<>) {
my ($animal, $n) = /^\s*(\S+)\s*,\s*(\S+)/;
$a{$animal} += $n if defined $n;
}
print "$_,${a{$_}}\n" for sort keys %a;
This short code affords you the chance to learn Perl's excellent hash facility, as %a. Hashes are central to Perl. One really cannot write fluent Perl without them.
Observe incidentally that the code exercises Perl's interesting autovivification feature. The first time a particular animal is encountered in the input stream, no count exists, so Perl implicitly assumes a pre-existing count of zero. Thus, the += operator does not fail, even though it seems that it should. It just adds to zero in the first instance.
On the other hand, it may happen that not only the number of data but the number of animals is so large that one would not like to store the hash %a. In this case, one can still calculate totals, provided only that the data are sorted by animal in the input, as they are in your example. In this case, something like the following might suit (though regrettably it is not nearly so neat as the above).
#!/usr/bin/perl
use warnings;
use strict;
use integer;
my $last_animal = undef;
my $total_for_the_last_animal = 0;
sub start_new_animal ($$) {
my $next_animal = shift;
my $n = shift;
print "$last_animal,$total_for_the_last_animal\n"
if defined $last_animal;
$last_animal = $next_animal;
$total_for_the_last_animal = $n;
}
while (<>) {
my ($animal, $n) = /^\s*(\S+)\s*,\s*(\S+)/;
if (
defined($n) && defined($animal) && defined($last_animal)
&& $animal eq $last_animal
) { $total_for_the_last_animal += $n; }
else { start_new_animal $animal, $n; }
}
start_new_animal undef, 0;
Use Perl’s awk mode.
-a
turns on autosplit mode when used with a -n or -p. An implicit split command to the #F array is done as the first thing inside the implicit while loop produced by the -n or -p.
perl -ane 'print pop(#F), "\n";'
is equivalent to
while (<>) {
#F = split(' ');
print pop(#F), "\n";
}
An alternate delimiter may be specified using -F.
All that’s left for you is to accumulate the sums in a hash and print them.
$ perl -F, -lane '$s{$F[0]} += $F[1];
END { print "$_,$s{$_}" for sort keys %s }' input
Output:
ant,1
bat,3
cat,7
dog,4
It's trivial in perl. Loop on the file input. Split the input line on comma. For each key in column one keep a hash to which you add the value in column two. At the end of the file print the list of hash keys and their values. It can be done in one line but that would obfuscate the algorithm.

Resources