I have difficulty understanding when and why the value held by a pushed Scalar container is affected after the push. I'll try to illustrate the issue that I ran into in a more complicated context in two stylized examples.
*Example 1 * In the first example, a scalar $i is pushed onto an array #b as part of a List. After the push, the value held by the scalar is explicitly updated in later iterations of the for loop using the $i++ instruction. These updates have an effect on the value in the array #b: at the end of the for loop, #b[0;0] is equal to 3, and no longer to 2.
my #b;
my $i=0;
for 1..3 -> $x {
$i++;
say 'Loose var $i: ', $i.VAR.WHICH, " ", $i.VAR.WHERE;
if $x == 2 {
#b.push(($i,1));
say 'Pushed $i : ', #b[0;0].VAR.WHICH, " ", #b[0;0].VAR.WHERE;
}
}
say "Post for-loop";
say "Array : ", #b;
say 'Pushed $i : ', #b[0;0].VAR.WHICH, " ", #b[0;0].VAR.WHERE;
Output example 1:
Loose var $i: Scalar|94884317665520 139900170768608
Loose var $i: Scalar|94884317665520 139900170768648
Pushed $i : Scalar|94884317665520 139900170768648
Loose var $i: Scalar|94884317665520 139900170768688
Post for-loop
Array : [(3 1)]
Pushed $i : Scalar|94884317665520 139900170768688
* Example 2 * In the second example, the scalar $i is the loop variable. Even though $i is updated after it has been pushed (now implicitly rather than explicitly), the value of $i in array #c does not
change after the push; i.e. after the for loop, it is still 2, not 3.
my #c;
for 1..3 -> $i {
say 'Loose var $i: ', $i.VAR.WHICH, " ", $i.VAR.WHERE;
if $i == 2 {
#c.push(($i,1));
say 'Pushed $i : ', #c[0;0].VAR.WHICH, " ", #c[0;0].VAR.WHERE;
}
}
say "Post for-loop";
say "Array : ", #c;
say 'Pushed $i : ', #c[0;0].VAR.WHICH, " ", #c[0;0].VAR.WHERE;;
Output example 2:
Loose var $i: Scalar|94289037186864 139683885277408
Loose var $i: Scalar|94289037186864 139683885277448
Pushed $i : Scalar|94289037186864 139683885277448
Loose var $i: Scalar|94289037186864 139683885277488
Post for-loop
Array : [(2 1)]
Pushed $i : Scalar|94289037186864 139683885277448
Question: Why is $i in #b in example 1 updated after the push, while $i in #c in example 2 is not?
edit:
Following #timotimo's comment, I included the output of .WHERE in the examples. This shows the (WHICH/logical) scalar-identity of $i stays the same, while its memory address changes through the various loop iterations. But it does not explain why in example 2 the pushed scalar remains tied to the same WHICH-identity in combination with an old address ("448).
A scalar value is just a container. You can think of them as a kind of smart pointer, rather than a primitive value.
If you do an assignment
$foo = "something"; #or
$bar++;
you are changing the scalars value, the container stays the same.
Consider
my #b;
my $i=0;
for 1..5 -> $x {
$i++;
#b.push(($i<>,1)); # decontainerize $i and use the bare value
}
say #b;
and
my #b;
my $i=0;
for 1..5 -> $x {
$i := $i + 1; # replaces the container with value / change value
#b.push(($i,1));
}
say #b;
Both of which work as expected. But: In both cases, the thing in the list is not mutable anymore, because there is no container.
#b[4;0] = 99;
will therefore die. So just use the loop variable then, right?
No.
for 1..5 -> $x {
#b.push(($x,1)); #
}
#b[4;0] = 99; #dies
even if we iterate over a list of mutable things.
my $one = 1;
my $two = 2;
my $three = 3;
my $four = 4;
my $five = 5;
for ($one, $two, $three, $four, $five) -> $x {
#b.push(($x,1));
}
#b[4;0] = 99; #dies
So there is no aliasing happening here, instead the loop variable is always the same container and gets values assigned that come from the other containers.
We can do this though.
for ($one, $two, $three, $four, $five) <-> $x {
#b.push(($x,1));
}
#b[4;0] = 99; # works
for ($one, $two, $three, $four, $five) -> $x is rw {
#b.push(($x,1));
}
#b[4;0] = 99; # works too
A way to make "the thing" mutable is using an intermediate variable.
for 1..5 -> $x {
my $j = $x;
#b.push(($j,1)); # a new container
}
#b[4;0] = 99;
works fine. Or shorter and more in the original context
my #b;
my $i=0;
for 1..5 -> $x {
$i++;
#b.push((my $ = $i, 1)); # a new anonymous container
}
#b[4;0] = 99;
say #b; # [(1 1) (2 1) (3 1) (4 1) (99 1)]
See also:
https://perl6advent.wordpress.com/2017/12/02/#theoneandonly
https://docs.perl6.org/language/containers
After playing with and thinking about my above question for some time, I'll wager an answer... It's pure conjecture on my part, so please feel free to say it's non-sense if it is, and if you happen to know, why...
In the first example, $i is defined outside of the lexical scope of the for loop. Consequently, $i exists independent of the loop and its iterations. When $i is referenced from inside the loop, there is only one $i that can be affected. It is this $i that gets pushed into #b, and has its contents modified afterwards in the loop.
In the second example, $i is defined inside the lexical scope of the for loop. As #timotimo pointed out, the pointed block get's called for each iteration, like a subroutine; $i is therefore freshly declared for each iteration, and scoped to the respective block. When $i is referenced inside the loop, the reference is to the block-iteration-specific $i, which would normally cease to exist when the respective loop iteration ends. But because at some point $i is pushed to #c, the reference to the block-iteration-specific $i holding value 2 cannot be deleted by the garbage collector after termination of the iteration. It will stay in existence..., but still be different from $i in later iterations.
Related
How should the check be made so that there are no line duplicates in the file
open ( FILE, ">newfile");
for( $a = 1; $a < 20; $a = $a + 1 ) {
my $random_number = 1+ int rand(10);;
# check to avoid inserting the line if the line is already present in the file
print FILE "Random number is $random_number \n";
}
close(FILE);
!$seen{$_}++ is a common idiom for identifying duplicates.
my %seen;
for (1..19) {
my $random_number = 1+ int rand(10);
say "Random number is $random_number" if !$seen{$random_number}++;
}
But that doesn't guarantee that you will get all numbers from 1 to 10 in random order. If that's what you are trying to achieve, the following is a far better solution:
use List::Util qw( shuffle );
say "Random number is $_" for shuffle 1..10;
It seems like what you are asking is how to randomize the order of the numbers 1 to 20. I.e. no duplicates, random order. That can be easily done with a Schwartzian transform. For example:
perl -le'print for map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [$_, rand()] } 1..20'
6
7
16
14
5
20
3
13
19
17
4
8
15
10
9
11
18
1
2
12
In this case, reading from the end and backwards, we create a list of numbers 1 .. 20, we feed that into a map statement which turns each number into an array-ref, containing the number, and a random number. Then we feed that list of array refs to a sort, where we sort numerically on the second argument in the array ref: the random number (hence creating a random order). Then we transform the array ref back into a simple number with another map statement. Finally we print the list using a for loop.
So in your case, the code would look something like:
print "Random number is: $_\n" for # print each number
map { $_>[0] } # restore to a number
sort { $a->[1] <=> $b->[1] } # sort the list on the random number
map { [ $_, rand() ] } # create array ref with random number as index
1 .. 20; # create list of numbers to randomize order of
Then you can use the program like below to redirect output to a file:
$ perl numbers.pl > newfile.txt
Enter each line into a hash as well, what makes it easy and efficient to later check for it
use warnings;
use strict;
use feature 'say';
my $filename = shift or die "Usage: $0 filename\n";
open my $fh, '>', $filename or die "Can't open $filename: $!";
my %existing_lines;
for my $i (1..19)
{
my $random_number = 1 + int rand(10);
# Check to avoid inserting the line if it is already in the file
if (not exists $existing_lines{$random_number}) {
say $fh "Random number is $random_number";
$existing_lines{$random_number} = 1;
}
}
close $fh;
This assumes that the intent in the question is to not repeat that number (symbolizing content to be stored without repetition).
But if it is indeed the whole line (sentence) to be avoided, where that random number is used merely to make each line different, then use the whole line for the key
for my $i (1..19)
{
my $random_number = 1 + int rand(10);
my $line = "Random number is $random_number";
# Check to avoid inserting the line if it is already in the file
if (not exists $existing_lines{$line}) {
say $fh $line;
$existing_lines{$line} = 1;
}
}
Notes and literature
Lexical filehandles (my $fh) are much better than globs (FILE), and the three-argument open is better. See the quide perlopentut and reference open
Always check the open call (or die... above). It can and does fail -- quietly. In that check always print the error for which it failed, $!
The C-style for loop is very rarely needed while the usual foreach (with synonym for) is much nicer to use; see it in perlsyn. The .. is the range operator
Always declare variables with my, and enforce that with strict pragma; always use warnings
If the filehandle refers to pipe-open (not the case here) always check its close
See perlintro for a general overview and for hashes; for more about Perl's data types see perldata. Keep in mind for later the notion of complex data structures, perldsc
return false will do the trick.
Because you cannot generate 20 distinct numbers in the range [1, 10].
I have a list in bash that can have repeated values in it. I would like to remove duplicates and get a list with only the unique values in it. Order must be preserved and the last occurrence of the unique values is the one I wish to keep.
For example, if I have this list:
A=( D B A C D )
I'm looking for this:
result=( B A C D )
I've seen solutions for this when the data is a list in a file, but I'd prefer to keep the list in-memory without jumping through any hoops.
I think I can use an associative array and loop through the list adding the entries as keys in the array and then just dump the keys into the unique list but I'm not an expert with associative arrays across platforms -- do they sort themselves on key value sort of like a lot of C++ STL containers do or do they preserve the order of insertion regardless of key values?
I'd like to avoid a reliance on associative arrays though, because not all systems I may need to run on have bash 4.x or higher... some will be bash 3.x...
Any help would be great.
Without Associative Arrays
You can do it with indexed arrays by using an intermediate indexed array to hold unique values from A. This requires a nested loop over values stored in c[] for each element of A, e.g.
#!/bin/bash
declare -a result # declare result indexed array
declare -a c # declare temp intermediate indexed array
A=( D B A C D ) # original with duplicates
## loop decending over A, reset found flag, loop over c, if present continue,
# otherwise store A at index in c
for ((i = $((${#A[#]}-1)); i >= 0; i--)); do
found=0;
for j in ${c[#]}; do
[ "$j" = "${A[i]}" ] && { found=1; break; }
done
[ "$found" -eq '1' ] && continue
c[i]=${A[i]}
done
## loop over c testing if index for A exists, add from c to result
for ((i = 0; i < ${#A[#]}; i++)); do
[ "${c[i]}" ] && result+=(${c[i]})
done
declare -p result # output result
Example Use/Output
$ bash lastuniqindexed.sh
declare -a result='([0]="B" [1]="A" [2]="C" [3]="D")'
Using Associative Arrays with BASH_VERSION Test
You can do it with a combination of indexed and associative arrays making only a single pass though each array. You use an associative array B keyed with the value of A using B as a frequency array indicating whether an element of A has been seen. You then store the element of A in a temporary indexed array c[] so that the unique values can be added to result preserving the original order.
You can address whether associative array functionality is present with a bash version test at the beginning, e.g.
#!/bin/bash
case $BASH_VERSION in
## empty or beginning with 1, 2, 3
''|[123].*) echo "ERROR: Bash 4.0 needed" >&2
exit 1;;
esac
declare -A B # declare associative array
declare -a result # declare indexed array
A=( D B A C D ) # original with duplicates
## loop decending over A, if B[A] doesn't exist, set B[A]=1, store in c[]
for ((i = $((${#A[#]}-1)); i >= 0; i--)); do
[ -n "${B[${A[i]}]}" ] || { B[${A[i]}]=1; c[i]=${A[i]};}
done
## loop over c testing if index for A exists, add from c to result
for ((i = 0; i < ${#A[#]}; i++)); do
[ "${c[i]}" ] && result+=(${c[i]})
done
declare -p result # output result
Without the use of associative arrays, the nested loops looping over the original checking against each entry in c[] will be much less efficient as the size of the array grows.
Example Use/Output
$ bash lastuniq.sh
declare -a result='([0]="B" [1]="A" [2]="C" [3]="D")'
Look things over and let me know if you have further questions.
I am working on my programming language which compiles into bash 4.3+ code. I am in the final stages of my language, but I have a small issue with recursion functions. Here's the bash code which is supposed to return the fibnacci number given an index.
#!/bin/bash
function fib() {
local a=$1
declare -n ret=$2
if (( $a <= 2 )); then
ret=1
return
fi
fib $((a-1)) fib1
fib $((a-2)) fib2
ret=$((fib1+fib2))
echo "fib($((a-1))) + fib($((a-2))) = $ret"
return
}
num=5
fib $num result
echo
echo "fib($num) = $result"
The problem in this code is that the fib(5) is giving 3 which is clearly wrong. What I think the problem is, when I pass fib1 and fib2 as a way to store the return value, they get overwritten by each call which assigns them. If that was the problem, how can I make fib1 and fib2 local to their execution scope.
Please note that I do not want to use a return statement to return the values, I want to try finding a solution using declare -n namerefs.
Thank you
What I think the problem is, when I pass fib1 and fib2 as a way to store the return value, they get overwritten by each call which assigns them.
Yep, and you can see that by printing the value of fib1 between and after the recursive calls:
fib $((a-1)) fib1
echo "fib($a): fib1: $fib1"
fib $((a-2)) fib2
echo "fib($a): fib1: $fib1 fib2: $fib2"
You should see the value of fib1 change during the second call. That's to be expected, since it wasn't declared local and there only is one global copy of fib1.
If you make them local... it doesn't help much.
Assume you start by calling fib 4 result. The first iteration will make fib1 local, and call fib 3 fib1. Now the second iteration will also make fib1 local, but it will also try to assign its return value to a variable of the same name. Since the access is by name, it saves the return value to its own copy of fib1.
This can be seen with a somewhat simpler script too, this tries to return a fixed value up from the bottom of the recursion:
#!/bin/bash
foo() {
declare -n ret=$2
if (( $1 == 0 )); then
echo "foo($1) returning"
ret=end # this is the value that should bubble up
return
fi
local x=initial$1 # use $1 here to track the level the value came from
foo $(($1 - 1)) x
ret=$x
echo "foo($1) = $x"
return
}
foo 3 result
echo "main: $result"
The workaround I can think of is to have a separate global variable for the return value, and to immediately copy it to a local variable:
local fib1 fib2
fib $((a-1)) retval
fib1=$retval
fib $((a-2)) retval
fib2=$retval
I use code like this to find data values for my calculations:
sub get_data {
$x =0 if($_[1] eq "A"); #get column number by name
$data{'A'}= [2.00000, 0.15000, -0.00143, 33.51030, 0.77, 1, 0, 12];
return $data{$_[0]}[$x];
}
Data is stored like this in Perl file. I plan no more than 100 columns. Then to get value I call:
get_data(column, row);
Now I realized that that is terribly slow way to look up data in table. How can I do it faster? SQL?
Looking at your github code, the main problem you have is that your
big hash of arrays is initialized every time the function is called.
Your current code:
my #atom;
# {'name'}= radius, depth, solvation_parameter, volume, covalent_radius, hydrophobic, H_acceptor, MW
$atom{'C'}= [2.00000, 0.15000, -0.00143, 33.51030, 0.77, 1, 0, 12];
$atom{'A'}= [2.00000, 0.15000, -0.00052, 33.51030, 0.77, 0, 0, ''];
$atom{'N'}= [1.75000, 0.16000, -0.00162, 22.44930, 0.75, 0, 1, 14];
$atom{'O'}= [1.60000, 0.20000, -0.00251, 17.15730, 0.73, 0, 1, 16];
...
Time taken for your test case on the slow netbook I'm typing this on: 6m24.400s.
The most important thing to do is to move this out of the function, so it's
initialized only once, when the module is loaded.
Time taken after this simple change: 1m20.714s.
But since I'm making suggestions, you could write it more legibly:
my %atom = (
C => [ 2.00000, 0.15000, -0.00143, 33.51030, 0.77, 1, 0, 12 ],
A => [ 2.00000, 0.15000, -0.00052, 33.51030, 0.77, 0, 0, '' ],
...
);
Note that %atom is a hash in both cases, so your code doesn't do what you
were imagining: it declares a lexically-scoped array #atom, which is unused, then proceeds to fill up an unrelated global variable %atom. (Also do you really want an empty string for MW of A? And what kind of atom is A anyway?)
Secondly, your name-to-array-index mapping is also slow. Current code:
#take correct value from data table
$x = 0 if($_[1] eq "radius");
$x = 1 if($_[1] eq "depth");
$x = 2 if($_[1] eq "solvation_parameter");
$x = 3 if($_[1] eq "volume");
$x = 4 if($_[1] eq "covalent_radius");
$x = 5 if($_[1] eq "hydrophobic");
$x = 6 if($_[1] eq "H_acceptor");
$x = 7 if($_[1] eq "MW");
This is much better done as a hash (again, initialized outside the function):
my %index = (
radius => 0,
depth => 1,
solvation_parameter => 2,
volume => 3,
covalent_radius => 4,
hydrophobic => 5,
H_acceptor => 6,
MW => 7
);
Or you could be snazzy if you wanted:
my %index = map { [qw[radius depth solvation_parameter volume
covalent_radius hydrophobic H_acceptor MW
]]->[$_] => $_ } 0..7;
Either way, the code inside the function is then simply:
$x = $index{$_[1]};
Time now: 1m13.449s.
Another approach is just to define your field numbers as constants.
Constants are capitalized by convention:
use constant RADIUS=>0, DEPTH=>1, ...;
Then the code in the function is
$x = $_[1];
and you then need to call the function using the constants instead of strings:
get_atom_parameter('C', RADIUS);
I haven't tried this.
But stepping back a bit and looking at how you are using this function:
while($ligand_atom[$x]{'atom_type'}[0]) {
print STDERR $ligand_atom[$x]{'atom_type'}[0];
$y=0;
while($protein_atom[$y]) {
$d[$x][$y] = sqrt(distance_sqared($ligand_atom[$x],$protein_atom[$y]))
- get_atom_parameter::get_atom_parameter($ligand_atom[$x]{'atom_type'}[0], 'radius');
- get_atom_parameter::get_atom_parameter($protein_atom[$y]{'atom_type'}[0], 'radius');
$y++;
}
$x++;
print STDERR ".";
}
Each time through the loop you are calling get_atom_parameter twice to
retrieve the radius.
But for the inner loop, one atom is constant throughout. So hoist the call
to get_atom_parameter out of the inner loop, and you've almost halved the
number of calls:
while($ligand_atom[$x]{'atom_type'}[0]) {
print STDERR $ligand_atom[$x]{'atom_type'}[0];
$y=0;
my $lig_radius = get_atom_parameter::get_atom_parameter($ligand_atom[$x]{'atom_type'}[0], 'radius');
while($protein_atom[$y]) {
$d[$x][$y] = sqrt(distance_sqared($ligand_atom[$x],$protein_atom[$y]))
- $lig_radius
- get_atom_parameter::get_atom_parameter($protein_atom[$y]{'atom_type'}[0], 'radius');
$y++;
}
$x++;
print STDERR ".";
}
But there's more. In your test case the ligand has 35 atoms and the
protein 4128 atoms. This means that your initial code made
4128*35*2 = 288960 calls to get_atom_parameter, and while now it's
only 4128*35 + 35 = 144515 calls, it's easy to just make some arrays with
the radii so that it's only 4128 + 35 = 4163 calls:
my $protein_size = $#protein_atom;
my $ligand_size;
{
my $x=0;
$x++ while($ligand_atom[$x]{'atom_type'}[0]);
$ligand_size = $x-1;
}
#print STDERR "protein_size = $protein_size, ligand_size = $ligand_size\n";
my #protein_radius;
for my $y (0..$protein_size) {
$protein_radius[$y] = get_atom_parameter::get_atom_parameter($protein_atom[$y]{'atom_type'}[0], 'radius');
}
my #lig_radius;
for my $x (0..$ligand_size) {
$lig_radius[$x] = get_atom_parameter::get_atom_parameter($ligand_atom[$x]{'atom_type'}[0], 'radius');
}
for my $x (0..$ligand_size) {
print STDERR $ligand_atom[$x]{'atom_type'}[0];
my $lig_radius = $lig_radius[$x];
for my $y (0..$protein_size) {
$d[$x][$y] = sqrt(distance_sqared($ligand_atom[$x],$protein_atom[$y]))
- $lig_radius
- $protein_radius[$y]
}
print STDERR ".";
}
And finally, the call to distance_sqared [sic]:
#distance between atoms
sub distance_sqared {
my $dxs = ($_[0]{'x'}-$_[1]{'x'})**2;
my $dys = ($_[0]{'y'}-$_[1]{'y'})**2;
my $dzs = ($_[0]{'z'}-$_[1]{'z'})**2;
return $dxs+$dys+$dzs;
}
This function can usefully be replaced with the following, which uses
multiplication instead of **.
sub distance_sqared {
my $dxs = ($_[0]{'x'}-$_[1]{'x'});
my $dys = ($_[0]{'y'}-$_[1]{'y'});
my $dzs = ($_[0]{'z'}-$_[1]{'z'});
return $dxs*$dxs+$dys*$dys+$dzs*$dzs;
}
Time after all these modifications: 0m53.639s.
More about **: elsewhere you declare
use constant e_math => 2.71828;
and use it thus:
$Gauss1 += e_math ** (-(($d[$x][$y]*2)**2));
The built-in function exp() calculates this for you (in fact, ** is commonly
implemented as x**y = exp(log(x)*y), so each time you are doing this you are
performing an unnecessary logarithm the result of which is just slightly less
than 1 as your constant is only accurate to 6 d.p.). This change would alter
the output very slightly. And again, **2 should be replaced by multiplication.
Anyway, this answer is probably long enough for now, and calculation of d[]
is no longer the bottleneck it was.
Summary: hoist constant values out of loops and functions! Calculating the
same thing repeatedly is no fun at all.
Using any kind of database for this would not help your performance in the
slightest. One thing that might help you though is Inline::C. Perl is
not really built for this kind of intensive computation, and Inline::C
would allow you to easily move performance-critical bits into C while
keeping your existing I/O in Perl.
I would be willing to take a shot at a partial C port. How stable
is this code, and how fast do you want it to be? :)
Putting this in a DB will make it MUCH easier to maintain, scale, expand, etc.... Using a DB can also save you a lot of RAM -- it gets and stores in RAM only the desired result instead of storing ALL values.
With regards to speed it depends. With a text file you take a long time to read all the values into RAM, but once it is loaded, retrieving the values is super fast, faster than querying a DB.
So it depends on how your program is written and what it is for. Do you read all the values ONCE and then run 1000 queries? The TXT file way is probably faster. Do you read all the values every time you make a query (to make sure you have the latest value set) -- then the DB would be faster. Do you 1 query/day? use a DB. etc......
I'm making a script to create a username. It should be four letters long; traditionally we've used 3 letters of the last name + 1 of the first name.
If it was already used we manually thought of an alternative.
So if my name is Fred Flinstones we should try FLIF. If this doesnt work; we loop through the name: FLIA, FLIB, FLIC ... FLIZ, FLAA, FLAB, FLAC, ... FLZZ, FAAA, FAAB, ...
The easiest way is to loop through last letters; then make another set of loops through second last letter and loop through last letters; then a set of loops through third last, second last, last; and fourth+third+second+last.
This makes a lot of do while loops nested in eachother + unreadable for other humans + a lot of typing.
I could use a counter per letter but that also doesn't seem elegant
I could try with one counter and then using mod 26 to see how many letters need replacement (but that seems very complex).
Is there some elegant/efficient ways to do this?
Bonus points for first trying to keep the string as 'logically correct' as possible (f.e. keeping the last letter an F for Fred or skipping letters FLIF; FLNF, FLSF, FLTF, ...) .
Not sure if this is what you mean, but if you structure your username-script in the following way (I used PHP as language), you can extend it by adding options with higher fuzz factors while keeping the code readable:
echo findName('FLINTSTONE', 'FRED');
function findName($last, $first) {
for ($fuzzFactor = 0; ; $fuzzFactor++) {
$candidates = fuzzNames($last, $first, $fuzzFactor);
if (empty($candidates)) {
// exhausted
return "sorry, I'm out of options!";
}
foreach ($candidates as $candidate) {
if (isUnique($candidate)) {
return $candidate;
}
}
}
}
function fuzzNames($last, $first, $fuzzFactor) {
switch ($fuzzFactor) {
case 0:
// no fuzz, return first choice
return array(substr($last, 0, 3) . $first[0]);
case 1:
// replace the third letter of the last name
// by the fourth/fifth/.../last letter (FLNF, FLTF, ... , FLEF)
$candidates = array();
for ($i = 3; $i < strlen($last); $i++) {
$candidates[] = substr($last, 0, 2) . $last[$i] . $first[0];
}
return $candidates;
case 2:
// replace the second and third letter of the last name
// by their follow-ups (FINF, FITF, ... , FNEF)
$candidates = array();
for ($i = 2; $i < strlen($last) - 1; $i++) {
for ($j = $i + 1; $j < strlen($last); $j++) {
$candidates[] = $last[0] . $last[$i] . $last[$j] . $first[0];
}
}
return $candidates;
default:
return array();
}
}