Php str_replace limit in Array - limit

$orjinal =$_POST['orjinal'];
$false = file_get_contents("false.txt");
$true = file_get_contents("true.txt");
$false1 = explode("\n", $false);
$true1 = explode("\n", $true);
$new = str_replace($false1, $true1, $orjinal);
Working but changes two times the same words
false.txt"
apple
melon
true.txt
melon
strawberry
$orjinal="i am eating apple"
i want this output; "i am eating melon"
output ; "i am eating strawberry"

You have three options:
1. Fix your false.txt and true.txt:
false.txt:
apple
true.txt:
melon
2. Call str_replace() twice (or as often as needed):
$new = str_replace($false1, $true1, $orjinal);
$new = str_replace($false1, $true1, $new);
3. Sit back and think once more about what exactly you want to achieve and if the simple str_replace() works exactly the way you intended or not.

Related

Avoid duplicate Bulk-New-ADUser Creation via a csv file

I have some modifications for this script. UserID is normally a user’s first name followed by their last name.
What I need to do is to compare the proxy address & SAMAccountName attribute associated with each Username before it create.
So I mean lets say Jack Sparrow , if jsparrow already in use then script will try as jasparrow (first and second letter of firstname) and in use jasparrow as well , will be jacsparrow and so on. I want to avoid duplicate usernames.
2-I decided that it would be better to make 'fun' passwords that use the first two letters of the FirstName , day/month , first two letters of the Lastname. The end result is that users get a password like "Ja1009Sp".
Firstname,LastName,Department,Manager,MobilePhone
Jack,Sparrow,IT,jsmith,1 88 635 5254-0551
John Smith,Sparrow,Finance,jsmith,188 635 5254-0554
Script :
Import-Module ActiveDirectory
$UserList = Import-CSV -Path C:\Temp\CreateUsers.csv
$targetOU='OU=usersOU,DC=My,DC=Domain,DC=org'
$upnDomain='sec.local'
foreach($Person in $UserList){
$useritems=#{
GivenName=$Person.Firstname
Surname=$Person.LastName
Department=$Person.Department
AccountPassword=ConvertTo-SecureString -String $Person.Password -AsPlainText -force
ChangePasswordAtLogon=$false
Enabled=$true
DisplayName="$($Person.Firstname) $($Person.Lastname)"
Manager=$Person.Manager
MobilePhone=$Person.MobilePhone
Name="$($Person.Firstname) $($Person.Lastname)"
SamAccountName="$($Person.Firstname+$Person.LastName.Substring(0,1))"
UserPrincipalName="$($Person.FirstName+$Person.LastName.Substring(0,1))#$upnDomain"
Company="Contoso"
}
New-ADUser #useritems -Path $targetOU
}
Try something like this.. I don't have AD available atm. to test the Get-ADUser-query used to look for existing account so it might need some tuning.
foreach ($Person in $UserList) {
#Reset counters
$i = 1
$n = 1
do {
if($i -le $person.Firstname.Length) {
$user = "$($Person.Firstname.Substring(0,$i)+$Person.LastName)"
$i++
} else {
#All combinations in use, adding number
$user = "$($Person.Firstname.Substring(0,1)+$Person.LastName+$n)"
$n++
}
} while ((Get-ADUser -Filter "(samAccountName -eq '$user') -or (proxyaddresses -like '$user*')"))
#Result username
#$user
#$useritems = #{
#.....
#SamAccountName=$user
#UserPrincipalName="$user#$upnDomain"
#....
#}
}
If all combinations are in use including jacksparrow, it tries jsparrow1 ++ until it finds a free number.
The password can be generated using:
$Password = "{0}{1}{2}" -f $Person.Firstname.Substring(0,2), (Get-Date).ToString("ddMM"), $Person.Lastname.Substring(0,2)

Downloading multiple fasta files from ncbi

I'm trying to download all fasta files associated with one organism from ncbi.
I tried wget -r -l3 -A "*.fna.gz" ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/bacteria/Microcystis_aeruginosa/ to get all files ending in .fna.gz from the third level down, but then it just rejects everything with the following output:
Removed “ftp.ncbi.nlm.nih.gov/genomes/refseq/bacteria/Microcystis_aeruginosa/latest_assembly_versions/.listing”.
Rejecting “GCF_000010625.1_ASM1062v1”.
Rejecting “GCF_000307995.1_ASM30799v2”.
Rejecting “GCF_000312165.1_ASM31216v1”.
Rejecting “GCF_000312185.1_ASM31218v1”.
Rejecting “GCF_000312205.1_ASM31220v1”.
Rejecting “GCF_000312225.1_ASM31222v1”.
Rejecting “GCF_000312245.1_ASM31224v1”.
Rejecting “GCF_000312265.1_ASM31226v1”.
Rejecting “GCF_000312285.1_ASM31228v1”.
Rejecting “GCF_000312725.1_ASM31272v1”.
Rejecting “GCF_000330925.1_MicAerT1.0”.
Rejecting “GCF_000332585.1_MicAerD1.0”.
Rejecting “GCF_000412595.1_spc777-v1”.
Rejecting “GCF_000599945.1_Mic70051.0”.
Rejecting “GCF_000787675.1_ASM78767v1”.
Rejecting “GCF_000981785.1_ASM98178v1”.
Any ideas on why it's rejecting these directories? Thanks for your help.
Not exactly sure why it's rejecting your request, but when I was still doing this type of thing, I found that if I don't download queries in smaller batches, the NCBI server timed me out and blocked my IP for a while before I could download again. This doesn't seem to be the same problem that your seeing, but maybe this script might get the same thing done. Let me know if this helps.
#!/usr/bin/env python
from Bio import Entrez
search_term = raw_input("Organism name: ")
Entrez.email = "your_email#isp.com" # required by NCBI
search_handle = Entrez.esearch(db="nucleotide", term=search_term, usehistory="y")
search_results = Entrez.read(search_handle)
search_handle.close()
gi_list = search_results["IdList"]
count = int(search_results["Count"])
webenv = search_results["WebEnv"]
query_key = search_results["QueryKey"]
batch_size = 5 # download sequences in batches so NCBI doesn't time you out
with open("ALL_SEQ.fasta", "w") as out_handle:
for start in range(0, count, batch_size):
end = min(count, start+batch_size)
print "Going to download record %i to %i" % (start+1, end)
fetch_handle = Entrez.efetch(db="nucleotide", rettype="fasta", retmode="text",retstart=start, retmax=batch_size, webenv=webenv, query_key=query_key)
data = fetch_handle.read()
fetch_handle.close()
out_handle.write(data)
print ("\nDownload completed")
I found a perl script that gets me close to accomplishing this task from here . Unfortunately, this script is just returning the ID's of the genomes, and not the actual sequences.
For example, the head of my output is:
gi|425458296|ref|NZ_CAIN00000000.1|NZ_CAIN01000000 Microcystis aeruginosa PCC 9808, whole genome shotgun sequencing project
gi|425448636|ref|NZ_CAIK00000000.1|NZ_CAIK01000000 Microcystis aeruginosa PCC 7941, whole genome shotgun sequencing project
Any perl users know what's going on?
use strict;
use LWP::Simple;
my ($name, $outname, $url, $xml, $out, $count, $query_key, $webenv, $ids);
my #genomeId;
my $base = 'http://eutils.ncbi.nlm.nih.gov/entrez/eutils/';
my $limit = 'wgs[prop]+AND+srcdb+refseq[prop])';
my #species = ('Microcystis aeruginosa');
foreach my $s (#species) {
undef #genomeId;
$query_key = $webenv = '';
$s =~ s/ /+/g;
# ESearch
$url = $base . "esearch.fcgi?db=genome&term=$s";
$xml = get($url);
$count = $1 if ($xml =~ /<Count>(\d+)<\/Count>/);
if ($count > 30) {
$url = $base . "esearch.fcgi?db=genome&term=$s&retmax=$count";
$xml = get($url);
}
while ($xml =~ /<Id>(\d+?)<\/Id>/gs) {
push(#genomeId, $1);
}
$ids = join(',', #genomeId);
# ELink
$url = $base . "elink.fcgidbfrom=genome&db=nuccore&cmd=neighbor_history&id=$ids&term=$limit";
$xml = get($url);
$query_key = $1 if ($xml =~ /<QueryKey>(\d+)<\/QueryKey>/);
$webenv = $1 if ($xml =~ /<WebEnv>(\S+)<\/WebEnv>/);
# EFetch
$url = $base . "efetch.fcgidb=nuccore&query_key=$query_key&WebEnv=$webenv&rettype=fasta&retmode=text";
$out = get($url);
open (OUT, ">$s.fna");
close OUT;
}

I want to create a new local variable from the sum of two others in Ruby but I'm stuck

I am trying to change one example to take a user input rather than using hard coded values then use those local variables to work out items needed.
So far my code looks like this:
print "Number of cars available today."
cars = gets.chomp()
print "Number of available seats in the car."
space_in_a_car = gets.chomp()
print "Number of drivers available."
drivers = gets.chomp()
print "Number of passagers that need transport."
passangers = gets.chomp
cars_not_driven = #{cars} - #{drivers}
cars_driven = drivers
carpool_capacity = #{cars_driven} * #{space_in_a_car}
average_passanger_per_car = #{passangers} / #{drivers}
print "The number of cars being driven today is #{cars_driven}.\n"
print "The number of cars not being driven today is #{cars_not_driven}.\n"
print "We have #{carpool_capacity} cars available.\n"
print "So we need to carry #{average_passanger_per_car} passangers per car to make sure we can transport everyone.\n"
The code will run without throwing any errors but of course because I am not getting the correct commands in:
cars_not_driven = #{cars} - #{drivers}
cars_driven = drivers
carpool_capacity = #{cars_driven} * #{space_in_a_car}
average_passanger_per_car = #{passangers} / #{drivers}
the only value I am getting in the return is:
print "The number of cars being driven today is #{cars_driven}.\n"
How should I be writing:
cars_not_driven = #{cars} - #{drivers} etc
to get the number of cars_not_driven?
I don't understand very well why you use this #{var} and <br>. If I am right that you want to use plain ruby, this should be the solution:
print "Number of cars available today."
cars = gets.chomp().to_i
print "Number of available seats in the car."
space_in_a_car = gets.chomp().to_i
print "Number of drivers available."
drivers = gets.chomp().to_i
print "Number of passagers that need transport."
passangers = gets.chomp.to_i
cars_not_driven = cars - drivers
cars_driven = drivers
carpool_capacity = cars_driven * space_in_a_car
average_passanger_per_car = passangers / drivers
print "The number of cars being driven today is #{cars_driven}.\n"
print "The number of cars not being driven today is #{cars_not_driven}.\n"
print "We have #{carpool_capacity} cars available.\n"
print "So we need to carry #{average_passanger_per_car} passangers per car to make sure we can transport everyone.\n"
When you use gets Ruby is expecting and returning a String. So variables car, drivers etc are all String.
In order to do integer operation over them, you need to convert them to integers. In Ruby you can do this using .to_i.
Now with that information, try:
cars_not_driven = cars.to_i - drivers.to_i
cars_driven = drivers.to_i
carpool_capacity = cars_driven * space_in_a_car.to_i
average_passanger_per_car = passangers.to_i / drivers.to_i
You can check the class of a variable using .class. Here:
cars = gets.chomp()
10
# => "10"
cars.class
# => String
drivers = gets.chomp()
20
# => "20"
drivers.class
# => String
Now lets add them:
cars + drivers
# => "1020"
Since they are string, + operator is adding two strings to one. Not something you intended. Now try this:
cars.to_i + drivers.to_i
# => 30

Perl, cmd, $ARGV[0], slow

[Strawberry Perl v5.16.3, Windows 7 x64, executing via cmd, eg c:\strawberry> perl test.pl 100000]
SYMPTOM: The following code: foreach (1..$ARGV[0]) { foo($_); }, executes roughly 20% slower than if I had included this extra line, before it: my $num = $ARGV[0];
QUESTION: Can anyone help me understand why?
Notice, in the second case, that after I initialize and set $num, I do not then use $num in the loop parameters. Were this the case, I could probably be convinced that repeatedly testing against $ARGV[0] in a forloop is somehow slower than a variable that I define myself... but this is not the case.
To track time, I use: use Time::HiRes; my $time = [Time::HiRes::gettimeofday()]; at the top of my script, and: print "\n1: ", Time::HiRes::tv_interval($time); at the bottom.
Confused!
Thanks,
Michael
EDIT
I am including the entire script, with a comment preceding the offending line... Interestingly, it looks like the time discrepancy is at least partially dependent on my redundant initialization of %h, as well as #chain... This is getting weird.
use Time::HiRes; my $time = [Time::HiRes::gettimeofday()];
#my $max=$ARGV[0];
my %h = (1=>1,89=>89);
$h{1}=1;
$h{89}=89;
my #chain=();
my $ans=0;
sub sum{my $o=0; foreach (#_){$o+=$_}; return $o;}
foreach (1..$ARGV[0]-1){
my $x=$_;
my #chain = ();
while(!exists($h{$x})){
push(#chain,$x);
$x = sum(map {$_**2} split('',$x));
}
foreach (#chain){$h{$_}=$h{$x} if !exists($h{$_});}
}
print "\n1: ", Time::HiRes::tv_interval($time);
foreach (1..$ARGV[0]){$ans++ if ($h{$_}==89);}
print "\n2: ", Time::HiRes::tv_interval($time);
On my system (perl 5.16.3 on GNU/Linux) there is no measurable difference. The standard deviation of the timings is larger than the difference between measurements of different versions.
For each variant of the script, 10 executions were performed. The $ARGV[0] was 3.5E5 in all cases (350000).
Without my $num = $ARGV[0]:
$ perl measure.pl
2.369921 2.38991 2.380969 4.419895 2.398861 2.420928 2.388721 2.368144 2.387212 2.386347
mean: 2.5910908
sigma: 0.609763793801797
With my $num = $ARGV[0]:
$ perl measure.pl
4.435764 2.419485 2.403696 2.401771 2.411345 2.466776 4.408127 2.416889 2.389191 2.397409
mean: 2.8150453
sigma: 0.803721101668365
The measure.pl script:
use strict; use warnings; use 5.016;
use List::Util 'sum';
my #times = map qx/perl your-algorithm.pl 3.5E5/, 1..10;
chomp #times;
say "#times";
say "mean: ", mean(#times);
say "sigma: ", sigma(#times);
sub mean { sum(#_)/#_ }
sub sigma {
my $mean = mean(#_);
my $variance = sum(map { ($_-$mean)**2 } #_) / #_;
sqrt $variance;
}
With your-algorithm.pl being reduced so that only one timing is printed:
foreach (1..$ARGV[0]){$ans++ if ($h{$_}==89);}
print Time::HiRes::tv_interval($time), "\n";

Perl Out Of Memory

I have a script that reads two csv files and compares them to find out if an ID that appears in one also appears in the other. The error I am receiving is as follows:
Out of memory during "large" request for 67112960 bytes, total sbrk() is 348203008 bytes
And now for the code:
use strict;
use File::Basename;
my $DAT = $ARGV[0];
my $OPT = $ARGV[1];
my $beg_doc = $ARGV[2];
my $end_doc = $ARGV[3];
my $doc_counter = 0;
my $page_counter = 0;
my %opt_beg_docs;
my %beg_docs;
my ($fname, $dir, $suffix) = fileparse($DAT, qr/\.[^.]*/);
my $outfile = $dir . $fname . "._IMGLOG";
open(OPT, "<$OPT");
while(<OPT>){
my #OPT_Line = split(/,/, $_);
$beg_docs{#OPT_Line[0]} = "Y" if(#OPT_Line[3] eq "Y");
$opt_beg_docs{#OPT_Line[0]} = "Y";
}
close(OPT);
open(OUT, ">$outfile");
while((my $key, my $value) = each %opt_beg_docs){
print OUT "$key\n";
}
close(OUT);
open(DAT, "<$DAT");
readline(DAT); #skips header line
while(<DAT>){
$_ =~ s/\xFE//g;
my #DAT_Line = split(/\x14/, $_);
#gets the prefix and the range of the beg and end docs
(my $pre = #DAT_Line[$beg_doc]) =~ s/[0-9]//g;
(my $beg = #DAT_Line[$beg_doc]) =~ s/\D//g;
(my $end = #DAT_Line[$end_doc]) =~ s/\D//g;
#print OUT "BEGDOC: $beg ENDDOC: $end\n";
foreach($beg .. $end){
my $doc_id = $pre . $_;
if($opt_beg_docs{$doc_id} ne "Y"){
if($beg_docs{$doc_id} ne "Y"){
print OUT "$doc_id,DOCUMENT NOT FOUND IN OPT FILE\n";
$doc_counter++;
} else {
print OUT "$doc_id,PAGE NOT FOUND IN OPT FILE\n";
$page_counter++;
}
}
}
}
close(DAT);
close(OUT);
print "Found $page_counter missing pages and $doc_counter missing document(s)";
Basically I get all the ID's from the file I am checking against to see if the ID exists in. Then I loop over the and generate the ID's for the other file, because they are presented as a range. Then I take the generated ID and check for it in the hash of ID's.
Also forgot to note I am using Windows
You're not using use warnings;, you're not checking for errors on opening files, and you're not printing out debugging statements showing the lines that you are reading in.
Do you know what the input file looks like? If it has no line breaks, you are reading the entire file in all at once, which will be disastrous if it is large. Pay attention to how you are parsing the file.
I'm not sure if it's the cause of your error, but inside your loop where you're reading DAT, you probably want to replace this:
(my $pre = #DAT_Line[$beg_doc]) =~ s/[0-9]//g;
with this:
(my $pre = $DAT_Line[$beg_doc]) =~ s/[0-9]//g;
and same for the other two lines there.
You're closing your OUT file handle and then trying to print to it inside the DAT loop, which, I think might be outputting to random memory, since you closed the FILEHANDLE - surprised this didn't output an error.
Remove the first close(OUT); and see if that improves.
I still don't know what your question is, if it's about the error message it means you've run out of memory. If it's about the message itself - you're trying to consume too much memory. If it's why you're consuming too much memory, I'd first ask if you read my message above, then I'd ask how much memory your system has, then I'd follow up with seeing if it improves if you take the regex away.

Resources