I did some simple test: reading csv file with 100 000 rows (10 columns) filled up by random English words. Script opened file, and put every rows to an array variable.
ab -n 100 -l http://localhost:8000
I sent 100 request by Apache Benchmark command to measure how much PHP8.2 is faster than PHP7.4, but... result surprised me. PHP7 was faster with score 66,9 seconds. For PHP8 it tooks 71 seconds. I did tests few times, same result. Why?
Tasks for both PHP versions were ran on indentical environment: Docker, Ubuntu 20.02, default PHP configuration.
In previous tests about calculating prime numbers, PHP8 was much faster. PHP8 is promoted as the fastest version from all of them.
Source code below:
<?php
if (!isset($_GET['phpinfo'])) {
$csvData = [];
if (($handle = fopen("test.csv", "r")) !== FALSE) {
$row = 0;
while (($data = fgetcsv($handle)) !== FALSE) {
$csvData[] = $data;
$row++;
}
fclose($handle);
if ($row === 100000) {
http_response_code(200);
} else {
http_response_code(400);
}
}
} else {
phpinfo();
}
Dockerfile (same for PHP8 - with replaced php version to 8.2 and workdir)
FROM ubuntu:20.04
ARG PHP_VERSION="7.4"
RUN apt update && \
apt -y install --no-install-recommends && \
apt -y install software-properties-common && \
add-apt-repository ppa:ondrej/php && \
apt update && \
apt -y install --no-install-recommends && \
apt -y install php${PHP_VERSION}
WORKDIR /var/www/html/php7
COPY . /var/www/html/php7
docker-compose.yml (same for PHP8 - with replaced php7 to php8)
version: "3.9"
services:
php7:
container_name: php7
build: ./php7
ports:
- "8000:8000"
volumes:
- ./php7:/var/www/html/php7
stdin_open: true
tty: true
restart: always
command: php -S 0.0.0.0:8000 -t .
test.csv (rows like below)
"sides","opportunity","thin","remove","mud","this","appearance","proud","bad","round"
the "Geographic units, by industry and statistical area: 2000–2022 descending order – CSV" from https://www.stats.govt.nz/large-datasets/csv-files-for-download/ is 135MB and has 5.9 million rows starting with
anzsic06,Area,year,geo_count,ec_count
A,A100100,2022,93,190
A,A100200,2022,138,190
A,A100300,2022,6,25
downloading that and stripping out the first 100,000 lines:
cat Data7602DescendingYearOrder.csv | head -n100000 > 100000.csv
and i have both php8.2.0-cli and php7.4.33-cli installed side-by-side (courtesy of https://deb.sury.org/ ) , and running this code
<?php
declare (strict_types = 1);
$csvData = [];
$handle = fopen("100000.csv", "r");
if (!$handle) {
throw new Exception("Could not open file");
}
$row = 0;
while (($data = fgetcsv($handle)) !== false) {
$csvData[] = $data;
$row++;
}
fclose($handle);
running them both through hyperfine benchmark:
hans#devad22:/temp2/csv$ php7.4 --version
PHP 7.4.33 (cli) (built: Nov 8 2022 11:33:53) ( NTS )
Copyright (c) The PHP Group
Zend Engine v3.4.0, Copyright (c) Zend Technologies
with Zend OPcache v7.4.33, Copyright (c), by Zend Technologies
hans#devad22:/temp2/csv$ php8.2 --version
PHP 8.2.0 (cli) (built: Dec 10 2022 10:53:01) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.2.0, Copyright (c) Zend Technologies
with Zend OPcache v8.2.0, Copyright (c), by Zend Technologies
hans#devad22:/temp2/csv$ cat 100000.csv | head
anzsic06,Area,year,geo_count,ec_count
A,A100100,2022,93,190
A,A100200,2022,138,190
A,A100300,2022,6,25
A,A100400,2022,57,50
A,A100500,2022,57,95
A,A100600,2022,12,30
A,A100700,2022,15,30
A,A100800,2022,30,85
A,A100900,2022,54,30
hans#devad22:/temp2/csv$ cat 100000.csv | wc -l
100000
hans#devad22:/temp2/csv$ cat csv.php
<?php
declare (strict_types = 1);
$csvData = [];
$handle = fopen("100000.csv", "r");
if (!$handle) {
throw new Exception("Could not open file");
}
$row = 0;
while (($data = fgetcsv($handle)) !== false) {
$csvData[] = $data;
$row++;
}
fclose($handle);
hans#devad22:/temp2/csv$ hyperfine --warmup 10 'php7.4 csv.php'
Benchmark 1: php7.4 csv.php
Time (mean ± σ): 345.5 ms ± 16.3 ms [User: 306.3 ms, System: 29.7 ms]
Range (min … max): 319.2 ms … 378.7 ms 10 runs
hans#devad22:/temp2/csv$ hyperfine --warmup 10 'php8.2 csv.php'
Benchmark 1: php8.2 csv.php
Time (mean ± σ): 337.7 ms ± 22.0 ms [User: 299.7 ms, System: 29.5 ms]
Range (min … max): 306.6 ms … 381.3 ms 10 runs
my conclusion is that PHP8.2 is about 2% faster than PHP7.4 in this particular task...
Related
On Ubuntu-18.04 I am trying to generate vera++ report to import in SonarQube. Where as it fails with below error.
bash -c 'find src -regex ".*\.cc\|.*\.hh" | vera++ - -showrules -nodup 2>&1 | vera++Report2checkstyleReport.perl > /sonar-test/valgrind-test/sonar-cxx/sonar-cxx-plugin/src/samples/SampleProject2/build/vera++-report.xml'
bash: vera++Report2checkstyleReport.perl: command not found
I have installed vera++ - 1.2.1 version. But I don't have vera++Report2checkstyleReport.perl file. Do I need to download separately? Could I please have an explanation that how this should get work?
Yes, i have downloaded the below file and executed now it is working fine.
#! /usr/bin/env perl
# An vera++ to checkstyle XML report generator for
# Copyright (C) 2010 - 2011, Neticoa SAS France - Tous droits réservés.
# Author(s) : Franck Bonin, Neticoa SAS France.
#
# This Software is free software; you can redistribute it and/or
# modify it under the terms of the GNU Lesser General Public
# License as published by the Free Software Foundation; either
# version 3 of the License, or (at your option) any later version.
#
# Sonar is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public
# License along with This Software; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02
# _________________________________________________________________________
use strict;
##my $rapport = $ARGV[0];
##chomp $rapport;
##open (DATAFILE, "$rapport") || die("Can't open $rapport\n");
my ($file,$line,$rule,$comment);
print "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n";
print "<checkstyle version=\"5.0\">\n";
my $lastfile = "";
while (<STDIN>)
{
chomp $_;
## replace '(RULENumber)' with 'RULENumber:'
$_ =~ s/(.*) \((.*)\) (.*)/\1\2:\3/g;
($file,$line,$rule,$comment) = split(":", $_);
my $severity = "error";
$severity = "warning";
#if ($rule =~ m/G.*/) { $severity = "ignore"; }
#if ($rule =~ m/F.*/) { $severity = "info"; }
#if ($rule =~ m/L.*/) { $severity = "warning"; }
#if ($rule =~ m/T.*/) { $severity = "error"; }
if ($file ne $lastfile)
{
if ($lastfile ne "")
{
print "\t</file>\n";
}
print "\t<file name=\"$file\">\n";
$lastfile = $file;
}
print "\t\t<error line=\"$line\" severity=\"$severity\" message=\"$comment\" source=\"$rule\"/>\n";
}
if ($lastfile ne "")
{
print "\t</file>\n";
}
print "</checkstyle>\n";
I have setup Scrapyd server on Amazon EC2. I have deployed the scrapy project to the server successfully but as soon as I schedule a spider run it instantly runs and finish the job without scraping a single item.
I have also setup another server and tried with that but no luck.
curl http://my.ec2/schedule.json -d project=default -d spider=somespider
yum install -y gcc
yum install -y openssl-devel
yum install python3
yum install -y python3-devel.x86_64
pip3 install python-dateutil
pip3 install Scrapy==1.5.1
pip3 install scrapyd
pip3 install scrapyd_client
pip3 install dateparser
pip3 install pyyaml
pip3 install botocore
export PATH=$PATH:/usr/local/bin
yum install -y git
cd /home/ec2-user
echo "[scrapyd]
eggs_dir = eggs
logs_dir =
items_dir =
jobs_to_keep = 5
dbs_dir = dbs
max_proc = 0
max_proc_per_cpu = 4
finished_to_keep = 250
poll_interval = 5.0
bind_address = 0.0.0.0
http_port = 6800
debug = off
runner = scrapyd.runner
application = scrapyd.app.application
launcher = scrapyd.launcher.Launcher
webroot = scrapyd.website.Root
[services]
schedule.json = scrapyd.webservice.Schedule
cancel.json = scrapyd.webservice.Cancel
addversion.json = scrapyd.webservice.AddVersion
listprojects.json = scrapyd.webservice.ListProjects
listversions.json = scrapyd.webservice.ListVersions
listspiders.json = scrapyd.webservice.ListSpiders
delproject.json = scrapyd.webservice.DeleteProject
delversion.json = scrapyd.webservice.DeleteVersion
listjobs.json = scrapyd.webservice.ListJobs
daemonstatus.json = scrapyd.webservice.DaemonStatus" > scrapyd.conf
scrapyd```
{"node_name": "my.ec2", "status": "ok", "pending": [], "running": [], "finished": [{"id": "abcd", "spider": "rishtml",
"start_time": "2019-09-14 19:33:42.667420", "end_time": "2019-09-14 19:33:43.563293"}]}
I am trying to install mecab and the ipadic dictionary as outlined here: http://taku910.github.io/mecab/#install-unix
I was able to successfully download mecab and install it and succesfully downloaded ipadic but get stuck on the second line of instruction below:
% tar zxfv mecab-ipadic-2.7.0-XXXX.tar.gz
% mecab-ipadic-2.7.0-XXXX
% ./configure
% make
% su
# make install
I am getting:
mecab-ipadic-2.7.0-20070801: command not found
I tried chmod -x on it and then tried it but same result.
Any help is appreciated.
Edit (result of cat /etc/mecabrc)
;
; Configuration file of MeCab
;
; $Id: mecabrc.in,v 1.3 2006/05/29 15:36:08 taku-ku Exp $;
;
dicdir = /usr/local/lib/mecab/dic/mecab-ipadic-neologd
; userdic = /home/foo/bar/user.dic
; output-format-type = wakati
; input-buffer-size = 8192
; node-format = %m\n
; bos-format = %S\n
; eos-format = EOS\n
There is no reason to compile from source on Ubuntu 16.04
Simple do:
$ sudo apt-get update
$ sudo apt install mecab mecab-ipadic-utf8
Then test it with
$ echo "日本語です" | mecab
日本 ニッポン ニッポン 日本 名詞-固有名詞-地名-国
語 ゴ ゴ 語 名詞-普通名詞-一般
です デス デス です 助動詞 助動詞-デス 終止形-一般
EOS
If things don't work, you may need to link /etc/mecabrc to the installed dictionary by setting dicdir=SOMEPATH_TO_IPADIC
The commands are like:
docker run / stop / rm ...
which works in terminal while causes segmentation fault in bash script.
I compared the environments between bash script and terminal, as shown below.
2c2
< BASHOPTS=cmdhist:complete_fullquote:extquote:force_fignore:hostcomplete:interactive_comments:progcomp:promptvars:sourcepath
---
> BASHOPTS=cmdhist:complete_fullquote:expand_aliases:extquote:force_fignore:hostcomplete:interactive_comments:login_shell:progcomp:promptvars:sourcepath
7,8c7,8
< BASH_LINENO=([0]="0")
< BASH_SOURCE=([0]="./devRun.sh")
---
> BASH_LINENO=()
> BASH_SOURCE=()
10a11
> COLUMNS=180
14a16,18
> HISTFILE=/home/me/.bash_history
> HISTFILESIZE=500
> HISTSIZE=500
19a24
> LINES=49
22a28
> MAILCHECK=60
28c34,37
< PPID=12558
---
> PIPESTATUS=([0]="0")
> PPID=12553
> PS1='[\u#\h \W]\$ '
> PS2='> '
32,33c41,42
< SHELLOPTS=braceexpand:hashall:interactive-comments
< SHLVL=2
---
> SHELLOPTS=braceexpand:emacs:hashall:histexpand:history:interactive-comments:monitor
> SHLVL=1
42,52c51
< _=./devRun.sh
< dao ()
< {
< echo "Dao";
< docker run -dti -v /tmp/projStatic:/var/projStatic -v ${PWD}:/home --restart always -p 50000:50000 --name projDev daocloud.io/silencej/python3-uwsgi-alpine-docker sh;
< echo "Dao ends."
< }
< docker ()
< {
< docker run -dti -v ${PWD}:/home --restart always -p 50000:50000 --name projDev owen263/python3-uwsgi-alpine-docker sh
< }
---
> _=/tmp/env.log
UPDATE:
The info and version:
docker version
Client:
Version: 1.13.1
API version: 1.26
Go version: go1.7.5
Git commit: 092cba3727
Built: Sun Feb 12 02:40:56 2017
OS/Arch: linux/amd64
Server:
Version: 1.13.1
API version: 1.26 (minimum version 1.12)
Go version: go1.7.5
Git commit: 092cba3727
Built: Sun Feb 12 02:40:56 2017
OS/Arch: linux/amd64
Experimental: false
docker info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 1
Server Version: 1.13.1
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: aa8187dbd3b7ad67d
You've rewritten the docker command in a shell, it's entirely possible this is even a recursive definition. Remove this from your environment:
docker ()
{
docker run -dti -v ${PWD}:/home --restart always -p 50000:50000 --name projDev owen263/python3-uwsgi-alpine-docker sh
}
When i launch this batch command for create and merge deltas:
D:\Sphinx\bin\indexer.exe --config D:\Sphinx\project\product.conf idx_product_delta --rotate
D:\Sphinx\bin\indexer.exe --config D:\Sphinx\project\product.conf --merge idx_product_main idx_product_delta --rotate
In searchd.log found this error and deltas are not merged into main
[Fri Sep 25 15:34:42.549 2015] [ 2312] WARNING: rotating index 'idx_product_main': cur to old rename failed: rename D:\Sphinx\project\data\product.spa to D:\Sphinx\project\data\product.old.spa failed: Broken pipe
Console output is:
using config file 'D:\Sphinx\project\product.conf'...
merging index 'idx_product_delta' into index 'idx_product_main'...
read 7.2 of 7.2 MB, 100.0% done
merged 11.5 Kwords
merged in 0.127 sec
ERROR: index 'idx_product_main': failed to delete 'D:\Sphinx\project\data\product.new.spa': Permission deniedtotal 671 reads, 0.006 sec, 15.3 kb/call avg, 0.0 msec/call avg total 36 writes, 0.004 sec, 277.8 kb/call avg, 0.1 msec/call avg
My product.conf is:
source src_product_main
{
type = mysql
sql_host = localhost
sql_user = root
sql_pass =
sql_db = database
sql_port = 3306 # optional, default is 3306
sql_query_pre = REPLACE INTO sphinx_index_meta(index_name, last_update) \
VALUES('idx_prodotti_main', current_timestamp())
sql_query_range = SELECT MIN(id),MAX(id) \
FROM product \
WHERE deleted = 0 AND visible= 1
sql_range_step = 1000
sql_query = SELECT id, text, last_update \
FROM product \
WHERE id>=$start AND id<=$end AND deleted = 0 AND visible = 1
sql_attr_timestamp = last_update
}
index idx_product_main
{
source = src_product_main
path = D:\Sphinx\project\data\product
ondisk_attrs = 1
stopwords = D:\Sphinx\project\stopwords.txt
min_word_len = 2
min_prefix_len = 0
min_infix_len = 3
ngram_len = 1
}
source src_product_delta : src_product_main
{
sql_query_range = SELECT MIN(id),MAX(id) \
FROM product \
WHERE deleted = 0 AND visible= 1
sql_range_step = 1000
sql_query = SELECT id, text, last_update \
FROM product \
WHERE id>=$start AND id<=$end AND deleted = 0 AND visible = 1
}
index idx_product_delta : idx_product_main
{
source = src_product_delta
path = D:\Sphinx\project\delta\product
ondisk_attrs = 1
stopwords = D:\Sphinx\project\stopwords.txt
min_word_len = 2
min_prefix_len = 0
min_infix_len = 3
ngram_len = 1
}
indexer
{
mem_limit = 128M
max_iosize = 1M
}
searchd
{
listen = 9312
listen = 9306:mysql41
log = D:\Sphinx\project\log\searchd.log
query_log = D:\Sphinx\project\log\query.log
read_timeout = 5
client_timeout = 300
max_children = 30
pid_file = D:\Sphinx\project\log\searchd.pid
seamless_rotate = 1
preopen_indexes = 0
unlink_old = 1
workers = threads # for RT to work
binlog_path = D:\Sphinx\project\data
}
I have also tried on Windows 7 and Windows 8, with both stable 2.2.10 and beta
2.3.1-id64-beta (r4926) with same error.
indexer running with a cron (windows scheduler) as SYSTEM user
searchd service running as SYSTEM user
D:\Sphinx\project\data\ folder permission has full control for SYSTEM
How can I solve this issue
UPDATE for Eugene Soldatov answer
I have also tried (first command less --rotate)
D:\Sphinx\bin\indexer.exe --config D:\Sphinx\project\product.conf idx_product_delta
D:\Sphinx\bin\indexer.exe --config D:\Sphinx\project\product.conf --merge idx_product_main idx_product_delta --rotate
but in console output found this error
Sphinx 2.2.10-id64-release (2c212e0)
Copyright (c) 2001-2015, Andrew Aksyonoff
Copyright (c) 2008-2015, Sphinx Technologies Inc (http://sphinxsearch.com)
using config file 'D:\Sphinx\project\product.conf'...
indexing index 'idx_prodotti_delta'...
FATAL: failed to lock D:\Sphinx\project\delta\prodotti.spl: No error, will not index. Try --rotate option.
Sphinx 2.2.10-id64-release (2c212e0)
Copyright (c) 2001-2015, Andrew Aksyonoff
Copyright (c) 2008-2015, Sphinx Technologies Inc (http://sphinxsearch.com)
using config file 'D:\Sphinx\project\product.conf'...
merging index 'idx_prodotti_delta' into index 'idx_prodotti_main'...
read 7.2 of 7.2 MB, 100.0% done
merged 11.5 Kwords
merged in 0.214 sec
ERROR: index 'idx_prodotti_main': failed to delete 'D:\Sphinx\project\data\prodotti.new.spa': Permission deniedtotal 20136 reads, 0.071 sec, 30.9 kb/call avg, 0.0 msec/call avg
total 36 writes, 0.012 sec, 283.3 kb/call avg, 0.3 msec/call avg
In searchd.log found this error
[Wed Sep 30 09:09:29.371 2015] [ 4244] rotating index 'idx_prodotti_main': started
[Wed Sep 30 09:09:29.381 2015] [ 4244] WARNING: rotating index 'idx_prodotti_main': cur to old rename failed: rename D:\Sphinx\project\data\prodotti.spa to D:\Sphinx\project\data\prodotti.old.spa failed: Broken pipe
[Wed Sep 30 09:09:29.381 2015] [ 4244] rotating index: all indexes done
UPDATE 2
Also try to insert sleep between two commands
D:\Sphinx\bin\indexer.exe --config D:\Sphinx\project\product.conf idx_product_delta --rotate
timeout /t 60
D:\Sphinx\bin\indexer.exe --config D:\Sphinx\project\product.conf --merge idx_product_main idx_product_delta --rotate
Console output:
ERROR: index 'idx_prodotti_main': failed to delete 'D:\Sphinx\project\data\prodotti.new.spa': Permission deniedtotal 20137 reads, 0.072 sec, 30.9 kb/c
UPDATE 3: Issue solved
Issue solved by sphinx guys here
http://sphinxsearch.com/bugs/view.php?id=2335
The reason of such behavior is that --rotate command is asynchronous, so when you run second command:
D:\Sphinx\bin\indexer.exe --config D:\Sphinx\project\product.conf --merge idx_product_main idx_product_delta --rotate
first may continue to work with index idx_product_delta:
D:\Sphinx\bin\indexer.exe --config D:\Sphinx\project\product.conf idx_product_delta --rotate
, so it's locked.
If it's possible, remove --rotate option on first command.
UPDATE:
Seems that you need --rotate option in first command. So you could measure average time that need to make it done and insert sleep between two commands. For example, for 30 seconds:
D:\Sphinx\bin\indexer.exe --config D:\Sphinx\project\product.conf idx_product_delta --rotate
timeout /t 30
D:\Sphinx\bin\indexer.exe --config D:\Sphinx\project\product.conf --merge idx_product_main idx_product_delta --rotate