I have setup Scrapyd server on Amazon EC2. I have deployed the scrapy project to the server successfully but as soon as I schedule a spider run it instantly runs and finish the job without scraping a single item.
I have also setup another server and tried with that but no luck.
curl http://my.ec2/schedule.json -d project=default -d spider=somespider
yum install -y gcc
yum install -y openssl-devel
yum install python3
yum install -y python3-devel.x86_64
pip3 install python-dateutil
pip3 install Scrapy==1.5.1
pip3 install scrapyd
pip3 install scrapyd_client
pip3 install dateparser
pip3 install pyyaml
pip3 install botocore
export PATH=$PATH:/usr/local/bin
yum install -y git
cd /home/ec2-user
echo "[scrapyd]
eggs_dir = eggs
logs_dir =
items_dir =
jobs_to_keep = 5
dbs_dir = dbs
max_proc = 0
max_proc_per_cpu = 4
finished_to_keep = 250
poll_interval = 5.0
bind_address = 0.0.0.0
http_port = 6800
debug = off
runner = scrapyd.runner
application = scrapyd.app.application
launcher = scrapyd.launcher.Launcher
webroot = scrapyd.website.Root
[services]
schedule.json = scrapyd.webservice.Schedule
cancel.json = scrapyd.webservice.Cancel
addversion.json = scrapyd.webservice.AddVersion
listprojects.json = scrapyd.webservice.ListProjects
listversions.json = scrapyd.webservice.ListVersions
listspiders.json = scrapyd.webservice.ListSpiders
delproject.json = scrapyd.webservice.DeleteProject
delversion.json = scrapyd.webservice.DeleteVersion
listjobs.json = scrapyd.webservice.ListJobs
daemonstatus.json = scrapyd.webservice.DaemonStatus" > scrapyd.conf
scrapyd```
{"node_name": "my.ec2", "status": "ok", "pending": [], "running": [], "finished": [{"id": "abcd", "spider": "rishtml",
"start_time": "2019-09-14 19:33:42.667420", "end_time": "2019-09-14 19:33:43.563293"}]}
Related
Hi i'm stuck with my user_data script because the script is no executed by the ec2 and i don't really understand why
On terraform i have
resource "aws_instance" "ec2_b" {
ami = "ami-0c2b8ca1dad447f8a"
instance_type = "t2.micro"
subnet_id = aws_subnet.private_b.id
vpc_security_group_ids = [aws_security_group.main.id]
tags = {
Name = "my-ec2-b"
}
key_name = "vockey"
user_data = file("./user_data.sh")
}
with the following script
#!/bin/bash
sudo echo "test" > /home/ec2-user/test.txt
sudo yum update -y
sudo yum install httpd -y
sudo systemctl start httpd
sudo systemctl enable httpd
sudo echo "Hello, World" | sudo tee /var/www/html/index.html
On aws console i can see that the script is there
However if i try to go on my ec2 with ssh i can see that the script has not been executed
I try to run laravel with media-library under docker(with PHP Version 8.1.4) and when I try to get properties of image with code :
if ( ! empty($CMSItemMedia) and File::exists($CMSItemMedia->getPath())) {
$CMSItemImage['url'] = $CMSItemMedia->getUrl();
\Log::info( varDump($CMSItemMedia->getUrl(), ' -1 $CMSItemMedia->getUrl()::') );
$imageInstance = Image::load($CMSItemMedia->getUrl());
\Log::info( varDump($imageInstance, ' -1 $imageInstance::') );
$CMSItemImage['width'] = $imageInstance->getWidth();
I got next logging lines and error :
[2022-05-31 13:37:20] local.INFO: scalar => (string) : -1 $CMSItemMedia->getUrl():: : http://127.0.0.1:8088/storage/currency_app/21/our_rules.png
[2022-05-31 13:37:20] local.INFO: (Object of Spatie\Image\Image) : -1 $imageInstance:: : Array
(
[manipulations] => Spatie\Image\Manipulations Object
(
[manipulationSequence:protected] => Spatie\Image\ManipulationSequence Object
(
[groups:protected] => Array
(
[0] => Array
(
)
)
)
)
[imageDriver] => gd
[temporaryDirectory] =>
[optimizerChain] =>
[pathToImage] => http://127.0.0.1:8088/storage/currency_app/21/our_rules.png
)
[2022-05-31 13:37:20] local.ERROR: Unable to init from given url (http://127.0.0.1:8088/storage/currency_app/21/our_rules.png). {"exception":"[object] (Intervention\\Image\\Exception\\NotReadableException(code: 0): Unable to init from given url (http://127.0.0.1:8088/storage/currency_app/21/our_rules.png). at /var/www/BiCurrencies_DOCKER_ROOT/vendor/intervention/image/src/Intervention/Image/AbstractDecoder.php:85)
[stacktrace]
If to open file in browser url
http://127.0.0.1:8088/storage/currency_app/21/our_rules.png
file is opened ok.
In phpinfo output I see that gd is intalled in my docker OS :
GD Support enabled
GD Version bundled (2.1.0 compatible)
FreeType Support enabled
FreeType Linkage with freetype
FreeType Version 2.10.4
I have in my app :
"spatie/laravel-medialibrary": "^10.0.0",
"intervention/image": "^2.7",
and in .env :
APP_URL=http://127.0.0.1:8088
so my app runs by http://127.0.0.1:8088 url.
In Dockerfile.yml file I have gd installed :
FROM php:8.1.4-apache
RUN apt-get update && \
apt-get install --assume-yes --no-install-recommends --quiet \
python \
libfreetype6-dev \
libwebp-dev \
libjpeg62-turbo-dev \
libpng-dev \
libzip-dev \
nano \
mc \
git-core \
libmagickwand-dev \
curl \
build-essential \
libnotify-bin \
openssl \
libssl-dev \
libgmp-dev \
libldap2-dev \
netcat \
locate \
# composer \
&& git clone https://github.com/nodejs/node.git && \
cd node \
&& git checkout v14.18.0 \
&& ./configure \
&& make \
&& make install
RUN pecl install imagick \
&& docker-php-ext-enable imagick
RUN npm install cross-env
RUN npm install -g yarn
RUN docker-php-ext-configure gd --with-freetype --with-jpeg --with-webp --with-jpeg
RUN docker-php-ext-install gd
# Install Composer
RUN curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer
RUN docker-php-ext-install pdo pdo_mysql zip gmp bcmath pcntl ldap sysvmsg exif \
&& a2enmod rewrite
RUN apt-get install -y grep mlocate
COPY virtualhost.conf /etc/apache2/sites-enabled/000-default.conf
I do not have such problems under hosting Kubuntu 20.
What is wrong ? Did I miss some docker options ?
Thanks!
I have this scenario in Vagrant where I wanted to patch how the ansible is being installed on RHEL 8 (because of certain issues with absence of packages missing due to repository setup) using ansible_local plugin. So here's the thing. Instead, I wanted to use pip3 (though I know I can use pip using ansible_local module but still it errs after that due to absence of certain repository so I figure out a way to fix it).
In my Vagrantfile, I have these lines as such,
.....
node.vm.provision "ansible_local" do |ansible|
ansible.playbook = ansible_playbook
ansible.verbose = true
ansible.install = true
## Actually this line doesn't suffice my problem since it still errs as pip requires other packages. Please check the *.rb file below
if i == 100
ansible.install_mode = "pip"
ansible.version = "2.9"
#ansible.ansible_rpm_install = Foo
end
end
...
.....
But because using pip still fails, so I end up changing the file /opt/vagrant/embedded/gems/2.2.9/gems/vagrant-2.2.9/plugins/provisioners/ansible/cap/guest/redhat/ansible_install.rb and what I did is this way,
cap/guest/redhat/ansible_install.rb
require_relative "../facts"
require_relative "../pip/pip"
module VagrantPlugins
module Ansible
module Cap
module Guest
module RedHat
module AnsibleInstall
def self.ansible_install(machine, install_mode, ansible_version, pip_args, pip_install_cmd = "")
case install_mode
when :pip
pip_setup machine, pip_install_cmd
Pip::pip_install machine, "ansible", ansible_version, pip_args, true
when :pip_args_only
pip_setup machine, pip_install_cmd
Pip::pip_install machine, "", "", pip_args, false
else
// My added part as a quick fix/solution
if machine.config.vm.box == "generic/rhel8"
ansible_rpm_install_rhel8 machine
elsif
ansible_rpm_install machine
end
end
end
private
def self.ansible_rpm_install(machine)
rpm_package_manager = Facts::rpm_package_manager(machine)
epel = machine.communicate.execute "#{rpm_package_manager} repolist epel | grep -q epel", error_check: false
if epel != 0
machine.communicate.sudo 'sudo rpm -i https://dl.fedoraproject.org/pub/epel/epel-release-latest-`rpm -E %dist | sed -n \'s/.*el\([0-9]\).*/\1/p\'`.noarch.rpm'
end
machine.communicate.sudo "#{rpm_package_manager} -y --enablerepo=epel install ansible"
end
def self.pip_setup(machine, pip_install_cmd = "")
rpm_package_manager = Facts::rpm_package_manager(machine)
machine.communicate.sudo("#{rpm_package_manager} -y install curl gcc libffi-devel openssl-devel python-crypto python-devel python-setuptools")
Pip::get_pip machine, pip_install_cmd
end
def self.pip_setup(machine, pip_install_cmd = "")
rpm_package_manager = Facts::rpm_package_manager(machine)
machine.communicate.sudo("#{rpm_package_manager} -y install curl gcc libffi-devel openssl-devel python-crypto python-devel python-setuptools")
Pip::get_pip machine, pip_install_cmd
end
// My added part as a quick fix/solution
def self.ansible_rpm_install_rhel8(machine)
rpm_package_manager = Facts::rpm_package_manager(machine)
epel = machine.communicate.execute "#{rpm_package_manager} repolist epel | grep -q epel", error_check: false
if epel != 0
machine.communicate.sudo 'sudo rpm -i https://dl.fedoraproject.org/pub/epel/epel-release-latest-`rpm -E %dist | sed -n \'s/.*el\([0-9]\).*/\1/p\'`.noarch.rpm'
end
machine.communicate.sudo "dnf -y update; dnf -y install python3 python3-pip; pip3 install ansible"
end
end
end
end
end
end
end
So I added a method self.ansible_rpm_install_rhel8(machine) and did an if..elsif as a solution when box name is "generic/rhel8". This works perfect actually on my end. However, I don't like this approach i.e. changing the file /opt/vagrant/embedded/gems/2.2.9/gems/vagrant-2.2.9/plugins/provisioners/ansible/cap/guest/redhat/ansible_install.rb because once it changes or version upgrades, this will be gone. Is there any better way that I can just do this only inside Vagrantfile such that perhaps overriding or extending the class itself? I have no idea how to do this.
Any ideas are welcome.
Thank you!
You can use multiple provisioning sections next one another which will be executed in the order as you defined them in your Vagrantfile.
Instead of patching a plugin, try to prepare your ansible execution accordingly.
Before executing the provisioning part for ansible, execute a shell provisioning as follows:
Vagrant.configure("2") do |node|
# ...
node.vm.provision :shell, path: "fix_repo_and_add_packages.sh"
node.vm.provision "ansible_local" do |ansible|
ansible.playbook = ansible_playbook
ansible.verbose = true
ansible.install = true
## Actually this line doesn't suffice my problem since it still errs as pip requires other packages. Please check the *.rb file below
if i == 100
ansible.install_mode = "pip"
ansible.version = "2.9"
#ansible.ansible_rpm_install = Foo
end
end
end
Your fix_repo_and_add_packages.sh contains the setup of the missing repos and here you can add packages as well.
For more information about shell provisioning, you can find in the doc
I am trying to install mecab and the ipadic dictionary as outlined here: http://taku910.github.io/mecab/#install-unix
I was able to successfully download mecab and install it and succesfully downloaded ipadic but get stuck on the second line of instruction below:
% tar zxfv mecab-ipadic-2.7.0-XXXX.tar.gz
% mecab-ipadic-2.7.0-XXXX
% ./configure
% make
% su
# make install
I am getting:
mecab-ipadic-2.7.0-20070801: command not found
I tried chmod -x on it and then tried it but same result.
Any help is appreciated.
Edit (result of cat /etc/mecabrc)
;
; Configuration file of MeCab
;
; $Id: mecabrc.in,v 1.3 2006/05/29 15:36:08 taku-ku Exp $;
;
dicdir = /usr/local/lib/mecab/dic/mecab-ipadic-neologd
; userdic = /home/foo/bar/user.dic
; output-format-type = wakati
; input-buffer-size = 8192
; node-format = %m\n
; bos-format = %S\n
; eos-format = EOS\n
There is no reason to compile from source on Ubuntu 16.04
Simple do:
$ sudo apt-get update
$ sudo apt install mecab mecab-ipadic-utf8
Then test it with
$ echo "日本語です" | mecab
日本 ニッポン ニッポン 日本 名詞-固有名詞-地名-国
語 ゴ ゴ 語 名詞-普通名詞-一般
です デス デス です 助動詞 助動詞-デス 終止形-一般
EOS
If things don't work, you may need to link /etc/mecabrc to the installed dictionary by setting dicdir=SOMEPATH_TO_IPADIC
ANSIBLE VERSION
ansible --version
ansible 2.0.2.0
config file =
configured module search path = Default w/o overrides
OS / ENVIRONMENT
MacOS 10.9
SUMMARY
I want to launch shell command to build docker image asynchronously, the build is success but this error occured:
fatal: [192.168.0.1]: FAILED! => {"changed": false, "failed": true, "msg": "The async task did not return valid JSON: No JSON object could be decoded"}
STEPS TO REPRODUCE
<!---
For bugs, show exactly how to reproduce the problem.
For new features, show how the feature would be used.
-->
I manually executed the python snippet ansible executes:
/root/.ansible/tmp/ansible-tmp-1465081459.8-258691745536723/command
/root/.ansible/tmp/ansible-tmp-1465081459.8-258691745536723/arguments
The result is
{
"changed": true,
"end": "2016-06-05 07:12:38.323820",
"stdout": "Sending build context to Docker daemon 12.8 kB\r\r
Step 1 : FROM centos:6.6
---> 87dd25f5ba5c
Step 2 : RUN yum groupinstall -y development
---> Using cache
---> dc1b778350d5
Step 3 : RUN yum install -y vim bzip2-devel hostname tar zlib-dev
---> Using cache
---> d4538794340e
Step 4 : RUN curl https://bootstrap.pypa.io/get-pip.py | python -
---> Using cache
---> cd3cd13b11db
Step 5 : RUN pip install envtpl
---> Using cache
---> 574514eb6d91
Step 6 : WORKDIR /tmp
---> Using cache
---> 3750674900e6
Step 7 : ENV SCALA_VERSION 2.11
---> Using cache
---> 8316460b0264
Step 8 : ENV KAFKA_VERSION 0.9.0.1
---> Using cache
---> aa8f9cf12288
Step 9 : ENV KAFKA_HOME /usr/share/kafka_\"$SCALA_VERSION\"-\"$KAFKA_VERSION\"
---> Using cache
---> 38c1243614ac
Step 10 : RUN yum install -y tar libcurl libcurl-devel rrdtool rrdtool-devel perl-devel libgcrypt-devel gcc make gcc-c++ yajl-devel libxml2-devel libxml-2.0 java-1.7.0-openjdk java-1.7.0-openjdk-devel
---> Using cache
---> 8350d7ba311a
Step 11 : RUN curl -L -O http://mirrors.tuna.tsinghua.edu.cn/apache/kafka/\"$KAFKA_VERSION\"/kafka_\"$SCALA_VERSION\"-\"$KAFKA_VERSION\".tgz && tar xfz kafka_\"$SCALA_VERSION\"-\"$KAFKA_VERSION\".tgz -C /usr/share/
---> Using cache
---> 485916280855
Step 12 : RUN mkdir -p /usr/local/bin/
---> Using cache
---> 2a3427511ef0
Step 13 : ADD start-kafka.sh /usr/local/bin/start-kafka.sh
---> Using cache
---> 6cd1ae0c2abc
Step 14 : RUN chmod +x /usr/local/bin/start-kafka.sh
---> Using cache
---> 4ba3e949839e
Step 15 : ADD zookeeper.properties.j2 \"$KAFKA_HOME\"/config/
---> Using cache
---> db8bfa4cce0d
Step 16 : ADD server.properties.j2 \"$KAFKA_HOME\"/config/
---> Using cache
---> b2d93fd06892
Step 17 : EXPOSE 2181 9092
---> Using cache
---> 25f1e2136c2c
Step 18 : RUN mv /etc/localtime /etc/localtime.bak
---> Using cache
---> 1cf4fc533f3a
Step 19 : RUN ln -s /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
---> Using cache
---> b6bae747bc32
Step 20 : CMD /usr/local/bin/start-kafka.sh
---> Using cache
---> 9920101ececa
Successfully built 9920101ececa",
"cmd": "docker build -t elk/kafka /tmp/dockerfile/kafka",
"rc": 0,
"start": "2016-06-05 07:12:38.253244",
"stderr": "",
"delta": "0:00:00.070576",
"invocation": {
"module_args": {"warn": true,
"executable": null,
"chdir": null,
"_raw_params": "docker build -t elk/kafka /tmp/dockerfile/kafka",
"removes": null,
"creates": null,
"_uses_shell": true
}
},
"warnings": []
}
The task I execute is
- name: build or check kafka docker image
shell: docker build -t {{ image_name }} /tmp/dockerfile/kafka
async: 2400 # wait seconds
poll: 5 # poll wait seconds
I finally struggled to solve this problem. I added code to print the result in line 445 under file executor/task_executor.py:
return dict(failed=True, msg=u"The async task did not return valid JSON: error:%s result:%s" % (to_unicode(e), result))
Then I found some invalid json text: /etc/profile.d/lang.sh: line 19: warning: setlocale: LC_CTYPE: cannot change locale (UTF-8): No such file or directory\\r\\n
After I fixed this locale problem with the help of this link, this problem disappeared.