Which settings are required to run Ansible on AWS EC2 - amazon-ec2

I've configured a new Amazon EC2 Ubuntu instance. and added my public ssh key to the server.
cat ~/.ssh/id_rsa.pub | ssh -I aws.pem ubuntu#<ec2publicDNS> "cat - >> ~/.ssh/authorized_keys2"
I'm now able to access the instance with
ssh ubuntu#<ec2publicIP>
So I added the following to my /etc/ansible/hosts
[webservers]
ubuntu#<ec2publicIP>
when I'm running ANSIBLE_DEBUG=1 ansible all -m ping I receive the following:
9264 1486122587.48735: starting run
9264 1486122587.58557: Loading CacheModule 'memory' from /usr/local/lib/python2.7/site-packages/ansible/plugins/cache/memory.py
9264 1486122587.62315: Loading CallbackModule 'minimal' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/minimal.py
9264 1486122587.62373: Loading CallbackModule 'actionable' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/actionable.py (found_in_cache=False, class_only=True)
9264 1486122587.62388: Loading CallbackModule 'context_demo' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/context_demo.py (found_in_cache=False, class_only=True)
9264 1486122587.62401: Loading CallbackModule 'debug' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/debug.py (found_in_cache=False, class_only=True)
9264 1486122587.62420: Loading CallbackModule 'default' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/default.py (found_in_cache=False, class_only=True)
9264 1486122587.62450: Loading CallbackModule 'foreman' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/foreman.py (found_in_cache=False, class_only=True)
9264 1486122587.63003: Loading CallbackModule 'hipchat' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/hipchat.py (found_in_cache=False, class_only=True)
9264 1486122587.63048: Loading CallbackModule 'jabber' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/jabber.py (found_in_cache=False, class_only=True)
9264 1486122587.63064: Loading CallbackModule 'json' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/json.py (found_in_cache=False, class_only=True)
9264 1486122587.63096: Loading CallbackModule 'junit' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/junit.py (found_in_cache=False, class_only=True)
9264 1486122587.63121: Loading CallbackModule 'log_plays' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/log_plays.py (found_in_cache=False, class_only=True)
9264 1486122587.63173: Loading CallbackModule 'logentries' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/logentries.py (found_in_cache=False, class_only=True)
9264 1486122587.63266: Loading CallbackModule 'mail' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/mail.py (found_in_cache=False, class_only=True)
9264 1486122587.63273: Loading CallbackModule 'minimal' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/minimal.py (found_in_cache=False, class_only=True)
9264 1486122587.63288: Loading CallbackModule 'oneline' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/oneline.py (found_in_cache=False, class_only=True)
9264 1486122587.63304: Loading CallbackModule 'osx_say' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/osx_say.py (found_in_cache=False, class_only=True)
9264 1486122587.63321: Loading CallbackModule 'profile_tasks' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/profile_tasks.py (found_in_cache=False, class_only=True)
9264 1486122587.63648: Loading CallbackModule 'skippy' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/skippy.py (found_in_cache=False, class_only=True)
9264 1486122587.63678: Loading CallbackModule 'slack' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/slack.py (found_in_cache=False, class_only=True)
9264 1486122587.63755: Loading CallbackModule 'syslog_json' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/syslog_json.py (found_in_cache=False, class_only=True)
9264 1486122587.63772: Loading CallbackModule 'timer' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/timer.py (found_in_cache=False, class_only=True)
9264 1486122587.63789: Loading CallbackModule 'tree' from /usr/local/lib/python2.7/site-packages/ansible/plugins/callback/tree.py (found_in_cache=False, class_only=True)
9264 1486122587.63795: in VariableManager get_vars()
9264 1486122587.63812: done with get_vars()
9264 1486122587.64662: Loading StrategyModule 'linear' from /usr/local/lib/python2.7/site-packages/ansible/plugins/strategy/linear.py
9264 1486122587.64819: getting the remaining hosts for this loop
9264 1486122587.64824: done getting the remaining hosts for this loop
9264 1486122587.64832: building list of next tasks for hosts
9264 1486122587.64838: getting the next task for host ubuntu#<ec2publicIP>
9264 1486122587.64846: done getting next task for host ubuntu#<ec2publicIP>
9264 1486122587.64852: ^ task is: TASK: meta (flush_handlers)
9264 1486122587.64859: ^ state is: HOST STATE: block=1, task=1, rescue=0, always=0, role=None, run_state=ITERATING_TASKS, fail_state=FAILED_NONE, pending_setup=False, tasks child state? (None), rescue child state? (None), always child state? (None), did rescue? False, did start at task? False
9264 1486122587.64863: done building task lists
9264 1486122587.64868: counting tasks in each state of execution
9264 1486122587.64872: done counting tasks in each state of execution:
num_setups: 0
num_tasks: 1
num_rescue: 0
num_always: 0
9264 1486122587.64876: advancing hosts in ITERATING_TASKS
9264 1486122587.64881: starting to advance hosts
9264 1486122587.64885: getting the next task for host ubuntu#<ec2publicIP>
9264 1486122587.64892: done getting next task for host ubuntu#<ec2publicIP>
9264 1486122587.64896: ^ task is: TASK: meta (flush_handlers)
9264 1486122587.64901: ^ state is: HOST STATE: block=1, task=1, rescue=0, always=0, role=None, run_state=ITERATING_TASKS, fail_state=FAILED_NONE, pending_setup=False, tasks child state? (None), rescue child state? (None), always child state? (None), did rescue? False, did start at task? False
9264 1486122587.64907: done advancing hosts to next task
9264 1486122587.65149: done queuing things up, now waiting for results queue to drain
9264 1486122587.65157: results queue empty
9264 1486122587.65161: checking for any_errors_fatal
9264 1486122587.65164: done checking for any_errors_fatal
9264 1486122587.65168: checking for max_fail_percentage
9264 1486122587.65171: done checking for max_fail_percentage
9264 1486122587.65175: checking to see if all hosts have failed and the running result is not ok
9264 1486122587.65180: done checking to see if all hosts have failed
9264 1486122587.65186: getting the remaining hosts for this loop
9264 1486122587.65190: done getting the remaining hosts for this loop
9264 1486122587.65198: building list of next tasks for hosts
9264 1486122587.65202: getting the next task for host ubuntu#<ec2publicIP>
9264 1486122587.65208: done getting next task for host ubuntu#<ec2publicIP>
9264 1486122587.65212: ^ task is: TASK: ping
9264 1486122587.65216: ^ state is: HOST STATE: block=2, task=1, rescue=0, always=0, role=None, run_state=ITERATING_TASKS, fail_state=FAILED_NONE, pending_setup=False, tasks child state? (None), rescue child state? (None), always child state? (None), did rescue? False, did start at task? False
9264 1486122587.65220: done building task lists
9264 1486122587.65224: counting tasks in each state of execution
9264 1486122587.65228: done counting tasks in each state of execution:
num_setups: 0
num_tasks: 1
num_rescue: 0
num_always: 0
9264 1486122587.65232: advancing hosts in ITERATING_TASKS
9264 1486122587.65235: starting to advance hosts
9264 1486122587.65238: getting the next task for host ubuntu#<ec2publicIP>
9264 1486122587.65244: done getting next task for host ubuntu#<ec2publicIP>
9264 1486122587.65248: ^ task is: TASK: ping
9264 1486122587.65252: ^ state is: HOST STATE: block=2, task=1, rescue=0, always=0, role=None, run_state=ITERATING_TASKS, fail_state=FAILED_NONE, pending_setup=False, tasks child state? (None), rescue child state? (None), always child state? (None), did rescue? False, did start at task? False
9264 1486122587.65256: done advancing hosts to next task
9264 1486122587.65263: getting variables
9264 1486122587.65269: in VariableManager get_vars()
9264 1486122587.65297: done with get_vars()
9264 1486122587.65308: done getting variables
9264 1486122587.65313: sending task start callback, copying the task so we can template it temporarily
9264 1486122587.65318: done copying, going to template now
9264 1486122587.65324: done templating
9264 1486122587.65329: here goes the callback...
9264 1486122587.65336: sending task start callback
9264 1486122587.65342: entering _queue_task() for ubuntu#<ec2publicIP>/ping
9264 1486122587.65349: Creating lock for ping
9264 1486122587.65468: worker is 1 (out of 1 available)
9264 1486122587.65510: exiting _queue_task() for ubuntu#<ec2publicIP>/ping
9264 1486122587.65575: done queuing things up, now waiting for results queue to drain
9264 1486122587.65582: waiting for pending results...
9267 1486122587.65922: running TaskExecutor() for ubuntu#<ec2publicIP>/TASK: ping
9267 1486122587.65987: in run()
9267 1486122587.66061: calling self._execute()
9267 1486122587.67436: Loading Connection 'ssh' from /usr/local/lib/python2.7/site-packages/ansible/plugins/connection/ssh.py
9267 1486122587.67554: Loading ShellModule 'csh' from /usr/local/lib/python2.7/site-packages/ansible/plugins/shell/csh.py
9267 1486122587.67589: Loading ShellModule 'fish' from /usr/local/lib/python2.7/site-packages/ansible/plugins/shell/fish.py
9267 1486122587.67632: Loading ShellModule 'powershell' from /usr/local/lib/python2.7/site-packages/ansible/plugins/shell/powershell.py
9267 1486122587.67649: Loading ShellModule 'sh' from /usr/local/lib/python2.7/site-packages/ansible/plugins/shell/sh.py
9267 1486122587.67672: Loading ShellModule 'sh' from /usr/local/lib/python2.7/site-packages/ansible/plugins/shell/sh.py (found_in_cache=True, class_only=False)
9267 1486122587.67693: in VariableManager get_vars()
9267 1486122587.67736: done with get_vars()
9267 1486122587.67764: Loading ActionModule 'normal' from /usr/local/lib/python2.7/site-packages/ansible/plugins/action/normal.py
9267 1486122587.67774: starting attempt loop
9267 1486122587.67783: running the handler
9267 1486122587.67827: ANSIBALLZ: Using lock for ping
9267 1486122587.67831: ANSIBALLZ: Acquiring lock
9267 1486122587.67837: ANSIBALLZ: Lock acquired: 4559072080
9267 1486122587.67841: ANSIBALLZ: Creating module
9267 1486122587.75433: ANSIBALLZ: Writing module
9267 1486122587.75461: ANSIBALLZ: Renaming module
9267 1486122587.75472: ANSIBALLZ: Done creating module
9267 1486122587.75528: _low_level_execute_command(): starting
9267 1486122587.75537: _low_level_execute_command(): executing: /bin/sh -c '( umask 77 && mkdir -p "` echo ~/.ansible/tmp/ansible-tmp-1486122587.76-200107609248376 `" && echo ansible-tmp-1486122587.76-200107609248376="` echo ~/.ansible/tmp/ansible-tmp-1486122587.76-200107609248376 `" ) && sleep 0'
9267 1486122590.52707: stdout chunk (state=2):
>>>ansible-tmp-1486122587.76-200107609248376=/home/ubuntu/.ansible/tmp/ansible-tmp-1486122587.76-200107609248376
<<<
9267 1486122590.52765: stdout chunk (state=3):
>>><<<
9267 1486122590.52775: stderr chunk (state=3):
>>><<<
9267 1486122590.52795: _low_level_execute_command() done: rc=0, stdout=ansible-tmp-1486122587.76-200107609248376=/home/ubuntu/.ansible/tmp/ansible-tmp-1486122587.76-200107609248376
, stderr=
9267 1486122590.52808: transferring module to remote /home/ubuntu/.ansible/tmp/ansible-tmp-1486122587.76-200107609248376/ping.py
9267 1486122590.53337: Sending initial data
9267 1486122590.53347: Sent initial data (139 bytes)
9267 1486122590.54550: stderr chunk (state=3):
>>>ssh: Could not resolve hostname <ec2publicIP>]: nodename nor servname provided, or not known
<<<
9267 1486122590.54583: stderr chunk (state=3):
>>>Connection closed
<<<
9267 1486122590.54612: stdout chunk (state=3):
>>><<<
9267 1486122590.54618: stderr chunk (state=3):
>>><<<
[WARNING]: sftp transfer mechanism failed on [ubuntu#<ec2publicIP>]. Use ANSIBLE_DEBUG=1 to see detailed information
9267 1486122590.54711:
9267 1486122590.54718: ssh: Could not resolve hostname <ec2publicIP>]: nodename nor servname provided, or not known
Connection closed
9267 1486122590.56466: stderr chunk (state=2):
>>>ssh: Could not resolve hostname <ec2publicIP>]: nodename nor servname provided, or not known
<<<
9267 1486122590.56501: stderr chunk (state=3):
>>>lost connection
<<<
9267 1486122590.56525: stdout chunk (state=3):
>>><<<
9267 1486122590.56534: stderr chunk (state=3):
>>><<<
[WARNING]: scp transfer mechanism failed on [ubuntu#<ec2publicIP>]. Use ANSIBLE_DEBUG=1 to see detailed information
9267 1486122590.56573:
9267 1486122590.56577: ssh: Could not resolve hostname <ec2publicIP>]: nodename nor servname provided, or not known
lost connection
9267 1486122590.56621: done running TaskExecutor() for ubuntu#<ec2publicIP>/TASK: ping
9267 1486122590.56628: sending task result
9267 1486122590.56669: done sending task result
9267 1486122590.56674: WORKER PROCESS EXITING
9264 1486122590.56785: in VariableManager get_vars()
9264 1486122590.56925: done with get_vars()
9264 1486122590.56939: marking ubuntu#<ec2publicIP> as failed
9264 1486122590.56947: marking host ubuntu#<ec2publicIP> failed, current state: HOST STATE: block=2, task=1, rescue=0, always=0, role=None, run_state=ITERATING_TASKS, fail_state=FAILED_NONE, pending_setup=False, tasks child state? (None), rescue child state? (None), always child state? (None), did rescue? False, did start at task? False
9264 1486122590.56952: ^ failed state is now: HOST STATE: block=2, task=1, rescue=0, always=0, role=None, run_state=ITERATING_COMPLETE, fail_state=FAILED_TASKS, pending_setup=False, tasks child state? (None), rescue child state? (None), always child state? (None), did rescue? False, did start at task? False
9264 1486122590.57203: getting the next task for host ubuntu#<ec2publicIP>
9264 1486122590.57211: host ubuntu#<ec2publicIP> is done iterating, returning
ubuntu#<ec2publicIP> | FAILED! => {
"failed": true,
"msg": "failed to transfer file to /home/ubuntu/.ansible/tmp/ansible-tmp-1486122587.76-200107609248376/ping.py:\n\nssh: Could not resolve hostname <ec2publicIP>]: nodename nor servname provided, or not known\r\nlost connection\n"
}
9264 1486122590.57242: no more pending results, returning what we have
9264 1486122590.57251: results queue empty
9264 1486122590.57255: checking for any_errors_fatal
9264 1486122590.57259: done checking for any_errors_fatal
9264 1486122590.57262: checking for max_fail_percentage
9264 1486122590.57265: done checking for max_fail_percentage
9264 1486122590.57269: checking to see if all hosts have failed and the running result is not ok
9264 1486122590.57272: done checking to see if all hosts have failed
9264 1486122590.57275: getting the remaining hosts for this loop
9264 1486122590.57279: done getting the remaining hosts for this loop
9264 1486122590.60734: building list of next tasks for hosts
9264 1486122590.60741: getting the next task for host ubuntu#<ec2publicIP>
9264 1486122590.60748: host ubuntu#<ec2publicIP> is done iterating, returning
9264 1486122590.60752: done building task lists
9264 1486122590.60755: counting tasks in each state of execution
9264 1486122590.60759: done counting tasks in each state of execution:
num_setups: 0
num_tasks: 0
num_rescue: 0
num_always: 0
9264 1486122590.60768: all hosts are done, so returning None's for all hosts
9264 1486122590.60773: done queuing things up, now waiting for results queue to drain
9264 1486122590.60777: results queue empty
9264 1486122590.60780: checking for any_errors_fatal
9264 1486122590.60785: done checking for any_errors_fatal
9264 1486122590.60789: checking for max_fail_percentage
9264 1486122590.60793: done checking for max_fail_percentage
9264 1486122590.60796: checking to see if all hosts have failed and the running result is not ok
9264 1486122590.60802: done checking to see if all hosts have failed
9264 1486122590.60809: getting the next task for host ubuntu#<ec2publicIP>
9264 1486122590.60813: host ubuntu#<ec2publicIP> is done iterating, returning
9264 1486122590.60818: running handlers
9264 1486122590.60893: RUNNING CLEANUP
do I have to expose some extra ports in my security_group in aws? Until now only port 22 is exposed.

Try to change your hosts file:
[webservers]
<ec2publicIP> ansible_user=ubuntu

Ansible uses ssh, so port 22 is enough, you won't need any additional Security Groups for you EC2 instance. What you might do is modify your inventory and instead of <username>#<ipaddress> use only <ipaddress> (or resolvable name). You can remote_user in your playbook, or specified it with ansible_user in your inventory as Konstantin pointed out.

Related

Horizon: context deadline exceeded

I ran stellar-horizon with captive-core following the configuration from the official documentation. After catching up with the history and applying all the checkpoints, when ingesting live data horizon tries to get info from http://localhost:11626/info and it times out.
Versions
horizon: 2.18.0
stellar-core: 19.1.0
go: 1.17.9
Horizon logs output
INFO[2022-07-05T15:17:16.644+01:00] Ledger: Got consensus: [seq=41625941, prev=a0d052, txs=430, ops=953, sv: [ SIGNED#lobstr_2_europe txH: f81623, ct: 1657030634, upgrades: [ ] ]] pid=306990 service=ingest subservice=stellar-core
INFO[2022-07-05T15:17:16.644+01:00] Tx: applying ledger 41625941 (txs:430, ops:953, base_fee:100) pid=306990 service=ingest subservice=stellar-core
INFO[2022-07-05T15:17:16.664+01:00] waiting for ingestion system catchup pid=306990 service=ingest status="{false false 0 41457386 41457386}"
INFO[2022-07-05T15:17:18.664+01:00] waiting for ingestion system catchup pid=306990 service=ingest status="{false false 0 41457386 41457386}"
INFO[2022-07-05T15:17:20.664+01:00] waiting for ingestion system catchup pid=306990 service=ingest status="{false false 0 41457386 41457386}"
ERRO[2022-07-05T15:17:21.645+01:00] failed to load the stellar-core info err="http request errored: Get \"http://localhost:11626/info\": context deadline exceeded" pid=306990 stack="[main.go:43 client.go:67 app.go:230 app.go:442 asm_amd64.s:1581]"
WARN[2022-07-05T15:17:21.645+01:00] could not load stellar-core info: http request errored: Get "http://localhost:11626/info": context deadline exceeded pid=306990
WARN[2022-07-05T15:17:21.646+01:00] error ticking app: context deadline exceeded pid=306990
INFO[2022-07-05T15:17:22.663+01:00] waiting for ingestion system catchup pid=306990 service=ingest status="{false false 0 41457386 41457386}"
INFO[2022-07-05T15:17:23.340+01:00] Processing ledger entry changes pid=306990 processed_entries=600000 progress="2.20%" sequence=41625919 service=ingest source=historyArchive
INFO[2022-07-05T15:17:24.663+01:00] waiting for ingestion system catchup pid=306990 service=ingest status="{false false 0 41457386 41457386}"
INFO[2022-07-05T15:17:25.174+01:00] Processing ledger entry changes pid=306990 processed_entries=650000 progress="2.42%" sequence=41625919 service=ingest source=historyArchive
ERRO[2022-07-05T15:17:26.646+01:00] failed to load the stellar-core info err="http request errored: Get \"http://localhost:11626/info\": context deadline exceeded" pid=306990 stack="[main.go:43 client.go:67 app.go:230 app.go:442 asm_amd64.s:1581]"
WARN[2022-07-05T15:17:26.646+01:00] could not load stellar-core info: http request errored: Get "http://localhost:11626/info": context deadline exceeded pid=306990
WARN[2022-07-05T15:17:26.647+01:00] error ticking app: context deadline exceeded pid=306990
INFO[2022-07-05T15:17:26.663+01:00] waiting for ingestion system catchup pid=306990 service=ingest status="{false false 0 41457386 41457386}"
INFO[2022-07-05T15:17:28.664+01:00] waiting for ingestion system catchup pid=306990 service=ingest status="{false false 0 41457386 41457386}"
INFO[2022-07-05T15:17:30.663+01:00] waiting for ingestion system catchup pid=306990 service=ingest status="{false false 0 41457386 41457386}"
INFO[2022-07-05T15:17:31.543+01:00] Processing ledger entry changes pid=306990 processed_entries=700000 progress="2.65%" sequence=41625919 service=ingest source=historyArchive
WARN[2022-07-05T15:17:31.647+01:00] could not load stellar-core info: http request errored: Get "http://localhost:11626/info": context deadline exceeded pid=306990
ERRO[2022-07-05T15:17:31.647+01:00] failed to load the stellar-core info err="http request errored: Get \"http://localhost:11626/info\": context deadline exceeded" pid=306990 stack="[main.go:43 client.go:67 app.go:230 app.go:442 asm_amd64.s:1581]"
WARN[2022-07-05T15:17:31.648+01:00] error ticking app: context deadline exceeded pid=306990
INFO[2022-07-05T15:17:32.663+01:00] waiting for ingestion system catchup pid=306990 service=ingest status="{false false 0 41457386 41457386}"
INFO[2022-07-05T15:17:33.367+01:00] Processing ledger entry changes pid=306990 processed_entries=750000 progress="2.87%" sequence=41625919 service=ingest source=historyArchive
INFO[2022-07-05T15:17:34.663+01:00] waiting for ingestion system catchup pid=306990 service=ingest status="{false false 0 41457386 41457386}"
WARN[2022-07-05T15:17:36.648+01:00] could not load stellar-core info: http request errored: Get "http://localhost:11626/info": context deadline exceeded pid=306990
ERRO[2022-07-05T15:17:36.648+01:00] failed to load the stellar-core info err="http request errored: Get \"http://localhost:11626/info\": context deadline exceeded" pid=306990 stack="[main.go:43 client.go:67 app.go:230 app.go:442 asm_amd64.s:1581]"

How to check if registered variable has a particular pattern in ansible?

So I want to check if nginx syntax is ok
- name: check if syntax is ok on nginx
shell: nginx -t
register: result
then I want to use that with the service module with "when:" conditional to run the service module to reload nginx "when:" "result" has in it "syntax is ok".
Would something like this work?
when: result.stdout_lines == "syntax is ok"
You can use something like this,
when: result.stdout | search("syntax is ok")
It will search for the string "syntax is ok" in the result.stdout string, and, if it finds it then the condition will result in success and will run the desired task, if it results in failure then the task will be skipped.
I have added few more ways to achieve the same result,
when: result.stdout is search("syntax is ok")
Use regex_search
when: result.stdout | regex_search("syntax is ok")
And the one suggested by #TinaC can also be used.
Tested on ansible version 2.8.0
Test ansible playbook
---
- hosts: all
gather_facts: no
vars:
- error_msg: "nginx: [alert] could not open error log file: open() \"/var/log/nginx/error.log\" failed (13: Permission denied) 2019/09/12 16:19:28 [warn] 5343#0: the \"user\" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:5 nginx: the configuration file /etc/nginx/nginx.conf syntax is ok 2019/09/12 16:19:28 [emerg] 5343#0: open() \"/run/nginx.pid\" failed (13: Permission denied) nginx: configuration file /etc/nginx/nginx.conf test failed"
tasks:
- debug:
msg: "{{ error_msg }}"
- debug:
msg: "{{ error_msg }}"
when: error_msg | search("syntax is ok")
- debug:
msg: "{{ error_msg }}"
when: error_msg is search("syntax is ok")
- debug:
msg: "{{ error_msg }}"
when: error_msg | regex_search("syntax is ok")
Output
PLAY [all] ***********************************************************************************************************************************************
TASK [debug] *********************************************************************************************************************************************
ok: [192.168.100.101] => {
"msg": "nginx: [alert] could not open error log file: open() \"/var/log/nginx/error.log\" failed (13: Permission denied) 2019/09/12 16:19:28 [warn] 5343#0: the \"user\" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:5 nginx: the configuration file /etc/nginx/nginx.conf syntax is ok 2019/09/12 16:19:28 [emerg] 5343#0: open() \"/run/nginx.pid\" failed (13: Permission denied) nginx: configuration file /etc/nginx/nginx.conf test failed"
}
TASK [debug] *********************************************************************************************************************************************
ok: [192.168.100.101] => {
"msg": "nginx: [alert] could not open error log file: open() \"/var/log/nginx/error.log\" failed (13: Permission denied) 2019/09/12 16:19:28 [warn] 5343#0: the \"user\" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:5 nginx: the configuration file /etc/nginx/nginx.conf syntax is ok 2019/09/12 16:19:28 [emerg] 5343#0: open() \"/run/nginx.pid\" failed (13: Permission denied) nginx: configuration file /etc/nginx/nginx.conf test failed"
}
TASK [debug] *********************************************************************************************************************************************
ok: [192.168.100.101] => {
"msg": "nginx: [alert] could not open error log file: open() \"/var/log/nginx/error.log\" failed (13: Permission denied) 2019/09/12 16:19:28 [warn] 5343#0: the \"user\" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:5 nginx: the configuration file /etc/nginx/nginx.conf syntax is ok 2019/09/12 16:19:28 [emerg] 5343#0: open() \"/run/nginx.pid\" failed (13: Permission denied) nginx: configuration file /etc/nginx/nginx.conf test failed"
}
TASK [debug] *********************************************************************************************************************************************
ok: [192.168.100.101] => {
"msg": "nginx: [alert] could not open error log file: open() \"/var/log/nginx/error.log\" failed (13: Permission denied) 2019/09/12 16:19:28 [warn] 5343#0: the \"user\" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:5 nginx: the configuration file /etc/nginx/nginx.conf syntax is ok 2019/09/12 16:19:28 [emerg] 5343#0: open() \"/run/nginx.pid\" failed (13: Permission denied) nginx: configuration file /etc/nginx/nginx.conf test failed"
}
PLAY RECAP ***********************************************************************************************************************************************
192.168.100.101 : ok=4 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
When changed the string syntax is ok to syntax is not ok, below is the output,
PLAY [all] ***********************************************************************************************************************************************
TASK [debug] *********************************************************************************************************************************************
ok: [192.168.100.101] => {
"msg": "nginx: [alert] could not open error log file: open() \"/var/log/nginx/error.log\" failed (13: Permission denied) 2019/09/12 16:19:28 [warn] 5343#0: the \"user\" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:5 nginx: the configuration file /etc/nginx/nginx.conf syntax is not ok 2019/09/12 16:19:28 [emerg] 5343#0: open() \"/run/nginx.pid\" failed (13: Permission denied) nginx: configuration file /etc/nginx/nginx.conf test failed"
}
TASK [debug] *********************************************************************************************************************************************
skipping: [192.168.100.101]
TASK [debug] *********************************************************************************************************************************************
skipping: [192.168.100.101]
TASK [debug] *********************************************************************************************************************************************
skipping: [192.168.100.101]
PLAY RECAP ***********************************************************************************************************************************************
192.168.100.101 : ok=1 changed=0 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0
Try with:
when: "'syntax is ok' in result.stdout_lines"

Pact provider doesn't send verification to Pact Broker

I am new in Pact (consumer-driven testing) and gradle, I used this famous workshop to try Pact with Java and Pact Brocker https://github.com/Mikuu/Pact-JVM-Example, but never works the final part when provider sends the verification to the Pact Broker. It works manually via REST API, but using the project never sends the verifications. Please help to understand what happens (probably something is missing, some library or annotation?)
I attach the debug log when producer tries to send the verification to the Pact Broker (local Broker in docker using gradle with ./gradlew :example-provider:pactVerify). I guess that the body of POST request is missing.
14:22:59.469 [DEBUG] [org.apache.http.impl.execchain.MainClientExec] Opening connection {}->http://localhost:80
14:22:59.469 [DEBUG] [org.apache.http.impl.conn.DefaultHttpClientConnectionOperator] Connecting to localhost/127.0.0.1:80
14:22:59.470 [DEBUG] [org.apache.http.impl.conn.DefaultHttpClientConnectionOperator] Connection established 127.0.0.1:55770<->127.0.0.1:80
14:22:59.470 [DEBUG] [org.apache.http.impl.execchain.MainClientExec] Executing request POST /pacts/provider/ExampleProvider/consumer/JunitRuleMultipleInteractionsConsumer/pact-version/e66d465478e1934bca0ad9b905a2f83835af481d/verification-results HTTP/1.1
14:22:59.470 [DEBUG] [org.apache.http.impl.execchain.MainClientExec] Target auth state: UNCHALLENGED
14:22:59.470 [DEBUG] [org.apache.http.impl.execchain.MainClientExec] Proxy auth state: UNCHALLENGED
14:22:59.492 [DEBUG] [org.apache.http.impl.execchain.MainClientExec] Connection can be kept alive indefinitely
14:22:59.493 [DEBUG] [org.apache.http.impl.conn.DefaultManagedHttpClientConnection] http-outgoing-255: Close connection
14:22:59.493 [DEBUG] [org.apache.http.impl.execchain.MainClientExec] Connection discarded
14:22:59.493 [DEBUG] [org.apache.http.impl.conn.PoolingHttpClientConnectionManager] Connection released: [id: 255][route: {}->http://localhost:80][total kept alive: 0; route allocated: 0 of 5; total allocated: 0 of 10]
14:22:59.493 [DEBUG] [org.gradle.internal.progress.DefaultBuildOperationExecutor] Completing Build operation 'Execute verifyPact for :example-provider:pactVerify_ExampleProvider'
14:22:59.493 [DEBUG] [org.gradle.api.internal.tasks.execution.ResolveTaskArtifactStateTaskExecuter] Removed task artifact state for {} from context.
14:22:59.493 [DEBUG] [org.gradle.api.internal.tasks.execution.ExecuteAtMostOnceTaskExecuter] Finished executing task ':example-provider:pactVerify_ExampleProvider'
14:22:59.493 [DEBUG] [org.gradle.internal.progress.DefaultBuildOperationExecutor] Completing Build operation 'Task :example-provider:pactVerify_ExampleProvider'
14:22:59.493 [null] [org.gradle.internal.progress.DefaultBuildOperationExecutor]
14:22:59.493 [DEBUG] [org.gradle.internal.progress.DefaultBuildOperationExecutor] Build operation 'Task :example-provider:pactVerify_ExampleProvider' completed
14:22:59.493 [INFO] [org.gradle.execution.taskgraph.DefaultTaskPlanExecutor] :example-provider:pactVerify_ExampleProvider (Thread[Task worker for ':',5,main]) completed. Took 0.96 secs.
The pact example seems to be missing one important step: Adding the #PactBroker annotation
Below the #Pact annotation, there should be an annotation
#PactBroker(port = PORT_NUM , host = HOST_NAME)
You can find an example here at the official repo

vagrant hangs up after fixed port collision

I am trying to bring up vagrant on a Windows machine.
It hangs up after
Fixed port collision for 22 => 2222. Now on port 2200.
A part of the debug log is below:
DEBUG subprocess: Waiting for process to exit. Remaining to timeout: 32000
DEBUG subprocess: Exit status: 0
INFO warden: Calling IN action: #<Vagrant::Action::Builtin::HandleForwardedPortCollisions:0x44e7760>
DEBUG environment: Attempting to acquire process-lock: fpcollision
DEBUG environment: Attempting to acquire process-lock: dotlock
INFO environment: Acquired process lock: dotlock
INFO environment: Released process lock: dotlock
INFO environment: Acquired process lock: fpcollision
INFO handle_port_collisions: Detecting any forwarded port collisions...
DEBUG handle_port_collisions: Extra in use: []
DEBUG handle_port_collisions: Remap: {}
DEBUG handle_port_collisions: Repair: true
INFO handle_port_collisions: Attempting to repair FP collision: 2222
INFO handle_port_collisions: Repaired FP collision: 2222 to 2200
INFO interface: info: Fixed port collision for 22 => 2222. Now on port 2200.
INFO interface: info: ==> vlad: Fixed port collision for 22 => 2222. Now on port 2200.
==> vlad: Fixed port collision for 22 => 2222. Now on port 2200.
INFO environment: Released process lock: fpcollision
DEBUG environment: Attempting to acquire process-lock: dotlock
INFO environment: Acquired process lock: dotlock
INFO environment: Released process lock: dotlock
INFO warden: Calling IN action: #<VagrantPlugins::ProviderVirtualBox::Action::PrepareNFSValidIds:0x44431c8>
INFO subprocess: Starting process: ["C:/Program Files/Oracle/VirtualBox/VBoxManage.exe", "list", "vms"]
INFO subprocess: Command not in installer, restoring original environment...
DEBUG subprocess: Selecting on IO
DEBUG subprocess: stdout: "vlad_vlad" {efce349f-2b2e-40db-9a14-2298d3024638}
DEBUG subprocess: Waiting for process to exit. Remaining to timeout: 32000
DEBUG subprocess: Exit status: 0
INFO warden: Calling IN action: #<VagrantPlugins::SyncedFolderNFS::ActionCleanup:0x435c3a8>
DEBUG host: Searching for cap: nfs_prune
DEBUG host: Checking in: windows
INFO nfs: Host doesn't support pruning NFS. Skipping.
INFO warden: Calling IN action: #<Vagrant::Action::Builtin::SyncedFolderCleanup:0x4263ca8>
INFO subprocess: Starting process: ["C:\\Windows\\System32\\WindowsPowerShell\\v1.0/powershell.EXE", "-NoProfile", "-ExecutionPolicy", "Bypass", "$PSVersionTable.PSVersion.Major"]
INFO subprocess: Command not in installer, restoring original environment...
DEBUG subprocess: Selecting on IO
DEBUG subprocess: stdout: 2
I do not have the slightest idea of how to proceed. Any help is appreciated.
After powershell installation vagrant continued with the setup of the VM

Reducer stuck due to dead host

I noticed my reducer is stuck due to a dead host. On the logs, it's showing a lot of retry messages. Is it possible to tell job tracker to give up on the dead node and resume the work? There were 323 mappers and only 1 reducer. I am on hadoop-1.0.3.
2012-08-08 11:52:19,903 INFO org.apache.hadoop.mapred.ReduceTask: 192.168.1.23 Will be considered after: 65 seconds.
2012-08-08 11:53:19,905 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201207191440_0203_r_000000_0 Need another 63 map output(s) where 0 is already in progress
2012-08-08 11:53:19,905 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201207191440_0203_r_000000_0 Scheduled 0 outputs (1 slow hosts and0 dup hosts)
2012-08-08 11:53:19,905 INFO org.apache.hadoop.mapred.ReduceTask: Penalized(slow) Hosts:
2012-08-08 11:53:19,905 INFO org.apache.hadoop.mapred.ReduceTask: 192.168.1.23 Will be considered after: 5 seconds.
2012-08-08 11:53:29,906 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201207191440_0203_r_000000_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
2012-08-08 11:53:47,907 WARN org.apache.hadoop.mapred.ReduceTask: attempt_201207191440_0203_r_000000_0 copy failed: attempt_201207191440_0203_m_000001_0 from 192.168.1.23
2012-08-08 11:53:47,907 WARN org.apache.hadoop.mapred.ReduceTask: java.net.NoRouteToHostException: No route to host
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
at java.net.Socket.connect(Socket.java:546)
at sun.net.NetworkClient.doConnect(NetworkClient.java:173)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:409)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:530)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:240)
at sun.net.www.http.HttpClient.New(HttpClient.java:321)
at sun.net.www.http.HttpClient.New(HttpClient.java:338)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:935)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:876)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:801)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1618)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.setupSecureConnection(ReduceTask.java:1575)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1483)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1394)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1326)
2012-08-08 11:53:47,907 INFO org.apache.hadoop.mapred.ReduceTask: Task attempt_201207191440_0203_r_000000_0: Failed fetch #18 from attempt_201207191440_0203_m_000001_0
2012-08-08 11:53:47,907 WARN org.apache.hadoop.mapred.ReduceTask: attempt_201207191440_0203_r_000000_0 adding host 192.168.1.23 to penalty box, next contact in 1124 seconds
2012-08-08 11:53:47,907 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201207191440_0203_r_000000_0: Got 1 map-outputs from previous failures
2012-08-08 11:54:22,909 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201207191440_0203_r_000000_0 Need another 63 map output(s) where 0 is already in progress
2012-08-08 11:54:22,909 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201207191440_0203_r_000000_0 Scheduled 0 outputs (1 slow hosts and0 dup hosts)
2012-08-08 11:54:22,909 INFO org.apache.hadoop.mapred.ReduceTask: Penalized(slow) Hosts:
2012-08-08 11:54:22,909 INFO org.apache.hadoop.mapred.ReduceTask: 192.168.1.23 Will be considered after: 1089 seconds.
I leave it alone and it retried for a while then give up on the dead host and rerun the mapper and succeeded. It's caused by two ip addressed on the host and I intentionally turned off one ip which was the one hadoop use.
My question is whether there is a way to tell hadoop to give up the dead host without retrying.
From your log you can see that one of the tasktrackers which ran a map task can not be connected to. The tasktracker on which the reducer runs is trying to retrieve the map intermediate results through the HTTP protocol and it fails because the tasktracker having the results is dead.
The default behaviour for tasktracker failure is like this:
The jobtracker arranges for map tasks that were run and completed successfully on the failed tasktracker to be rerun if they belong to incomplete jobs, since their intermediate output residing on the failed tasktracker’s local filesystem may not be accessible to the reduce task. Any tasks in progress are also rescheduled.
The problem is that if a task(be it a map or a reduce) fails too many times (I think 4 times) it will not be rescheduled anymore and the job will fail.
In your case, the map seems to complete successfully but the reducer is unable to connect to the mapper and retrieve the intermediate results. It tries 4 times and after that the job fails.
A failed task, cannot be completely ignored, as it is part of the job and unless all tasks comprised by the job succeed, the job itself doesn't succeed.
Try to find the link the reducer is trying to access and copy it in a browser to see the error you get.
You can also blacklist and completely exclude a node from the list of nodes Hadoop uses:
In conf/mapred-site.xml
<property>
<name>mapred.hosts.exclude</name>
<value>/full/path/of/host/exclude/file</value>
</property>
To reconfigure nodes.
/bin/hadoop mradmin -refreshNodes

Resources