Until loop module fails with "dict object has no attribute state" - ansible

Here is my Ansible task
- name: wait until response has key word "PIPELINE_STATE_SUCCEEDED"
uri:
url: https://abcd.com/response
method: GET
register: ABCD
until: ABCD.json.state == "PIPELINE_STATE_SUCCEEDED"
retries: 30
delay: 600
When I run this script (total retries and delay adds up to 300 minutes for task to pass), after few retries, suddenly the scripts emits below error message and it breaks.
''dict object'' has no attribute ''state'''
I also tried decreasing delay number and increasing retries but still the same problem. I have several other task in the same playbook which uses similar module except size of delay is significantly less in those (total retries and delay adds up to around 30 minutes).
Any idea why this could be happening?

Related

Generate exactly 1 Flowfile

I'm using the GenerateFlowFile processor in Apache Nifi - When I activate it, I want the processor to create exactly 1 Flowfile.
Right now I use the REST API via Python to change the state to RUNNING, wait 0.5 seconds and change the state to STOPPED. This results in 1 FlowFile being added to the queue to the next processor.
I tested a bit and waiting for 1.5 seconds gives me 2 FlowFiles, 2.5 seconds gives me 3 FlowFiles - I'm guessing the processor generates one Flowfile each second it is running.
How can I ensure that exactly 1 Flowfile is being generated? The above method obviously is dependent on the network connection and roundtrip times. Worst case: the connection drops while I wait and I cannot stop the processor anymore and x Flowfiles are being generated.
My current configs are:
Settings:
Yield duration: 1 sec
Penalty Duration: 30sec
Bulletin Level: WARN
Scheduling:
Scheduling Strategy: CRON driven
Concurrent Tasks: 1
Run Schedule: * * * * * ?
Execution: All nodes
Run duration: 0ms
Properties:
File Size: 0B
Batch Size: 1
Data Format: Text
Unique FlowFiles: false
Custom Text: No value set
Character Set: UTF-8
Mime Type: No value set
You'll want to flag the GenerateFlowFile as Primary node only (assuming you have more than 1 node) to ensure each node is not generating its own FlowFile.
Set the Scheduling to Timer and whack the run schedule up to something like 604800 (1 week) - this means that it even if you leave the processor running, it's only going to run once a week - that should give you plenty time to fix a connectivity issue if your script can't connect to tell the processor to stop.
Keep concurrency at 1.

Ansible WinRM The maximum number of concurrent operations for this user has been exceeded

We are using Ansible playbooks to automate long running scripts on many systems within our network, Some of those systems are Windows 10 while the others are Windows 7. The long running operations are launched using the async mechanism and the ansible module async_status is used to poll the results of the tasks every 30 seconds.
- name: Long running operation
win_command:
cmd: cmd
_raw_params: python long_running_script.py
async: 2140000
poll: 0
register: async_sleeper
- name: Status poll
async_status: jid="{{ async_sleeper.ansible_job_id }}"
register: job_result
until: job_result.finished
retries: 100001
delay: 30
The windows 10 server have the following default configuration for WinRM:
MaxConcurrentOperations = 4294967295
MaxConcurrentOperationsPerUser = 1500
Every ~12.5 hours or so, the playbook errors out with "maximum number of concurrent operations for this user has been exceeded" and this corresponds neatly to 1500 / 30 (our poll interval)
But clearly async_status is not a concurrent operation. It is supposed to be a short-lived check whether the process is still running and should exit after. So at any given point the number of concurrent processes must not exceed 2. The Task manager on the client machine does not show any lingering processes. So what is happening? Does ConcurrentOperation refer to Count of operations and not really of concurrency? We know we can increase the quota but we do not want to do that on production systems without getting to the root of this problem.
It would help to know:
What do Concurrent Operation really mean?
What is the industry best practice to overcome this problem?
What has changed in Windows 10 that this error is not found in the other version of the OS?
We ran some experiments and it turns out that the value for MaxConcurrentOperationsPerUser is indeed a counter and does not have to be "Concurrent"
This behavior is different between Windows 7 where it behaves like what its name implies and Windows 10 where it is a counter.
So if we set the variable to 30, and have a long running operation that we poll the status every 30 seconds - then the operation will error in 15 minutes.
This issue may either be addressed in the future or fixed but leaving it here for anyone else who might face it.

Set infinite session timeout but limited per request timeout

I'm trying to quickly connect to a couple thousand sites (some up some down), but it seems setting aiohttp.ClientTimeout(total=60) and passing that in to ClientSession means there is only 60 seconds allowable in total for all sites. This means that after about a minute they all quickly fail with concurrent.futures._base.TimeoutError. I tried raising that timeout, which fixes that failure issue, but then the problem is that all of the threads end up getting hung on non responding sites for the entire length of it.
Is it possible to disable the total timeout, however have a per-request timeout of 60 seconds? (Edited) - There does seem to be a timeout parameter on session.get(...), however it seems that overrides the session timeout and causes the entire session to timeout upon expiration, and not just that request. If I set my ClientSession timeout to 600 but the session.get timeout to 15, all requests fail after 15 seconds
I want to be able to get through my full list of a couple thousand sites only waiting 60 second max for each connection, but have no total time limit. Is the only way to do this being to create a new session for each request?
timeout = aiohttp.ClientTimeout(total=60)
connector = aiohttp.TCPConnector(limit=40)
dummy_jar = aiohttp.DummyCookieJar()
async with aiohttp.ClientSession(connector=connector, timeout=timeout, cookie_jar=dummy_jar) as session:
for site in sites:
task = asyncio.ensure_future(make_request(session, site, connection_pool))
tasks.append(task)
await asyncio.wait(tasks)
connection_pool.close()

Ansible retry get attempt number

I am using a task that connects via ssh to a device. Latency is not always constant and sometimes when the prompt is not displayed in time the task fails.
Assuming it is possible to control the timeout value for this task is it possible to dynamically increase this timeout proportionally to the number of the attempt performed?
Something like this
- name: task_name
connection : local
task_module:
args...
timeout : 10 * "{{ attempt_number }}"
retries: 3
delay: 2
register: result
until: result | success
I don't think its possible to get the current attempt number while running the task, it's quite unclear why you're trying to achieve such thing.
Can you elaborate a little bit more?
Yes, it's possible, here are docs.
When you run a task with until and register the result as a variable, the registered variable will include a key called “attempts”, which records the number of the retries for the task.

Jmeter - I have run 2 test cases but result seems odd

I have run load testing for website but when I have increased no. of users , I can see throughput time seems increasing instead of decrease.
Test Case 1 :
No. of Threads : 15
Ramp up time : 450 [As I want to put delay of 30 seconds between 2 users]
Loop count : Forever
Scheduler : 1800 Seconds [As I want to run test for 30 minutes]
In Http requests I have added 10 pages and each request has constant timer with 30000 miliseconds as I need to put delay of 30 seconds between 2 requests.
Now When I see result of Aggregate Report , it shows me Throughput 3/min for each request.
Test Case 2 :
No. of Threads : 30
Ramp up time : 900 [As I want to put delay of 30 seconds between 2 users]
Loop count : Forever
Scheduler : 1800 Seconds [As I want to run test for 30 minutes]
In Http requests I have added 10 requests/pages and each request has constant timer with 30000 miliseconds as I need to put delay of 30 seconds between 2 requests.
Now When I see result of Aggregate Report , it shows me Throughput 6/min for each request.
I am confuse that how it is possible? If my users are increased from 15 to 30 then it should have more load on server and throughtput should decrease like 1/min or 2/min.
Please let me know what I am doing wrong here.
Throughput is no. of completions per unit time. (A completion can be a http request/db request in short anything that needs to be executed and needs >0 execution time.)
Ex. req per sec or req per min etc.
By definition of throughput in JMeter, it is calculated as total no. of requests/total time.
In your first case, no. of requests generated in 1800 seconds with 3 second delay in every request by 15 users are x. Thus throughput is x/30 i.e. 3 it means ~90 requests were generated (verify this from aggregate report or other reporter.)
In your second case, everything else is same but no. of users are doubled which creates ~double no. of requests in given time which is (1800 seconds)
Thus according to formula, no. of requests generated/total time.
Throughput in 2nd case = 2x/30 = 2*throughput in 1st case
Which is 6/min. (Correctly shown by JMeter.)
Key here is to check no. of requests generated in both cases.
I hope this clears your confusion. Let me know if you need further clarification. BTW "when I have increased no. of users , I can see throughput time seems increasing instead of decrease." is not always true.
Throughput increased by factor of 2.
Test Case 1: - 3 requests per minute - 1 request each 20 seconds
Test Case 2: - 6 requests per minute - 1 request each 10 seconds
As per JMeter Glossary:
Throughput is calculated as requests/unit of time. The time is calculated from the start of the first sample to the end of the last sample. This includes any intervals between samples, as it is supposed to represent the load on the server.
The formula is: Throughput = (number of requests) / (total time).
You may also be interested in the following plugins:
Server Hits Per Second
Transactions Per Second
or alternatively Loadosophia.org service which can convert your JMeter .jtl results files into easy-understandable professional load report

Resources