How I run a specific task when playbook all other tasks completed? The problem is that this needs to be done in every playbook. Just adding to every playbook is not a good idea, I need to make it common for everyone. There is one common role in every playbook, but it works in the beginning. Is it possible to add a task to it that would start at the very end? Or some other way to do this, so that it is always done at the end without editing each playbook.
You could do it with writing a Callback Plugin. This is python code, that executes pre-defined Functions when an (ansible-internal) event occurs.
Interesting for you would be the v2_playbook_on_stats method, which is on of the last steps executed.
For this, please checkout the basic Developer Guidelines page of Ansible:
https://docs.ansible.com/ansible/latest/dev_guide/index.html
But more importantly the Plugins Guide:
https://docs.ansible.com/ansible/latest/dev_guide/developing_plugins.html
The basic structure as outlined in the document is:
from ansible.plugins.callback import CallbackBase
class CallbackModule(CallbackBase):
pass
They even provide a proper example executing the v2_playbook_on_stats method:
# Make coding more python3-ish, this is required for contributions to Ansible
from __future__ import (absolute_import, division, print_function)
__metaclass__ = type
# not only visible to ansible-doc, it also 'declares' the options the plugin requires and how to configure them.
DOCUMENTATION = '''
callback: timer
callback_type: aggregate
requirements:
- whitelist in configuration
short_description: Adds time to play stats
version_added: "2.0"
description:
- This callback just adds total play duration to the play stats.
options:
format_string:
description: format of the string shown to user at play end
ini:
- section: callback_timer
key: format_string
env:
- name: ANSIBLE_CALLBACK_TIMER_FORMAT
default: "Playbook run took %s days, %s hours, %s minutes, %s seconds"
'''
from datetime import datetime
from ansible.plugins.callback import CallbackBase
class CallbackModule(CallbackBase):
"""
This callback module tells you how long your plays ran for.
"""
CALLBACK_VERSION = 2.0
CALLBACK_TYPE = 'aggregate'
CALLBACK_NAME = 'namespace.collection_name.timer'
# only needed if you ship it and don't want to enable by default
CALLBACK_NEEDS_WHITELIST = True
def __init__(self):
# make sure the expected objects are present, calling the base's __init__
super(CallbackModule, self).__init__()
# start the timer when the plugin is loaded, the first play should start a few milliseconds after.
self.start_time = datetime.now()
def _days_hours_minutes_seconds(self, runtime):
''' internal helper method for this callback '''
minutes = (runtime.seconds // 60) % 60
r_seconds = runtime.seconds - (minutes * 60)
return runtime.days, runtime.seconds // 3600, minutes, r_seconds
# this is only event we care about for display, when the play shows its summary stats; the rest are ignored by the base class
def v2_playbook_on_stats(self, stats):
end_time = datetime.now()
runtime = end_time - self.start_time
# Shows the usage of a config option declared in the DOCUMENTATION variable. Ansible will have set it when it loads the plugin.
# Also note the use of the display object to print to screen. This is available to all callbacks, and you should use this over printing yourself
self._display.display(self._plugin_options['format_string'] % (self._days_hours_minutes_seconds(runtime)))
I want to also highlight the importance of the DOCUMENTATION string. I first thought, that this is only for generating the Doc help page. But no. Checkout this Example:
options:
format_string:
description: format of the string shown to user at play end
ini:
- section: callback_timer
key: format_string
env:
- name: ANSIBLE_CALLBACK_TIMER_FORMAT
default: "Playbook run took %s days, %s hours, %s minutes, %s seconds"
In there you have ini, env, and default sections, these are actually used to inject options into your Callback plugin using self._plugin_options['format_string'] or using self.get_option("format_string") for a list of all callback methods that can be overriden, please refer to https://github.com/ansible/ansible/blob/devel/lib/ansible/plugins/callback/init.py
For you the methods starting with v2_ are interesting, because these are for Ansible 2+.
Checkout https://github.com/ansible/ansible/tree/devel/lib/ansible/plugins/callback for more examples.
But it seems, that they are cleaning up quite a lot at the moment.
Therefore, I would say, please checkout a Version Tag, like:
https://github.com/ansible/ansible/tree/v2.9.6/lib/ansible/plugins/callback
Hope this helps.
Related
With python 3.10 and latest version of huggingface.
for simple code likes this
from transformers import pipeline
input_list = ['How do I test my connection? (Windows)', 'how do I change my payment method?', 'How do I contact customer support?']
classifier = pipeline('sentiment-analysis')
results = classifier(input_list)
the program hangs and returns error messages:
File ".......env/lib/python3.10/multiprocessing/spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
but replace the list input with a string, it works
from transformers import pipeline
classifier = pipeline('sentiment-analysis')
result = classifier('How do I test my connection? (Windows)')
It needs to define a main function to run multitask that the list input depends on. Following update works
from transformers import pipeline
def main():
input_list = ['How do I test my connection? (Windows)',
'how do I change my payment method?',
'How do I contact customer support?']
classifier = pipeline('sentiment-analysis')
results = classifier(input_list)
if __name__ == '__main__':
main()
The question is reduced to where to put freeze_support() in a Python script?
I'm working on a little proof of concept about Airflow on Google Cloud.
Essentially, I want to create a workflow that download data from an REST API (https), transform this data into JSON format and upload it on a Google Cloud storage unit.
I've already done this with pure Python code and it works. Pretty straightforward! But because I want to schedule this and there is some dependencies, Airflow should be the ideal tool for this.
After careful reading of the Airflow documentation, I've seen the HttpOperator and/or HttpHook can do the trick for the download part.
I've created my Http connection into the WebUI with my email/password for the authorization as the following:
{Conn Id: "atlassian_marketplace", Conn Type: "HTTP", Host: "https://marketplace.atlassian.com/rest/2", Schema: None/Blank, Login: "my username", Password: "my password", Port: None/Blank, Extra: None/Blank}
First question:
-When to use the SimpleHttpOperator versus the HttpHook?
Second question:
-How do we use SimpleHttpOperator or HttpHook with HTTPs calls?
Third question:
-How do we access the data returned by the API call?
In my case, the XCOM feature will not do the trick because these API calls can return a lot of data (100-300mb)!
I've look on Google to find an example code on how to use the operaor/hook for my use case but i didn't find anything useful, yet.
Any ideas?
I put here the skeleton of my code so far.
# Usual Airflow import
# Dag creation
dag = DAG(
'get_reporting_links',
default_args=default_args,
description='Get reporting links',
schedule_interval=timedelta(days=1))
# Task 1: Dummy start
start = DummyOperator(task_id="Start", retries=2, dag=dag)
# Task 2: Connect to Atlassian Marketplace
get_data = SimpleHttpOperator(
http_conn_id="atlassian_marketplace",
endpoint="/vendors/{vendorId}/reporting".format({vendorId: "some number"}),
method="GET")
# Task 3: Save JSON data locally
# TODO: transform_json: transform to JSON get_data.json()?
# Task 4: Upload data to GCP
# TODO: upload_gcs: use Airflow GCS connection
# Task 5: Stop
stop = DummyOperator(task_id="Stop", retries=2, dag=dag)
# Dependencies
start >> get_data >> transform_json >> upload_gcs >> stop
Look at the following example:
# Usual Airflow import
# Dag creation
dag = DAG(
'get_reporting_links',
default_args=default_args,
description='Get reporting links',
schedule_interval=timedelta(days=1))
# Task 1: Dummy start
start = DummyOperator(task_id="Start", retries=2, dag=dag)
# Task 2: Connect to Atlassian Marketplace
get_data = SimpleHttpOperator(
task_id="get_data",
http_conn_id="atlassian_marketplace",
endpoint="/vendors/{vendorId}/reporting".format({vendorId: "some number"}),
method="GET",
xcom_push=True,
)
def transform_json(**kwargs):
ti = kwargs['ti']
pulled_value_1 = ti.xcom_pull(key=None, task_ids='get_data')
...
# transform the json here and save the content to a file
# Task 3: Save JSON data locally
save_and_transform = PythonOperator(
task_id="save_and_transform",
python_callable=transform_json,
provide_context=True,
)
# Task 4: Upload data to GCP
upload_to_gcs = FileToGoogleCloudStorageOperator(...)
# Task 5: Stop
stop = DummyOperator(task_id="Stop", retries=2, dag=dag)
# Dependencies
start >> get_data >> save_and_transform >> upload_to_gcs >> stop
Using Discord.py-rewrite, How can we diagnose my_background_task to find the reason why its print statement is not printing every 3 seconds?
Details:
The problem that I am observing is that "print('inside loop')" is printed once in my logs, but not the expected 'every three seconds'. Could there be an exception somewhere that I am not catching?
Note: I do see print(f'Logged in as {bot.user.name} - {bot.user.id}') in the logs so on_ready seems to work, so that method cannot be to blame.
I tried following this example: https://github.com/Rapptz/discord.py/blob/async/examples/background_task.py
however I did not use its client = discord.Client() statement because I think I can achieve the same using "bot" similar to as explained here https://stackoverflow.com/a/53136140/6200445
import asyncio
import discord
from discord.ext import commands
token = open("token.txt", "r").read()
def get_prefix(client, message):
prefixes = ['=', '==']
if not message.guild:
prefixes = ['=='] # Only allow '==' as a prefix when in DMs, this is optional
# Allow users to #mention the bot instead of using a prefix when using a command. Also optional
# Do `return prefixes` if u don't want to allow mentions instead of prefix.
return commands.when_mentioned_or(*prefixes)(client, message)
bot = commands.Bot( # Create a new bot
command_prefix=get_prefix, # Set the prefix
description='A bot for doing cool things. Commands list:', # description for the bot
case_insensitive=True # Make the commands case insensitive
)
# case_insensitive=True is used as the commands are case sensitive by default
cogs = ['cogs.basic','cogs.embed']
#bot.event
async def on_ready(): # Do this when the bot is logged in
print(f'Logged in as {bot.user.name} - {bot.user.id}') # Print the name and ID of the bot logged in.
for cog in cogs:
bot.load_extension(cog)
return
async def my_background_task():
await bot.wait_until_ready()
print('inside loop') # This prints one time. How to make it print every 3 seconds?
counter = 0
while not bot.is_closed:
counter += 1
await bot.send_message(channel, counter)
await channel.send(counter)
await asyncio.sleep(3) # task runs every 3 seconds
bot.loop.create_task(my_background_task())
bot.run(token)
[]
From a cursory inspection, it would seem your problem is that you are only calling it once. Your method my_background_task is not called once every three seconds. It is instead your send_message method that is called once every three seconds. For intended behavior, place the print statement inside your while loop.
Although I am using rewrite, I found both of these resources helpful.
https://github.com/Rapptz/discord.py/blob/async/examples/background_task.py
https://github.com/Rapptz/discord.py/blob/rewrite/examples/background_task.py
I wrote a basic program to test the ruby metriks gem
require 'metriks'
require 'metriks/reporter/logger'
#registry = Metriks::Registry.new
#logger = Logger.new('/tmp/metrics.log')
#reporter = Metriks::Reporter::Logger.new(:logger => #logger)
#reporter.start
#registry.meter('tasks').mark
print "Hello"
#registry.meter('tasks').mark
#reporter.stop
After i execute the program, there is nothing in the log other than it got created.
$ cat /tmp/metrics.log
# Logfile created on 2015-06-15 14:23:40 -0700 by logger.rb/44203
You should either pass in your own registry while instantiating Metriks::Reporter::Logger or use the deafult registry (Metrics::Resgitry.default) if you are using a logger to log metrics.
Also the default log write interval is 60 seconds, your code completes before that so even if everything is setup okay it won't get recorded. So, since you want to use your own registry, this should work for you (I'm adding a little sleep since I'm gonna use an interval of 1 second) :
require 'metriks'
require 'metriks/reporter/logger'
#registry = Metriks::Registry.new
#logger = Logger.new('/tmp/metrics.log')
#reporter = Metriks::Reporter::Logger.new(:logger => #logger,
:registry => #registry
:interval => 1)
#reporter.start
#registry.meter('tasks').mark
print "Hello"
#registry.meter('tasks').mark
# Just giving it a little time so the metrics will be recorded.
sleep 2
#reporter.stop
But I don't really think short intervals are good.
UPDATE : Also I think #reporter.write will help you write down the logs instantly regardless of the time interval. So you don't have to use sleep (better).
i want to make a link that is valid only for 24 hours, this is for a validation purpose, so my question is simple:
How do i make this link valid only for this time; i've a hint:
Get the epoch time.
Make a link using only this value: something.com/time/1359380374
When the user clics on the link, extract this value and compare.
I hear about Hash values? why? we cant get the time from the hash value (invert the process) so how this is done?
Your best bet is to have the users email send as an argument and then query the database to see if their link has expired:
Requested link query: update users set locked_stamp = now();
Request url: http://yourdomain.com/?email=useremail
Query: select true from users where email = '$email' and locked_stamp > now()-interval 1 hour and now() limit 1
Result: You have a person requesting within the hour with email: $email.
I have a script that using base64 to encode the timestamp... but its not secure by any means.
import tornado.web
import base64, re, time
import sys
def get_time():
"""Method used to get the current time in b64"""
return base64.b64encode(str(int(time.mktime(time.localtime()))))
class WebHandler(tornado.web.RequestHandler):
def get(self, _time):
timecheck = base64.b64decode(_time)
try:
#require it to be all digits
assert re.match('^\d+$', timecheck) is not None
# Must be within 1 hour: greater then 1 hour ago and less then now
assert int(timecheck) > int(time.mktime(time.localtime()))-3600 and \
int(timecheck) < int(time.mktime(time.localtime()))
except AssertionError:
raise tornado.web.HTTPError(401,'Woops! Unauthorized.')
else:
self.write('Pass')
# Route
application = tornado.web.Application([
(r"/([^\/]+)/?", WebHandler),
])
if __name__ == "__main__":
application.listen(8889)
tornado.ioloop.IOLoop.instance().start()
the same way it sets secure cookies:
signed_message = self.create_signed_value(secret, name, value)
Then you can check it:
message = self.decode_signed_value(secret, name, value, max_age_days=31, clock=None,min_version=None)
Secret should be a long random number, but you only need one per app. min_version could be DEFAULT_SIGNED_VALUE_VERSION (which is currently 2).
Don't roll your own solution. Use the one in the library. It's there. It works.