Process Substitution in Ansible for Path-based Parameters

Process Substitution in Ansible for Path-based Parameters - ansible

Many Ansible modules are designed to accept file paths as a parameter, the but lack the possibility to supply the contents of the file directly. In cases where the input data actually comes from something other than a file, this forces one to create a temporary file somewhere on disk, write the intended parameter value into it and then supply the path of this temporary file to the Ansible module.
For illustration purposes a real life example: the java_cert Ansible module takes the parameter pkcs12_path for the path to a PKCS12 keystore containing a keypair to be imported into a given Java keystore. Now say for example this data is retrieved through a Vault lookup, so in order to be able to supply the module with the path it demands, we must write the Vault lookup result into a temporary file, use the file's path as the parameter and then handle the secure deletion of the temporary file, seeing as the data is likely confidential.
When a situation such as this arises within the context of Shell/bash scripting, namely a command line tool's flag only supporting interaction with a file, the magic of process substitution (e.g. --file=<(echo $FILE_CONTENTS)) allows for the tool's input and output data to be linked with other commands by transparently providing a named pipe that acts as if it were a (mostly) normal file on disk.
Within Ansible, is there any comparable mechanism to replace file-based parameters with more flexible constructs that allow for the usage of data from variables or other commands? If there is no built-in method to achieve this, are there maybe 3rd-party solutions that allow for it, or that simplify workflows like the one I described? For example something like a custom lookup plugin which is supplied with the file content data and then handles, transparently and in the background, the file management (i.e. creation, writing the data, and ultimately deletion) and provides the temporary path as its return value, without the user necessarily ever having to know it.
Exemplary usage of such a plugin could be:
...
pkcs_path: "{{ lookup('as_file', '-----BEGIN PRIVATE KEY-----...-----END PRIVATE KEY----- ') }}"
...
with the plugin then creating a file under e.g. /tmp/as_file.sg7N3bX containing the textual key from the second parameter and returning this file path as the lookup result. I am however unsure how exactly the continued management of the file (especially the timely deletion of sensitive data) could be realized in such a context.

Disclaimer:
I am (obviously!) the author of the below collection which was created as a reaction to the above question
The lookup plugin was not thoroughly tested and might fail with particular modules.
Since this was a pretty good idea and nothing existed, I decided to give it a try. This all ended up in a collection now called thoteam.var_as_file which is available in a github repo. I won't paste all files in this answer as they are all available in the mentioned repo with a full README documentation to install, test and use.
The global idea was the following:
Create a lookup plugin responsible for pushing new temporary files with a given content and returning a path to use them.
Clean-up the created files at the end of playbook run. For this step, I created a callback plugin which launches the cleanup action listening to v2_playbook_on_stats events.
I still have some concerns about concurrency (files yet to be cleaned are stored in a static json file on disk) and reliability (not sure that the stats stage happens in all situation, especially on crashes). I'm also not entirely sure using a callback for this is a good practice / best choice.
Meanwhile this was quite fun to code and it does the job. I will see if this work is used by other and might very well enhance all this in the next weeks (and if you have PRs to fix the already know issues, I'm happy to accept them).
Once installed and the callback plugin enabled (see https://github.com/ansible-ThoTeam/thoteam.var_as_file#installing-the-collection), the lookup can be used anywhere to get a file path containing the passed content. For example:
- name: Get a filename with the given content for later use
ansible.builtin.set_fact:
my_tmp_file: "{{ lookup('thoteam.var_as_file.var_as_file', some_variable) }}"
- name: Use in place in a module where a file is mandatory and you have the content in a var
community.general.java_cert:
pkcs12_path: "{{ lookup('thoteam.var_as_file.var_as_file', pkcs12_store_from_vault) }}"
cert_alias: default
keystore_path: /path/to/my/keystore.jks
keystore_pass: changeit
keystore_create: yes
state: present
These are the relevant parts of the two plugin files. I removed the ansible documentation vars (for conciseness) which you can find in the git repo directly if your wish.
plugins/lookup/var_as_file.py
from ansible.errors import AnsibleError
from ansible.plugins.lookup import LookupBase
from ansible.module_utils.common.text.converters import to_native
from ansible_collections.thoteam.var_as_file.plugins.module_utils.var_as_file import VAR_AS_FILE_TRACK_FILE
from hashlib import sha256
import tempfile
import json
import os
def _hash_content(content):
"""
Returns the hex digest of the sha256 sum of content
"""
return sha256(content.encode()).hexdigest()
class LookupModule(LookupBase):
created_files = dict()
def _load_created(self):
if os.path.exists(VAR_AS_FILE_TRACK_FILE):
with open(VAR_AS_FILE_TRACK_FILE, 'r') as jfp:
self.created_files = json.load(jfp)
def _store_created(self):
"""
serialize the created files as json in tracking file
"""
with open(VAR_AS_FILE_TRACK_FILE, 'w') as jfp:
json.dump(self.created_files, jfp)
def run(self, terms, variables=None, **kwargs):
'''
terms contains the content to be written to the temporary file
'''
try:
self._load_created()
ret = []
for content in terms:
content_sig = _hash_content(content)
file_exists = False
# Check if file was already create for this content and check it.
if content_sig in self.created_files:
if os.path.exists(self.created_files[content_sig]):
with open(self.created_files[content_sig], 'r') as efh:
if content_sig == _hash_content(efh.read()):
file_exists = True
ret.append(self.created_files[content_sig])
else:
os.remove(self.created_files[content_sig])
# Create / Replace the file
if not file_exists:
temp_handle, temp_path = tempfile.mkstemp(text=True)
with os.fdopen(temp_handle, 'a') as temp_file:
temp_file.write(content)
self.created_files[content_sig] = temp_path
ret.append(temp_path)
self._store_created()
return ret
except Exception as e:
raise AnsibleError(to_native(repr(e)))
plugins/callback/clean_var_as_file.py
from ansible.plugins.callback import CallbackBase
from ansible_collections.thoteam.var_as_file.plugins.module_utils.var_as_file import VAR_AS_FILE_TRACK_FILE
from ansible.module_utils.common.text.converters import to_native
from ansible.errors import AnsibleError
import os
import json
def _make_clean():
"""Clean all files listed in VAR_AS_FILE_TRACK_FILE"""
try:
with open(VAR_AS_FILE_TRACK_FILE, 'r') as jfp:
files = json.load(jfp)
for f in files.values():
os.remove(f)
os.remove(VAR_AS_FILE_TRACK_FILE)
except Exception as e:
raise AnsibleError(to_native(repr(e)))
class CallbackModule(CallbackBase):
''' This Ansible callback plugin cleans-up files created by the thoteam.var_as_file.var_as_file lookup '''
CALLBACK_VERSION = 2.0
CALLBACK_TYPE = 'utility'
CALLBACK_NAME = 'thoteam.var_as_file.clean_var_as_file'
CALLBACK_NEEDS_WHITELIST = False
# This one doesn't work for a collection plugin
# Needs to be enabled anyway in ansible.cfg callbacks_enabled option
CALLBACK_NEEDS_ENABLED = False
def v2_playbook_on_stats(self, stats):
_make_clean()
I'll be happy to get any feedback if your give it a try.

Related

Can I tell pylint about a specific param a decorator requires and have it not apply unused-argument?

I use pyinvoke which has a task decorator that works like this:
#task
def mycommand(
# MUST include context param even if its not used
ctx: Context
):
# Do stuff, but don't use ctx
Even if I don't use ctx I must include it for pyinvoke to work correctly. Pylint throws Unused argument 'ctx' Pylint(W0613:unused-argument).
From what I have read in GitHub issues it seems like it would be unreasonable to expect pylint to dig into decorators and figure them all out automatically.
I also don't want to turn off this pylint rule for the entire function.
Is there a way I can tell pylint that if the #task decorator is used do not apply the W0613 rule to the first argument of the function?

When there is code that is too dynamic and impossible to parse for pylint it's possible to create a "brain" i.e. a simpler version that will explain what the code does to astroid (the internal code representation of pylint). Generally this is what a pylint plugin does (for example pylint-django will do it for view function that need request, which is similar to your issue with ctx). Here's an example of brain for signal directly in astroid and the documentation. It's possible that a pylint plugin already exists so you don't have to do this yourself.

How can I access ansible variables inside an action plugin (without providing them as arguments)

I want to write an action plugin (specifically, a variation of 'assert') that logs the role calling the action plugin to file without including the role name as an argument to the plugin.
I can see (per this question) that "{{role_name}}" is a well-defined variable. But I have no idea how to access it in Python.
I don't want to have to do:
- name: example asset
custom_assert:
that: 1 > 0
msg: "Basic maths has broken"
role: "{{role_name}}"
I've tried out the following method (based on the email exchange here)
from ansible.inventory.manager import InventoryManager
from ansible import constants as C
inventory = InventoryManager(self._loader, C.DEFAULT_HOST_LIST)
return inventory.get_host(self._connection.host).vars
But all that I can access through there is some variables set in my hosts file - not the full range range of variables set with "register" or "setup" or known to ansible for other reasons (such as role_name).
(Additionally, I would like to access the task name as well - although the 'that' and 'msg' arguments nominally include all the info I need, I forsee benefits from being able to log the task name as well):

I think that the solution provided under Custom Ansible Callback not receiving group_vars/host_vars is what you are looking for.
Basically, you need to access the play variable manager as follows:
def v2_playbook_on_play_start(self, play):
variable_manager = play.get_variable_manager()
hostvars = variable_manager.get_vars()['hostvars']

Manually populate an ImageField

I have a models.ImageField which I sometimes populate with the corresponding forms.ImageField. Sometimes, instead of using a form, I want to update the image field with an ajax POST. I am passing both the image filename, and the image content (base64 encoded), so that in my api view I have everything I need. But I do not really know how to do this manually, since I have always relied in form processing, which automatically populates the models.ImageField.
How can I manually populate the models.ImageField having the filename and the file contents?
EDIT
I have reached the following status:
instance.image.save(file_name, File(StringIO(data)))
instance.save()
And this is updating the file reference, using the right value configured in upload_to in the ImageField.
But it is not saving the image. I would have imagined that the first .save call would:
Generate a file name in the configured storage
Save the file contents to the selected file, including handling of any kind of storage configured for this ImageField (local FS, Amazon S3, or whatever)
Update the reference to the file in the ImageField
And the second .save would actually save the updated instance to the database.
What am I doing wrong? How can I make sure that the new image content is actually written to disk, in the automatically generated file name?
EDIT2
I have a very unsatisfactory workaround, which is working but is very limited. This illustrates the problems that using the ImageField directly would solve:
# TODO: workaround because I do not yet know how to correctly populate the ImageField
# This is very limited because:
# - only uses local filesystem (no AWS S3, ...)
# - does not provide the advance splitting provided by upload_to
local_file = os.path.join(settings.MEDIA_ROOT, file_name)
with open(local_file, 'wb') as f:
f.write(data)
instance.image = file_name
instance.save()
EDIT3
So, after some more playing around I have discovered that my first implementation is doing the right thing, but silently failing if the passed data has the wrong format (I was mistakingly passing the base64 instead of the decoded data). I'll post this as a solution

Just save the file and the instance:
instance.image.save(file_name, File(StringIO(data)))
instance.save()
No idea where the docs for this usecase are.

You can use InMemoryUploadedFile directly to save data:
file = cStringIO.StringIO(base64.b64decode(request.POST['file']))
image = InMemoryUploadedFile(file,
field_name='file',
name=request.POST['name'],
content_type="image/jpeg",
size=sys.getsizeof(file),
charset=None)
instance.image = image
instance.save()

Is there any way to get Roo to accept StringIO objects in place of files?

I'm trying to write some unit tests which involves Roo reading Excel 2007 files. I have the Excel file in my unit test file as a hex string, which in turn is fed into a StringIO instance. I can't simply pass the StringIO object to Roo::Spreadsheet.open, since that function actually checks if the passed object is a File instance:
def open(file, options = {})
file = File === file ? file.path : file
# ...
and if it isn't, proceeds to assume it's a string. Manually specifying the extension doesn't help:
doc = Roo::Spreadsheet.open(file, extension: :xlsx)
Are there any clever ways of getting Roo to use the StringIO instance as a file?

It looks like this version of roo has support for this. Instead of checking explicitly if it's a File class, it checks in duck-typing style if it's a stream based on whether it responds to #seek. The relevant code is here and here.

How do you customize the identifier used by MinispadeFilter in rake-pipeline

Per this question: Setting up rake-pipeline for use with handlebars alongside Google App Engine
I'm using a MinispadeFilter as my dependency management system via rake-pipeline.
The weird thing I'm seeing is the coffeescript and handlebars files have their minispade identifier set to a tmp directory (I'm assuming, where the work is being done). screencast.com/t/wIXmREcreW
Is there a way to set that to a root path such that it is normalized? Likewise my js files, while not pointing to a tmp path, are pointing to the original assets path instead of the public path. I know its just an identifier, but should I expect them to reference the public path? screencast.com/t/k9kZNcPo

The MinispadeFilter is pretty dumb about generating module identifiers by default. It just names them after the path of the input files. You're seeing the tmp dirs in there from handlebars and coffeescript because the minispade filter is getting the module id from the place where the pipeline turns them into javascript.
The filter takes a :module_id_generator option which allows you to customize the generation of module ids. If you're not familiar with Ruby, this may be a little heavy for you, so bear with me. The module_id_generator option takes a Ruby proc, which is like an anonymous function in JS. The filter then takes this proc that you pass in and executes it for each input file, passing your proc a FileWrapper object representing the input file, and your proc should return a string that will be used as the module id for that file.
Here's a match block from one of my projects:
match "**/*.js" do
minispade :module_id_generator => proc { |input| input.path.sub(/lib\//, 'timelog/').sub(/\.js$/, '') }
concat "js/app.js"
end
The :module_id_generator is a proc which takes a FileWrapper named input and turns it into the module id I want. The input file's path is available as the path method on input. In this case, my JS files are in a lib/ directory, so I use Ruby's sub method to replace the beginning lib/ part of the path with timelog (the name of the project) then again to remove the .js extension. So a js file named lib/models.js would get a module id of timelog/models.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio