Is there a way to avoid managing selected Puppet resource properties? - windows

The Problem:
I'm trying to have Puppet manage some of the details of several scheduled tasks without managing whether those tasks are enabled. To that end, I declare scheduled_task resources without any explicit enabled attributes, with the intention of communicating that whether the tasks are enabled is not to be altered by Puppet. Puppet, however, is persistent in re-enabling the tasks on every run, just as if I had specified enabled => true for each of them. How can I make it stop doing that?
Already Tried:
I've considered setting the attributes for each datacenter via hiera, but the reality is that this makes failovers and switches more complicated than necessary. I don't want to change my puppet code every time that needs to happen. I also can't shut-off the puppet-agent runs. I need to maintain the integrity of our rolling deployment system.
Providers?
I've read a little about providers, seems like I can handle the behavior there. However, I'm having a hard time figuring out where there is (if any) documentation that explains how to use them to override specific resource properties.
Notify/Subscribe:
I've thought about using notify/subscribe to only set the triggers to enabled on creation. I'm not thinking this is the right solution, because it's not one resource subscribing to/notifying another, it's properties being set on a single resource. If there's some magical way of doing this or something similar, I'd love to know.
Bottom-line: I just need puppet to create disabled tasks, and let me turn them on/off without changing the state in subsequent runs.

Is there a way to ignore resource attribute defaults in Puppet entirely?
Only by explicitly declaring a value for every attribute of every resource.
But that doesn't seem to be the question you really wanted to ask. You seem to be exploring ways to assign attribute values without specifying them as literals in your resource declarations, and I guess you're looking for some kind of layered approach, with the bottom layer replacing or overriding types' and providers' built-in defaults.
As you thought, you could conceivably do this at least in part via providers. You would need to write a custom provider for your resource type and specify that it be used by each resource instance. But don't. This way is complicated to implement and confusing to anyone who later has to read your manifests (maybe including future you).
Notify / subscribe, on the other hand, are simply the wrong tools for the job. They do not do what you seem to think they do, or perhaps you just haven't thought that idea through.
I think you're selling Hiera short and / or inappropriately minimizing the complexity of the task. Probably a mixture of both -- I'm inclined to guess that Hiera can do more for you than you appreciate, but also that the complications you envision will manifest to some extent, in some form, no matter what you do.
Nevertheless, there is an approach that seems to match pretty closely what I think you want: resource default declarations (I link to the latest docs, but this feature is present, in substantially the same form, in every Puppet version released in at least the last nine years). Thus, you might cause all scheduled_task resources to be disabled unless you explicitly say otherwise by putting this resource default declaration at an appropriate place:
Scheduled_task {
enabled => false
}
When choosing where to put such a statement, do note that, unlike anything else in modern Puppet, resource default statements have dynamic scope. The manual discusses that in somewhat more detail.
Revisited:
In light of the clarification of the question, I'll clarify that resource types' providers are where resource management behavior lives. Therefore, if Puppet's behavior on target machines is not what you want in the event of an altogether missing property declaration, then modifying the provider or writing your own are pretty much your only alternatives. Of course, if you have a support contract then perhaps you don't have to do that in-house.
If you don't want to hack on providers -- an altogether reasonable position -- then you're left with managing the properties to their desired values. Supposing that you employ Hiera effectively, however, you should not need to modify any manifests to control which servers have their tasks enabled.

Related

Best approach to map global resources in terraform provider

I'm writing a terraform provider for a software, which has a large set of instance specific global configurations (approximately 300 of them). When you use the provider, you define your endpoint and credentials and then operate within this instance. What I'm struggling to decide is how exactly to manage this config. It's not a resource that is created or destroyed, so I'm not sure if creating a global_config resource would be the best approach. Since all the values will already have been initialised during the setup of the system and can only be overridden; the config cannot be destroyed; you can't have more than two config resources. Since you should be able to override all entries, it can't be a data source either.
I haven't managed to find any relevant documentation (or even similar examples) so far, so I would be very grateful, if someone could point me to anything relevant, or suggest how to best achieve this. Thanks.
Terraform's provider model is designed primarily for objects that Terraform itself can create or destroy. There is no built-in support for automatically "adopting" an existing object to be under Terraform's management, because Terraform generally assumes that each object is managed by exactly one declared resource instance and Terraform aims to preserve that assumption by being the one to have created the object.
However, there are some existing examples in other systems of this sort of "singleton" object that is implicitly created but can have its settings changed. Key examples for study are the resource types for default VPCs and their default public subnets in AWS.
There are currently two broad ways to represent this situation in Terraform, neither of which is perfect and so each of which has some advantages and disadvantages to consider:
Mandatory terraform import: you can potentially build your resource type so that its "create" action always immediately fails telling the user to import the existing object, and then to implement the "import" action to allow users to explicitly bind their existing object to their Terraform resource instance using the terraform import command.
This is the more explicit of the two options in that it requires the user to intentionally declare that the existing object should be managed by this Terraform configuration, in the same way that users normally take that action in Terraform. This means that the user remains in control and can (as they must always do when importing) take care to import that object into only one resource instance in one Terraform configuration, thereby preserving Terraform's uniqueness assumption.
However, it also adds a mandatory extra setup step to any Terraform configuration which uses this resource type. That extra step does not fit well into typical automation around Terraform, and so that step will often need to be taken in an exceptional way outside of a team's normal workflow.
Treat "create" as if it were "adopt": since the actions a provider is expected to implement for a resource type are just a matter convention, there's no technical reason that your "create" action cannot just verify that the configured object exists and return success without creating anything. I call that "adopting" here to represent the idea that Terraform will then assume that this existing object is now under the exclusive management of whatever resource instance claimed to have created it, but "adopting" is not actually a formal part of Terraform's workflow.
This has the advantage of fitting well into an existing Terraform workflow, requiring no unusual additional steps on the part of the operator.
However, it also means that it's easier to accidentally adopt the same object into two different resource instances, either in the same configuration or in separate configurations. The consequences of doing that will vary depending on what the object represents, but at minimum it will likely result in the different resource instances "fighting" one another, constantly undoing each other's work on each new Terraform run and thus never converging on a stable desired state.
The second of these is the more convenient of the two and so is the one that existing providers have typically chosen as long as the consequences of incorrect multiple-adoption are just the risk of a non-converging system: that situation is confusing and kinda annoying, but also often not super harmful.
The first is the safer of the two because it guards against the accidental multiple-adoption problem. It could be appropriate if two configurations fighting to control a single object may have more significant consequences, such as one configuration breaking the other one by changing its settings in a way that is invalid for the other use-case.

Does this "overly general" type of programming have a name?

Anyone who has experience with the Salesforce platform will know it can essentially be used as a backend for a lot of web applications. They let the end user define custom objects and the fields on those objects. So for instance, rather than having some entity as a strongly-typed class in the code, they have a generic "custom object", whose behaviour and data is defined by the fields you choose and the triggers and rules you apply to it. So they don't have to update the code, recompile and redeploy every time a user adds one (which, given they are a web service would be both impractical and cause serious downtime, a lot).
I was thinking how this could be implemented, and I think Salesforce may do it in a very complex way but I'm specifically thinking how I can implement this. So far I've come up with this:
An "object defintion", which contains all the metadata for a specific record type. Equivalent to a hardcoded class definition.
A generic "record", probably with some sort of dictionary/map tying values to field identifiers that exist in the object definition.
When operating on user data, both the record and the object defintion need to be in memory so that the integrity of the data can be checked. Behaviour normally provided by methods can be applied using some kind of trigger system (again, I'm using a Salesforce example here because it's the best example I know of) with defined actions/events.
This whole system seems very clunky, slow (without serious optimisation), and like it would be prone to problems which wouldn't plague 99% of software projects, so I'd like to learn more about it, but I have no idea where to start looking.
Is the idea I've laid out above already an existing paradigm and if so what is it called?
You have encountered the custom-fields. The design is to enable tenant specific fields against a fixed entity. Since multi-tenancy at the highest level demand That a single codebase / database be used for all tenants with the options to full Customization. This design is the best approach. The below link points to a patent That was granted for managing the custom-fields per tenant.
https://www.google.com/patents/US7779039

Extending functionality of existing program I don't have source for

I'm working on a third-party program that aggregates data from a bunch of different, existing Windows programs. Each program has a mechanism for exporting the data via the GUI. The most brain-dead approach would have me generate extracts by using AutoIt or some other GUI manipulation program to generate the extractions via the GUI. The problem with this is that people might be interacting with the computer when, suddenly, some automated program takes over. That's no good. What I really want to do is somehow have a program run once a day and silently (i.e. without popping up any GUIs) export the data from each program.
My research is telling me that I need to hook each application (assume these applications are always running) and inject a custom DLL to trigger each export. Am I remotely close to being on the right track? I'm a fairly experienced software dev, but I don't know a whole lot about reverse engineering or hooking. Any advice or direction would be greatly appreciated.
Edit: I'm trying to manage the availability of a certain type of professional. Their schedules are stored in proprietary systems. With their permission, I want to install an app on their system that extracts their schedule from whichever system they are using and uploads the information to a central server so that I can present that information to potential clients.
I am aware of four ways of extracting the information you want, both with their advantages and disadvantages. Before you do anything, you need to be aware that any solution you create is not guaranteed and in fact very unlikely to continue working should the target application ever update. The reason is that in each case, you are relying on an implementation detail instead of a pre-defined interface through which to export your data.
Hooking the GUI
The first way is to hook the GUI as you have suggested. What you are doing in this case is simply reading off from what an actual user would see. This is in general easier, since you are hooking the WinAPI which is clearly defined. One danger is that what the program displays is inconsistent or incomplete in comparison to the internal data it is supposed to be representing.
Typically, there are two common ways to perform WinAPI hooking:
DLL Injection. You create a DLL which you load into the other program's virtual address space. This means that you have read/write access (writable access can be gained with VirtualProtect) to the target's entire memory. From here you can trampoline the functions which are called to set UI information. For example, to check if a window has changed its text, you might trampoline the SetWindowText function. Note every control has different interfaces used to set what they are displaying. In this case, you are hooking the functions called by the code to set the display.
SetWindowsHookEx. Under the covers, this works similarly to DLL injection and in this case is really just another method for you to extend/subvert the control flow of messages received by controls. What you want to do in this case is hook the window procedures of each child control. For example, when an item is added to a ComboBox, it would receive a CB_ADDSTRING message. In this case, you are hooking the messages that are received when the display changes.
One caveat with this approach is that it will only work if the target is using or extending WinAPI controls.
Reading from the GUI
Instead of hooking the GUI, you can alternatively use WinAPI to read directly from the target windows. However, in some cases this may not be allowed. There is not much to do in this case but to try and see if it works. This may in fact be the easiest approach. Typically, you will send messages such as WM_GETTEXT to query the target window for what it is currently displaying. To do this, you will need to obtain the exact window hierarchy containing the control you are interested in. For example, say you want to read an edit control, you will need to see what parent window/s are above it in the window hierarchy in order to obtain its window handle.
Reading from memory (Advanced)
This approach is by far the most complicated but if you are able to fully reverse engineer the target program, it is the most likely to get you consistent data. This approach works by you reading the memory from the target process. This technique is very commonly used in game hacking to add 'functionality' and to observe the internal state of the game.
Consider that as well as storing information in the GUI, programs often hold their own internal model of all the data. This is especially true when the controls used are virtual and simply query subsets of the data to be displayed. This is an example of a situation where the first two approaches would not be of much use. This data is often held in some sort of abstract data type such as a list or perhaps even an array. The trick is to find this list in memory and read the values off directly. This can be done externally with ReadProcessMemory or internally through DLL injection again. The difficulty lies mainly in two prerequisites:
Firstly, you must be able to reliably locate these data structures. The problem with this is that code is not guaranteed to be in the same place, especially with features such as ASLR. Colloquially, this is sometimes referred to as code-shifting. ASLR can be defeated by using the offset from a module base and dynamically getting the module base address with functions such as GetModuleHandle. As well as ASLR, a reason that this occurs is due to dynamic memory allocation (e.g. through malloc). In such cases, you will need to find a heap address storing the pointer (which would for example be the return of malloc), dereference that and find your list. That pointer would be prone to ASLR and instead of a pointer, it might be a double-pointer, triple-pointer, etc.
The second problem you face is that it would be rare for each list item to be a primitive type. For example, instead of a list of character arrays (strings), it is likely that you will be faced with a list of objects. You would need to further reverse engineer each object type and understand internal layouts (at least be able to determine offsets of primitive values you are interested in in terms of its offset from the object base). More advanced methods revolve around actually reverse engineering the vtable of objects and calling their 'API'.
You might notice that I am not able to give information here which is specific. The reason is that by its nature, using this method requires an intimate understanding of the target's internals and as such, the specifics are defined only by how the target has been programmed. Unless you have knowledge and experience of reverse engineering, it is unlikely you would want to go down this route.
Hooking the target's internal API (Advanced)
As with the above solution, instead of digging for data structures, you dig for the internal API. I briefly covered this with when discussing vtables earlier. Instead of doing this, you would be attempting to find internal APIs that are called when the GUI is modified. Typically, when a view/UI is modified, instead of directly calling the WinAPI to update it, a program will have its own wrapper function which it calls which in turn calls the WinAPI. You simply need to find this function and hook it. Again this is possible, but requires reverse engineering skills. You may find that you discover functions which you want to call yourself. In this case, as well as being able to locate the location of the function, you have to reverse engineer the parameters it takes, its calling convention and you will need to ensure calling the function has no side effects.
I would consider this approach to be advanced. It can certainly be done and is another common technique used in game hacking to observe internal states and to manipulate a target's behaviour, but is difficult!
The first two methods are well suited for reading data from WinAPI programs and are by far easier. The two latter methods allow greater flexibility. With enough work, you are able to read anything and everything encapsulated by the target but requires a lot of skill.
Another point of concern which may or may not relate to your case is how easy it will be to update your solution to work should the target every be updated. With the first two methods, it is more likely no changes or small changes have to be made. With the second two methods, even a small change in source code can cause a relocation of the offsets you are relying upon. One method of dealing with this is to use byte signatures to dynamically generate the offsets. I wrote another answer some time ago which addresses how this is done.
What I have written is only a brief summary of the various techniques that can be used for what you want to achieve. I may have missed approaches, but these are the most common ones I know of and have experience with. Since these are large topics in themselves, I would advise you ask a new question if you want to obtain more detail about any particular one. Note that in all of the approaches I have discussed, none of them suffer from any interaction which is visible to the outside world so you would have no problem with anything popping up. It would be, as you describe, 'silent'.
This is relevant information about detouring/trampolining which I have lifted from a previous answer I wrote:
If you are looking for ways that programs detour execution of other
processes, it is usually through one of two means:
Dynamic (Runtime) Detouring - This is the more common method and is what is used by libraries such as Microsoft Detours. Here is a
relevant paper where the first few bytes of a function are overwritten
to unconditionally branch to the instrumentation.
(Static) Binary Rewriting - This is a much less common method for rootkits, but is used by research projects. It allows detouring to be
performed by statically analysing and overwriting a binary. An old
(not publicly available) package for Windows that performs this is
Etch. This paper gives a high-level view of how it works
conceptually.
Although Detours demonstrates one method of dynamic detouring, there
are countless methods used in the industry, especially in the reverse
engineering and hacking arenas. These include the IAT and breakpoint
methods I mentioned above. To 'point you in the right direction' for
these, you should look at 'research' performed in the fields of
research projects and reverse engineering.

`global` assertions?

Are there any languages with possibility of declaring global assertions - that is assertion that should hold during the whole program execution. So that it would be possible to write something like:
global assert (-10 < speed < 10);
and this assertion will be checked every time speed changes state?
eiffel supports all different contracts: precondition, postcondition, invariant... you may want to use that.
on the other hand, why do you have a global variable? why don't you create a class which modifies the speed. doing so, you can easily check your condition every time the value changes.
I'm not aware of any languages that truly do such a thing, and I would doubt that there exist any since it is something that is rather hard to implement and at the same time not something that a lot of people need.
It is often better to simply assert that the inputs are valid and modifications are only done when allowed and in a defined, sane way. This concludes the need of "global asserts".
You can get this effect "through the backdoor" in several ways, though none is truly elegant, and two are rather system-dependent:
If your language allows operator overloading (such as e.g. C++), you can make a class that overloads any operator which modifies the value. It is considerable work, but on the other hand trivial, to do the assertions in there.
On pretty much every system, you can change the protection of memory pages that belong to your process. You could put the variable (and any other variables that you want to assert) separately and set the page to readonly. This will cause a segmentation fault when the value is written to, which you can catch (and verify that the assertion is true). Windows even makes this explicitly available via "guard pages" (which are really only "readonly pages in disguise").
Most modern processors support hardware breakpoints. Unless your program is to run on some very exotic platform, you can exploit these to have more fine-grained control in a similar way as by tampering with protections. See for example this article on another site, which describes how to do it under Windows on x86. This solution will require you to write a kind of "mini-debugger" and implies that you may possibly run into trouble when running your program under a real debugger.

What are the drawbacks to merging the Task and Bug Work Items and only use one of them in TFS 2010?

I was thinking that I’d rather only use the Task Work Item and ignore the Bug Work Item. This is my thinking as I set things up for my team. I’m on a quest to see why I shouldn’t do this. From my perspective a Task is either a new item or a bug item. There is no need to use two distinct Work Item Types. To make this happen in TFS I’ll start with the Bug Work Item and create a custom field (“Item Type”) to distinguish the two task types: new/bug. Both new tasks and bugs will share the same fields. Anyone see any major drawbacks to this approach?
The main reason Tasks/Issue/Bugs/etc are different work items are because the individual fields of each work type can be configured differently.
For example, by default, Bugs have a Triage property, Issues have Due date, Tasks have a Discipline. The States of a Bug (Active/Closed/Resolve) are different from an issue (Active/Closed).
By merging them into a single work item type you would loose the ability to configure each one uniquely.
Also, the rules followed when a Bug and Task are closed, for example, are generally different. Segregating them into work items allows a simpler rules set.
Work item type is also a standard column in all queries.
Overall, it depends on how extensively you are using Team Foundation. If your project is small, and the above don't matter, it's not going to hurt. Though I don't see much gain either.
I would suggest keeping Bug and dropping Task if you want to merge them. By default when you check in code and Resolve with a bug, it sets the status to Resolved and assigns it to whoever created it - usually a tester, but in your case possibly a PM. That person can then test to confirm the work is done and close it. You can set up alerts on their work items so they get an email and know that progress has happened. Alternatively if you use Task, when you Resolve at check in it is just closed. No alerts, no further testing. YMMV but on some of our projects we use Bug for things like "user would like to add a new report" and it fits our process well. (For others we keep the distinction for reporting purposes.)
It all boils down to 3 things:
Creation / prioritization
Reporting / Notifications
Completion workflow
Typically creation of a Task involves different fields than a Bug. For a bug you'll want to know things like environment found in, who notified you, severity, priority, etc.
For tasks you usually want to know the requestor, reason behind it, business unit impacted and iteration it is scheduled for. Tasks might be long term goals that result in new or enhanced functionality.
Reporting and Notifications of the two are generally different as well. PM's are going to track tasks to ensure deliverables are met, your tech support area is going to track bugs.
Next, bugs will generally result in hotfixes and service packs. Depending on severity this this might involve a high priority push through QA and release as quickly as possible. Tasks are more laid back and will go through all forms of regression and regular testing with a period of acceptance by the impacted business unit.
Finally, bugs may impact previous versions of your software. Tasks will almost always be for either the version currently under development or the one after that.
In short, they are fundamentally different things. They might share most fields in common, however by combining them you are restricting yourself in both reporting and workflows. Today this might be okay; however within the next month or next year this could seriously restrict you.
Considering that maintainence of work item types is an incredibly easy thing there is almost no benefit to merging them.

Resources