Best approach to map global resources in terraform provider - go

I'm writing a terraform provider for a software, which has a large set of instance specific global configurations (approximately 300 of them). When you use the provider, you define your endpoint and credentials and then operate within this instance. What I'm struggling to decide is how exactly to manage this config. It's not a resource that is created or destroyed, so I'm not sure if creating a global_config resource would be the best approach. Since all the values will already have been initialised during the setup of the system and can only be overridden; the config cannot be destroyed; you can't have more than two config resources. Since you should be able to override all entries, it can't be a data source either.
I haven't managed to find any relevant documentation (or even similar examples) so far, so I would be very grateful, if someone could point me to anything relevant, or suggest how to best achieve this. Thanks.

Terraform's provider model is designed primarily for objects that Terraform itself can create or destroy. There is no built-in support for automatically "adopting" an existing object to be under Terraform's management, because Terraform generally assumes that each object is managed by exactly one declared resource instance and Terraform aims to preserve that assumption by being the one to have created the object.
However, there are some existing examples in other systems of this sort of "singleton" object that is implicitly created but can have its settings changed. Key examples for study are the resource types for default VPCs and their default public subnets in AWS.
There are currently two broad ways to represent this situation in Terraform, neither of which is perfect and so each of which has some advantages and disadvantages to consider:
Mandatory terraform import: you can potentially build your resource type so that its "create" action always immediately fails telling the user to import the existing object, and then to implement the "import" action to allow users to explicitly bind their existing object to their Terraform resource instance using the terraform import command.
This is the more explicit of the two options in that it requires the user to intentionally declare that the existing object should be managed by this Terraform configuration, in the same way that users normally take that action in Terraform. This means that the user remains in control and can (as they must always do when importing) take care to import that object into only one resource instance in one Terraform configuration, thereby preserving Terraform's uniqueness assumption.
However, it also adds a mandatory extra setup step to any Terraform configuration which uses this resource type. That extra step does not fit well into typical automation around Terraform, and so that step will often need to be taken in an exceptional way outside of a team's normal workflow.
Treat "create" as if it were "adopt": since the actions a provider is expected to implement for a resource type are just a matter convention, there's no technical reason that your "create" action cannot just verify that the configured object exists and return success without creating anything. I call that "adopting" here to represent the idea that Terraform will then assume that this existing object is now under the exclusive management of whatever resource instance claimed to have created it, but "adopting" is not actually a formal part of Terraform's workflow.
This has the advantage of fitting well into an existing Terraform workflow, requiring no unusual additional steps on the part of the operator.
However, it also means that it's easier to accidentally adopt the same object into two different resource instances, either in the same configuration or in separate configurations. The consequences of doing that will vary depending on what the object represents, but at minimum it will likely result in the different resource instances "fighting" one another, constantly undoing each other's work on each new Terraform run and thus never converging on a stable desired state.
The second of these is the more convenient of the two and so is the one that existing providers have typically chosen as long as the consequences of incorrect multiple-adoption are just the risk of a non-converging system: that situation is confusing and kinda annoying, but also often not super harmful.
The first is the safer of the two because it guards against the accidental multiple-adoption problem. It could be appropriate if two configurations fighting to control a single object may have more significant consequences, such as one configuration breaking the other one by changing its settings in a way that is invalid for the other use-case.

Related

NiFi Controller Service Reuse and Schema Registry Architecture

When creating Apache NiFi controller services, I'm interested in hearing about when it makes sense to create new ones and when to re-share existing ones.
Currently I have a CsvReader and CSVRecordSetWriter at the root process group and they are reused heavily in child process groups. I have tried to set them up to be as dynamic and flexible as possible to cover the widest number of use cases possible. I am setting the Schema Text property in each currently like this:
Reader Schema Text: ${avro.schema:notNull():ifElse(${avro.schema}, ${avro.schema.reader})}
Writer Schema Text: ${avro.schema:notNull():ifElse(${avro.schema}, ${avro.schema.writer})}
A very common pattern I have is to map files with different fields from different sources into a common format (common schema). So one thought is to use the ConvertRecord or UpdateRecord processors with avro.schema.reader and avro.schema.writerattributes set to the input and output schemas. Then I would have the writer always set the avro.schema attribute so any time I read records again further along in a flow it would default to using avro.schema. This feels dirty to leave the reader and writer schema attributes hanging around. Is there a better way from an architecture standpoint? Why have tons of controller services hanging around at different levels? Aside from some settings that may need to be different for some use cases, am I missing anything?
Also curious in hearing about how others organize their schemas? I don't have a need to reuse them at disparate locations across different processor blocks or reference different versions so it seems like a waste to centralize them or maintain a schema registry server that will also require upgrades and maintenance when I can just use AvroSchemaRegistry.
In the end, I decided it made more sense to split the controller into two controllers. One for conversions from Schema A to Schema B and another for using the same avro.schema property as normal/default readers and writers do when adding new ones. This allows for explicitly choosing the right pattern at processor block configuration time rather than relying on the implicit configuration of a single processor. Plus you get the added benefit of not stopping all flows (just a subset) when you only need to tweak settings on one of those two patterns.

SSIS Get List of all OLE DB Destinations in Data Flow

I have several SSIS Packages that we use to load in data from multiple different OLE DB data sources into our DB. Inside each package we have several Data Flow tasks that hold a large amount of OLE DB Sources and Destinations. What I'm looking to do is see if there is a way to get a text output the holds all of the Destinations flow configurations (Sources would be good to but not top of my list).
I'm trying to make sure that all my OLE DB Destination flows are pointed at the right table, as I've found a few hiccups, without having to double click on each Flow task and check that way, it just becomes tedious and still prone to missing things.
I'm viewing the packages in Visual Studio 2013. Any help is appreciated!
I am not aware of any programmatic ways to discover this data, other than building an application to read the XML within the *.dtsx package. Best advice, pack a lunch and have at it. I know for sure that there is nothing with respect to viewing and setting database tables (only server connections).
Though, a solution I may add once you have determined the list: create a variable(s) to store the unique connection strings and then set those connection strings inside the source/destination components. This will make it easier to manage going forward. In fact, you can take it one step further by setting the same values as parameter, as opposed to variables, which have the added benefit of being exposed on the server. This allows either you or the DBA to set the values as you promote through environments or change server nodes.
Also, I recommend rationalizing that solution into smaller solutions if possible. In my opinion, there is nothing worse than one giant solution that tries to do it all. I am not sure if any of this is helpful, but for what its worth, I do hope it helps.
You can use the SSIS Object Model for your needs..An example can be found here. Look in the method IterateAllDestinationComponentnsInPackage for the exact details. To start understanding the code, start in the Start method and follow the path.
Caveats: Make sure you use the appropriate Monikers and Class IDs for the Data Flow Tasks and your Destination Components. You can also use this for other Control Flow Tasks and Data Flow Components (for example, Source Components as your other need seems to be). Just keep in mind the appropriate Monikers and Class IDs.

Is there a way to avoid managing selected Puppet resource properties?

The Problem:
I'm trying to have Puppet manage some of the details of several scheduled tasks without managing whether those tasks are enabled. To that end, I declare scheduled_task resources without any explicit enabled attributes, with the intention of communicating that whether the tasks are enabled is not to be altered by Puppet. Puppet, however, is persistent in re-enabling the tasks on every run, just as if I had specified enabled => true for each of them. How can I make it stop doing that?
Already Tried:
I've considered setting the attributes for each datacenter via hiera, but the reality is that this makes failovers and switches more complicated than necessary. I don't want to change my puppet code every time that needs to happen. I also can't shut-off the puppet-agent runs. I need to maintain the integrity of our rolling deployment system.
Providers?
I've read a little about providers, seems like I can handle the behavior there. However, I'm having a hard time figuring out where there is (if any) documentation that explains how to use them to override specific resource properties.
Notify/Subscribe:
I've thought about using notify/subscribe to only set the triggers to enabled on creation. I'm not thinking this is the right solution, because it's not one resource subscribing to/notifying another, it's properties being set on a single resource. If there's some magical way of doing this or something similar, I'd love to know.
Bottom-line: I just need puppet to create disabled tasks, and let me turn them on/off without changing the state in subsequent runs.
Is there a way to ignore resource attribute defaults in Puppet entirely?
Only by explicitly declaring a value for every attribute of every resource.
But that doesn't seem to be the question you really wanted to ask. You seem to be exploring ways to assign attribute values without specifying them as literals in your resource declarations, and I guess you're looking for some kind of layered approach, with the bottom layer replacing or overriding types' and providers' built-in defaults.
As you thought, you could conceivably do this at least in part via providers. You would need to write a custom provider for your resource type and specify that it be used by each resource instance. But don't. This way is complicated to implement and confusing to anyone who later has to read your manifests (maybe including future you).
Notify / subscribe, on the other hand, are simply the wrong tools for the job. They do not do what you seem to think they do, or perhaps you just haven't thought that idea through.
I think you're selling Hiera short and / or inappropriately minimizing the complexity of the task. Probably a mixture of both -- I'm inclined to guess that Hiera can do more for you than you appreciate, but also that the complications you envision will manifest to some extent, in some form, no matter what you do.
Nevertheless, there is an approach that seems to match pretty closely what I think you want: resource default declarations (I link to the latest docs, but this feature is present, in substantially the same form, in every Puppet version released in at least the last nine years). Thus, you might cause all scheduled_task resources to be disabled unless you explicitly say otherwise by putting this resource default declaration at an appropriate place:
Scheduled_task {
enabled => false
}
When choosing where to put such a statement, do note that, unlike anything else in modern Puppet, resource default statements have dynamic scope. The manual discusses that in somewhat more detail.
Revisited:
In light of the clarification of the question, I'll clarify that resource types' providers are where resource management behavior lives. Therefore, if Puppet's behavior on target machines is not what you want in the event of an altogether missing property declaration, then modifying the provider or writing your own are pretty much your only alternatives. Of course, if you have a support contract then perhaps you don't have to do that in-house.
If you don't want to hack on providers -- an altogether reasonable position -- then you're left with managing the properties to their desired values. Supposing that you employ Hiera effectively, however, you should not need to modify any manifests to control which servers have their tasks enabled.

Does this "overly general" type of programming have a name?

Anyone who has experience with the Salesforce platform will know it can essentially be used as a backend for a lot of web applications. They let the end user define custom objects and the fields on those objects. So for instance, rather than having some entity as a strongly-typed class in the code, they have a generic "custom object", whose behaviour and data is defined by the fields you choose and the triggers and rules you apply to it. So they don't have to update the code, recompile and redeploy every time a user adds one (which, given they are a web service would be both impractical and cause serious downtime, a lot).
I was thinking how this could be implemented, and I think Salesforce may do it in a very complex way but I'm specifically thinking how I can implement this. So far I've come up with this:
An "object defintion", which contains all the metadata for a specific record type. Equivalent to a hardcoded class definition.
A generic "record", probably with some sort of dictionary/map tying values to field identifiers that exist in the object definition.
When operating on user data, both the record and the object defintion need to be in memory so that the integrity of the data can be checked. Behaviour normally provided by methods can be applied using some kind of trigger system (again, I'm using a Salesforce example here because it's the best example I know of) with defined actions/events.
This whole system seems very clunky, slow (without serious optimisation), and like it would be prone to problems which wouldn't plague 99% of software projects, so I'd like to learn more about it, but I have no idea where to start looking.
Is the idea I've laid out above already an existing paradigm and if so what is it called?
You have encountered the custom-fields. The design is to enable tenant specific fields against a fixed entity. Since multi-tenancy at the highest level demand That a single codebase / database be used for all tenants with the options to full Customization. This design is the best approach. The below link points to a patent That was granted for managing the custom-fields per tenant.
https://www.google.com/patents/US7779039

VB6 Is it possible to implement the Singleton design pattern?

Within VB6 is it possible to implement the Singleton design pattern?
Currently the legacy system I work on has a large about of IO performed by multiple instances of a particular class. It is desirable to clean up all these instances and have the IO performed by one instance only. This would allow us to add meaningful logging and monitoring to the IO routines.
There are so many ways to do this, and it depends if this is a multi-project application with different dlls or a single project.
If it is single project and there is a large amount of code that you are worrying about chaning/breaking then I suggest the following:
Given a class clsIOProvider that is instantiated all over the place, create a module modIOProvider in the same project.
For each method / property defined in clsIOProvider, create the same set of methods in modIOProvider.
The implementation of those methods, as well as the instance data of the class, should be cloned from clsIOProvider to modIOProvider.
All methods and properties in clsIOProvider should be chnaged to forward to the implementation in modIOProvider. The class should no longer have an instance data.
(Optional) If the class requires the use of the constructor and destructor (Initialize/Terminate), forward those to modIOProvider as well. Add a single instnace counter within modIOProvider to track the number of instances. Run your initialzation code when the instance counter goes from 0 to 1, and your termination code when the instance counter goes from 1 to 0.
The advantage of this is that you do not have to change the coe in the scores of places that are utilizeing the clsIOProvider class. They are happily unaware that the object is now effectively a singleton.
If coding a proejct from scratch I would do this a bit differently, but as a refactoring appraoch wahat I've outlined should work well.
It's easy enough to only create and use one instance of an object. How you do it depends on your code what it does and where it's called from.
In one process, you can just have a single variable in a global module with an instance, maybe with a factory function that creates it on first use.
If it's shared by multiple process, it complicates things, but can be done with an ActiveX EXE and the Running Object Table.

Resources