Significance of RECORD_ACTION and RECORD_IS_ROOT in TIBCO MDM - tibco

How reliable is RECORD_ACTION and RECORD_IS_ROOT in rulebase constraints? In our application, we have a couple of business requirements which are driven by these values. But recently it was noticed that there are occurrences where these variables become null during a workflow execution which leads to nullifying the entire rulebase constraint. Do we have any alternatives for these?

RECORD_ACTION and RECORD_IS_ROOT are system variable in TIBCO MDM.
User does not have control on these variables.
So, will suggest to use user define attribute for any Rulebase operations.

Related

How to validate from aggregate

I am trying to understand validation from the aggregate entity on the command side of the CQRS pattern when using eventsourcing.
Basically I would like to know what the best practice is in handling validation for:
1. Uniqueness of say a code.
2. The correctness/validation of an eternal aggregate's id.
My initial thoughts:
I've thought about the constructor passing in a service but this seems wrong as the "Create" of the entity should be the values to assign.
I've thought about validation outside the aggregate, but this seem to put logic somewhere that I assume should be responsibility of the aggregate itself.
Can anyone give me some guidance here?
Uniqueness of say a code.
Ensuring uniqueness is a specific example of set validation. The problem with set validation is that, in effect, you perform the check by locking the entire set. If the entire set is included within a single "aggregate", that's easily done. But if the set spans aggregates, then it is kind of a mess.
A common solution for uniqueness is to manage it at the database level; RDBMS are really good at set operations, and are effectively serialized. Unfortunately, that locks you into a database solution with good set support -- you can't easily switch to a document database, or an event store.
Another approach that is sometimes appropriate is to have the single aggregate check for uniqueness against a cached copy of the available codes. That gives you more freedom to choose your storage solution, but it also opens up the possibility that a data race will introduce the duplication you are trying to avoid.
In some cases, you can encode the code uniqueness into the identifier for the aggregate. In effect, every identifier becomes a set of one.
Keep in mind Greg Young's question
What is the business impact of having a failure?
Knowing how expensive a failure is tells you a lot about how much you are permitted to spend to solve the problem.
The correctness/validation of an eternal aggregate's id.
This normally comes in two parts. The easier one is to validate the data against some agreed upon schema. If our agreement is that the identifier is going to be a URI, then I can validate that the data I receive does satisfy that constraint. Similarly, if the identifier is supposed to be a string representation of a UUID, I can test that the data I receive matches the validation rules described in RFC 4122.
But if you need to check that the identifier is in use somewhere else? Then you are going to have to ask.... The main question in this case is whether you need the answer to that right away, or if you can manage to check that asynchronously (for instance, by modeling "unverified identifiers" and "verified identifiers" separately).
And of course you once again get to reconcile all of the races inherent in distributed computing.
There is no magic.

Is there a process for munging data from many different formats in RapidMiner?

I'm trying to help my team streamline a data ingestion process that is taking up a substantial amount of time. We receive data in multiple formats and with attributes arranged differently. Is there a way using RapidMiner to create a process that:
Processes files on a schedule that are dropped into a folder (this
one I think I know but I'd love tips on this as scheduled processes
are new to me)
Automatically identifies input filetype and routes to the correct operator ("Read CSV" for example)
Recognizes a relatively small number of attributes and arranges them accordingly. In some cases, attributes are named the same way as our ingestion format and in others they are not (phone vs phone # vs Phone for example)
The attributes we process mostly consist of name, id, phone, email, address. Also, in some cases names are split first/last and in some they are full name.
I recognize that munging files for such simple attributes shouldn't be that hard but the number of files we receive and lack of order makes it very difficult to streamline a process without a bit of automation. I'm also going to move to a standardized receiving format but for a number of reasons that's on the horizon and not an immediate solution.
I appreciate any tips or guidance you can share.
Your question is relative broad, so unfortunately I can't give you complete answer. But here are some ideas on how I would tackle the points you mentioned:
For a full process scheduling RapidMiner Server is what you are
looking for. In that case you can either define a schedule (e.g.,
check regularly for new files) or even define a web service to
trigger the process.
For selecting the correct operator depending on file type, you could
use a combination of "Loop Files" and macro extraction to get the
correct type and the use either "Branch" or "Select Subprocess" for
switching to different input routes.
The "Select Attributes" operator has some very powerful options to
select specific subsets only. In your example I would go for a
regular expression akin to [pP]hone.* to get the different spelling
variants. Also very helpful in that case would be the "Reorder
Attributes" operator and "Rename by Replacing" to create a common
naming schema.
A general tip when building more complex process pipelines is to organize your different tasks in sub-processes and use the "Execute Process" operator. This makes everything much more readable and maintainable. Also a good error handling strategy is important to handle unforeseen data formats.
For more elaborate answers and tips from many adavanced RapidMiner users, I also highly recommend the RapidMiner community.
I hope this gives a good starting point for your project.

Does this "overly general" type of programming have a name?

Anyone who has experience with the Salesforce platform will know it can essentially be used as a backend for a lot of web applications. They let the end user define custom objects and the fields on those objects. So for instance, rather than having some entity as a strongly-typed class in the code, they have a generic "custom object", whose behaviour and data is defined by the fields you choose and the triggers and rules you apply to it. So they don't have to update the code, recompile and redeploy every time a user adds one (which, given they are a web service would be both impractical and cause serious downtime, a lot).
I was thinking how this could be implemented, and I think Salesforce may do it in a very complex way but I'm specifically thinking how I can implement this. So far I've come up with this:
An "object defintion", which contains all the metadata for a specific record type. Equivalent to a hardcoded class definition.
A generic "record", probably with some sort of dictionary/map tying values to field identifiers that exist in the object definition.
When operating on user data, both the record and the object defintion need to be in memory so that the integrity of the data can be checked. Behaviour normally provided by methods can be applied using some kind of trigger system (again, I'm using a Salesforce example here because it's the best example I know of) with defined actions/events.
This whole system seems very clunky, slow (without serious optimisation), and like it would be prone to problems which wouldn't plague 99% of software projects, so I'd like to learn more about it, but I have no idea where to start looking.
Is the idea I've laid out above already an existing paradigm and if so what is it called?
You have encountered the custom-fields. The design is to enable tenant specific fields against a fixed entity. Since multi-tenancy at the highest level demand That a single codebase / database be used for all tenants with the options to full Customization. This design is the best approach. The below link points to a patent That was granted for managing the custom-fields per tenant.
https://www.google.com/patents/US7779039

design an extendable database model

Currently I'm doing a project whose specifications are unclear - well who doesn't. I wonder what's the best development strategy to design a DB, that's going to be extended sooner or later with additional tables and relations. I want to include "changeability".
My main concern is that I want to apply design patterns (it's a university project) and I want to separate the constant factors from those, that change by choosing appropriate design patterns - in my case MVC and a set of sub-patterns at model level.
When it comes to the DB however, I may have to resdesign my model in my MVC approach, because my domain model at a later stage my require a different set of classes representing the DB tables. I use Hibernate as an abstraction layer between DB and application.
Would you start with a very minimal DB, just a few tables and relations? And what if I want an efficient DB, too? I wonder what strategies are applied in the real world. Stakeholder analysis for example isn't a sufficient planing solution when it comes to changing requirements. I think - at a DB level - my design pattern ends. So there's breach whose impact I'd like to minimize with a smart strategy.
In unclear situations I prefer a minimalistic DB design, supporting the needs known right now. My experience is that any effort to be clever, to model for future needs makes the model more complex. When the new needs arise, they are often in unforseen areas. The extra modeling for future needs doesn't fit the new needs, but rather makes the needed refactoring even harder.
As you already have chosen Hibernate to be able to decouple the DB design and the OO model, I think that sticking with an as simple DB as possible is a good choice.
What you describe is typical for almost every project. There are a few things you can do however.
Try to isolate the concepts (not their realizations) of your problem domain. Remember: Extending a data model is almost always easy (add a new table, a new column etc.) but changing your data model is hard and requires data migration.
I advocate using an Agile development process: Implement only what you need right now, but make sure you understand the complete problem before modeling it.
Another thing you should check before starting to hack away your code is wether your chosen infrastructure is appropriate. Using a relational database when you want to change your schema's very often is usually a bad match. Document databases are schema-less and hence more flexible. I think you should evaluate wether using a relational database is really appropriate for you application.
"Currently I'm doing a project whose specifications are unclear"
Given the 'database' tag, I assume you are asking this question in a database context.
Remember that a database is a set of ASSERTIONS OF FACT (capitalization intended).
If it is unclear what kind of "assertions of fact" your user wants to be registered by the database, then you simply cannot define (the structure of) your database.
And you will be helping both yourself and your user by first trying to clear out everything that is unclear.
In a simple answer: BE MINIMALISTIC.
Try to figure out the main entities. Don´t worry about the properties, you will fill them later. Then, create the relations between the entities. Create a test application using wour favorite ORM (Hibernate?), build some unit tests, and voilà, you have your minimal DB operational. :)
No project begins with requirements entirely known and fixed for all time. Use an agile, iterative approach to the database design so you that you can accommodate change during development.
All database designs are extensible and subject to change during their lifetime. Don't try to avoid change. Just make sure you have the right people and processes in place to manage change effectively when it happens.

What is a use-case? How to identify a use-case? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
The question is quite generic. What are the points that should be kept in mind to identify a valid use case? How to approach a use case?
A use case identifies, with specificity, a task or goal that users will be able to accomplish using a program. It should be written in terms that users can understand.
Wikipedia's description is overly formal. I'll dig through my other texts shortly.
In contrast, the original wiki's article is much more accessible.
An early article by Alastair Cockburn, cited positively by The Pragmatic Programmer, contains a good template.
This question, from just a few days ago, is very closely related, but slightly more specific.
The definition of use case is simple:
An actor's interactions with a system to create something of business value.
More formally:
a sequence of transactions performed
by a system that yield a measurable
set of values for a particular actor.
They're intended to be very simple: Actor, Interaction, Value. You can add some details, but not too many.
Using use cases is easy. Read this: http://www.gatherspace.com/static/use_case_example.html
The biggest mistake is overlooking the interaction between actor and system. A use case is not a place to write down long, detailed, technical algorithm designs. A use case is where an actor does something.
People interact with systems so they can take actions (place orders, approve billing, reject an insurance claim, etc.) To take an action, they first make a decision. To make a decision, they need information
Information
Decision
Action
These are the ingredients in the "Interaction" portion of a use case.
A valid use case could describe:
intended audience / user
pre-requisites (ie must have logged in, etc)
expected outcome(s)
possible points of failure
workflow of user
From Guideline: Identify and Outline Actors and Use Cases by the Eclipse people:
Identifying actors
Find the external entities with which
the system under development must
interact. Candidates include groups of
users who will require help from the
system to perform their tasks and run
the system's primary or secondary
functions, as well as external
hardware, software, and other systems.
Define each candidate actor by naming
it and writing a brief description.
Includes the actor's area of
responsibility and the goals that the
actor will attempt to accomplish when
using the system. Eliminate actor
candidates who do not have any goals.
These questions are useful in
identifying actors:
Who will supply, use, or remove
information from the system?
Who will
use the system?
Who is interested in a
certain feature or service provided by
the system?
Who will support and
maintain the system?
What are the
system's external resources?
What
other systems will need to interact
with the system under development?
Review the list of stakeholders that
you captured in the Vision statement.
Not all stakeholders will be actors
(meaning, they will not all interact
directly with the system under
development), but this list of
stakeholders is useful for identifying
candidates for actors.
Identifying use cases
The best way to find use cases is to
consider what each actor requires of
the system. For each actor, human or
not, ask:
What are the goals that the actor will
attempt to accomplish with the system?
What are the primary tasks that the
actor wants the system to perform?
Will the actor create, store, change,
remove, or read data in the system?
Will the actor need to inform the
system about sudden external changes?
Does the actor need to be informed
about certain occurrences, such as
unavailability of a network resource,
in the system?
Will the actor perform
a system startup or shutdown?
Understanding how the target
organization works and how this
information system might be
incorporated into existing operations
gives an idea of system's
surroundings. That information can
reveal other use case candidates.
Give a unique name and brief
description that clearly describes the
goals for each use case. If the
candidate use case does not have
goals, ask yourself why it exists, and
then either identify a goal or
eliminate the use case.
Outlining Use Cases
Without going into details, write a
first draft of the flow of events of
the use cases identified as being of
high priority. Initially, write a
simple step-by-step description of the
basic flow of the use case. The
step-by-step description is a simple
ordered list of interactions between
the actor and the system. For example,
the description of the basic flow of
the Withdraw Cash use case of an
automated teller machine (ATM) would
be something like this:
The customer inserts a bank card.
The system validates the card and prompts
the person to enter a personal
identification number (PIN).
The customer enters a PIN.
The system
validates the PIN and prompts the
customer to select an action.
The customer selects Withdraw Cash.
The system prompts the customer to choose
which account.
The customer selects
the checking account.
The system
prompts for an amount.
The customer
enters the amount to withdraw.
The
system validates the amount (assuming
sufficient funds), and then issues
cash and receipt.
The customer takes the cash and receipt, and then
retrieves the bank card.
The use case ends.
As you create this step-by-step
description of the basic flow of
events, you can discover alternative
and exceptional flows. For example,
what happens if the customer enters an
invalid PIN? Record the name and a
brief description of each alternate
flow that you identified.
Representing relationships between
actors and use cases
The relationship between actors and
use cases can be captured, or
documented. There are several ways to
do this. If you are using a use-case
model on the project, you can create
use-case diagrams to show how actors
and use cases relate to each other.
Refer to Guideline: Use-Case
Model for more information.
If you are not using a use-case model
for the project, make sure that each
use case identifies the associated
primary and secondary actors.

Resources