Implementing row level identification to all existing mappings(Automation) - etl

I am trying to implement row level identification in the target in informatica.
The mappings are already present and links are already made but i want this change to be implemented in more than 1000 mappings. How can i do this without affecting the current mapping as it become very tedious to add column in expression in 1000 mappings manually. Is there a better way to do this. Please tell. If my question is not understood please ask.

This is a very complex task. There is no out-of-the-box solution to accomplish that, I'm afraid. What you could do, is export all the mappings to XML and work on the exported definitions to add the required feature. Then import the definitions back. This however requires some analysis to understand the way mappings are built as well as the details of your requirement.

Related

How do I access h2o xgb model input features after saving a model to disk and reloading it?

I'm using h2o's xgboost implementation in Python. I've saved a model to disk and I'm trying to load it later on for analysis and predicting. I'm trying to access the input features list or, even better, the feature list used by the model which does not include the features it decided not to use. The way people advise doing this is to use varimp function to get the variable importance and while this does remove features that aren't used in the model this actually gives you the variable importance of intermediate features created by OHE the categorical features, not the original categorical feature names.
I've searched for how to do this and so far I've found the following but no concrete way to do this:
Someone asking something very similar to this and being told the feature has been requested in Jira
Said Jira ticket which has been marked resolved but I believe says this was implemented but not customer visible.
A similar ticket requesting this feature (original categorical feature importance) for variable importance heatmaps but it is still open.
Someone else who found an unofficial way to access the columns with model._model_json['output']['names'] but that doesn't give the features that weren't used by the model and they are told to use a different method that doesn't work if you have saved the model to disk and reloaded it (which I am doing).
The only option I see is to just use the varimp features, split on period character to break the OHE feature names, select the first part of all the splits, and then run a set over everything to get the unique column names. But I'm hoping there's a better way to do this.

Generate lang files dynamically

I have a situation that I had to sync my array with language files, so every time I had to generate & translate it.
I was looking for a package like laravel-langman it has an option to sync. But now that I am looking, it doesn't allow me to create a key with the value using artisan commend directly without asking for input.
Any Help will be appreciated.
You should check out this page maybe, it mentions multiple packages that solve your problem. We currently use a combination of 2 packages. I think the first one has what you want.
We use 2 packages to solve this issue, one is for the basic translations that don't get added dynamically, for this we used: waavi/translation
Now you still need it working for dynamically created or removed translations which you need if you want your models to contain multi language descriptions or something similar. For this we used: dimsav/laravel-translatable
With both of those you are all set, but you can also see if you like another package over the ones i listed.

Application To Help Translate XSL Transformations

There must exist some application to do the following, but I am not even sure how to google for it.
The dilemma is that we have to backtrace defects and in doing so this requires to see how certain fields in the output xml have been generated by the XSL. The hard part is spending hours in the XSL and XML trying to figure out where it was even generated. Even debugging is difficult if you are working with multiple XSL transformation and edits as you still need to find out primary keys that get in the specific scenario for that transform.
Is there some software program that could take an XSL and perhaps do one of two things:
Feed it an output field name and it would generate a list of all
the possible criteria that would generate this field so you can figure out which one of a dozen in the XSL meets your criteria, or
Somehow convert the xsl into some more readable if/then type
format (kind of like how you can use Javadoc to produce readable documentation)
You don't say what tools you are currently using. Tools like oXygen and Stylus Studio have some quite sophisticated XSLT debugging capability. OXygen's output mapping tool (see http://www.oxygenxml.com/xml_editor/working_with_xslt_debugger.html#xsltOutputMapping) sounds very like the thing you are asking for.
Using schema-aware stylesheets can greatly ease debugging. At least in the Saxon implementation, if you declare in your stylesheet that you want the output to be valid against a particular schema, then if it isn't, Saxon will tell you what instruction in the stylesheet caused invalid output to be generated. Sometimes it will show you the error at stylesheet compile time, before you even supply a source document. This capability is greatly under-used, in my view. More details here: http://www.stylusstudio.com/schema_aware.html
It's an interesting question. Your suggestions are also interesting but would be quite challenging to develop; I know of no COTS or FOSS solution to either, but here are some thoughts:
Your first possibility is essentially data-flow analysis from
compiler design. I know of no tools that expose this to the user,
but you might ask XSLT processor developers if they have ever
considered externalizing such an analysis in a manner that would be useful to XSLT
developers.
Your second possibility is essentially a documentation generator
against XSLT source. I have actually helped to complete one for a client in
financial services in the past (see Document XSLT Automatically), but the solution was the property of
the client and was never released publicly as far as I know. It
would be possible to recreate such a meta-transformation between
XSLT input and HTML or Docbook output, but it's not simple to do in the
most general case.
There's another approach that you might consider:
Tighten up your interface definition. In your comment, you mention uncertainty as to whether a problem's source is bad data from the sender or a bug in the XSLT. You would be well-served by a stricter interface definition. You could implement this via better typing in XSD, addition of xsd:assertion statements if XSD 1.1 is an option, or adding a Schematron-based interface checking level, which would allow you the full power of XPath-based assertions over the input. Having such an improved and more specific interface definition would help both you and your clients know what should and should not be sent into your systems.

CSV import with user correction

I'm looking for general UI advice on importing a CSV file. The UI is done in ASP.NET MVC3.
When the user uploads the file I need to validate it and allow them to manually correct any errors within the browser before I store it in the database. There's so many potential errors to check for and I'm really not sure what the best way is to achieve this. Another thing is that I only have a few days to implement this so it can't be too complicated. I'm fine with regular expressions and programming and I already have the posted file stream available, but I just can't think of a good and practical way to present this functionaly to the user.
Hope someone can inspire me. Many thanks.
There are some suggestions here:
Reading a CSV file in .NET?
Of these, we chose to use Linq2CSV in our MVC projects.
http://www.codeproject.com/KB/linq/LINQtoCSV.aspx
It is fairly easy to use, and validation is nice. You define a simple class that lays out the structure (columns) of the csv file. It will do basic validation, and if that passed, we sent it through a Validator that used DataAnnotation attributes to validate against more complex rules. We found it reliable, and we were able to add some features to it that we wanted.
If the file was pathologically bad, we'd fail the whole thing and present a single error message. If the file was reasonably sound, we would display the rows in error along with the error messages for the row so they could see the problem in context. In our case, this was a display grid only - we did not allow editing through the website - because the CSVs were being generated out of their data system, and we needed them to edit the source data in their system and regenerate the CSV. To do in place editing, you would need to stage all the column values as strings so they can fix numbers that don't parse, etc.

FITS Export with custom Metadata

does anybody has experience in exporting data as a FITS file with custom Metadata (FITS header) information? So far I was only able to generate FITS files with the standard Mathematica FITS header template. The documentation gives no hint on whether custom Metadata export is supported and how it might be done.
The following suggestions from comp.soft-sys.math.mathematica do not work:
header=Import[<some FITS file>, "Metadata"];
Export<"test.fits",data ,"Metadata"->header]
or
Export["test.fits",{"Data"->data,"Metadata"->header}]
What is the proper way to export my own Metadata to a FITS file ?
Cheers,
Markus
Update: response from Wolfram Support:
"Mathematica does not yet support Export of metadata for FITS file. The
example are referring to importing of this data. We do plan to support
this in the future..."
"There are also plans to include binary tables into FITS import
functionality."
I will try to come up with some workaround.
According to the documentation for v.7 and v.8, there is a couple of ways of accomplishing what you want, and you almost have the rule form correct:
Export["test.fits", {"Data" -> data, "Metadata" -> header}, "Rules"]
The other ways are
Export["test.fits", header, "Metadata"]
Export["test.fits", {data, header}, {{"Data", "Metadata"}}]
note the double brackets around the element labels in the second method.
Edit: After some testing, due to prodding from #belisarius, whenever I include the "Metadata" element, I get an error stating that it is not a valid export element. Also, you can't export a "RawData" element, either. So, I'd submit a bug for two reasons: the metadata isn't user settable which is vitally important for any serious application. At a minimum, the user should at least be able to augment the default Mathematica metadata. Second, the documentation is woefully inadequate in describing what is a "valid" export element vs. import element. Of course, I'd describe all of the documentation for v.6 and beyond as woefully inadequate, so this is par for the course.
Mathematica 9 now allows export of metadata (header) entries, which are additive to the standard required entries. In the Help browser, search "FITS" and there is an example that shows this (with Export followed by Import to verify).

Resources