Is there a standard for defining mappings/transformations? - etl

Are there any standards for defining data transformations in a tool and format agnostic manner?
There are some obvious candidates like XPath transformations but they're specific to XML. There are hundreds of ETL tools on the market, but they're proprietary syntaxes and often rely on low-code/no-code wysiwyg formats.
Has there been any attempts to define an agnostic data transformation standard/format?

Related

How to customize a serialization

I'm newbie with graphql and spqr. I would like to serialize my dates with personal format. How I can do it?
The best answer I'd offer is: don't! SPQR serializes all temporal scalars as ISO 8601 strings in UTC zone for a reason. It is the most portable format, that any client can easily parse and understand, and any conversion and display logic is better left to the client itself.
If this is for some reason impossible (e.g. backwards compatibility with a legacy client), your best bet is providing your own scalar implementations. In the future there might be a feature to avoid this, but currently you have to implement your own scalars and a TypeMapper that will map the desired Java types to those scalars. See the existing ScalarMapper for inspiration. Once you have the mapper, register it via generator.withTypeMappers.

Semantic patches for POSIX shell script

Is there a tool for refactoring using semantic patches for shell scripts, just like Coccinelle for C?
An example modification would be to switch from
command > file
syntax to "sticky" one
command >file
Is there a tool for refactoring shell scripts? I doubt it.
However, you could build one using a general program transformation system (PTS).
These are tools that accept language descriptions (you'd need a grammar for POSIX shell scripts), will parse said langauge building ASTs, and then allow you to apply transformations to those ASTs, finally prettyprinting the AST back to valid source text.
Good PTSes let you express code changes using source code patterns (Cocinelle is not a genearl purpose PTS since it only works for C, but it falls into this category of source-pattern driven) rather than writing procedure code to modify the trees.
A problem with most of them is they do not go beyond matching on (context-free) ASTs, while real constraints require the tool to understand "context" (e.g., how information from far away in the source text affects the meaning of a particular point in the text). Cocinelle also does this, which is why it is an interesting tool; this kind of capability is necessary to transform traditional programming languages.
Our DMS Software Reengineering Toolkit is a general PTS that provide support for context analysis (symbol tables, control and data flow analysis, ...). I think (Unix) shell scripting langauge like Posix, all have various macro like capabilities that make processing them much harder than macro-free langauges such as Java. DMS provides built-in support for capturing and handling preprocessor conditionals and macros; it presently uses these to handle C and C++.
But, a Posix transformation is not out of the box. You have to define its grammar, and the various context analysis support to DMS. At that point you can start to write context-dependent transformations using source patterns. This work is doable with DMS, but isn't a weekend exercise. So the real question is,
how much automated patching do you intend to do; is it enough to justify configuring a PTS?

XML parser interface to different languags

I'm working on writing a parser for a specific XML based document, which has a lot of rules and complicated interface.
I was going to write the parser in Ruby to parse it to JSON. Then realized, a lot of other people who use different languages like to use it. So I'm thinking of somehow creating a central rule system, where each language can wrap it and create it's own parser.
Any idea how to go about it?
It's unlikely to be productive for you to write your own XML parser from scratch.
As you anticipated, there has indeed been a need for parsing XML in every major language. You can likely find libraries that implement multiple parsing models in any language you need. Be aware of tree-based models such as DOM, stream-based models such as SAX, and pull-based models such as StAX. Also consider XML processing models above the parsing level: Declarative transformations (eg XSLT) and databinding (eg JAXB).
The "central rule system" you envision has also already been realized in schemas (eg, XSD, RelaxNG, Schematron, ...).

non-declarative markup language

Wikipedia states:
Many markup languages such as HTML, MXML, XAML, XSLT or other
user-interface markup languages are often declarative. HTML, for
example, only describes what should appear on a webpage - it does not
specify the possible interactions with that webpage.
Which implies that there are markup languages which are non-declarative (use of the word often). I suspect this is not the case - Are there any non-declarative markup languages?
CFML (Coldfusion Markup Language) and o:XML(object-oriented XML) are two non-declarative markup languages:
CFML tags are essentially much more powerful versions of Java Tag Libraries, and with CFML's ECMAScript-like syntax you'll feel right at home.
o:XML is a complete object oriented programming language, with features including polymorphism, function overloading, exception handling, threads and more. The syntax is fully compliant XML. With o:XML, object-oriented paradigms can be leveraged to the maximum, while data and code remains in a standard format. With o:XML there is no 'impedance mismatch' when developing XML web-applications, tools and systems.
References
o:XML - object-oriented XML
Schema for Object-Oriented XML
Introducing o:XML
Why CFML? - Open CFML Foundation
Object Oriented Programming And ColdFusion - What's The Point?

Unifying enums across multiple languages

I have one large project with components in multiple languages that each depend on some of the same enum values. What solutions have you come up with to unify enums across multiple arbitrary languages? I can think of a few, but I'm looking for the best solution.
(In my implementation, I'm using Php, Java, Javascript, and SQL.)
You can put all of the enums in a text file, then use a code generator to write out the appropriate syntax for each language from that common file so that each component has the enums. Make that text file the authoritative source of information.
You can express the text file in XML but I'd think a tab-delimited flat file would work just fine.
Make them in a format that every language can understand or has a library for. I am using JSON for this at the moment.
Then you can include it with two ways:
For development: Load it from a file/URL at runtime
good for small changes you want too see immediately
slow
For productive usage: Include it in the files
using a build script
fast
no instant feedback
I would apply the dry principle and using code generator as such you could add anew language easely even if it has not enum natively existing.

Resources