some say scripted is powerfull than declarative some say declarative is the recommended one. for a complex pipeline which type I need to choose?
Declarative is the future.
Declarative pipeline is a new feature that supports the pipeline as code concept. It improves the readability of pipeline code . This code is written in a Jenkinsfile which can be checked into a source control management system such as Github or bitbucket.
The scripted pipeline is a traditional way of writing the code. It uses stricter groovy based syntax because this is the first pipeline to be built on the groovy foundation. There is just one benefit. Scripted pipeline allows greater control over the pipeline. One can manipulate the flow of script extensively in this approach.
Currently , Declarative pipeline has a more strict and pre-defined structure. In future releases, declarative pipeline will inculcate the groovy syntax. If your CI pipeline is simple and short , go for declarative. If your requirements are complex and big , go for scripted.
Related
While building operators using OperatorSDK: Go framework, we end up creating Kubernetes resources such Deployments, Services etc programmatically by leveraging structs from k8s modules/packages. Compared to creating these manifests in yaml/json formats, this is quite cumbersome and requires quite a bit of coding. And any changes to the manifest would require code changes and the new version of the operator needs to be rolled out.
I am wondering whether existing templating/overlay tools such as Helm or Kustomize can be used for building these k8s resources within the operator code. This would also enable you to externalise the manifest/template files from the operator code. I couldn't find any good examples of how these tools can be used as modules/libraries within a Go program. Please provide any pointers, suggestions or alternate approaches.
Related question: Kubernetes operator create Deployment using yaml template
This talks about how you can read a yaml file and unmarshal it into a Deployment object. Here, I would still need to code templating/overlay logic within the operator.
You can use the helm engine programmatically, by calling engine.Render.
func Render(chrt *chart.Chart, values chartutil.Values) (map[string]string, error)
We have a quite big and old SFCC project, which started on Pipelines and should now finally be migrated to Controllers. For this we need to identify, which Pipelines are easy candidates for an initial migration.
We are doing it based in the following criteria:
which other Pipelines are calling this pipeline
how many Pipeline.execute calls are in existing controllers to this pipeline
Is the pipeline using a custom hooking mechanism (identified on a certain include pipeline) - I guess it can be abstracted by "What other pipelines is this Pipeline calling"
Is there already something out there that does something close to this?
So in case somebody else has this challenge, to get eventually rid of Pipelines in a legacy Project, you can use my node analyser script also:
https://github.com/Andreas-Schoenefeldt/SFCCAnalyser
after installation, start it with npm run
I see a broad adoption of Dataweave which I feel is more of transformation library just like Freemarker or Velocity.
In case of DW Change in transformation logic would need change in code, the very same purpose template engines got popular at the first place to seperate logic and code so that we can change transformation logic without needing to rebuild/repackage our code (more deployment hassle).
Can anyone help me to point out few reasons as to why one would prefer DW .
TLDR: If you're looking for a template engine for things like static websites, DataWeave definitely isn't the right choice. Use the right tool for the job. Also, while you can use DataWeave outside of Mule, I don't think I've seen anyone adopt DataWeave that hasn't adopted MuleSoft..
A few things to consider (and most of these I'm stating in the context of developing Mule applications):
These template engines are, typically, for outputting static text. If you're using it to output structured data rather than something like an HTML page.. you're probably doing it wrong. They aren't going to return structured data - they are going to return text. If you're at the very end of your flow and you're going to output that back out of the API or to a file, you're fine I suppose.. but if you want to actually be able to work with that output, you're going to have to convert the plain text to an actual object... introducing a lot of extra steps in this process when you could have just used DataWeave in the first place. Dataweave is especially beneficial when you want to do things like streaming because you're processing large payloads. Dataweave can understand JSON, XML, and CSV (the three most common data types I see) in a streamed format without any additional work, making it very easy to create efficient applications. The big difference between a template engine and a data transformation language is that one is for outputting text using structured data as input, and the other is for working with structured data on the input and outputting structured data that you can continue to work with. There is a reason that almost all of the template engine docs talk about building websites and not things like integrations.
The DataWeave engine is, as Aled indicated, built into the Mule runtime. Deeply so. You can use DataWeave in any field in any connector by default, even fields that don't have the f(x) button - because it's built into the runtime. This makes DataWeave what you could consider a first-class citizen within Mule, unlike something you will only be able to utilize either via connectors or by invoking java bridges/libraries.. which you do via DataWeave or a long series of connector operations.
The benefits you listed are also not things you can't do with DataWeave. You can VERY easily templatize and externalize DataWeave - for example, I have several DataWeave libraries in my maven repo I can include as dependencies. I've built several transformation services that use databases with DataWeave in order to do transformation, allowing me to change those transformations without modifying the app. You can also use dynamic DataWeave, where you use a template system to load specific parts of the script before running it. I've even taken it a step further and written a generic DataWeave script that I can use to do basic mappings without writing DataWeave - this allowed me to wrap a web UI around things pretty easily.
I wouldn't use DataWeave outside of MuleSoft unless you're a MuleSoft shop. If you are a MuleSoft shop, using the CLI to run your scripts, the same way you do with most interpreted languages, works fairly nicely - especially since you likely already have in-house expertise in DataWeave. The language is still niche enough that unless you've already adopted it for use in Mule applications I don't see any advantage in using it.
Docs / basic examples:
https://github.com/mulesoft-labs/data-weave-native
https://docs.mulesoft.com/mule-runtime/4.3/parse-template-reference
https://docs.mulesoft.com/mule-runtime/4.3/dataweave-create-module
https://github.com/mikeacjones/transform-system-api
Because it is the expression and transformation language embedded in Mule runtime. If you are using Mule it is also integrated with the IDE Anypoint Studio.
Outside Mule applications I don't think you can use DataWeave easily. You might want to go with the alternatives.
I use DMSDK to ingest data; I have multiple custom flows to run following data ingestion. Instead of manually running the flows one by one, What is the best way to orchestrate MarkLogic data hub flows?
gradle, trigger or other scheduling tools?
I concur with Dave Cassel that NiFi, or perhaps something like MuleSoft, or maybe even Camel is a great way to manage running your flows. Particularly if you are talking about operational management.
To answer on other mechanisms:
Crontab doesn't connect to MarkLogic itself. You'd have to write scripts or code to make something actually happen. You won't have much control either, nor logging, unless you add that as well.
We have great plugins for Gradle that make running flows real easy. Great during development and such, but perhaps less suited for scheduling or operational tasking.
Triggers inside MarkLogic only respond to insertion of data, so you'd still have to initiate an update from outside anyhow.
Scheduled Tasks inside MarkLogic has similar limitations to Crontab and Gradle. It doesn't do much by itself, so you have to write code anyhow. It provides no logging by itself, nor ways to operationally manage the tasks, other than through Admin ui.
JAR package might depend on what JAR package you actually mean. You can create a JAR of your ml-gradle project, but that doesn't give you a lot of gain over calling Gradle itself.
Personally, I'd have a close look at the operational requirements. Think of for instance: need to get status overview, interrupt schedules, loops to retry at failure, built-in logging, and facilities to send notifications when attention is needed.
HTH!
There are a variety of answers that will work, of course; my preference is NiFi. This keeps any scheduling overhead outside of MarkLogic, with the trade-off that you'll need to have NiFi running.
Between Apache Oozie, Spotify/Luigi and airbnb/airflow, what are the pros and cons for each of them?
I have used oozie and airflow in the past for building a data ingestion pipeline using PIG and Hive. Currently, I am in the process of building a pipeline that looks at logs and extracts out useful events and puts them on redshift.
I found that airflow was much easier to use/test/setup. It has a much cooler UI and lets users perform actions from the UI itself, which is not the case with Oozie. Any information about Luigi or other insights regarding stability and issues are welcome.
Azkaban: Nice UI, relatively simple, accessible for non-programmers. Has a longish history at LinkedIn.
Check out the Azkaban CLI project for programmatic job creation. I have an Azkaban example workflows project on GitHub.
Airflow: Decent UI, Python-ish job definition, semi-accessible for non-programmers, dependency declaration syntax is weird.
Luigi: OK UI, workflows are pure Python, requires solid grasp of Python coding and object oriented concepts, hence not suitable for non-programmers.
Oozie: Insane XML based job definitions. Here be dragons. ;-)
IMHO, Azkaban enforces simplicity (can’t use features that don’t exist) and the others subtly encourage complexity.
Simpler pipelines are better than complex pipelines: Easier to create, easier to understand (especially when you didn’t create) and easier to debug/fix.
When complex actions are needed you want to encapsulate them in a way that either completely succeeds or completely fails.
If you can make it idempotent (running it again creates identical results) then that’s even better.
This post will give you an initial idea about different possible workflows
http://bytepawn.com/luigi-airflow-pinball.html