Chaining of XQuery / XSLT Transformation to improve performance

Chaining of XQuery / XSLT Transformation to improve performance - performance

I have developed some XQuery scripts which I call in a chain via Saxon-CLI (bat files).
My problem is now that Saxon CLI is quite slow (because Java is slow, and Java on DotNet is even slower).
The problem is the startup time which takes some seconds (not the query execution itself). So my idea is to avoid creating new processes over and over again and just use one XSLT or XQuery process which loads the scripts and execute them.
But how to load & execute an XQuery-File in Saxon-XSLT? Is it possible?

Certainly, command-line scripts that involve firing up a new Java VM for each step are NOT the way to do it!
XProc is certainly a good candidate. But I have to confess I still do a lot of this in Ant: it's old but it works.
It's also possible to control a sequence of queries and transformations from within XSLT (you can invoke queries using a Saxon extension function, but it needs Saxon-PE or higher). I don't think that's the preferred way, but it's one less technology to learn about.
There are also quite a few pipeline processors out there such as Orbeon.

Related

Why Java based serverless functions have cold start if the JVM uses a JIT compiler?

Late Friday night thoughts after reading through material on how Cloudflare's v8 based "no cold start" Workers function - in short, because of the V8 engine's Just-in-Time compiler of Javascript code - I'm wondering why this no cold start type of serverless functions seems to only exist for Javascript.
Is this just because architecturally when AWS Lambda / Azure Functions were launched, they were designed as a kind of even more simplified Kubernetes model, where each function exists in its own container? I would assume that was a simpler model of keeping different clients' code separate than whatever magic sauce v8 isolates provided under the hood.
So given Java is compiled into bytecode for the JVM, which uses JIT compilation (if it doesn't optimise and compile to machine code certain high usage functions), is it therefore also technically possible to have no cold start Java serverless functions? As long as there is some way to load in each client's bytecode as they are invoked, on the cloud provider's server.
What are the practical challenges for this to become a reality? I'm not a big expert on all this, but can imagine perhaps:
The compiled bytecode isn't designed to be loaded in this way - it expects to be the only code being executed in a JVM
JVM optimisations aren't written to support loading short-lived, multiple functions, and treats all code loaded in to be one massive program
JVM once started doesn't support loading additional bytecode.

In principle, you could probably develop a Java-centric serverless runtime in which individual functions are dynamically loaded on-demand, and you might be able to achieve pretty good cold-start time this way. However, there are two big reasons why this might not work as well as JavaScript:
While Java is designed for JIT compiling, it has not been optimized for startup time nearly as intensely as V8 has. Today, the JVM is most commonly used in large always-on servers, where startup speed is not that important. V8, on the other hand, has always focused on a browser environment where code is downloaded and executed while a user is waiting, so minimizing startup latency is critical. (It might actually be interesting to look at an alternative Java runtime like Android's Dalvik, which has had much more reason to prioritize startup speed. Maybe it could be the basis of a really fast Java serverless environment!)
Security. V8 and other JavaScript runtimes have been designed with hostile code in mind from the beginning, and have had a huge amount of security research done on them. Java tried to target this too, in the very early days, with "applets", but that usage of Java never caught on. These days, secure sandboxing is not a major concern of Java. Because of this, it is probably too risky to run multiple Java apps that don't trust each other within the same container. And so, you are back to starting a separate container for each application.

How do I increase performance of jython Code?

I'm currently working in an environment within a JVM which only allows two script languages: groovy and jython. The scripts I write rely heavily on jdbc queries (querying and iterating over result sets) and nested loops.
I usually write each script first in jython, then in groovy, in order to compare performance. Groovy always beats jython (makes sense, since groovy is essentially java source code right?), despite the code performing the same tasks.
I would, however, prefer to use jython. So I researched parameters that could speed up jython code in general. I tweaked the xmx and xms jvm parameters to no avail. I am also in the process of tweaking the garbage collector.
I was wondering if you could provide me with some jython specific jvm tuning advice to improve performance. I'm grateful for any lessons you might have learnt improving jython performance.

In fact, kotlin is also a scripting language under the JVM. Kotlin has a scripting language mode , it is also used as the DSL of gradle.
Due to the late appearance, the design of syntax and performance should be better. You can try it.

Creating an installer for consultingware

At the company I work, we have a product that for all intents and purposes could be called consultingware. It's a platform for EDI with quite a few moving parts. The back-end is an ESB written in Java SE, the front-end is a Java EE application running on GlassFish, the database is typically on an MSSQL server and RabbitMQ is used as queueing middleware. It's domain-agnostic in the sense that different message models and mappings can be deployed. Setting up a new environment tends to take quite a while, but a lot of it are mundane tasks that could easily be automated by filling in the right parameters and running scripts. T-SQL for the database, asadmin scripts on GlassFish, and the ESB configs are XML, so an XSLT transformation on a template would do the job.
This is never going to become a simple installation, but having an "installer" that does most of the work for you, lists prerequisite steps, presents the user with a convenient way of supplying necessary parameters, generating some scripts and putting things in place would be nice; even if only the devs ever use it, it would make life easier. Although the software is technically platform-independent, it tends to be run on Windows Server.
Just making a Java application that does the above wouldn't be very difficult, but rather than reinvent the wheel (and make a probably very ugly GUI) I'd like to see if any existing solutions fit the bill. InstallShield and Inno Setup look promising. So the question is, which existing tool could provide the following, or alternatively, is making something from scratch worth it?
Call other executables or installers (for GlassFish, for example).
Run shell scripts (for the asadmin setup).
Connect to a (MSSQL) database and run scripts.
Perform XSLT transformations (could be via a Java method call/jar execution).
Set up services.
Maybe have some way of checking if prerequisites are fullfilled (check if GlassFish is installed, RabbitMQ, DB is accessible...)

FWIW, you can do all of those things from an MSI. There are a number of tools out there that make the process easier. I use a free one called MAKEMSI that is excellent: http://dennisbareis.com/makemsi.htm

In MongoDB, is v8 as fast as using a driver

We are testing a load harness and we wrote it as shell scripts calling javascript files via the mongo shell. I just wanted to get your opinion as to whether running the algorithm this way might not give us correct results given that the mongo shell is for admin and relies on a javascript engine. Would running the algorithm written as an app in Java or some other language give us different results?
I guess the short question is will an algorithm that loads mongodb have performance differences when using the shell vs a driver. Also, anything we should keep in mind as we compare MMAP vs WiredTiger in 3.0.7?

Quartz.NET vs Windows Scheduled Tasks. How different are they?

I'm looking for some comparison between Quartz.NET and Windows Scheduled Tasks?
How different are they? What are the pros and cons of each one? How do I choose which one to use?
TIA,

With Quartz.NET I could contrast some of the earlier points:
Code to write - You can express your intent in .NET language, write unit tests and debug the logic
Integration with event log, you have Common.Logging that allows to write even to db..
Robust and reliable too
Even richer API
It's mostly a question about what you need. Windows Scheduled tasks might give you all you need. But if you need clustering (distributed workers), fine-grained control over triggering or misfire handling rules, you might like to check what Quartz.NET has to offer on these areas.
Take the simplest that fills your requirements, but abstract enough to allow change.

My gut reaction would be to try and get the integral WinScheduler to work with your needs first before installing yet another scheduler - reasoning:
no installation required - installed and enabled by default
no code to write - jobs expressed as metadata
integration with event log etc.
robust and reliable - good enough for MSFT, Google etc.
reasonably rich API - create jobs, check status etc.
integrated with remote management tools
security integration - run jobs in different credentials
monitoring tooling
Then reach for Quartz if it doesn't meet your needs. Quartz certainly has many of these features too, but resist adding yet another service to own and manage if you can.

One important distinction, for me, that is not included in the other answers is what gets executed by the scheduler.
Windows Task Scheduler can only run executable programs and scripts. The code written for use within Quartz can directly interact with your project's .NET components.
With Task Scheduler, you'll have to write a shell executable or script. Inside of that shell, you can interact with your project's components. While writing this shell code is not a difficult process, you do have to consider deploying the extra files.
If you anticipate adding more scheduled tasks over the lifetime of the project, you may end up needing to create additional executable shells or script files, which requires updates to the deployment process. With Quartz, you don't need these files, which reduces the total effort needed to create and deploy additional tasks.

Unfortunately, Quartz.NET job assemblies can't be updated without restarting the process/host/service. That's a pretty big one for some folks (including myself).
It's entirely possible to build a framework for jobs running under Task Scheduler. MEF-based assemblies can be called by a single console app, with everything managed via a configuration UI. Here's a popular managed wrapper:
https://github.com/dahall/taskscheduler
https://www.nuget.org/packages/TaskScheduler
I did enjoy my brief time of working with Quart.NET, but the restart requirement was too big a problem to overcome. Marko has done a great job with it over the years, and he's always been helpful and responsive. Perhaps someday the project will get multiple AppDomain support, which would address this. (That said, it promises to be a lot of work. Kudos to he and his contributors if they decide to take it on.)
To paraphrase Marko, if you need:
Clustering (distributed workers)
Fine-grained control over triggering or misfire handling rules
...then Quartz.NET will be your requirement.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Chaining of XQuery / XSLT Transformation to improve performance - performance

Related

Why Java based serverless functions have cold start if the JVM uses a JIT compiler?

How do I increase performance of jython Code?

Creating an installer for consultingware

In MongoDB, is v8 as fast as using a driver

Quartz.NET vs Windows Scheduled Tasks. How different are they?

Categories

Resources