Parallel Processing with Starting New Task - front end screen timeout - parallel-processing

I am running an ABAP program to work with a huge amount of data. The SAP documentation gives the information that I should use
Remote Function Modules with the addition STARTING NEW TASK to process the data.
So my program first selects all the data, breaks the data into packages and calls a function module with a package of data for further processing.
So that's my pseudo code:
Select KEYFIELD from MYSAP_TABLE into table KEY_TABLE package size 500.
append KEY_TABLE to ALL_KEYS_TABLE.
Endselect.
Loop at ALL_KEYS_TABLE assigning <fs_table> .
call function 'Z_MASS_PROCESSING'
starting new TASK 'TEST' destination in group default
exporting
IT_DATA = <fs_table> .
Endloop .
But I am surprised to see that I am using Dialog Processes instead of Background Process for the call of my function module.
So now I encountered the problem that one of my Dialog Processes were killed after 60 Minutes because of Timeout.
For me, it seems that STARTING NEW TASK is not the right solution for parallel processing of mass data.
What will be the alternative?

As already mentioned, thats not an easy topic that is handled with a few lines of codes. The general steps you have to conduct in a thoughtful way to gain the desired benefit is:
1) Get free work processes available for parallel processing
2) Slice your data in packages to be processed
3) Call an RFC enabled function module asynchronously for each package with the available work processes. Handle waiting for free work processes, if packages > available processes
4) Receive your results asynchronously
5) Wait till everything is processed and merge the data together again and assure that every package was handled properly
Although it is bad practice to just post links, the code is very long and would make this answer very messy, therfore take a look at the following links:
Example1-aRFC
Example2-aRFC
Example3-aRFC
Other RFC variants (e.g. qRFC, tRFC etc.) can be found here with short description but sadly cannot give you further insight on them.
EDIT:
Regarding process type of aRFC:
In parallel processing, a job step is started as usual in a background
processing work process. (...)While the job itself runs in a
background process, the parallel processing tasks that it starts run
in dialog work processes. Such dialog work processes may be located on
any SAP server.
The server is specified with the GROUP (default: parallel_generators) see transaction RZ12 and can have its own ressources just for parallel processing. If your process times out, you have to slice your packages differently in size.

I think, best way for parallel processing in SAP is Bank Parallel Processing framework as Jagger mentioned. Unfortunently its rarerly mentioned in any resource and its not documented well.
Actually, best documentation I found was in this book
https://www.sap-press.com/abap-performance-tuning_2092/
Yes, it's tricky. It costed me about 5 or 6 days to force it going. But results were good.
All stuff is situated in package BANK_PP_JOBCTRL and you can use its name for googling.
Main idea there is to divide all your work into steps (simplified):
Preparation
Parallel processing
2.1. Processing preparation
2.2. Processing
(Actually there are more steps there)
First step is not paralleized. Here you should prepare all you data for parallel processing and devide it into 'piece' which will be processed in parallel.
Content of pieces, in turn, can be ID or preloaded data as well.
After that, you can run step 2 in parallel processing.
Great benefit of all this is that error in one piece of parallel work won't lead to crash of all your processing.
I recomend you check demo in function group BANK_API_PP_DEMO

To implement parallel processing, you need to do a bit more than just add that clause. The information is contained in this help topic. A lot of design effort needs to be devoted to ensure that the communication and result merging overhead of the parallel processing does not negate the performance advantage gained by the parallel processing in the first place and that referential integrity of the data is maintained even when some of the parallel tasks fail. Do not under-estimate the complexity of this task.

You could make use of the bgRFC technique. This is a new method of background processing made by SAP.
BgRFC has, in addition to the already existing IN BACKGROUND TASK, the possibility to configure and monitor all calls which run through this method.
You can read more documentation between the different possibilities here. This is all (of course) depending on your SAP version.

Related

Control parallelism in Apache Beam Dataflow pipeline

We are experimenting with Apache Beam (using Go SDK) and Dataflow to parallelize one of our time consuming tasks. For little more context, we have caching job which takes some queries, runs it across database and caches them. Each database query may take few seconds to many minutes and we want to run those in parallel for quicker task completion.
Created a simple pipeline that looks like this:
// Create initial PCollection.
startLoad := beam.Create(s, "InitialLoadToStartPipeline")
// Emits a unit of work along with query and date range.
cachePayloads := beam.ParDo(s, &getCachePayloadsFn{Config: config}, startLoad)
// Emits a cache response which includes errCode, errMsg, time etc.
cacheResponses := beam.ParDo(s, &cacheQueryDoFn{Config: config}, cachePayloads)
...
The number units which getCachePayloadsFn emits are not a lot and will be mostly in hundreds and max few thousands in production.
Now the issue is cacheQueryDoFn is not getting executed in parallel and queries are getting executed sequentially one by one. We confirmed this by putting logs in StartBundle and ProcessElement by logging goroutine id, process id, start and end time etc in caching function to confirm that there is no overlap in execution.
We would want to run the queries always in parallel even if there are just 10 queries. From our understanding and documentations, it creates bundles from the overall input and those bundles run in parallel and within bundle it runs sequentially. Is there a way to control the number of bundles from the load or any way to increase parallelism?
Things we tried:
Keeping num_workers=2 and autoscaling_algorithm=None. It starts two VMs but runs Setup method to initialize DoFn on only one VM and uses that for entire load.
Found sdk_worker_parallelism option here. But not sure how to correctly set it. Tried setting it with beam.PipelineOptions.Set("sdk_worker_parallelism", "50"). No effect.
By default, the Create is not parallel and all the DoFns are being fused into the same stage as the Create, so they also have no parallelism. See https://beam.apache.org/documentation/runtime/model/#dependent-parallellism for some more info on this.
You can explicitly force a fusion break with the Reshuffle transform.

Sustainable solution using JMeter for a big functional flow

I have a huge flow to test using APIs. There are 3 endpoints. One is starting a process (db migration) that can last ~2-3 days, one is returning the status of the current running process (in progress, success, fail) and the last one is returning all the failed processes (as a list).
The whole flow should be:
Start the first process
Call the second endpoint until the first process ends (should get Fail or Success)
If the process failed, call the first endpoint again, if not, go to the next process.
The problem is that 1 process can last around 2-3 days and we have around 20k processes to check. (this should take a lot of time). I do have a special VM only for this.
My question: does it worth trying to implement a solution for this using JMeter?
It is not worth implementing in JMeter unless you want to use the tool as a workload automation engine that replaces functionalities provided by UC4 AppWorkr or Control-M. Based on what you describe, it does not appear to be a load test except the 2nd part that continuously queries the services for success/failure. I do not know the architecture behind that implementation. Hence, I am unable to quantify even that would be a load test or not.

How to keep webserver responsive while executing many asynchronous background tasks

I am working on a web application that provides its users to optionally execute long-running processes 'in background'. An example would be some long-running report generation, or deleting thousands of objects simultaneously.
I've implemented this using an ExecutorService defined as FixedThreadPool using a ThreadFactory. The ThreadFactory is built like this:
ThreadFactoryBuilder()
.setNameFormat(clientId + "-BackgroundTask-%d")
.setDaemon(true)
.setPriority(Thread.MIN_PRIORITY)
.build()
I execute the task like this:
Future<TaskStatus> future = clientExecutors.get(clientId).submit(
backgroundTask::execute);
taskFutures.put(backgroundTask.getTaskId(), future);
How can I enforce my webserver to always priorize handling new incoming requests (as fast as possible) over executing background tasks?
In other words: It should never ever happen, that a user has to wait long time while browsing the site, just because there are a lot of background-tasks executing. As you can see from above, I tried to do this by setting .setPriority(Thread.MIN_PRIORITY). However that does not seem to be sufficient.
Furthermore, as for now, I've set some arbitrary value for the FixedThreadPool size (10) and use it globally for the entire background-handling of the application (and all its customers).
Instead I would like to define a threadpool for each customer, to make sure each customer has the same privilege to run a certain amount of tasks in the background. Say, each customer has a FixedThreadPool of size 5, and on the server I'll have a max. of 50 different customers. That would add up to 250 running background tasks at the same time.
The most important requirement here is: it does not matter, how long these background-tasks need to execute (say 2 minutes, or 20 minutes). What is important, is that each customer has the ability to send 5 tasks to be executed in background, and each of those are worked on equally.
I've tested running 30 cpu-intensive background tasks and it turns out that while these are running and cpu is near 100%, new incoming requests take a very long time to be handled.
So obviously, I am doing it wrong.
Update 12.09.2017
I've read about microservices and while it sounds great I see a great challenge in splitting the necessary parts from our monolithic application. Mostly because nearly every operation might turn into a long running process given a big enough data selection.
Furthermore, wouldn't I run into the same problem with my microservice, i.e. the server running the microservice would suffer the same performance degradation. Well the only good thing would, that the rest of the web app would not suffer from it anymore.
I've read some posts about introducing Thread.sleep(1) or Thread.sleep in general into CPU-heavy operations to reduce the amount of CPU used in these operations. I've also read about someone who introduced this as an aspect so that he can even change the amount of time waited dynamically in order to have some control about how much cpu would be used.
However, my gut tells me that ain't right either. What do you think about introducing Thread.sleep to lower the amount of CPU used for a task? Is this common practice? If not, what would be the right approach?
I would highly consider changing your system architecture to offload these long-running requests to a separate instance instead of running them in-process with the general request-service application. In general I think it is an anti-pattern to handle both batch / online (or long / short running) processing in the same application instance.
Ideally you'd build a standalone microservice to handle these requests, but you could also simply just deploy X instances of your existing application, and configure your load balancer to route requests to the long running invocation paths (e.g. POST /myapp/longrunningjob) only to the instances dedicated to running these long-running processes.

Grinder - how to distribute invocation of urls from file

We have a huge file of different urls (~500K - ~1M urls).
We want to use Grinder 3 for distributing these urls to the Workers in a way that every worker will invoke a single and different url.
In the JY script we could:
Read the file one time per Agent
Allocate line-number-ranges per Agent
Every Worker would gets a line/url according to its run-id from its Agent line-number-range.
This still means loading a huge file into memory and writing some code to a problem that might be common to many.
Any ideas to a simpler/ready-made solution?
I used Grinder in a similar fashion a while back, and wrote a utility for multi-threaded, one-time ingestion of URLs from a large file.
See https://bitbucket.org/travis_bear/file_util -- in particular, the sequential reader.
I'd recommend using the split command-line utility (or similar) to give separate chunks of the master file to each agent prior to executing your Grinder run.
I would have taken a different approach if you like since its a huge file ,
How many threads are you planning to spawn . I believe you already know that you can get Grinder.ThreadNo to get the currently executing thread.
You can actually divide the file using a pre-processor with equal number of records into number of thread and name them 0 , 1 ,2 etc which matches with thread name .
Why I am suggesting this is that processing the file looks like a pre task whats important are its contents. File processing should not interfere when threads are executing.
So now each thread will have its own file and no collisions .
for eg 20 threads 20 files however your number of threads should be chosen carefully and may be peak + 50 % .

how to implement custom cloud worker

I am designing a cloud app and need a worker process which scours my database looking for work, and then performs it.
Most of the info I seem to find on the subject of background tasks in the cloud involves some kind of scheduler and/or queuing system.
What I have doesn't quite fit into the "run this task every 5 minutes" or "add this to the queue to be executed later" models. I think the main difference to my problem is that the workers themselves find work to do, rather than being assigned it by a periodic scheduler or an external process that generates work.
What I have is basically a giant table where each entry has three fields:
job: a small task to be performed, lets say it gets the last message from a twitter account and stores it in the database
the interval at which to perform that job: say every 5 minutes, N.B. the interval is arbitrary and different for each entry in the table
the last date when the job was performed
The way I would implement this is to have a worker which has an infinite loop. When it enters the loop, it scours the database a)looking for items whose date + interval < currentTime, b)when it finds one, it sets date = currentTime, and c)then executes the job. If there is no work ATM, it sleep for a few seconds, then tries again.
I will have many parallel workers scouring the database simultaneously, which is why I do b) first and then c) in the paragraph above. Since there are parallel workers, action a) and b) are atomic operations on the database to prevent work being duplicated. If the worker crashes after a) and b), but before it manages to finish the work, it's no big deal, and the workers can just do it at the next interval; reason for this is that the work is not performed in a time-invariant system so a backlog scenario of failed jobs has no benefit as the tasks have to be performed at their exact intervals, so it's better to skip 1 interval than to have uneven intervals between which the tasks were executed.
My question is whether that is a reasonable implementation strategy? If so, how do I bring this process to life on the cloud (I am using Heroku, but may switch to EC2 in the future)? I still haven't written any code so I would welcome other suggestions (maybe I misunderstood the use cases/applications for queue systems).
This sounds so close to using something like a scheduled job that you might as well tread the well beaten path and do it the more conventional way. There's no reason why you can't schedule a job to run once every few seconds.
However, this idea of looking for work sounds dodgy. What happens if two workers find the same task to run at the same time for instance? Also, are there not triggers in the application which can indicate that work needs doing? It seems strange that you have code 'looking for work'.
You can go a very long way with simple periodic background tasks, so I would exhaust all possibilities in that area before rolling your own.

Resources