Parallel Processing in docx4j - thread-safety

Whether it is possible to process multiple paragraph simultaneously using docx4j. If possible, whether it's thread safe?
I tried to split every character into individual runs.When I process the paragraph sequentially. It's working fine. But When I tried the same process parallel. Not received the expected output.

Concptually docx4j assumes one thread per WordprocessingMLPackage (ie one thread per docx).
If you want to work for a while with one thread per paragraph, that should be fine provided you aren't modifying objects higher in the hierarchy than P.

Related

How to stop the concurrency between two transaction controller in two different thread groups

Here is my problem, I have two different thread groups which contains two different transaction controller named Upload a CSV file and Upload a XLS file to my target application. Is there any way I can prevent the concurrency between these two controller. Basically the objective is at same point of time two or more than two files (CSV and XLS type) shouldn't upload to my system.
At current I have put a Random Timers, like in the first thread group I am generating a value between 1sec-5sec and in the second thread group I am using 6sec-10sec. (please check the image)
Think Timer for CSV:
Think Timer for XLS:
Is there any better approach to do this? where non of the file upload should be at the same time.
Note: I am using all different set of users for these two request.
Do you realize that given your requirement you will be able to upload only one file at a time therefore it doesn't make a lot of sense to use more than one thread and to have more than one thread group?
Whatever.
There is a Critical Section Controller which ensures that it's child(ren) are only run using one thread at a time
You can also consider using Inter-Thread Communication Plugin so i.e. upload XLS will wait until upload CSV is done or vice versa.

Using JMeter is it possible to wait for all threads to finish before ending the test?

I need to run a jmeter test with N users over the course of a fixed time period. I am planning on using an "Ultimate Thread Group" for this as it meets my requirements. However, at the end of the time period and during the ramp down it simply kills threads even if they are not finished. This causes me problems because I end up in a situation where I have half completed records left lying around. Is there any way, either using this type of thread group or any other type to do as I require?
I have already got my test script ready, and have been exploring different thread groups and UTG seems like the best option, apart from the fact it kills threads without waiting for completion.
I would recommend to use stepping thread instead of ultimate thread group, this will surely help you out with your scenario. You can adjust stepping thread parameters according to your needs.

Parallel Processing with Starting New Task - front end screen timeout

I am running an ABAP program to work with a huge amount of data. The SAP documentation gives the information that I should use
Remote Function Modules with the addition STARTING NEW TASK to process the data.
So my program first selects all the data, breaks the data into packages and calls a function module with a package of data for further processing.
So that's my pseudo code:
Select KEYFIELD from MYSAP_TABLE into table KEY_TABLE package size 500.
append KEY_TABLE to ALL_KEYS_TABLE.
Endselect.
Loop at ALL_KEYS_TABLE assigning <fs_table> .
call function 'Z_MASS_PROCESSING'
starting new TASK 'TEST' destination in group default
exporting
IT_DATA = <fs_table> .
Endloop .
But I am surprised to see that I am using Dialog Processes instead of Background Process for the call of my function module.
So now I encountered the problem that one of my Dialog Processes were killed after 60 Minutes because of Timeout.
For me, it seems that STARTING NEW TASK is not the right solution for parallel processing of mass data.
What will be the alternative?
As already mentioned, thats not an easy topic that is handled with a few lines of codes. The general steps you have to conduct in a thoughtful way to gain the desired benefit is:
1) Get free work processes available for parallel processing
2) Slice your data in packages to be processed
3) Call an RFC enabled function module asynchronously for each package with the available work processes. Handle waiting for free work processes, if packages > available processes
4) Receive your results asynchronously
5) Wait till everything is processed and merge the data together again and assure that every package was handled properly
Although it is bad practice to just post links, the code is very long and would make this answer very messy, therfore take a look at the following links:
Example1-aRFC
Example2-aRFC
Example3-aRFC
Other RFC variants (e.g. qRFC, tRFC etc.) can be found here with short description but sadly cannot give you further insight on them.
EDIT:
Regarding process type of aRFC:
In parallel processing, a job step is started as usual in a background
processing work process. (...)While the job itself runs in a
background process, the parallel processing tasks that it starts run
in dialog work processes. Such dialog work processes may be located on
any SAP server.
The server is specified with the GROUP (default: parallel_generators) see transaction RZ12 and can have its own ressources just for parallel processing. If your process times out, you have to slice your packages differently in size.
I think, best way for parallel processing in SAP is Bank Parallel Processing framework as Jagger mentioned. Unfortunently its rarerly mentioned in any resource and its not documented well.
Actually, best documentation I found was in this book
https://www.sap-press.com/abap-performance-tuning_2092/
Yes, it's tricky. It costed me about 5 or 6 days to force it going. But results were good.
All stuff is situated in package BANK_PP_JOBCTRL and you can use its name for googling.
Main idea there is to divide all your work into steps (simplified):
Preparation
Parallel processing
2.1. Processing preparation
2.2. Processing
(Actually there are more steps there)
First step is not paralleized. Here you should prepare all you data for parallel processing and devide it into 'piece' which will be processed in parallel.
Content of pieces, in turn, can be ID or preloaded data as well.
After that, you can run step 2 in parallel processing.
Great benefit of all this is that error in one piece of parallel work won't lead to crash of all your processing.
I recomend you check demo in function group BANK_API_PP_DEMO
To implement parallel processing, you need to do a bit more than just add that clause. The information is contained in this help topic. A lot of design effort needs to be devoted to ensure that the communication and result merging overhead of the parallel processing does not negate the performance advantage gained by the parallel processing in the first place and that referential integrity of the data is maintained even when some of the parallel tasks fail. Do not under-estimate the complexity of this task.
You could make use of the bgRFC technique. This is a new method of background processing made by SAP.
BgRFC has, in addition to the already existing IN BACKGROUND TASK, the possibility to configure and monitor all calls which run through this method.
You can read more documentation between the different possibilities here. This is all (of course) depending on your SAP version.

Grinder - how to distribute invocation of urls from file

We have a huge file of different urls (~500K - ~1M urls).
We want to use Grinder 3 for distributing these urls to the Workers in a way that every worker will invoke a single and different url.
In the JY script we could:
Read the file one time per Agent
Allocate line-number-ranges per Agent
Every Worker would gets a line/url according to its run-id from its Agent line-number-range.
This still means loading a huge file into memory and writing some code to a problem that might be common to many.
Any ideas to a simpler/ready-made solution?
I used Grinder in a similar fashion a while back, and wrote a utility for multi-threaded, one-time ingestion of URLs from a large file.
See https://bitbucket.org/travis_bear/file_util -- in particular, the sequential reader.
I'd recommend using the split command-line utility (or similar) to give separate chunks of the master file to each agent prior to executing your Grinder run.
I would have taken a different approach if you like since its a huge file ,
How many threads are you planning to spawn . I believe you already know that you can get Grinder.ThreadNo to get the currently executing thread.
You can actually divide the file using a pre-processor with equal number of records into number of thread and name them 0 , 1 ,2 etc which matches with thread name .
Why I am suggesting this is that processing the file looks like a pre task whats important are its contents. File processing should not interfere when threads are executing.
So now each thread will have its own file and no collisions .
for eg 20 threads 20 files however your number of threads should be chosen carefully and may be peak + 50 % .

MFC CEvent class member function SetEvent , difference with Thread Lock() function?

what i s the difference between SetEvent() and Thread Lock() function? anyone please help me
Events are used when you want to start/continue processing once a certain task is completed i.e. you want to wait until that event occurs. Other threads can inform the waiting thread about the completion of this task using SetEvent.
On the other hand, critical section is used when you want only one thread to execute a block of code at a time i.e. you want a set of instructions to be executed by one thread without any other thread changing the state at that time. For example, you are inserting an item into a linked list which involves multiple steps, at that time you don't want another thread to come and try to insert one more object into the list. So you block the other thread until first one finishes using critical sections.
Events can be used for inter-process communication, ie synchronising activity amongst different processes. They are typically used for 'signalling' the occurrence of an activity (e.g. file write has finished). More information on events:
http://msdn.microsoft.com/en-us/library/windows/desktop/ms686915%28v=vs.85%29.aspx
Critical sections can only be used within a process for synchronizing threads and use a basic lock/unlock concept. They are typically used to protect a resource from multi-threaded access (e.g. a variable). They are very cheap (in CPU terms) to use. The inter-process variant is called a Mutex in Windows. More info:
http://msdn.microsoft.com/en-us/library/windows/desktop/ms682530%28v=vs.85%29.aspx

Resources