I'm wondering if size of AAE file has an impact on efficiency, speed of BOT. I mean when I wrote first version of my BOT it had like 6.7 MB but after adding some additional error handling and additional if statements. it grew up to 11.5 MB. It did not change the speed of BOT but I'm wondering what is the maximal BOT size which wont affect it's speed.
Size of the atmx files will not have an impact on the speed at which your BOT would run. You could have a file that is as big as you want it to be, although that is not the best practice, and still have your bot work fine.
However, its always a good practice to Break down a chunk of task to small sub tasks and run those tasks via Run Task command in the command library.
Having said that, what might impact the performance is the logic that goes behind it. For example a bot that is stuck in a loop can take up system resources which would eventually crash the Device when all its resources has been consumed by the endless loop.
P.S: Snap shot of the .atmx file that is about 4x times bigger and works just fine in terms of performance
Related
I have an application that takes files from one place and moves them to another place - pretty much all this application does is checks if files are in s3 and downloads ones that are not to another s3. Currently application uses very low amounts of provided CPU. From this post, my understanding that it is to be expected (seeing as my app is pretty much I/O and nothing more).
My initial idea was to lower the number of CPUs provided to the app. However, providing less and less negatively impacted the speed at which my app performs its duties (which according to this article kind of make sense - less CPU means less total clock speed). This is not an option as it needs to run somewhat fast.
I am using kafka messages to start my app. So another idea of mine was to increase the number of partitions in my topic from which my app consumes the messages (so that I can increase the no. of threads that can run concurrently). That allowed me to reduce the number of CPU that I provide to my app (while maintaining the desired processing speed) but my app still uses very low amounts of CPUs.
My app runs in kubernates whose cluster is deployed to EC2s, if that is of any difference. My app is springBoot java. I tried to only give it a minimum number of CPUs, while maxing out the no. of concurrent threads in my app, but again I can see a lot of wasted CPU there.
My question is then as follows: is it possible to somehow make an application to use all available CPU (thus making it more efficient) in this scenario? Is there a config or a method or something that does that? Or for an app that checks data is present and downloads data somewhere else, this is an expected behavior - increasing the number of available resource would improve speed at which my app runs but as a cons of that, there will be waster CPU? (so I am in the classic "good comes with the bad" sort of situation here?)
I am trying to design an Online Programming Contest Judge, and one of the things that I need to ensure is that when the same code is compiled (assuming the requirement),
given the same input, it should take exactly the same amount of time for the program to execute, each time this is done.
Currently, I am using a simple python script that
has 2 threads, one of which invokes a blocking system call that starts the execution of the test code, and the other keeps track of time and sends a kill signal to the
child process after the time limit expires. Incidentally, I am doing this inside a virtual machine for reason of security, and convenience (setting up a proper chroot is
way too complicated, and more risky).
However, given identical conditions (ie, when I restore a snapshot), I still get a variation in the time taken for execution in range of approximately 50ms on either side. As this prevents setting strict time limits, is there anyway to eliminate this variation?
I'm not an expert in that field, but I don't think you can do it. Even if you restore the snapshot inside the VM, the state of the "Outside" Machine is going to be pretty different. You have two OSs running, each one which multiple process which are probably going to compete for the resources at some point. If it's a website or a PC with an internet connection, you can get hit by different amounts of connections (or request), and that will make process start running and consume requests etc... If some application tries to access the hard disk, the initial position of the physical disk matters a lot for seek time, etc...
If you want a "deterministic" limit, you might wanna check if you can count how many instructions were executed by a certain process, or something like that.
Anyways, I've participated in several programming contents, and as far as I know, they don't care about the 50 ms differences... If you do a proper algorithm, you can get inside the time with a really big margin. So I'd advise you to live with it, and just include that in the rules.
I have a large (500K lines) .NET CF (C#) program, running on CE6/.NET CF 3.5 (v.3.5.10181.0). This is running on a FreeScale i.Mx31 (ARM) # 400MHz. It has 128MB RAM, with ~80MB available to applications. My app is the only significant one running (this is a dedicated, embedded system). Managed memory in use (as reported by GC.Collect) is about 18MB.
To give a better idea of the app size, here's some stats culled from .NET CF Remote Performance Monitor after staring up the application:
GC:
Garbage Collections 131
Bytes Collected by GC 97,919,260
Managed Bytes in use after GC 17,774,992
Total Bytes in use after GC 24,117,424
GC Compactions 41
JIT:
Native Bytes Jitted: 10,274,820
Loader:
Classes Loaded 7,393
Methods Loaded 27,691
Recently, I have been trying to track down a performance problem. I found that my benchmark after running the app in two different startup configurations would run in approximately 2 seconds (slow case) vs. 1 second (fast case). In the slow case, the time for the benchmark could change randomly from EXE run to EXE run from 1.1 to 2 seconds, but for any given EXE run, would not change for the life of the application. In other words, you could re-run the benchmark and the time for the test stays the same until you restart the EXE, at which point a new time is established and consistent.
I could not explain the 1.1 to 2x slowdown via any conventional mechanism, or by narrowing the slowdown to any particular part of the benchmark code. It appeared that the overall process was just running slower, almost like a thread was spinning and taking away some of "my" CPU.
Then, I randomly discovered that just by switching away from my app (the GUI loses the foreground) to another app, my performance issue disappears. It stays gone even after returning my app to the foreground. I now have a tentative workaround where my app after startup launches an auxiliary app with a 1x1 size window that kills itself after 5ms. Thus the aux app takes the foreground, then relinquishes it.
The question is, why does this speed up my application?
I know that code gets pitched when a .NET CF app loses the foreground. I also notice that when performing a "GC Heap" capture with .NET CF Remote Performance Monitor, a Code Pitch is logged -- and this also triggers the performance improvement in my app. So I suspect somehow that code pitching is related or even responsible for fixing performance. But I'm at a loss as to figure out how to determine if that is really the case, or even to explain why pitching code could help in this way. Does pitching out lots of code somehow significantly help locality of reference of code pages (that are re-JITted, presumably near each other in memory) enough to help this much? (My benchmark spans probably 3 dozen routines and hundreds of lines of code.)
Most importantly, what can I do in my app to reliably avoid this slower condition. Any pointers to relevant .NET CF / JIT / Code pitching information would be greatly appreciated.
Your app going to the background auto-triggers a GC.Collect, which collects, may compact the GC Heap and may pitch code. Have you checked to see if a manual GC.Collect without going to the background gives the same behavior? It might not be pitching that's giving the perf gain, it might be collection or compaction. If a significant number of dead roots are swept up, walking the root tree may be getting faster. Can't say I've specifically seen this issue, so this is all conjecture.
Send your app a wm_hibernate before your performance loop. Will clean up things
We have a similar issue with our .NET CF application.
Over time, our application progressively slows down, eventually to a halt with what I anticipate is due to high CPU load, or as #wil-s says, as if thread is spinning consuming CPU. The only assumption / conclusion I've made to so far is either we have a rogue thread in our code, or there's an under the cover issue in .NET CF, maybe with the JITter.
Closing the application and re-launching returns our application to normal expected performance.
I am yet to implement code change to test issuing WM_HIBERNATE or launch a dummy app which quits itself (as above) to force a code pitch, but am fairly sure this will resolve our issue based on the above comments. (so many thanks for that)
However, I'm subsequently interested to know whether a root cause was ever found to this specific issue?
Incidentally and seemingly off topic (but bear with me), we're using a Freescale i.MX28 processor and are experiencing unpredictable FlashDisk corruption. Seeing 2K blocks of 0xFFs (erased blocks) in random files located on NAND Flash.
I'm mentioning this as I now believe the CPU and FlashDisk corruption issues are linked, due to this article as well as this one:
https://electronics.stackexchange.com/questions/26720/flash-memory-corruption-due-to-electricals
In the article, #jwygralak67 comments:
I recently worked through a flash corruption issue, on a WinCE system,
as part of a development team. We would sporadically find 2K blocks of
flash that were erased. (All bytes 0xFF) For about 6 months we tested
everything from ESD, to dirty power to EMI and RFI interference, we
bought brand new devices and tracked usage to make sure we weren't
exceeding the erase cycle limit and buring out blocks, we went through
our (application level) software with a fine toothed comb.
In the end it turned out to be an obscure bug in the very low level
flash driver code, which only occurred under periods of heavy CPU
load. The driver came from a 3rd party. We informed them of the issue
we found, but I don't know if they ever released a patch.
Unfortunately, we're yet to make contact with him.
With all of this in mind, potentially if we work around the high CPU load, maybe the corruption will no longer be present. Another conjecture case!
On that assumption however, this doesn't give a firm root cause for either situation, which I'm desperately seeking!
Any knowledge or insight, however small, would be very gratefully received.
#ctacke - we've spoken before regarding OpenNETCF via email, so I'm pleased to see your name!
I created a program in C# to work with 2.5 million records in Oracle Express (local instance), parse/split those records and create an additional 5 million records.
I added some code to print times on the screen and it seems fairly fast. It is doing all the processing for 1K records every 9 seconds. Which means it takes more than 6 hours to finish.
Now, with Task Manager I can see the program is using 6% of CPU (max) and around 50MB of memory. I understand the OS, and Oracle itself need resources to operate but..... is there a way to tell this little program "hey, it's ok, go ahead and use at least 50% of CPU, there are 4GB of RAM so knock yourself out"?
Note: One of the reasons I'm using a local instance with Oracle Express is to reduce the network bottleneck. Also I might not run this process quite often but I was intrigued to see if this was at all possible.
Please forgive my noobness,
Thanks!
The operating system will give your program all the resources it needs, the reason your process is not consuming all the CPU is probably because it's waiting for the IO sub system more than the processor.
If you want to see if you can consume more CPU cycles try writing a program that runs a short infinite loop as fast as possible and you will see the difference in CPU usage.
A number of thoughts, not really answers I guess, but.
You could up the priority of the applications thread, however, its possible that the code maybe less efficient than you think, so..
Have you run a profiler on it?
If its currently a single threaded app, you could look to see if you could parse it in batches and therefore run them in parallel.
Without knowing a lot of detail of the splitting of records, is it possible to off hand that more to oracle to do? eg, would matter less about network etc or local or otherwise.
If you're apps drawing/updating a screen or UI then it will almost certainly slow the progress of the work down. An example. I ran an app which sorted about 10k emails into around 250k lines into a database, if I added an item to a listbox each line the time went from short to rediculous eg, crash out got bored. So, again, offloading to a thread to do the work with as few UI updates to do as possible can help.
Any idea how to do performance and scalability testing if no clear performance requirements have been defined?
More information about my application.
The application has 3 components. One component can only run on Linux, the other two components are Java programs so they can run on Linux/Windows/Mac... The 3 components can be deployed to one box or each component can be deployed to one box. Deployment is very flexible. The Linux-only component will capture raw TCP/IP packages over the network, then one Java component will get those raw data from it and assemble them into the data end users will need and output them to hard disk as data files. The last Java component will upload data from data files to my database in batch.
In the absence of 'must be able to perform X iterations within Y seconds...' type requirements, how about these kinds of things:
Does it take twice as long for twice the size of dataset? (yes = good)
Does it take 10x as long for twice the size of dataset? (yes = bad)
Is it CPU bound?
Is it RAM bound (eg lots of swapping to virtual memory)?
Is it IO / Disk bound?
Is there a certain data-set size at which performance suddenly falls off a cliff?
Surprisingly this is how most perf and scalability tests start.
You can clearly do the testing without criteria, you just define the tests and measure the results. I think your question is more in the lines 'how can I establish test passing criteria without performance requirements'. Actually this is not at all uncommon. Many new projects have no clear criteria established. Informally it would be something like 'if it cannot do X per second we failed'. But once you passed X per second (and you better do!) is X the 'pass' criteria? Usually not, what happens is that you establish a new baseline and your performance tests guard against regression: you compare your current numbers with the best you got, and decide if the new build is 'acceptable' as build validation pass (usually orgs will settle here at something like 70-80% as acceptable, open perf bugs, and make sure that by ship time you get back to 90-95% or 100%+. So basically the performance test themselves become their own requirement.
Scalability is a bit more complicated, because there there is no limit. The scope of your test should be to find out where does the product break. Throw enough load at anything and eventually it will break. You need to know where that limit is and, very importantly, find out how does your product break. Does it give a nice error message and revert or does it spills its guts on the floor?
Define your own. Take the initiative and describe the performance goals yourself.
To answer any better, we'd have to know more about your project.
If there has been 'no performance requirement defined', then why are you even testing this?
If there is a performance requirement defined, but it is 'vague', can you indicate in what way it is vague, so that we can better help you?
Short of that, start from the 'vague' requirement, and pick a reasonable target that at least in your opinion meets or exceeds the vague requirement, then go back to the customer and get them to confirm that your clarification meets their requirements and ideally get formal sign-off on that.
Some definitions / assumptions:
Performance = how quickly the application responds to user input, e.g. web page load times
Scalability = how many peak concurrent users the applicaiton can handle.
Firstly perfomance. Performance testing can be quite simple, such as measuring and recording page load times in a development environment and using techniques like applicaiton profiling to identify and fix bottlenecks.
Load. To execute a load test there are four key factors, you will need to get all of these in place to be successfull.
1. Good usage models of how users will use your site and/or application. This can be easy of the application is already in use, but it can be extermely difficult if you are launching a something new, e.g. a Facebook application.
If you can't get targets as requirements, do some research and make some educated assumptions, document and circulate them for feedback.
2. Tools. You need to have performance testing scripts and tools that can excute the scenarios defined in step 1, with the number of expected users in step 1. (This can be quite expensive)
3. Environment. You will need a production like environment that is isolated so your tests can produce repoducible results. (This can also be very expensive.)
4. Technical experts. Once the applicaiton and environment starts breaking you will need to be able to identify the faults and re-configure the environment and or re-code the application once faults are found.
Generally most projects have a "performance testing" box that they need to tick because of some past failure, however they never plan or budget to do it properley. I normally recommend to do budget for and do scalability testing properley or save your money and don't do it at all. Trying to half do it on the cheap is a waste of time.
However any good developer should be able to do performance testing on their local machine and get some good benefits.
rely on tools (fxcop comes to mind)
rely on common sense
If you want to test performance and scalability with no requirements then you should create your own requirements / specs that can be done in the timeline / deadline given to you. After defining the said requirements, you should then tell your supervisor about it if he/she agrees.
To test scalability (assuming you're testing a program/website):
Create lots of users and data and check if your system and database can handle it. MyISAM table type in MySQL can get the job done.
To test performance:
Optimize codes, check it in a slow internet connection, etc.
Short answer: Don't do it!
In order to get a (better) definition write a performance test concept you can discuss with the experts that should define the requirements.
Make assumptions for everything you don't know and document these assumptions explicitly. Assumptions comprise everything that may be relevant to your system's behaviour under load. Correct assumptions will be approved by the experts, incorrect ones will provoke reactions.
For all of those who have read Tom DeMarcos latest book (Adrenaline Junkies ...): This is the strawman pattern. Most people who are not willing to write some specification from scratch will not hesitate to give feedback to your document. Because you need to guess several times when writing your version you need to prepare for being laughed at when being reviewed. But at least you will have better information.
The way I usually approach problems like this is just to get a real or simulated realistic workload and make the program go as fast as possible, within reason. Then if it can't handle the load I need to think about faster hardware, doing parts of the job in parallel, etc.
The performance tuning is in two parts.
Part 1 is the synchronous part, where I tune each "thread", under realistic workload, until it really has little room for improvement.
Part 2 is the asynchronous part, and it is hard work, but needs to be done. For each "thread" I extract a time-stamped log file of when each message sent, each message received, and when each received message is acted upon. I merge these logs into a common timeline of events. Then I go through all of it, or randomly selected parts, and trace the flow of messages between processes. I want to identify, for each message-sequence, what its purpose is (i.e. is it truly necessary), and are there delays between the time of receipt and time of processing, and if so, why.
I've found in this way I can "cut out the fat", and asynchronous processes can run very quickly.
Then if they don't meet requirements, whatever they are, it's not like the software can do any better. It will either take hardware or a fundamental redesign.
Although no clear performance and scalability goals are defined, we can use the high level description of the three components you mention to drive general performance/scalability goals.
Component 1: It seems like a network I/O bound component, so you can use any available network load simulators to generate various work load to saturate the link. Scalability can be measure by varying the workload (10MB, 100MB, 1000MB link ), and measuring the response time , or in a more precise way, the delay associated with receiving the raw data. You can also measure the working set of the links box to drive a realistic idea about your sever requirement ( how much extra memory needed to receive X more workload of packets, ..etc )
Component 2: This component has 2 parts, an I/O bound part ( receiving data from Component 1 ), and a CPU bound part ( assembling the packets ), you can look at the problem as a whole, make sure to saturate your link when you want to measure the CPU bound part, if is is a multi threaded component, you can look for ways to improve look if you don't get 100% CPU utilization, and you can measure time required to assembly X messages, from this you can calculate average wait time to process a message, this can be used later to drive the general performance characteristic of your system and provide and SLA for your users ( you are going to guarantee a response time within X millisecond for example ).
Component 3: Completely I/O bound, and depends on both your hard disk bandwidth, and the back-end database server you use, however you can measure how much do you saturate disk I/O to optimize throughput, how much I/O counts do you require to read X MB of data, and improve around these parameters.
Hope that helps.
Thanks