pywikibot.exceptions.TimeoutError - how to modify the default - pywikibot

in my github actions unit test i am running some pywikibot code (pywikibot 6.6.3) that some times fails due to the site not being responsive or a misconfiguration. The log report used to show the error messages after a few minutes.
Now the code runs some 2 hours and more with hint such as:
... Waiting 120.0 seconds before retrying.
File "/opt/hostedtoolcache/Python/3.9.7/x64/lib/python3.9/site-packages/pywikibot/data/api.py", line 1883, in wait
Error: raise TimeoutError('Maximum retries attempted without success.')
pywikibot.exceptions.TimeoutError: Maximum retries attempted without success.
But there is no mention how to change the retries and timeout values?
I found the source code at
https://github.com/wikimedia/pywikibot/blob/master/pywikibot/data/api.py
But since the api init is called indirectly i need to know how the configuration is to be done to get less retries and quicker timeout
https://stackoverflow.com/a/39062902/1497139
even has a precise hint what to change but no source code example for this so i still don't know what the python code should look to modify outside of the config file during the initialization phase.
How could i change the settings to get a quick fail mode which is appropriate for the tests?

the style of setting the variables is:
import pywikibot
pywikibot.config.max_retries=2
I was just not believing that the wikimedia foundation still uses global variables for configuration ...

Related

Re-run Cypress test in Github Actions does not work

I have a cypress workflow in Github and it runs nicely. But, when the e2e tests fail for some reason and I want to re-run them using the re-run all jobs button (below), the following message appears:
The run you are attempting to access is already complete and will not accept new groups.
The existing run is: https://dashboard.cypress.io/projects/abcdef/runs
When a run finishes all of its groups, it waits for a configurable set of time before finally completing. You must add more groups during that time period.
The --tag flag you passed was:
The --group flag you passed was: core
What should I change in my configuration to make these possible? Sometimes the e2e fails because of a backend error that is fixed later.
I'd like to do this instead of a force e2e commit.
I was facing the same issue before.
I think you can try to pass GITHUB_TOKEN or add a custom build id. It fixed my issue. Hoep it helps.
https://github.com/cypress-io/github-action#custom-build-id
Check your Cypress Dashboard subscription plan. Mine got the free plan full (500 test for free and I was running in 3 different browsers 57 tests, so it got full pretty quick since this is 171 tests in one run) and after that it didn't allowed me to keep running or re running more parallel tests. Test kept running but in 1 machine out of 4 in the first browser and stages for the other 2 browsers started failing, I was able to allow the pipeline to not be failing by passing continueOnError: true in the configuration.
Quick edit, I don't remember where but I read that you could also add a delay to your pipeline and/or reduce the default wait on the Dashboard which is 60s(https://docs.cypress.io/guides/guides/parallelization#Run-completion-delay)

The JVM should have exited but did not

During Distributed testing with Jmeter 3.3 in non gui mode i'm getting error as, how can I fix this :
I'm using same version of JMeter and JDK on Master as well as Slave machines.
The JVM should have exited but did not.
The following non-daemon threads are still running (DestroyJavaVM is OK):
Thread[main,5,main],
stackTrace:java.net.SocketInputStream#socketRead0
java.net.SocketInputStream#socketRead
java.net.SocketInputStream#read
java.net.SocketInputStream#read
java.io.BufferedInputStream#fill
java.io.BufferedInputStream#read
java.io.DataInputStream#readByte
sun.rmi.transport.StreamRemoteCall#executeCall
sun.rmi.server.UnicastRef#invoke
java.rmi.server.RemoteObjectInvocationHandler#invokeRemoteMethod
java.rmi.server.RemoteObjectInvocationHandler#invoke
com.sun.proxy.$Proxy19#rrunTest
org.apache.jmeter.engine.ClientJMeterEngine#runTest at line:149
org.apache.jmeter.engine.DistributedRunner#start at line:132
org.apache.jmeter.engine.DistributedRunner#start at line:149
org.apache.jmeter.JMeter#runNonGui at line:1005
org.apache.jmeter.JMeter#startNonGui at line:910
org.apache.jmeter.JMeter#start at line:538
sun.reflect.NativeMethodAccessorImpl#invoke0
sun.reflect.NativeMethodAccessorImpl#invoke
sun.reflect.DelegatingMethodAccessorImpl#invoke
java.lang.reflect.Method#invoke
org.apache.jmeter.NewDriver#main at line:248
I strongly recommend using this jmeter property:
jmeterengine.force.system.exit=true
documented here. These Chinese language web pages link link tipped me off.
You can add -Jjmeterengine.force.system.exit=true on the command line when launching JMeter, or add jmeterengine.force.system.exit=true to JMETER_HOME/bin/jmeter.properties.
How I Confirmed This Fix
With JMeter 5.1 and java version "1.8.0_231" on MS-Win10, we're using a customized version of this JMeter InfluxDB backend Listener.
After my 60 second test run from the command line (jmeter.bat -n -t plan.jtl), the command line hung after displaying this output (very similar to op):
Tidying up ... # Wed Jan 29 14:41:04 CST 2020 (1580330464874)
... end of run
The JVM should have exited but did not.
The following non-daemon threads are still running (DestroyJavaVM is OK):
Thread[DestroyJavaVM,5,main], stackTrace:
Thread[pool-2-thread-3,5,main], stackTrace:sun.misc.Unsafe#park
java.util.concurrent.locks.LockSupport#parkNanos
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject#awaitNanos
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ThreadPoolExecutor#getTask
java.util.concurrent.ThreadPoolExecutor#runWorker
java.util.concurrent.ThreadPoolExecutor$Worker#run
java.lang.Thread#run
Thread[pool-2-thread-4,5,main], stackTrace:sun.misc.Unsafe#park
java.util.concurrent.locks.LockSupport#parkNanos
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject#awaitNanos
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ThreadPoolExecutor#getTask
java.util.concurrent.ThreadPoolExecutor#runWorker
java.util.concurrent.ThreadPoolExecutor$Worker#run
java.lang.Thread#run
Thread[pool-2-thread-1,5,main], stackTrace:sun.misc.Unsafe#park
java.util.concurrent.locks.LockSupport#parkNanos
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject#awaitNanos
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ThreadPoolExecutor#getTask
java.util.concurrent.ThreadPoolExecutor#runWorker
java.util.concurrent.ThreadPoolExecutor$Worker#run
java.lang.Thread#run
After modifying my command line as follows, jmeter.bat cleanly exited instead of hanging and all the ugly stack trace went away too:
jmeter.bat -n -Jjmeterengine.force.system.exit=true -t plan.jtl
To confirm that the problem was caused by our customized JMeter InfluxDB backend Listener, I removed it from the .jmx and I also removed the jmeterengine.force.system.exit=true. No hang, no ugly stacktrace (I actually love stacktraces).
I have not taken the next step to discover whether the problem is with the official JMeter InfluxDB backend Listener or with our customized variant, which is not (and will never be) available publicly.
Should mention one gap in this story. I feel this test conclusively points to our customized backend listener (or jmeter's). However, its odd that none of the threads in the above thread dump seem to belong to the backend listener. So I applaud that JMeter did the right thing by dumping the stack trace -- few other apps go to the extent of auto-dumping when appropriate for troubleshooting. But in this case, perhaps that jmeter auto-dump code needs to be enhanced, because it did not point to the culprit backend listener code. Anyone over there at Apache listening in on this?
Good luck.
Most probably your JMeter engine(s) is(are) overloaded therefore cannot gracefully shut down running threads when you request them to do so.
Make sure you follow JMeter Best Practices
The very first "best practice" states Always use latest version of JMeter so consider migrating to JMeter 5.0 or whatever latest version is available at JMeter Downloads page
Make sure your JMeter instances have enough headroom to operate in terms of CPU, RAM and so on. You can use JMeter PerfMon Plugin for this if you don't have other monitoring software in place/in mind.
Take a thread dump and examine it - this way you will know where exactly your test is stuck
Introduce reasonable timeout values in HTTP Request Defaults so in case when server fails to respond JMeter wouldn't wait infinitely but rather fail with an error
And finally (however I wouldn't recommend this) you can suppress this check by adding the next line to user.properties file:
jmeter.exit.check.pause=-1
if you go for this keep in mind that you may run into a situation when JMeter slaves will still be trying to execute something even after your test ends so you will need to kill and restart the processes manually or using a script.

Distributed JMeter test fails with java error but test will run from JMeter UI (non-distributed)

My goal is to run a load test using 4 Azure servers as load generators and 1 Azure server to initiate the test and gather results. I had the distributed test running and I was getting good data. But today when I remote start the test 3 of the 4 load generators fail with all the http transactions erroring. The failed transactions log the following error:
Non HTTP response message: java.lang.ClassNotFoundException: org.apache.commons.logging.impl.Log4jFactory (Caused by java.lang.ClassNotFoundException: org.apache.commons.logging.impl.Log4jFactory)
I confirmed the presence of commons-logging-1.2.jar in the jmeter\lib folder on each machine.
To try to narrow down the issue I set up one Azure server to both initiate the load and run JMeter-server but this fails too. However, if I start the test from the JMeter UI on that same server the test runs OK. I think this rules out a problem in the script or a problem with the Azure machines talking to each other.
I also simplified my test plan down to where it only runs one simple http transaction and this still fails.
I've gone through all the basics: reinstalled jmeter, updated java to the latest version (1.8.0_111), updated the JAVA_HOME environment variable and backed out the most recent Microsoft Security update on the server. Any advice on how to pick this problem apart would be greatly appreciated.
I'm using JMeter 3.0r1743807 and Java 1.8
The Azure servers are running Windows Server 2008 R2
I did get a resolution to this problem. It turned out to be a conflict between some extraneous code in a jar file and a component of JMeter. It was “spooky” because something influenced the load order of referenced jar files and JMeter components.
I had included a jar file in my JMeter script using the “Add directory or jar to classpath” function in the Test Plan. This jar file has a piece of code I needed for my test along with many other components and one of those components, probably a similar logging function, conflicted with a logging function in JMeter. The problem was spooky; the test ran fine for months but started failing at the maximally inconvenient time. The problem was revealed by creating a very simple JMeter test that would load and run just fine. If I opened the simple test in JMeter then, without closing JMeter, opened my problem test, my problem test would not fail. If I reversed the order, opening the problem test followed by the simple test then the simple test would fail too. Given that the problem followed the order in which things loaded I started looking at the jar files and found my suspect.
When I built the script I left the jar file alone thinking that the functions I need might have dependencies to other pieces within the jar. Now that things are broken I need to find out if that is true and happily it is not. So, to fix the problem I changed the extension on my jar file to zip then edited it in 7-zip. I removed all the code except what I needed. I kept all the folders in the path to my needed code, I did this for two reasons; I did not have to update my code that called the functions and when I tried changing the path the functions did not work.
Next I changed the extension on the file back to jar and changed the reference in JMeter’s “Add directory or jar to classpath” function to point to the revised jar. I haven’t seen the failure since.
Many thanks to the folks who looked at this. I hope the resolution will help someone out.

Application.config "disappears" while debugging Azure WebRole if I pause at a breakpoint for too long

I've noticed this off and on. If I'm locally debugging my Azure WebRole, in Visual Studio 2013, and I pause at a break-point for too long, the current request, or the next one, and all subsequent requests, will result in a 500.19 - Internal Server Error.
HTTP Error 500.19 - Internal Server Error
The requested page cannot be accessed because the related configuration data for the page is invalid.
Detailed Error Information:
Module CustomErrorModule
Notification SendResponse
Handler Not yet determined
Error Code 0x80070490
Config Error The configuration section 'system.webServer/httpErrors' cannot be read because it is missing a section declaration
Config File \\?\C:\Users\<user>\AppData\Local\dftmp\Resources\11468ba0-d99a-45d2-bcce-eae28c7b4e2f\temp\temp\RoleTemp\applicationHost.config
Requested URL <request url>
Physical Path
Logon Method Not yet determined
Logon User Not yet determined
Event viewer says:
The worker process for application pool '902aa9af-0ed8-4126-be43-e533339bdeef' encountered an error 'Cannot read configuration file' trying to read global
module configuration data from file '\\?\C:\Users\<user>\AppData\Local\dftmp\Resources\11468ba0-d99a-45d2-bcce-eae28c7b4e2f\temp\temp\RoleTemp\applicationHost.config',
line number '0'. Worker process startup aborted.
And if I go check that directory I see that the file isn't there. I'm guessing the file WAS there at some point, because I was able to make requests prior to that.
Restarting the WebRole seems to fix the problem.
99% of the time this isn't a problem, so I'm sure there isn't actually an error in the config file, it just makes debugging "stressful" since I always feel like I'm under some kind of time limit. :(

ORA-01013 - Weblogic setting error?

I am running a banking program, coded in Oracle PL/SQL. This program runs for 2-3 hours everyday, as part of the End of Day processing.
Till yesterday, it was working fine. Today when I run it today, after around 30 mins, the program terminates with the error ORA-01013: user requested cancel of current operation. I am not terminating the program manually.
I feel this could be a weblogic (where the application is deployed) setting problem. I am not fluent in weblogic and am not sure what parameter can be changed to stop this error. Please help!!!
Oracle version: 11.2.0.3
Oracle weblogic server: 11g
This sounds like a JDBC timeout. From the WebLogic console go to Services->Data Sources and click the name of your data source to see its settings. Click the Connection Pool tab, and expand the Advanced section at the bottom of the page. Look for the Statement Timeout setting.
From the documentation:
When Statement Timeout is set to -1, (the default) statements do not timeout.
The behaviour you're seeing suggests the timeout is set to 1800 if it's timing out after 30 minutes.
However, this won't have changed on its own, and if it was already set then it was being ignored previously, which would need some investigation - has anything else changed?
Another possibility is that your code is making several calls within the 3-4 hour window and one of them is now exceeding the timeout on its own, which might be the case if the timeout is lower than 1800. Without seeing your code or the current timeout value I'm just guessing, obviously.

Resources