Stanford CoreNLP Server disable logging - stanford-nlp

I have the feeling that the logging of the server is quite exhaustive. Is there a way to disable or reduce the logging output? It seems that if I send a document to the server it will write the content to stdout which might be a performance killer.
Can I do that somehow?
Update
I found a way to suppress the output from the server. Still my question is how and if I can do this using a command line argument for the actual server. However for a dirty workaround it seems the following can ease the overhead.
Running the server with
java -mx6g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -prettyPrint false 2&>1 >/dev/null
where >/dev/null would pipe the output into nothing. Unfortunately this alone did not help. 2&>1 seems to do the trick here. I confess that I do not know what it's actually doing. However, I compared two runs.
Running with 2&>1 >/dev/null
Processed 100 sentences
Overall time: 2.1797 sec
Time per sentence: 0.0218 sec
Processed 200 sentences
Overall time: 6.5694 sec
Time per sentence: 0.0328 sec
...
Processed 1300 sentences
Overall time: 30.482 sec
Time per sentence: 0.0234 sec
Processed 1400 sentences
Overall time: 32.848 sec
Time per sentence: 0.0235 sec
Processed 1500 sentences
Overall time: 35.0417 sec
Time per sentence: 0.0234 sec
Running without additional arguments
ParagraphVectorTrainer - Epoch 1 of 6
Processed 100 sentences
Overall time: 2.9826 sec
Time per sentence: 0.0298 sec
Processed 200 sentences
Overall time: 5.5169 sec
Time per sentence: 0.0276 sec
...
Processed 1300 sentences
Overall time: 54.256 sec
Time per sentence: 0.0417 sec
Processed 1400 sentences
Overall time: 59.4675 sec
Time per sentence: 0.0425 sec
Processed 1500 sentences
Overall time: 64.0688 sec
Time per sentence: 0.0427 sec
This was a very shallow test but it appears that this can have quite an impact. The difference here is a factor of 1.828 which is quite a difference over time.
However, this was just a quick test and I cannot guarantee that my results are completely sane!
Further update:
I assume that this has to do with how the JVM is optimizing the code over time but the time per sentence becomes compareable with the one I am having on my local machine. Keep in mind that I got the results below using 2&>1 >/dev/null to eliminate the stdout logging.
Processed 68500 sentences
Overall time: 806.644 sec
Time per sentence: 0.0118 sec
Processed 68600 sentences
Overall time: 808.2679 sec
Time per sentence: 0.0118 sec
Processed 68700 sentences
Overall time: 809.9669 sec
Time per sentence: 0.0118 sec

You're now the third person that's asked for this :) -- Preventing Stanford Core NLP Server from outputting the text it receives . In the HEAD of the GitHub repo, and in versions 3.6.1 onwards, there's a -quiet flag that prevents the server from outputting the text it receives. Other logging can then be configured with SLF4J, if it's in your classpath.

Related

how we will calculate time difference between 90%and 95% (Aggregate Report)

we have run 3000 samples total duration time is 4 min 45 sec how is it possible 99% of requests are taking time only 1 min 20 sec and only 1 % of requests take the rest of the time
we have run 3000 samples total duration time is 4 min 45 sec how is it possible 99% of requests are taking time only 1 min 20 sec and only 1 % of requests take the rest of the time
As per JMeter Glossary:
90% Line (90th Percentile) is the value below which 90% of the samples fall. The remaining samples too at least as long as the value. This is a standard statistical measure. See, for example: Percentile entry at Wikipedia.
As per Request Stats Report
90% line (ms) 90th Percentile. 90% of the samples were smaller than or equal to this time.
95% line (ms) 95th Percentile. 95% of the samples were smaller than or equal to this time.
99% line (ms) 99th Percentile. 99% of the samples were smaller than or equal to this time.
So if you see i.e. 1500 ms as 90% percentile it means that 90% of Samplers response time was 1500 ms or less, the remaining ones are higher.

Unable to analyse Throughput Performance of One API which has been migrated to new platform

I have checking the performance one APIs which is performing in two systems therefore as the api has been migrated to new system i am doing the performance comparison from old system
Statistics as shown below:
 
New System:
 
Thread -25
Ramp-up ~25
Avg -8sec
Median - 7.8
95th percentile  -  8.8 sec
Throughput  - 0.39                  
Old System:
 
Thread -25
Ramp-up ~25
Avg -10 sec
Median - 10
95th percentile - 10
Throughput  - 0.74 
Here we can observe that the New System has taken less time for 25 Threads than old system but throughput is more Old System.
But Old System has Taken more time
I am confused about the throughput which system is more efficient ?
One which has taken less time should have more throughput but here the lesser time taken has less throughput which makes me confused to understand the performance??
can anyone help me here???
As per JMeter Glossary:
Throughput is calculated as requests/unit of time. The time is calculated from the start of the first sample to the end of the last sample. This includes any intervals between samples, as it is supposed to represent the load on the server.
The formula is: Throughput = (number of requests) / (total time)
So double check total test duration for both test executions, my expectation is that the "new system test" took longer.
With regards to the reason I cannot state anything meaningful without seeing the full .jtl results files for both executions, I can only assume that it could be one very long request in the "new system" test or you're having a Timer with random think time somewhere in your test.

Calculate Average Response time Calculation in JMeter

(Attached as image)
"In My summary report
Total Samplers = 11944
My total Average response = 2494 mili-second = 2.49 seconds.
What i understand from here 11944 samplers are processed in average of 2.49 seconds.That means my test actually should processed for 11944 x 2.49 Seconds = 82 hours.But it actually ran about 15-20 mints max.
So trying to understand,is it reduced execution time due to JMeter parallel/multiple thread execution or i am understanding it wrong way.
I want to know a single request average response time"
JMeter calculates response time as:
Sum of all Samplers response times
Divided by the number of samplers
basically it's arithmetic mean of all samplers response times.
11944 x 2.49 / 3600 gives 8.2 hours and yes, this is how much time it would take to execute the test with a single user, the amount of time will reduce proportionally depending on the number of threads used
More information:
Calculator class source code
JMeter Glossary
Understanding Your Reports: Part 2 - KPI Correlations
It depends on threads number you used
For example if you used 50 threads 12K Samples/requests and each time took (average of) 2.5 seconds
12000 * 2.5 / 50 / 60 = 10 minutes
^ ^ ^ ^
requests avg. sec threads sec per minute

Oracle AWR units of db time per sec (WORKLOAD REPOSITORY COMPARE PERIOD REPORT)

I'm getting WORKLOAD REPOSITORY COMPARE PERIOD REPORT that says
Load Profile
1st per sec
DB time: 1.3
I'm confused, Db time should be in time units, doesn't it?
BELOW are context and history of what's I've researched about AWRs and how I came to the answer I posted eventually.
I have ARW report that says
Elapsed Time (min) DB time (min)
60 80
That I read e.g. here https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:229943800346471782 it's explained how DB time can exceed elapsed time. And Db time is time, it's measured in time units (min = minute?), so far so good.
Then Load Profile says:
1st per sec
DB time: 1.3
If DB time is 80 minutes in 60 minutes, than per sec by math should be 80/60/60, where that division by 60 to get per second go?
EDIT: my guess now as the question have been posted that this metric is in seconds, although units are not mentioned in AWR and I could not find about it in web by awr db time in sec search. Please provide link where it's confimed for sure (if it is so).
EDIT 2: WORKLOAD REPOSITORY report says, DB Time(s): per sec in Load profile section, where as WORKLOAD REPOSITORY COMPARE PERIOD REPORT just says Db time per sec. So now with like 99% assurance I can guess compare report uses same units, it's still not 100% sure fact. I actually get the reports via automated system, so cannot be sure they not mangled along the way...
P.S. by the way, I tried to do pretty formating of output, wanting to insert tabs, but could not find how, e,g. here Tab space in Markdown it says for similar it's not possible in markdown. Please add in comment if it can be done on stackoverflow.
My guess is that due to lots of info to fit on one line of compare AWR developers of the report decided to skip (s): which is present on the same place in ordinary (not compare) AWR.
I've looked at WORKLOAD REPOSITORY report, it says: DB Time(s): per sec 1.4 in Load profile section, where as WORKLOAD REPOSITORY COMPARE PERIOD REPORT just says Db time per sec and states 2nd 1.4. So now with like 99% assurance I can guess compare report uses same units - seconds for per seconds metric. Not 100% sure, but what are things we are 100% sure anyway?

How to reduce spring-shell startup time?

I'm using spring-shell and I want to reduce startup time of the shell.
Now it's taking 8-10 seconds and I want it to take less.
Do you have any suggestions?
By profiling I can see:
org.python.util.PythonInterpreter.exec(String) takes ~2 sec:
org.python.core.imp.importOne(String, PyFrame, int) - import jython-standalone-2.7.0.jar\Lib_jyio.py takes ~1 sec.
org.python.jsr223.PyScriptEngine.(ScriptEngineFactory) takes ~0.5 sec
Thanks.

Resources