Stanford CoreNLP Server: Reduce memory footprint

Stanford CoreNLP Server: Reduce memory footprint - stanford-nlp

I am acessing the CoreNLP Server from a Python script running in Jupyter Lab. I am using the full annotator suite to extract quotes from newspaper articles.
request_params={'annotators': "tokenize,ssplit,pos,lemma,ner,depparse,coref,quote",...
As against the recommended 2GB, I have allocated 4GB and yet the quote annotator fails to load. Windows task manager shows memory utilization at >94% for long periods.
Where can I get a list of options that I can tune to improve memory use?

The coreference models are probably the main culprit. If you don't care about quote attributions you can set -quote.attributeQuotes false and not use coref, but you will lose quote attributions.
I'm not sure the exact amount, but I think you should be fine in the 6GB-8GB range for running the entire pipeline presented in your question. The models used do take up a lot of memory. I don't think the options you have set in your comment ("useSUTime", "applyNumericClassifiers") will affect memory footprint at all.

Related

how to index tons of data at once with Rails, (re)tire, json without eating (all) memory?

In a Rails 3.2.x app, using (Re)tire to access an ES cluster a rake task is going through approx 1M rows to create a new index. (Ruby 1.9.3).
The task is using .to_json with specific attributes and methods listed to limit the resulting hash for each element.
Yet as the task run the memory is eaten away, ending with the process being killed usually by the system.
The task is already using find_by_batch. Smaller batches sizes (using find_each) don't help.
checking without index
Removing the index.import call does improve things (obviously). The task goes through the whole collection very fast without a problem. Pointing to either ES, tire or the JSON conversion (and the relations it might call upon).
reducing the scope of the task
Adding back index.import and passing a very limited hash (with string keys) for each item does make things slower but not too much and does not eat memory away. So json might no be the culprit here.
adding attributes and methods back
The culprit seems to be one of the method used to grab one of the additional attributes. It's based on a relation of the model and another ... Ending up with a lot of models being involved and sifted through.
As pointed out by Index the results of a method in ElasticSearch (Tire + ActiveRecord) adding includes does help a bit but the task does end up heavy too.
going around
I also tried to go around part of the problem and replace the calls to Tire with the use of ES bulk API.
Generating json files and sending them with a Ruby http lib can work. Yet, the same problem arise : memory since the same requests to the DB are made.
What's left ?
What I don't get is why even with the find_by_batch Ruby keeps eating away memory. I would expect that after each batch of data, memory related that batch would be freed.
Next to try : GC.start calls, Active Record caching de activation around the tasks.
Yet, except if a solution limiting the memory use drastically (300 or 500Mo instead of 800+) the background issue is : indexing a lot of instances of a Model including data related to some other models.
am I missing something for the import and includes that would solve the issue ?
would splitting that task into smaller background jobs (resque, sidekiq) help ? I would suppose so as each batch would be isolated from the others and once treated, really free up the memory (?) (orchestrating those tasks would be another trouble)
is there good practices related to indexing big quantities of data into ES ?

I've been using Rails + Elasticsearch for a while and did this kind of dance a few times.
A few things comes to mind, in no particular order.
Did you try to use the recent elasticsearch gem (instead of tire) ? I've updated my apps to use and like having more control on what is done.
I would also try to force a GC sweep after each ActiveRecord loop. You could also be extra careful with memory allocation by explicitly resetting all local variables each time.
You could use the fork & exec trick to fork a brand new process at each loop, it would be the most effective GC you can get. It's a little overhead when you write it the first time, but the pay-off is great. Take good care of limiting the amount of memory used in the outer part of the task. Using a process-based background task would partly achieve the same goal, but you might still get memory bloat.
Can you limit the use of ActiveRecord? If you need some basic associations you could use a lower-level/simpler tool like Sequel (or else) to use Ruby hashes/arrays instead of full fledged AR models.

Limiting memory of V8 Context

I have a script server that runs arbitrary java script code on our servers. At any given time multiple scripts can be running and I would like to prevent one misbehaving script from eating up all the ram on the machine. I could do this by having each script run in its own process and have an off the shelf monitoring tool monitor the ram usage of each process, killing and restarting the ones that get out of hand. I don't want to do this because I would like to avoid the cost of restart the binary every time one of these scripts goes crazy. Is there a way in v8 to set a per context/isolate memory limit that I can use to sandbox the running scripts?

It should be easy to do now
context.EstimatedSize() to get estimated size of the context
isolate.TerminateExecution() when context goes out of acceptable memory/cpu usage/whatever
in order to get access if there is an infinite loop(or something else blocking, like high cpu calculation) I think you could use isolate.RequestInterrupt()

A single process can run multiple isolates, if you have a 1 isolate to 1 context ratio you can easily
restrict memory usage per isolate
get heap stats
See some examples in this commit:
https://github.com/discourse/mini_racer/commit/f7ec907547e9a6ea888b2587e4edee3766752dd3
In particular you have:
v8::HeapStatistics stats;
isolate->GetHeapStatistics(&stats);
There are also fancy features like memory allocation callbacks you can use.

This is not reliably possible.
All JavaScript contexts by this process share the same object heap.
WebKit/Chromium tries some stuff to disable contexts after context OOMs.
http://code.google.com/searchframe#OAMlx_jo-ck/src/third_party/WebKit/Source/WebCore/bindings/v8/V8Proxy.cpp&exact_package=chromium&q=V8Proxy&type=cs&l=361
Sources:
http://code.google.com/p/v8/source/browse/trunk/src/heap.h?r=11125&spec=svn11125#280
http://code.google.com/p/chromium/issues/detail?id=40521
http://code.google.com/p/chromium/issues/detail?id=81227

How to interpret Windows Task Manager?

I run Windows 7 RC1, which uses the same WTM from Vista. When i look at the processes, there some columns I'm not sure what the differences are:
Memory - working set
Memory - private working set
Memory - commit size
can anyone tell me what they are?

From the following article, under the section Types of Memory Usage:
There are two main types of memory usage: working set and private working set. The private working set is the amount of memory used by a process that cannot be shared among other processes, while working set includes the memory shared by other processes.
That may sound confusing, so let’s try to simplify it a bit. Lets pretend that there are two kids who are coloring, and both of the kids have 5 of their own crayons. They decide to share some of their crayons so that they have more colors to choose from. When each child is asked how many crayons they used, both of them said they used 7 crayons, because they each shared 2 of their crayons.
The point of that metaphor is that one might assume that there were a total of 14 crayons if they didn’t know that the two kids were sharing, but in reality there were only 10 crayons available. Here is the rundown:
Working Set: This includes all of the shared crayons, so the total would be 14.
Private Working Set: This includes only the crayons that each child owns, and doesn’t reflect how many were actually used in each picture. The total is therefore 10.
This is a really good comparison to how memory is measured. Many applications reuse code that you already have on your system, because in the end it helps reduce the overall memory consumption. If you are viewing the working set memory usage you might get confused because all of your running processes might actually add up to more than the amount of RAM you have installed, which is the same problem we had with the crayon metaphor above. Naturally the working set will always be larger than the private working set.

Working set:
Working set is the subset of virtual pages that are resident in physical memory only; this will be a partial amount of pages from that process.
Private working set:
The private working set is the amount of memory used by a process that cannot be shared among other processes
Commit size:
Amount of virtual memory that is reserved for use by a process.
And at microsoft.com you can find more details about other memory types.

'Working Set' is the amount of memory that the process currently has in physical RAM. In other words, accessing any pages in the 'Working Set' will not cause a page fault since the page is in RAM.
As for the other two, I'm not 100% sure, probably 'Working Set' contains sharable memory, such as memory mapped files, and 'Private Working Set' contains only pages that the process can use and are not shareable.
Have look at this site and search for the speaker 'Dave Solomon'. There is an excellent webcast that he gave which explains about Windows memory, and he mentions working set, commit sizes, and other memory terms.
EDIT:
Those site links are indeed dead :(
Instead, you can search Google for
vimeo david solomon windows
Those same videos look to be available on Vimeo now, which is cool.

If you open the Resource Monitor from the WTM, mousing over the various column headings of the interesting process displays a pretty informative tool tip.
e.g.
Commit(KB): Amount of virtual memory reserved by the operating system for the process in KB.
etc.

This article at Microsoft seems to be the most detailed.
Edit Oct 2018: new link

How to deal with memory leaks in RMagick in Ruby?

Im developing web-application with Merb and im looking for some safe and stable image processing library. I used to work with Imagick in php, then moved to ruby and start using RMagick. But there is a problem. Long running scripts causing memory leaks. There are couple solution exists, but I don't know which one is the most stable. So, what do you think?
Right now, my app uses internal API that i wrote to process images, in PHP. Its running on separate server along with other applications, so its not a big problem. But i think its not a good architecture.
Anyway, i`ll consider any practical tips.

I too have encountered this issue - the solution is to force garbage collection.
When you have reassigned the image variable to a new image simply use GC.start to ensure the old reference is released from memory.
On later versions of RMagick, I also believe you can also call destroy! on the image when you have finished processing it.
A combination of the two would probably ensure you are covered, but im not sure of the real life impact on performance (I would assume it is negligible i most cases).
Alternatively, you could use mini-magick which is a wrapper for the ImageMagick commandline client.

When using RMagick it's important to remember to destroy the image once you are done, otherwise you will fill up the /tmp dir when working with large sets of images. For example you must call destroy!
require 'RMagick'
Dir.foreach('/home/tiffs/') do |file|
next if file == '.' or file == '..'
image = Magick::Image.read(file).first
image.format = "PNG"
image.write("/home/png/#{File.basename(file, '.*')}.png")
image.destroy!
end

Actually, it isn't really a Ruby specific problem, other Interpreters share that as well. The concrete problem is that the GC of Ruby only sees memory that was allocated by Ruby itself, and not by external libraries (with the notable exception of the library using Rubys memory management facilities). So, a ImageMagick-Object in Ruby memory space is really small, but the image in the space managed by ImageMagick is large. So, this is not a leak per se, but it behaves like one.
Rubys Garbage Collector never kicks in if your Process stays under a certain limit (8MB is standard). As ImageMagick never creates large objects in Ruby space, it probably never kicks in. So, either you use the proposed method of spawning a new process or using exec. Another rather nifty one is to have an image processing service in the backend that forks for every task. Another one would be to have some kind of monitoring in place that kickstarts the GC every once in a while.
There is another Library called MagickWand by Timothy Paul Hunter (the author of RMagick) that tries to address these issues and create a nicer API. It's in alpha and requires a rather new release of ImageMagick, though.

Now you can tell ImageMagick which memory space should be used.
I think RMAGICK_ENABLE_MANAGED_MEMORY = true and GC.start is what you need.
MANAGED_MEMORY
If true, RMagick is using Ruby managed memory for all allocations. If false,
RMagick allocates memory for objects directly from the operating system. You can
enable RMagick to use Ruby managed memory (when built with ImageMagick 6.4.0-11
and later) by setting
RMAGICK_ENABLE_MANAGED_MEMORY = true
before requiring RMagick.
https://rmagick.github.io/constants.html
However, image.destroy! itself is enough to stabilize the memory consumption.

This is not due to ImageMagick; it's due to Ruby itself, and it's a well known problem. My suggestion is to split your program into two parts: a long-running part that allocates little memory and just deals with the control of the system, and a separate program that actually does the processing work. The long-running control process should do just enough to find some work for a child process that it spawns, and the child should do all of the processing for that particular work item.
Another option would be to leave the two combined, but after a work unit is complete, use exec to replace your process with a freshly started version of the same program, which would search for another work item, process it, and exec itself again.
This is assuming that the work items are fairly large, which they almost certainly are if you're using ImageMagick. If they're not, you'll find that the overhead of spawning a new process and having the Ruby interpreter re-parse your entire program starts to get a little too large. You can deal with this by having your program do more work units (say, ten or a hundred) before re-executing itself.

Set Windows process (or user) memory limit

Is there any way to set a system wide memory limit a process can use in Windows XP? I have a couple of unstable apps which do work ok for most of the time but can hit a bug which results in eating whole memory in a matter of seconds (or at least I suppose that's it). This results in a hard reset as Windows becomes totally unresponsive and I lose my work.
I would like to be able to do something like the /etc/limits on Linux - setting M90, for instance (to set 90% max memory for a single user to allocate). So the system gets the remaining 10% no matter what.

Use Windows Job Objects. Jobs are like process groups and can limit memory usage and process priority.

Use the Application Verifier (AppVerifier) tool from Microsoft.
In my case I need to simulate memory no longer being available so I did the following in the tool:
Added my application
Unchecked Basic
Checked Low Resource Simulation
Changed TimeOut to 120000 - my application will run normally for 2 minutes before anything goes into effect.
Changed HeapAlloc to 100 - 100% chance of heap allocation error
Set Stacks to true - the stack will not be able to grow any larger
Save
Start my application
After 2 minutes my program could no longer allocate new memory and I was able to see how everything was handled.

Depending on your applications, it might be easier to limit the memory the language interpreter uses. For example with Java you can set the amount of RAM the JVM will be allocated.
Otherwise it is possible to set it once for each process with the windows API
SetProcessWorkingSetSize Function

No way to do this that I know of, although I'm very curious to read if anyone has a good answer. I have been thinking about adding something like this to one of the apps my company builds, but have found no good way to do it.
The one thing I can think of (although not directly on point) is that I believe you can limit the total memory usage for a COM+ application in Windows. It would require the app to be written to run in COM+, of course, but it's the closest way I know of.
The working set stuff is good (Job Objects also control working sets), but that's not total memory usage, only real memory usage (paged in) at any one time. It may work for what you want, but afaik it doesn't limit total allocated memory.

Per process limits
From an end-user perspective, there are some helpful answers (and comments) at the superuser question “Is it possible to limit the memory usage of a particular process on Windows”, including discussions of how to set recursive quota limits on any or all of:
CPU assignment (quantity, affinity, NUMA groups),
CPU usage,
RAM usage (both ‘committed’ and ‘working set’), and
network usage,
… mostly via the built-in Windows ‘Job Objects’ system (as mentioned in #Adam Mitz’s answer and #Stephen Martin’s comment above), using:
the registry (for persistence, when desired) or
free tools, such as the open-source Process Governor.
(Note: nested Job Objects ~may~ not have been available under all earlier versions of Windows, but the un-nested version appears to date back to Windows XP)
Per-user limits
As far as overall per-user quotas:
??
It is possible that each user session is automatically assigned to a job group itself; if true, per-user limits should be able to be applied to that job group. Update: nope; Job Objects can only be nested at the time they are created or associated with a specific process, and in some cases a child Job Object is allowed to ‘break free’ from its parent and become independent, so they can’t facilitate ‘per-user’ resource limits.
(NTFS does support per-user file system ~storage~ quotas, though)
Per-system limits
Besides simple BIOS or ‘energy profile’ restrictions:
VM hypervisor or Kubernetes-style container resource limit controls may be the most straightforward (in terms of end-user understandability, at least) option.
Footnotes, regarding per-process and other resource quotas / QoS for non-Windows systems:
‘Classic’ Mac OS (including ‘classic’ applications running on 2000s-era versions of Mac OS X): per-application memory limits can be easily set within the ‘Memory’ section of the Finder ‘Get Info’ window for the target program; as a system using a cooperative multitasking concurrency model, per-process CPU limits were impossible.
BSD: ? (probably has some overlap with linux and non-proprietary macOS methods?)
macOS (aka ‘Mac OS X’): no user-facing interface; system support includes, depending on version, the ‘Multiprocessing Services API’, Grand Central Dispatch, POSIX threads / pthread, ‘operation objects’, and possibly others.
Linux: ‘Resource Manager’/limits.conf, control groups/‘cgroups’, process priority/‘niceness’/renice, others?
IBM z/OS and other mainframe-style systems: resource controls / allocation was built-in from nearly the beginning

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Stanford CoreNLP Server: Reduce memory footprint - stanford-nlp

Related

how to index tons of data at once with Rails, (re)tire, json without eating (all) memory?

Limiting memory of V8 Context

How to interpret Windows Task Manager?

How to deal with memory leaks in RMagick in Ruby?

Set Windows process (or user) memory limit

Categories

Resources