Performance issues mongo on windows

Performance issues mongo on windows - windows

We have built a system where videos are stored in mongodb. The videos are each a couple of hundred megabytes in size. The system is built in python3 using mongoengine. The c extensions of pymongo and bson are installed.
The definition of the mongoengine documents is:
class VideoStore(Document, GeneralMixin):
video = EmbeddedDocumentListField(SingleFrame)
mutdat = DateTimeField()
_collection = 'VideoStore'
def gen_video(self):
for one_frame in self.video:
yield self._get_single_frame(one_frame)
def _get_single_frame(self, one_frame):
if one_frame.frame.tell() != 0:
one_frame.frame.seek(0)
return pickle.loads(one_frame.frame.read())
class SingleFrame(EmbeddedDocument):
frame = FileField()
Reading a video in Linux takes about 3 to 4 seconds. However running the same code in Windows takes 13 to 17 seconds.
Does anyone out there have any experience with this problem and any kind of solution?
I have thought of and tested (to no avail):
increasing the chunksize
reading the video as a single blob without using yield
Storing the file as a single blob (so no storing of separate frames)

Use Linux, Windows is poorly supported. The use of "infinite" virtual memory among other things causes issues with Windows variants. This thread elaborates further:
Why Mongodb performance better on Linux than on Windows?

Related

What can be done to lower UE4Editor startup time?

Status: the problem lowered, but compared to other users reports it persists.
I have moved to UE4.27.0 and the startup time lowered from 11 (v4.26.2) to 6 minutes! (the RAM usage lowered too!) But doesnt compare to the speed other ppl report "almost instantly"...
It is not compiling anything, not even shaders, it is like the 6th time I run it for one project.
Should I try to disable plugins? but Im new with UE and dont want to difficult my usage. Tho, for ex., I have nothing VR related to test so it could really be initially disabled.
HD READ SPEED? NO
I have tested moving UE4Editor whole engine path (100GB) to a 3xSSD(Stripes), but the UE4Editor startup time remained the same. My HD were it is too, is fast but not so fast as the 3xSSD.
CPU USAGE? MAY BE if it could use 4 cores could solve it?
UE4Editor startup uses A SINGLE CORE ONLY, i can confirm with htop and system monitor, it is possible to see only a single core being used 100% and it changes between the 4 cores, so only one is used at 100% per time.
I tested this command line parameter -USEALLAVAILABLECORES after the project URL for UE4Editor, but nothing changed. I read that option is ignored in some machines, so may be if I patch it's usage it could work on mine?
GPU? no?
a report about an integrated graphics card (weak one) says it doesnt interfere with the startup time.
LOG for UE4Editor v4.27.0 with the new biggest intervals ("..." means ommited log lines to make it easier to read; "!(interval in seconds)" is just to easy reading it (no ommitted lines here)):
[2021.09.15-23.38.20:677][ 0]LogHAL: Linux SourceCodeAccessSettings: NullSourceCodeAccessor
!22s
[2021.09.15-23.38.42:780][ 0]LogTcpMessaging: Initializing TcpMessaging bridge
[2021.09.15-23.38.42:782][ 0]LogUdpMessaging: Initializing bridge on interface 0.0.0.0:0 to multicast group 230.0.0.1:6666.
!16s
[2021.09.15-23.38.58:158][ 0]LogPython: Using Python 3.7.7
...
[2021.09.15-23.39.01:817][ 0]LogImageWrapper: Warning: PNG Warning: Duplicate iCCP chunk
!75s
[2021.09.15-23.40.16:951][ 0]SourceControl: Source control is disabled
...
[2021.09.15-23.40.26:867][ 0]LogAndroidPermission: UAndroidPermissionCallbackProxy::GetInstance
!16s
[2021.09.15-23.40.42:325][ 0]LogAudioCaptureCore: Display: No Audio Capture implementations found. Audio input will be silent.
...
[2021.09.15-23.41.08:207][ 0]LogInit: Transaction tracking system initialized
!9s
[2021.09.15-23.41.17:513][ 0]BlueprintLog: New page: Editor Load
!23s
[2021.09.15-23.41.40:396][ 0]LocalizationService: Localization service is disabled
...
[2021.09.15-23.41.45:457][ 0]MemoryProfiler: OnSessionChanged
!13s
[2021.09.15-23.41.58:497][ 0]LogCook: Display: CookSettings for Memory: MemoryMaxUsedVirtual 0MiB, MemoryMaxUsedPhysical 16384MiB, MemoryMinFreeVirtual 0MiB, MemoryMinFreePhysical 1024MiB
SPECS:
I'm using ubuntu 20.04.
My CPU is 4 cores 3.6GHz.
GeForce GT 710 1GB.
Related question but for older UE4: https://answers.unrealengine.com/questions/987852/view.html

Unreal Engine needs a high-end pc with a lot of RAM, fast SSD's, a good CPU and a medium graphic card. First of all there are always some shaders that needs to be compiled from the engine, and a lot of assets to be loaded in the startup time. As I can see you're on Linux you are probably using a self-compiled Unreal Engine version.... not the best thing to do for a newbie, because this may cause several problems on load time, startup, compiling and a lot of other stuff. If it's the first times you're using Unreal, try using it on Windows, it's all easier.

Can I make GhostScript use more than 2 GB of RAM?

I'm running a 64-bit version of GhostScript (9.50) on 64-bit processor with 16gb of RAM under Windows 7.
GhostScript returns a random-ish error message (it will tell me that I have type error in the array command) when I try to allocate one too many arrays totaling more than 2 GBs of RAM.
To be clear, I am seeing how growth of the memory usage in Windows Task Monitor, not from within GhostScript
I'd like to know why this is so.
More importantly, I'd like to know if I can override this behavior.
Edit: This code produces the error --
/TL 25000 def
/TL- TL 1 sub def
/G TL array def
0 1 TL- { dup == flush G exch TL array put }for
The error looks like this: Here's the last bit of the messages I get
5335
5336
5337
5338
5339
5340
5341
5342
5343
5344
5345
Unrecoverable error: typecheck in array
Operand stack: --nostringval-- ---
Begin offending input ---
/TL 25000 def /TL- TL 1 sub def /G TL array def 0 1 TL- { dup == flush G exch TL array put }for --- End offending input --- file offset = 0 gsapi_run_string_continue returns -20

The amount of RAM is almost certainly not the limiting factor, but it would help if you were to post the actual error message. It may be 'random-ish' to you, but it's meaningful to people who program in PostScript.
More than likely you've tripped over some other internal limit, for example the operand stack size but without seeing the PostScript program or the error message I cannot say any more than that. I can say that (64-bit) Ghostscript will happily address more than 2GB of RAM, I was running a file last week which had Ghostscript using 8.1GB.
Note that PostScript itself is basically a 32-bit language; while Ghostscript has extended many of the architectural limitations documented in the PostScript Language Reference Manual (such as 64K elements in arrays and strings) moving beyond 32-bit limits is essentially unspecified.
As to whether you can change the behaviour, that depends on exactly what the problem is, and I can't tell from what's here.
Edit
Here's a screenshot of Ghostscript running the test file to completion, along with the Task Manager display showing the amount of memory the process is using. Not shown is the vmstatus which I ran from the PostScript environment afterwards. This showed that Ghostscript thinks it's using 10,010,729,850 bytes form a maximum of 10,012,037,312. My calculator says that 9,562.8MB comes out at 10,027,322,572.4 bytes, so a pretty close match.
To answer the points in the comments this is (as you can probably tell) on a 64-bit Windows 10 installation with quite a lot of memory.
The difference is, almost certainly, something which has been fixed since the release of 9.52. The 9.52 64-bit binary does exit with a VMerror after (for me) 5360 iterations. Obviously trying to use vast amounts of PostScript memory (as opposed to, say, canvas memory) is not a common occurrence, not least because many PostScript interpreters simply won't allow it, so this doesn't get exercised much.
The Ghostscript Git repository is here if you want to go through the commits and try to figure out which one caused the change. You only have to go back to March this year, anything before about the 19th March would have been in 9.52.
Beyond simple curiosity, is there a reason to try and use up loads of memory in PostScript ?

Train RoBERTa from scratch where dataset is larger than the capacity of RAM?

I have a corpus that is 16 GB large and my ram IS around 16 GB ish. If I load the entire dataset to train the language model RoBERTa from scratch, I am going to have a memory issue. I intend to train my RoBERTa using the script provided from Huggingface's tutorial in their blog post: https://colab.research.google.com/github/huggingface/blog/blob/master/notebooks/01_how_to_train.ipynb
However, their blog post suggests the usage of LineByLineTextDatase. However, this loads the dataset eagerly.
class LineByLineTextDataset(Dataset):
"""
This will be superseded by a framework-agnostic approach
soon.
"""
def __init__(self, tokenizer: PreTrainedTokenizer, file_path: str, block_size: int):
assert os.path.isfile(file_path)
# Here, we do not cache the features, operating under the assumption
# that we will soon use fast multithreaded tokenizers from the
# `tokenizers` repo everywhere =)
logger.info("Creating features from dataset file at %s", file_path)
with open(file_path, encoding="utf-8") as f:
lines = [line for line in f.read().splitlines() if (len(line) > 0 and not line.isspace())]
batch_encoding = tokenizer(lines, add_special_tokens=True, truncation=True, max_length=block_size)
self.examples = batch_encoding["input_ids"]
def __len__(self):
return len(self.examples)
def __getitem__(self, i) -> torch.Tensor:
return torch.tensor(self.examples[i], dtype=torch.long)
Unexpectedly, my kernel crashed on the part where they read the line. I wonder if there is a way to make it read lazily. It will be very desirable if the suggested answer can create minimum code change with the posted tutorial since I'm rather new with Huggingface and afraid I won't be able to debug it on my own.

I would recommend using HuggingFace's own datasets library. The documentation says:
It provides a very efficient way to load and process data from raw files (CSV/JSON/text) or in-memory data (python dict, pandas dataframe) with a special focus on memory efficiency and speed. As a matter of example, loading a 18GB dataset like English Wikipedia allocate 9 MB in RAM and you can iterate over the dataset at 1-2 GBit/s in python.
The quick tour has good explanations and code snippets for creating a dataset object with your own data and it also explains how to train your own model.

what is the size of windows semaphore object?

How to find size of a semaphore object in windows?
I tried using sizeof() but we cannot give name of the sempahore object as an argument to sizeof. It has to be the handle. sizeof(HANDLE) gives us the size of handle and not semaphore.

This what is known as an "opaque handle.". There is no way to know how big it really is, what it contains or how any of the functions work internally. This gives Microsoft the ability to completely rewrite the implementation with each new version of Windows if they want to without worrying about breaking existing code. It's a similar concept to having a public and private interface to a class. Since we are not working on the Windows kernel, we only get to see the public interface.
Update:
It might be possible to get a rough idea of how big they are by creating a bunch and monitoring what happens to your memory usage in Process Explorer. However, since there is a good chance that they live in the kernel and not in user space, it might not show up at all. In any case, there are no guarantees about any other version of Windows, past or future, including patches/service packs.

It's something "hidden" from you. You can't say how big it is. And it's a kernel object, so it probably doesn't even live in your address space. It's like asking "how big is the Process Table?", or "how many MB is Windows wasting?".
I'll add that I have made a small test on my Windows 7 32 bits machine: 100000 kernel semaphores (with name X{number} with 0 <= number < 100000)) : 4 mb of kernel memory and 8 mb of user space (both measured with Task Manager). It's about 40 bytes/semaphore in kernel space and 80 bytes/semaphore in user space! (this in Win32... In 64 bits it'll probably double)

Ruby Memory Management

I have been using Ruby for a while now and I find, for bigger projects, it can take up a fair amount of memory. What are some best practices for reducing memory usage in Ruby?
Please, let each answer have one "best practice" and let the community vote it up.

When working with huge arrays of ActiveRecord objects be very careful... When processing those objects in a loop if on each iteration you are loading their related objects using ActiveRecord's has_many, belongs_to, etc. - the memory usage grows a lot because each object that belongs to an array grows...
The following technique helped us a lot (simplified example):
students.each do |student|
cloned_student = student.clone
...
cloned_student.books.detect {...}
ca_teachers = cloned_student.teachers.detect {|teacher| teacher.address.state == 'CA'}
ca_teachers.blah_blah
...
# Not sure if the following is necessary, but we have it just in case...
cloned_student = nil
end
In the code above "cloned_student" is the object that grows, but since it is "nullified" at the end of each iteration this is not a problem for huge array of students. If we didn't do "clone", the loop variable "student" would have grown, but since it belongs to an array - the memory used by it is never released as long as array object exists.
Different approach works too:
students.each do |student|
loop_student = Student.find(student.id) # just re-find the record into local variable.
...
loop_student.books.detect {...}
ca_teachers = loop_student.teachers.detect {|teacher| teacher.address.state == 'CA'}
ca_teachers.blah_blah
...
end
In our production environment we had a background process that failed to finish once because 8Gb of RAM wasn't enough for it. After this small change it uses less than 1Gb to process the same amount of data...

Don't abuse symbols.
Each time you create a symbol, ruby puts an entry in it's symbol table. The symbol table is a global hash which never gets emptied.
This is not technically a memory leak, but it behaves like one. Symbols don't take up much memory so you don't need to be too paranoid, but it pays to be aware of this.
A general guideline: If you've actually typed the symbol in code, it's fine (you only have a finite amount of code after all), but don't call to_sym on dynamically generated or user-input strings, as this opens the door to a potentially ever-increasing number

Don't do this:
def method(x)
x.split( doesn't matter what the args are )
end
or this:
def method(x)
x.gsub( doesn't matter what the args are )
end
Both will permanently leak memory in ruby 1.8.5 and 1.8.6. (not sure about 1.8.7 as I haven't tried it, but I really hope it's fixed.) The workaround is stupid and involves creating a local variable. You don't have to use the local, just create one...
Things like this are why I have lots of love for the ruby language, but no respect for MRI

Beware of C extensions which allocate large chunks of memory themselves.
As an example, when you load an image using RMagick, the entire bitmap gets loaded into memory inside the ruby process. This may be 30 meg or so depending on the size of the image.
However, most of this memory has been allocated by RMagick itself. All ruby knows about is a wrapper object, which is tiny(1).
Ruby only thinks it's holding onto a tiny amount of memory, so it won't bother running the GC. In actual fact it's holding onto 30 meg.
If you loop over a say 10 images, you can run yourself out of memory really fast.
The preferred solution is to manually tell the C library to clean up the memory itself - RMagick has a destroy! method which does this. If your library doesn't however, you may need to forcibly run the GC yourself, even though this is generally discouraged.
(1): Ruby C extensions have callbacks which will get run when the ruby runtime decides to free them, so the memory will eventually be successfully freed at some point, just perhaps not soon enough.

Measure and detect which parts of your code are creating objects that cause memory usage to go up. Improve and modify your code then measure again. Sometimes, you're using gems or libraries that use up a lot of memory and creating a lot of objects as well.
There are many tools out there such as busy-administrator that allow you to check the memory size of objects (including those inside hashes and arrays).
$ gem install busy-administrator
Example # 1: MemorySize.of
require 'busy-administrator'
data = BusyAdministrator::ExampleGenerator.generate_string_with_specified_memory_size(10.mebibytes)
puts BusyAdministrator::MemorySize.of(data)
# => 10 MiB
Example # 2: MemoryUtils.profile
Code
require 'busy-administrator'
results = BusyAdministrator::MemoryUtils.profile(gc_enabled: false) do |analyzer|
BusyAdministrator::ExampleGenerator.generate_string_with_specified_memory_size(10.mebibytes)
end
BusyAdministrator::Display.debug(results)
Output:
{
memory_usage:
{
before: 12 MiB
after: 22 MiB
diff: 10 MiB
}
total_time: 0.406452
gc:
{
count: 0
enabled: false
}
specific:
{
}
object_count: 151
general:
{
String: 10 MiB
Hash: 8 KiB
BusyAdministrator::MemorySize: 0 Bytes
Process::Status: 0 Bytes
IO: 432 Bytes
Array: 326 KiB
Proc: 72 Bytes
RubyVM::Env: 96 Bytes
Time: 176 Bytes
Enumerator: 80 Bytes
}
}
You can also try ruby-prof and memory_profiler. It is better if you test and experiment different versions of your code so you can measure the memory usage and performance of each version. This will allow you to check if your optimization really worked or not. You usually use these tools in development / testing mode and turn them off in production.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio