Instantiating multiple objects in one frame or one object per frame? - performance

The idea is simple 'instantiating a map' in Awake with random values.
But the Question is:
Should i instantiate the whole map in one frame (using Loop)?
or better to instantiate each object per frame?
Because i don't want to ruin player's ram by instantiating 300 gameobjects in less than a second.

Whether you instantiate all gameobjects in one frame or not, they will always end up in the RAM the same way. The only way to "ruin" someone's ram would be to instantiate so many gameobjects til there is no memory left. Considering that a typical prefab in unity is only a few kb in size and a typical RAM nowadays is a few GB in size , that would take roughly a million gameobjects.

Never ever make things that depends on frames, never!!
There are some exceptions when this can be good but most of the time its not.
Good case:
- Incremental garbage collection (still has drawbacks)
Bad case:
- Your case, loading a map should be at the beginning
Why i should not make my game frame dependent?
Because, PCs have different computational speed, a good example was Harry Potter II, the game was developed for machines capable of 30 frames per seconds, modern machines can run that game extremely fast thus the game is basically sped up, you need to manually throttle the CPU to make the game playable.
Another example is unitys delta time, the reason you use it when moving objects over multiple frames because it takes into account the last frames computation speed
Also 300 objects is nothing when loading a game, also from a player point of view:
What is better 10 seconds loading, or 30 seconds 15 fps then normal speed
(above example is exaggerated tho)

When loading a map you can do it asynchronously at the start of entering the scene. This way you can put a loadingscreen during the loading time. This is a good way to do it if you are making a single player game. If it's a multiplayer game you need to sync it on the server for every other player aswell. The method for loading a scene async is SceneManager.LoadSceneAsync().
If you're trying to instantiate objects during runtime because you want to randomize certain objects I would recommend loading every object that doesn't need randomizing by entering the scene (so dropping them in the scene).
This is how I interpreted your question tell me if I am wrong.

Related

What is the difference between FPS and UPS and should I keep track of UPS on my game loop?

I was searching for ways to improve my game looping and how to implement more performance options to the players when I found the term UPS. I know it means updates per seconds, but how it affects performance? And should I worry about it?
Let's assume you have an extremely simple game with a single thread and a basic loop, like "while(running) { get_input(); update_world_state(); update_video(); }". In this case you end up with "UPS = FPS" (and no reason to track UPS separately from FPS); and if the GPU is struggling to keep up the entire game slows down (e.g. if you're getting 15 frames per second, then things that have nothing to do with graphics might take 4 times longer than they should, even when you have 8 CPUs doing nothing while waiting for GPU to finish).
For one alternative, what if you had 2 threads, where one thread does "while(running) { get_input(); update_world_state(); }" and the other thread does "while(running) { update_video(); }"? In this case there's no reason to expect UPS to have anything to do with FPS. The problem here is that most games aren't smart enough to handle variable timing, so you'd end up with something more like "while(running) { get_input(); update_world_state(); wait_until_next_update_starts(); }" to make sure that the game can't run too fast (e.g. cars that are supposed to be moving at a speed of 20 Km per hour moving at 200 Km per hour because update_world_state() is being called too often). Depending on things and stuff, you might get 60 UPS (regardless of what FPS is); but if the CPU can't keep up the game can/will still slow down and you might get 20 UPS (regardless of what FPS is). Of course there's no point updating the video if the world state hasn't changed; so you'd want the graphics loop to be more like "while(running) { wait_for_world_state_update(); update_video(); }", where wait_for_world_state_update() makes sure FPS <= UPS (and where wait_for_world_state_update() returns immediately without any delay when UPS is keeping up).
The next step beyond this is "tickless". In this case you might have one high priority thread monitoring user input and assigning timestamps to input events (e.g. "at time = 12356 the user fired their main weapon") and storing them in a list. Then you might have a second (lower priority, to avoid messing up the accuracy of the user input timestamps) thread with a main loop like "while(running) { next_frame_time = estimate_when__next_frame_will_actually_be_visible(); update_world_state_until(next_frame_time); update_video(); }", where update_world_state_until() uses a whole pile of maths to predict what the game state will be at a specific point in time (and consumes the list of stored user input events while taking their timestamps into account). In this case UPS doesn't really make any sense (you'd only care about FPS). This is also much more complicated (due to the maths involved in calculating the world state at any point in time); but the end result is like having "infinite UPS" without the overhead of updating the world state more than once per frame; and it allows you to hide any graphics latency (e.g. things seen 16.66 ms later than they should); which makes it significantly better than other options (much smoother, significantly less likely for performance problems to cause simulation speed variations, etc).

Rendering in DirectX 11

When frame starts, I do my logical update and render after that.
In my render code I do usual stuff. I set few states, buffors, textures, and end by calling Draw.
m_deviceContext->Draw(
nbVertices,
0);
At frame end I call present to show rendered frame.
// Present the back buffer to the screen since rendering is complete.
if(m_vsync_enabled)
{
// Lock to screen refresh rate.
m_swapChain->Present(1, 0);
}
else
{
// Present as fast as possible.
m_swapChain->Present(0, 0);
}
Usual stuff. Now, when I call Draw, according to MSDN
Draw submits work to the rendering pipeline.
Does it mean that data is send to GPU and main thread (the one called Draw) continues? Or does it wait for rendering to finish?
In my opinion, only Present function should make main thread wait for rendering to finish.
There are a number of calls which can trigger the GPU to start working, Draw being one. Other's include Dispatch, CopyResource, etc. What the MSDN docs are trying to say is that stuff like PSSetShader. IASetPrimitiveTopology, etc. doesn't really do anything until you call Draw.
When you call Present that is taken as an implicit indicator of 'end of frame' but your program can often continue on with setting up rendering calls for the next frame well before the first frame is done and showing. By default, Windows will let you queue up to 3 frames ahead before blocking your CPU thread on the Present call to let the GPU catch-up--in real-time rendering you usually don't want the latency between input and render to be really high.
The fact is, however, that GPU/CPU synchronization is complicated and the Direct3D runtime is also batcning up requests to minimize kernel-call overhead so the actual work could be happing after many Draws are submitted to the command-queue. This old article gives you the flavor of how this works. On modern GPUs, you can also have various memory operations for paging in memory, setting up physical video memory areas, etc.
BTW, all this 'magic' doesn't exist with Direct3D 12 but that means the application has to do everything at the 'right' time to ensure it is both efficient and functional. The programmer is much more directly building up command-queues, triggering work on various pixel and compute GPU engines, and doing all the messy stuff that is handled a little more abstracted and automatically by Direct3 11's runtime. Even still, ultimately the video driver is the one actually talking to the hardware so they can do other kinds of optimizations as well.
The general rules of thumb here to keep in mind:
Creating resources is expensive, especially runtime shader compilation (by HLSL complier) and runtime shader blob optimization (by driver)
Copying resources to the GPU (i.e. loading texture data from the CPU memory) requires bus bandwidth that is limited in supply: Prefer to keep textures, VB, and IB data in Static buffers you reuse.
Copying resources from the GPU (i.e. moving GPU memory to CPU memory) uses a backchannel that is slower than going to the GPU: try to avoid the need for readback from the GPU
Submitting larger chunks of geometry per Draw call helps to amortize overhead (i.e. calling draw once for 10,000 triangles with the same state/shader is much faster than calling draw 10 times for a 1000 triangles each with changing state/shaders between).

How to optimize scene loading in Unity3D

I was wondering if there is any way to speed up level loading in unity3d. I currently have loading level between both scenes, but it yet takes around 5 sec to load new level on ipad3. That's quite a lot.
I've optimized all start and awake functions so there is realy little stuf going on there. However i have a lot of sprites in each scene and i think they took most of the loading time.
Could I somehow determine which objects needs to be load at start and which can load during first 10 secs of level? I tried adding level additive but that makes my game to lag for sec or two.
Any smart way of speeding it up?
The time it takes to load your level is determined by the things that are referenced in your scene. The only way to cut loading time is to remove things.
You mentioned it would be fine to load some things after the level has started. You can do this using Resources.Load. However, this will only benefit you if there are no references to the thing you are loading. For example, if you have 100 trees in your scene it won't do you any good reducing that to 10. The tree is still referenced and will have to be loaded before your level can start. If you eliminate all of them then your level can start without loading the tree. It is then up to you to load it using Resources.Load and start planting them, maybe over the course of several frames to prevent a hickup.

Lag with deltaTime

I'm a newbie game developer and I'm having an issue I'm trying to deal with. I am working in a game for Android using JAVA.
The case is I'm using deltaTime to have smooth movements and so on in any devices, but I came across with a problem. In a specific moment of the game, it realizes a quite expensive operation which increments the deltaTime for the next iteration. With this, this next iteration lags a bit and in old slow devices can be really bad.
To fix this, I have thought in a solution I would like to share with you and have a bit of feedback about what could happen with this. The algorythm is the following:
1) Every iteration, the deltatime is added to an "average deltatime variable" which keeps an average of all the iterations
2) If in an iteration the deltaTime is at least twice the value of the "average variable", then I reasign its value to the average
With this the game will adapt to the actual performance of the device and will not lag in a concret iteration.
What do you think? I just made it up, I suppose more people came across with this and there is another better solution... need tips! Thanks
There is a much simpler and accurate method than storing averages. I dont believe your proposal will ever get you the results that you want.
Take the total span of time (including fraction) since the previous frame began - this is your
delta time. It is often milliseconds or seconds.
Multiply your move speed by delta time before you apply
it.
This gives you frame rate independence. You will want to experiment until your speeds are correct.
Lets consider the example from my comment above:
If you have one frame that takes 1ms, and object that moves 10 units
per frame is moving at a speed of 10 units per millisecond. However, if
a frame takes 10ms, your object slows to 1 unit per millisecond.
In the first frame, we multiply the speed (10) by 1 (the delta time). This gives us a speed of 10.
In the second frame, our delta is 10 - the frame was ten times slower. If we multiply our speed (10) by the delta (10) we get 100. This is the same speed as object was moving in the 1ms frame.
We now have consistent movement speeds in our game, regardless of how often the screen updates.
EDIT:
In response to your comments.
A faster computer is the answer ;) There is no easy fix for framerate consistency and it can manifest itself in a variety of ways - screen tearing being the grimmest dilemma.
What are you doing in the frames with wildly inconsistent deltas? Consider optimizing that code. The following operations can really kill your framerate:
AI routines like Pathing
IO operations like disk/network access
Generation of procedural resources
Physics!
Anything else that isn't rendering code...
These will all cause the delta to increase by X, depending on the order of the algorithms and quantity of data being processed. Consider performing these long running operations in a separate thread and act on/display the results when they are ready.
More edits:
What you are effectively doing in your solution is slowing everything back down to avoid the jump in on screen position, regardless of the game rules.
Consider a shooter, where reflexes are everything and estimation of velocity is hugely important. What happens if the frame rate doubles and you halve the rotation speed of the player for a frame? Now the player has experienced a spike in frame rate AND their cross-hair moved slower than they thought. Worse, because you are using a running average, subsequent frames will have their movement slowed.
This seems like quite a knock on effect for one slow frame. If you had a physics engine, that slow frame may even have a very real impact on the game world.
Final thought: the idea of the delta time is to disconnect the game rules from the hardware you are running on - your solution reconnects them

How much time is too much?

Given that the standard number of ticks for a cycle in a WP7 app is 333,333 ticks (or it is if you set it as such), how much of this time slice does someone have to work in?
To put it another way, how many ticks do the standard processes eat up (Drawing the screen, clearing buffers, etc)?
I worked out a process for doing something in a Spike (as I often do) but it is eating up about (14 ms) of time right now (about half the time slice that I have available) and I am concerned about what will happen if it runs past that point.
The conventional way of doing computationally intensive things is to do them on a background thread - this means that the UI thread(s) don't block while the computations are occurring - typically the UI threads are scheduled ahead of the background threads so that the screen drawing continues smoothly even though the CPU is 100% busy. This approach allows you to queue as much work as you want to.
If you need to do the computational work within the UI thread - e.g. because its part of game mechanics or part of the "per frame" update/drawing logic, then conventionally what happens is that the game frame rate slows down a bit because the phone is waiting on your logic before it can draw.
If your question is "what is a decent frame rate?" Then that depends a bit on the type of app/game, but generally (at my age...) I think anything 30Hz and above is OK - so up to 33ms for each frame - and it is important that the frame rate is smooth - i.e. each frame length takes about the same time.
I hope that approximately answers your question... wasn't entirely sure I understood it!

Resources