Is there is way to optimize skia::flush time cost? - cobalt

We have two different platform and with CPU frequency the same setting, and found the time cost of canvas->flush() on the rasterizer thread has huge difference at YT start time, the quick one only has 1.632ms at most time, and the slow one has 7.292ms at most time, so is there a way find the root cause of the difference and to optimize it?
cobalt version : Cobalt 11.132145 with ARM-Linux and Opengl
1.
2.Code of canvas->flush()
void HardwareRasterizer::Impl::Submit(
const scoped_refptr<render_tree::Node>& render_tree,
const scoped_refptr<backend::RenderTarget>& render_target,
const Options& options) {
DCHECK(thread_checker_.CalledOnValidThread());
scoped_refptr<backend::RenderTargetEGL> render_target_egl(
base::polymorphic_downcast<backend::RenderTargetEGL*>(
render_target.get()));
// Skip rendering if we lost the surface. This can happen just before suspend
// on Android, so now we're just waiting for the suspend to clean up.
if (render_target_egl->is_surface_bad()) {
return;
}
backend::GraphicsContextEGL::ScopedMakeCurrent scoped_make_current(
graphics_context_, render_target_egl);
// Make sure the render target's framebuffer is bound before continuing.
// Skia will usually do this, but it is possible for some render trees to
// have non-skia draw calls only, in which case this needs to be done.
GL_CALL(glBindFramebuffer(GL_FRAMEBUFFER,
render_target_egl->GetPlatformHandle()));
// First reset the graphics context state for the pending render tree
// draw calls, in case we have modified state in between.
gr_context_->resetContext();
AdvanceFrame();
// Get a SkCanvas that outputs to our hardware render target.
SkCanvas* canvas = GetCanvasFromRenderTarget(render_target);
canvas->save();
if (options.flags & Rasterizer::kSubmitFlags_Clear) {
canvas->clear(SkColorSetARGB(0, 0, 0, 0));
} else if (options.dirty) {
// Only a portion of the display is dirty. Reuse the previous frame
// if possible.
if (render_target_egl->ContentWasPreservedAfterSwap()) {
canvas->clipRect(CobaltRectFToSkiaRect(*options.dirty));
}
}
// Rasterize the passed in render tree to our hardware render target.
RasterizeRenderTreeToCanvas(render_tree, canvas, kBottomLeft_GrSurfaceOrigin);
{
TRACE_EVENT0("cobalt::renderer", "Skia Flush");
canvas->flush();
}
graphics_context_->SwapBuffers(render_target_egl);
canvas->restore();
}

The Skia flush() call is the function in which all of the OpenGL functions will be called (prior to that function being called, all of the drawing functions are simply serialized and queued in an internal Skia format).
Therefore, I would investigate your GL driver implementation in this case. It could be that your CPU is waiting on your GPU to consume some of the draw commands sent to it by GLES.

Related

Chaining animations in SwiftUI

I'm working on a relatively complex animation in SwiftUI and am wondering what's the best / most elegant way to chain the various animation phases.
Let's say I have a view that first needs to scale, then wait a few seconds and then fade (and then wait a couple of seconds and start over - indefinitely).
If I try to use several withAnimation() blocks on the same view/stack, they end up interfering with each other and messing up the animation.
The best I could come up with so far, is call a custom function on the initial views .onAppear() modifier and in that function, have withAnimation() blocks for each stage of the animation with delays between them. So, it basically looks something like this:
func doAnimations() {
withAnimation(...)
DispatchQueue.main.asyncAfter(...)
withAnimation(...)
DispatchQueue.main.asyncAfter(...)
withAnimation(...)
...
}
It ends up being pretty long and not very "pretty". I'm sure there has to be a better/nicer way to do this, but everything I tried so far didn't give me the exact flow I want.
Any ideas/recommendations/tips would be highly appreciated. Thanks!
As mentioned in the other responses, there is currently no mechanism for chaining animations in SwiftUI, but you don't necessarily need to use a manual timer. Instead, you can use the delay function on the chained animation:
withAnimation(Animation.easeIn(duration: 1.23)) {
self.doSomethingFirst()
}
withAnimation(Animation.easeOut(duration: 4.56).delay(1.23)) {
self.thenDoSomethingElse()
}
withAnimation(Animation.default.delay(1.23 + 4.56)) {
self.andThenDoAThirdThing()
}
I've found this to result in more consistently smoother chained animations than using a DispatchQueue or Timer, possibly because it is using the same scheduler for all the animations.
Juggling all the delays and durations can be a hassle, so an ambitious developer might abstract out the calculations into some global withChainedAnimation function than handles it for you.
Using a timer works. This from my own project:
#State private var isShowing = true
#State private var timer: Timer?
...
func askQuestion() {
withAnimation(Animation.easeInOut(duration: 1).delay(0.5)) {
isShowing.toggle()
}
timer = Timer.scheduledTimer(withTimeInterval: 1.6, repeats: false) { _ in
withAnimation(.easeInOut(duration: 1)) {
self.isShowing.toggle()
}
self.timer?.invalidate()
}
// code here executes before the timer is triggered.
}
I'm afraid, for the time being, there is no support for something like keyframes. At least they could have added a onAnimationEnd()... but there is no such thing.
Where I did manage to have some luck, is animating shape paths. Although there aren't keyframes, you have more control, as you can define your "AnimatableData". For an example, check my answer to a different question: https://stackoverflow.com/a/56885066/7786555
In that case, it is basically an arc that spins, but grows from zero to some length and at the end of the turn it progressively goes back to zero length. The animation has 3 phases: At first, one end of the arc moves, but the other does not. Then they both move together at the same speed and finally the second end reaches the first. My first approach was to use the DispatchQueue idea, and it worked, but I agree: it is terribly ugly. I then figure how to properly use AnimatableData. So... if you are animating paths, you're in luck. Otherwise, it seems we'll have to wait for the possibility of more elegant code.

Controlling animations in an animator via parameters, in sequences

So I am animating an avatar, and this avatar has its own animator with states and such.
When interacting with props, the props itself has an animator with states in it. In both case, I transition to some animations through parameters in the animator (bool type).
For example, for a door, the character will have "isOpeningDoor", while the door will have "isOpen".
Now the question: when I change the value on an animator on GO1, and then change the bool on GO2; do the first animation finish and then the second start? Because in my case, it does not happen; they start almost at the same time.
void OnTriggerEnter (collider door)
{
if (door.gameObject.tag=="door")
{
GOAnimator1.SetBool("isOpeningDoor", true);
GOAnimator2.SetBool("isOpen", true);
}
}
I believe that I am doing it wrong, since I change the parameter on the animator, but I do not check for the animation to end; is this even possible or am I doing something not kosher?
I really think it might be doable!
As you have it in your code now, the animations on GO1 and GO2 start at almost the same time because that's how it's written. The OnTriggerEnter() function will complete the execution in the frame it is called, and return the control to Unity.
What I think that might help you are coroutines and sendMessage between gameobjects:
http://docs.unity3d.com/Manual/Coroutines.html
http://docs.unity3d.com/ScriptReference/GameObject.SendMessage.html
The idea is to:
Create a coroutine in GO2 that waits an amount of time until it sets the GOAnimator2 variable to activate the door animation.
Create a function in GO2 that calls the aforementioned coroutine
From the OnTriggerEnter() send a message to GO2 to execute the newly created function
It reads complicated, but it's fairly simple. The execution would be like this:
1.Code for the coroutine:
function GO2coroutine(){
float timeToWait = 0.5f; //Tweak this
for ( float t = 0f; t < timeToWait; t+=time.deltaTime)
yield;
GetComponent<Animator>().SetBool("isOpen",true);
}
Code for the function calling it:
function callCoroutine() {
StartCoroutine("Fade");
}
And the code modification for your OnTriggerEnter():
void OnTriggerEnter (collider door)
{
if (door.gameObject.tag=="door")
{
GOAnimator1.SetBool("isOpeningDoor", true);
GO2.SendMessage("callCoroutine");
}
}
I didn't have a chance to test the code, so please don't copy paste it, there might be slight changes to do.
There is another way, but I don't like it much. That is making the animation longer with an idle status to wait for the first game object animation to end... but it will be a hassle in case you shorten the animation because you have to, or have any other models or events.
Anyway, I think the way to go is with the coroutine! Good Luck!

Unity2d game shooting and animation sync issue

I'm a new in Unity and making my first 2D game. I seen several topics on this forum in this issue, but I didn't found the solution.
So I have a lovely shooting animation and the bullet generation. My problem, I have to generate the bullet somewhere at the middle of the animation, but the character shoots the bullet and the animation at the same time, which killing the UX :)
I attached an image, about the issue, this is the moment, when the bullet should be initialized, but as you can see it's already on it's way.
Please find my code:
The GameManager update method calls the attackEnemy function:
public void Awake(){
animator = GetComponent ();
animator.SetTrigger ("enemyIdle");
}
//if the enemy pass this point, they stop shooting, and just go off the scren
private float shootingStopLimit = -6f;
public override void attackPlayer(){
//animator.SetTrigger ("enemyIdle");
if (!isAttacking && gameObject.transform.position.y > shootingStopLimit) {
isAttacking = true;
animator.SetTrigger("enemyShoot");
StartCoroutine(doWait());
gameObject.GetComponentInChildren ().fireBullet ();
StartCoroutine (Reload ());
}
}
private IEnumerator doWait(){
yield return new WaitForSeconds(5);
}
private IEnumerator Reload(){
animator.SetTrigger ("enemyIdle");
int reloadTime = Random.Range (4,7);
yield return new WaitForSeconds(reloadTime);
isAttacking = false;
}......
My questions:
- How can I sync the animation and the bullet generation ?
Why not the doWait() works ? :)
Is it okay to call the attackPlayer method from the GameManager update ?
The enemies are flynig from the right side of the screen to the left, when they reach the most right side of the screen, they became visible to the user. I don't know why, but they to a shooting animation (no bullet generation happen )first, only after it they do the idle. Any idea why ?
Thanks,
K
I would suggest checking out animation events. Using animation events, you can call a method to instantiate your bullet.
To use Mecanim Animation Events you need to write the name of the function you want to call at the selected frame in the "Function" area of the "Edit Animation Event" window.
The other boxes are for any variables that you want to pass to that function to trigger whatever you have in mind.
Triggering/blending between different animations can be done in many different ways. The event area is more for other things that you want to trigger that are not related to animation (e.g. audio, particle fx, etc).

How to DEBUG OpenGL a gray/black texture box?

I'm altering someone else's code. They used PNG's which are loaded via BufferedImage. I need to load a TGA instead, which is just simply a 18 byte header and BGR codes. I have the textures loaded and running, but I get a gray box instead of the texture. I don't even know how to DEBUG this.
Textures are loaded in a ByteBuffer:
final static int datasize = (WIDTH*HEIGHT*3) *2; // Double buffer size for OpenGL // not +18 no header
static ByteBuffer buffer = ByteBuffer.allocateDirect(datasize);
FileInputStream fin = new FileInputStream("/Volumes/RAMDisk/shot00021.tga");
FileChannel inc = fin.getChannel();
inc.position(18); // skip header
buffer.clear(); // prepare for read
int ret = inc.read(buffer);
fin.close();
I've followed this: [how-to-manage-memory-with-texture-in-opengl][1] ... because I am updating the texture once per frame, like video.
Called once:
GL11.glBindTexture(GL11.GL_TEXTURE_2D, textureID);
GL11.glTexParameteri(GL11.GL_TEXTURE_2D, GL11.GL_TEXTURE_WRAP_S, GL11.GL_CLAMP);
GL11.glTexParameteri(GL11.GL_TEXTURE_2D, GL11.GL_TEXTURE_WRAP_T, GL11.GL_CLAMP);
GL11.glTexParameteri(GL11.GL_TEXTURE_2D, GL11.GL_TEXTURE_MAG_FILTER, GL11.GL_NEAREST);
GL11.glTexParameteri(GL11.GL_TEXTURE_2D, GL11.GL_TEXTURE_MIN_FILTER, GL11.GL_NEAREST);
GL11.glTexImage2D(GL11.GL_TEXTURE_2D, 0, GL11.GL_RGB, width, height, 0, GL11.GL_RGB, GL11.GL_UNSIGNED_BYTE, (ByteBuffer) null);
assert(GL11.GL_NO_ERROR == GL11.glGetError());
Called repeatedly:
GL11.glBindTexture(GL11.GL_TEXTURE_2D, textureID);
GL11.glTexSubImage2D(GL11.GL_TEXTURE_2D, 0, 0, 0, width, height, GL11.GL_RGB, GL11.GL_UNSIGNED_BYTE, byteBuffer);
assert(GL11.GL_NO_ERROR == GL11.glGetError());
return textureID;
The render code hasn't changed and is based on:
GL11.glDrawArrays(GL11.GL_TRIANGLES, 0, this.vertexCount);
Make sure you set the texture sampling mode. Especially min filter: glTexParameteri ( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR). The default setting is mip mapped (GL_NEAREST_MIPMAP_LINEAR) so unless you upload mip maps you will get a white read result.
So either set the texture to no mip or generate them. One way to do that is to call glGenerateMipmap after the tex img call.
(see https://www.khronos.org/opengles/sdk/docs/man/xhtml/glTexParameter.xml).
It's a very common gl pitfall and something people just tend to know after getting bitten by it a few times.
There is no easy way to debug stuff like this. There are good gl debugging tools in for example xcode but they will not tell you about this case.
Debugging GPU code is always a hassle. I would bet my money on a big industry progress in this area as more companies discover the power of GPU. Until then; I'll share my two best GPU debugging friends:
1) Define a function to print OGL errors:
int printOglError(const char *file, int line)
{
/* Returns 1 if an OpenGL error occurred, 0 otherwise. */
GLenum glErr;
int retCode = 0;
glErr = glGetError();
while (glErr != GL_NO_ERROR) {
printf("glError in file %s # line %d: %s\n", file, line, gluErrorString(glErr));
retCode = 1;
glErr = glGetError();
}
return retCode;
}
#define printOpenGLError() printOglError(__FILE__, __LINE__)
And call it after your render draw calls (possible earlier errors will also show up):
GL11.glDrawArrays(GL11.GL_TRIANGLES, 0, this.vertexCount);
printOpenGLError();
This alerts if you make some invalid operations (which might just be your case) but you usually have to find where the error occurs by trial and error.
2) Check out gDEBugger, free software with tons of GPU memory information.
[Edit]:
I would also recommend using the opensource lib DevIL - its quite competent in loading various image formats.
Thanks to Felix, by not calling glTexSubImage2D (leaving the memory valid, but uninitialized) I noticed a remnant pattern left by the default memory. This indicated that the texture is being displayed, but the load is most likely the problem.
**UPDATE:
The, problem with the code above is essentially the buffer. The buffer is 1024*1024, but it is only partially filled in by the read, leaving the limit marker of the ByteBuffer at 2359296(1024*768*3) instead of 3145728(1024*1024*3). This gives the error:
Number of remaining buffer elements is must be ... at least ...
I thought that OpenGL needed space to return data, so I doubled the size of the buffer.
The buffer size is doubled to compensate for the error.
final static int datasize = (WIDTH*HEIGHT*3) *2; // Double buffer size for OpenGL // not +18 no header
This is wrong, what is needed is the flip() function (Big THANKS to Reto Koradi for the small hint to the buffer rewind) to put the ByteBuffer in read mode. Since the buffer is only semi-full, the OpenGL buffer check gives an error. The correct thing to do is not double the buffer size; use buffer.position(buffer.capacity()) to fill the buffer before doing a flip().
final static int datasize = (WIDTH*HEIGHT*3); // not +18 no header
buffer.clear(); // prepare for read
int ret = inc.read(buffer);
fin.close();
buffer.position(buffer.capacity()); // make sure buffer is completely FILLED!
buffer.flip(); // flip buffer to read mode
To figure this out, it is helpful to hardcode the memory of the buffer to make sure the OpenGL calls are working, isolating the load problem. Then when the OpenGL calls are correct, concentrate on the loading of the buffer. As suggested by Felix K, it is good to make sure one texture has been drawn correctly before calling glTexSubImage2D repeatedly.
Some ideas which might cause the issue:
Your texture is disposed somewhere. I don't know the whole code but I guess somewhere there is a glDeleteTextures and this could cause some issues if called at the wrong time.
Are the texture width and height powers of two? If not this might be an issue depending on your hardware. Old hardware sometimes won't support non-power of two images.
The texture parameters changed between the draw calls at some other point ( Make a debug check of the parameters with glGetTexParameter ).
There could be a loading issue when loading the next image ( edit: or even the first image ). Check if the first image is displayed without loading the next images. If so it must be one of the cases above.

Direct2D API calls stall at specific intervals

I am working on migrating the drawing code of an application from GDI/GDI+ to Direct2D. So far things have been going well - however, while testing the new code, I have noticed some bizarre performance. The flow of execution I have been investigating is as follows (I have done my best to remove irrelevant code):
Create D2D Factory (on creation of app)
HRESULT hr = S_OK;
hr = D2D1CreateFactory(D2D1_FACTORY_TYPE_MULTI_THREADED, &m_pD2DFactory);
if (hr == S_FALSE) {
ASSERT(FALSE);
throw Exception(CExtString(_T("Failed to create Direct2D factory")));
}
OnDraw Callback
HWND hwnd = GetSafeHwnd();
RECT rc;
GetClientRect(&rc);
D2D1_SIZE_U size = D2D1::SizeU(rc.right - rc.left, rc.bottom - rc.top);
// Create a render target if it has been destroyed
if (!m_pRT) {
D2D1_RENDER_TARGET_PROPERTIES props = D2D1::RenderTargetProperties(
D2D1_RENDER_TARGET_TYPE_DEFAULT,
D2D1::PixelFormat(
DXGI_FORMAT_B8G8R8A8_UNORM,
D2D1_ALPHA_MODE_IGNORE),
0,
0,
D2D1_RENDER_TARGET_USAGE_NONE,
D2D1_FEATURE_LEVEL_DEFAULT);
GetD2DFactory()->CreateHwndRenderTarget(props,
D2D1::HwndRenderTargetProperties(hwnd, size),
&m_pRT);
}
m_pRT->Resize(size);
m_pRT->BeginDraw();
// Begin drawing the layers, given the
// transformation matrix and some geometric information
Draw(m_pRT, matrixD2D, rectClipWorld, rectClipDP);
HRESULT hr = m_pRT->EndDraw();
if (hr == D2DERR_RECREATE_TARGET) {
SafeRelease(m_pRT);
}
The contents of the Draw method
The draw method does a lot of fluff that is largely irrelevant to this test (as I have turned all extraneous layers off), but it eventually draws a layer that executes this method several thousand times:
void DrawStringWithEffects(ID2D1RenderTarget* m_pRT, const CString& text, const D2D1_POINT_2F& point, const COLORREF rgbFore, const COLORREF rgbBack, IDWriteTextFormat* pfont) {
// The text will be vertically centered around point.y, with point.x on the left hand side
// Create a TextLayout for the string
IDWriteTextLayout* textLayout = NULL;
GetDWriteFactory()->CreateTextLayout(text,
text.GetLength(),
pfont,
std::numeric_limits<float>::infinity(),
std::numeric_limits<float>::infinity(),
&textLayout);
DWRITE_TEXT_METRICS metrics = {0};
textLayout->GetMetrics(&metrics);
D2D1_RECT_F rect = D2D1::RectF(point.x, point.y - metrics.height/2, point.x + metrics.width, point.y + metrics.height/2);
D2D1_POINT_2F pointDraw = point;
pointDraw.y -= metrics.height/2;
ID2D1SolidColorBrush* brush = NULL;
m_pRT->CreateSolidColorBrush(ColorD2DFromCOLORREF(rgbBack), &brush);
m_pRT->FillRectangle(rect, brush);
// ^^ this is sometimes very slow!
brush->SetColor(ColorD2DFromCOLORREF(rgbFore));
m_pRT->DrawTextLayout(pointDraw, textLayout, brush, D2D1_DRAW_TEXT_OPTIONS_NONE);
// ^^ this is also sometimes very slow!
SafeRelease(&brush);
SafeRelease(&textLayout);
The vast majority of the time, the Direct2D calls are executing ~3-4 times faster than the GDI+ equivalents, which is great (generally 0.1ms compared to ~0.35ms). For some reason, though, the function calls will occasionally stall for a long period of time - upwards of 200ms combined. The offending calls are straight from the Direct2D API - FillRectangle and DrawTextLayout. Strangely, these stalls appear in the same location every time I run the application - the 73rd occurrence of the loop, then the 218th, then the 290th and so on (there is somewhat of a pattern in the differences, alternating between every ~73rd and every ~145th cycle). This is independent of the data that it draws (when I told it to skip drawing the 73rd cycle, the next cycle simply becomes the 73rd and thus stalls).
I thought this may be a GPU/CPU communication issue, so I set the render target (I am using an HWnd target) to software mode (D2D1_RENDER_TARGET_TYPE_SOFTWARE), and the results were even more strange. The stall times dropped from ~200ms to ~20ms (still not great, but hey), but there were two instances that stalled for over 2500ms! (These two, like the rest of the stalls, are completely reproducible in terms of being the n'th API call).
This is rather frustrating, as 99% of the loop is several times faster than the old implementation, but the (less than) 1% remaining hang for an abnormally long time.
To any Direct2D experts out there - what type of problem might this stalling be a symptom of? What, in general, could be causing this disconnect between my code and what D2D is doing in the background?
Direct2D buffers drawing commands (presumably to optimize them). You can't look at the performance of an individual drawing command, you must look at the total time between BeginDraw() and EndDraw(). If you want to force each drawing command to execute immediately, you must follow each one with a call to Flush(). That's probably a bad idea for performance though.
https://msdn.microsoft.com/en-us/library/windows/desktop/dd371768(v=vs.85).aspx
After BeginDraw is called, a render target will normally build up a
batch of rendering commands, but defer processing of these commands
until either an internal buffer is full, the Flush method is called,
or until EndDraw is called.

Resources