Timing issues with Tango image frames - google-project-tango

It appears that Tango is dropping image frames when I try to get depth data, image data, and pose data at the same time.
I am trying to capture depth and image frames and synchronize them with pose data. Using the C point-cloud-jni-example I added code to dump point cloud data to memory buffers and then to files. I added a callback for onFrameAvailable() and copied image data to buffers and then to files. Since the image data is at 30 Hz and the depth data is at ~5 Hz I naively expected the latest image to match fairly closely with the latest depth frame. The timestamps were not very close. In some cases they were different by more than 100 milliseconds. So I started to investigate the timing on the onXYZijAvailable(), onFrameAvailable(), and onPoseAvailable() callbacks and the corresponding data timestamps.
I added logcat dumps to each callback and printed out system time (std::chrono::system_clock::now()) and the TangoSystem timestamp of the returned data, whether depth, image, or pose. Some of this was described in exactly how do we compute timestamp differentials?.
Here is some pose timing. The sys time is the current clock time when the callback is executed. The pose timestamp is from the actual pose struct.
sys time pose timestamp
TM CLK Pose 10.008419 245.976464
TM CLK Pose 10.025983 246.009791
TM CLK Pose 10.124470 246.043119
TM CLK Pose 10.133542 246.076447
TM CLK Pose 10.147136 246.109774
TM CLK Pose 10.192470 246.143102
TM CLK Pose 10.200370 246.176430
TM CLK Pose 10.225367 246.209757
TM CLK Pose 10.300509 246.243085
TM CLK Pose 10.311827 246.276413
TM CLK Pose 10.335946 246.309740
TM CLK Pose 10.399209 246.343068
TM CLK Pose 10.407704 246.376396
TM CLK Pose 10.426889 246.409723
TM CLK Pose 10.504403 246.443051
The corresponding differences from pose to pose are shown here. The pose timing is rock solid at 33 msec based on the recorded timestamps. The callback times vary quite a bit, presumably due to the load of the application.
time: 0.017564 pose: 0.033327
time: 0.098487 pose: 0.033328
time: 0.009072 pose: 0.033328
time: 0.013594 pose: 0.033327
time: 0.045334 pose: 0.033328
time: 0.007900 pose: 0.033328
time: 0.024997 pose: 0.033327
time: 0.075142 pose: 0.033328
time: 0.011318 pose: 0.033328
time: 0.024119 pose: 0.033327
time: 0.063263 pose: 0.033328
time: 0.008495 pose: 0.033328
time: 0.019185 pose: 0.033327
time: 0.077514 pose: 0.033328
time: 0.011892 pose: 0.033328
Here is some depth timing and corresponding differences. The timestamps are very stable at about 0.2 seconds.
sys time : xyz timestamp
TM CLK XYZ 10.161695 246.017013
TM CLK XYZ 10.363448 246.216639
TM CLK XYZ 10.595306 246.438693
TM CLK XYZ 10.828368 246.668223
TM CLK XYZ 11.025787 246.890277
TM CLK XYZ 11.233364 247.097379
TM CLK XYZ 11.433941 247.297005
TM CLK XYZ 11.633176 247.496631
TM CLK XYZ 11.830650 247.696257
time: 0.201753 depth: 0.199626
time: 0.231858 depth: 0.222054
time: 0.233062 depth: 0.229530
time: 0.197419 depth: 0.222054
time: 0.207577 depth: 0.207102
time: 0.200577 depth: 0.199626
time: 0.199235 depth: 0.199626
time: 0.197474 depth: 0.199626
time: 0.196935 depth: 0.199626
Here is some image timing. The lines marked "---" are problem frames.
sys time : img timestamp
TM CLK Img 10.041056 246.005896
TM CLK Img 10.074105 246.105709 -----
TM CLK Img 10.106492 246.105709
TM CLK Img 10.142581 246.138980
TM CLK Img 10.176176 246.172251
TM CLK Img 10.241146 246.205522
TM CLK Img 10.274909 246.305335 -----
TM CLK Img 10.317819 246.305335
TM CLK Img 10.361682 246.345225
TM CLK Img 10.397533 246.390139
TM CLK Img 10.472859 246.430886
TM CLK Img 10.514923 246.538175 -----
TM CLK Img 10.551663 246.545651
TM CLK Img 10.585960 246.586398
TM CLK Img 10.626671 246.620526
TM CLK Img 10.705709 246.656249
TM CLK Img 10.734324 246.767705 -----
TM CLK Img 10.774233 246.768562
TM CLK Img 10.808848 246.804285
TM CLK Img 10.847230 246.842580
TM CLK Img 10.927872 246.878303
TM CLK Img 10.957309 246.989759 -----
TM CLK Img 10.991136 246.990616
Here is the corresponding time differences for the above list.
time: 0.033049 image: 0.099813
time: 0.032387 image: 0.000000
time: 0.036089 image: 0.033271
time: 0.033595 image: 0.033271
time: 0.064970 image: 0.033271
time: 0.033763 image: 0.099813
time: 0.042910 image: 0.000000
time: 0.043863 image: 0.039890
time: 0.035851 image: 0.044914
time: 0.075326 image: 0.040747
time: 0.042064 image: 0.107289
time: 0.036740 image: 0.007476
time: 0.034297 image: 0.040747
time: 0.040711 image: 0.034128
time: 0.079038 image: 0.035723
time: 0.028615 image: 0.111456
time: 0.039909 image: 0.000857
time: 0.034615 image: 0.035723
time: 0.038382 image: 0.038295
time: 0.080642 image: 0.035723
time: 0.029437 image: 0.111456
time: 0.033827 image: 0.000857
Notice that every 4 frames there is a big delay in the image time, roughly 100 msec. This is followed by two frames with the same or nearly the same timestamp. Even in cases where the timestamp on two successive images is identical the callback still fires to indicate a new frame. The result is that I am missing every fifth frame of video. That stinks for an application trying to match depth and image data.
I have stripped any extra processing out of the code. In the callbacks the only thing that happens is the data gets copied out to static buffers. The rendering of the point cloud is still being done in the normal rendering thread.
So, what gives? Can the Tango device not handle depth, image, and pose callbacks all running at the same time? Do I need to use UpdateTexture() instead of onFrameAvailable()?

In the current version of Project Tango Tablet RGB IR camera is used for both depth and color images and it can only do one or the other for each frame. So in the stream we get 4 RGB frames followed by 1 Depth frame resulting in the pattern you observed. This is more of a hardware limitation.

Related

Crop large GeoJson layer using a standard machine

I need to extract a portion of a large GeoJson layer. I've tried multiple times with QGis on a 16Gb Ram intel i7 11th gen processor with no result. After about one hour if the program does not crash, I am able to load the file in QGis but no way of cropping it using the Clip vector function.
The layer in question is publicly available at https://github.com/microsoft/KenyaNigeriaBuildingFootprints (the file is kenya.geojson.zip)
... and the extent I need is 36.753446978,36.821255765,-1.330521423,-1.295234032 [EPSG:4326]
Here is how you can approach that with R.
Get the data
options(timeout=max(6000, getOption("timeout")))
url <- "https://minedbuildings.blob.core.windows.net/africa/kenya.geojsonl.zip"
download.file(url, basename(url), mode="wb")
f <- unzip("kenya.geojsonl.zip")
f
#"./kenya.geojsonl"
Read a subset by specifying an extent. This takes a while because geojson is a rather inefficient format.
library(terra)
e <- ext(36.753446978, 36.821255765, -1.330521423, -1.295234032)
## with timer
system.time(v <- vect("kenya.geojsonl", extent=e))
# user system elapsed
#1390.03 8.28 1409.11
v
# class : SpatVector
# geometry : polygons
# dimensions : 18159, 0 (geometries, attributes)
# extent : 36.75303, 36.8228, -1.331356, -1.29447 (xmin, xmax, ymin, ymax)
# source : kenya.geojsonl
# coord. ref. : lon/lat WGS 84 (EPSG:4326)
And then perhaps:
writeVector(v, "buildings.shp")

Number of augmented images on the fly in Keras

I have a code to augment images like the following,
# Augmentation
train_datagen = ImageDataGenerator(rotation_range=5, # rotation
width_shift_range=0.2, # horizontal shift
zoom_range=0.2, # zoom
horizontal_flip=True, # horizontal flip
brightness_range=[0.2,0.8]) # brightness
# Epochs
epochs = 25
# Batch size
batch_size = 32
history = model.fit(train_datagen.flow(x_train,y_train,
batch_size=batch_size,
seed=27,
shuffle=False),
epochs=epochs,
steps_per_epoch=x_train.shape[0] // batch_size,
validation_data=(x_test,y_test),
verbose=1)
I am trying to understand exactly how many extra images will be created in the training process as a result of augmentation.
The second question is how can I create extra 50K images on the fly for the training?
The datagenerator doesn't create new images it rather just transforms it for each epoch.
if you have x_train ) = [x1,x2,x3] images in your entire training set, upon training in each epoch the model should see the same x_train BUT your x_train is so small (just 3 images) so the thing is for each epoch the datagen will feed the model the whole x_train slightly transformed (according to the parameters you put in ImageDataGenerator) e.g.:
for epoch 1 x_train: [x1,x2,x3]
for epoch 2 x_train: [x1_t1,x2_t1,x3_t1]
for epoch 3 x_train: [x1_t2,x2_t2,x3_t2]
etc...

Which methodology for calculating SNR is appropriate for this image?

I was trying to find snr for a set of images that I have but my two methodologies of doing so creates two different answers and I'm not sure which is right. I was wondering if one of them is just straight up the wrong way of doing this or if neither way is correct?
I am trying to characterize the snr of a set of images that I'm processing. I have 1 set of data with images and darkfields. From these pieces of data I subtracted the darkfield from the image and got "corrected_images".
So since I know snr is (mean of signal)/(std of noise), in my first methodology I was working with the corrected image and background noise image and I just took the mean of every pixel on the spectrum (from the corrected image) with a value greater than 1 for signal and the general std for the background noise image as my values. the plot for this methodology is in blue.
In my second methodology I used a single uncorrected image and basically considered every pixel above 50 as signal and every pixel below 50 as noise.This gives us the orange values for snr.
# -*- coding: utf-8 -*-
"""
Spyder Editor
This is a temporary script file.
"""
from PIL import Image
from matplotlib import pyplot as plt
import numpy as np
import os
signals=[]
name = r"interpolating_streaks/corrected"
name2 = r"interpolating_streaks/averages"
file = os.listdir(name)
file2 = os.listdir(name2)
wv1=[]
signal = []
snr = []
noise = []
x=0
for i in file:
wv=(i[:3])
wv1.append(wv)
corrected_image = Image.open(name+"/"+i) #opens the image
streak= np.array(corrected_image)
dark_image = Image.open(name2+'/d'+wv+'_averaged.tif')
dark = np.array(dark_image)
darkavg = dark[:][:].mean(axis=0)
avg= streak[:][:].mean(axis=0)
for i in avg:
if i >= 1:
signal.append(i)
noiser = np.std(darkavg)
signalr = np.mean(signal)
snr.append(signalr/noiser)
plt.plot(wv1,snr)
signal = []
noise = []
snr = []
for i in file2:
if(i[0] !='d'):
image = Image.open(name2+'/' + i )
im = np.array(image)
im_avg = im[:][:].mean(axis=0)
for i in im_avg:
if i <= 50:
noise.append(i)
else:
signal.append(i)
snr.append(np.mean(signal)/np.std(noise))
plt.plot(wv1,snr)
I would expect the snr values to be the same , and I know for my camera the snr has to be below 45 dB (but also I'm pretty sure this methodology for snr doesnt output decibels)
here are my current results
![1]: https://imgur.com/a/Vgecyp1

How to prevent "animated" bokeh plot from stuttering

I'm trying to create a kind of dashboard for monitoring ECG data and relevant features. When "animating" the ECG by constantly updating two ColumnDataSources, very soon the plot starts stuttering and growing slow. At the moment, I'm running this in a notebook, but it's the same with a bokeh server.
The ECG itself shows about the latest second, some 400 data points in a line. Left on its own, this runs almost smooth, but this is not really helpful, dashboard-wise. Another plot shows the heart rate variability, maybe a dozen circles but derived from the last minute of data, thus the two sources.
It seems that the more is gathered in the data sources and the faster the updates happen, the slower the plotting becomes. The rollover parameter of ColumnDataSource.stream() cuts the overall length of what is collected but ultimately does not prevent the stuttering.
There is probably some rookie mistake in my code, as I feel bokeh should be well-equipped for visualizing such an amount of data. So, here is what I do for plotting:
dashboard_source = ColumnDataSource(record[:1]) # initialize with first row
ecg_source = ColumnDataSource(record[:1]) # initialize with first row
# some options
time_window = 12 # seconds to keep in view
ecg_length = 1 # seconds to keep in view in ECG plot
update_rate = 1000 / sampling_rate # number of milliseconds between each plot update
# update function in which source data is fed from record dataframe
current_record_pos = 1
def update_dashboard_source():
global current_record_pos
new_row = record.iloc[current_record_pos]
dashboard_source.stream(new_row, rollover = sampling_rate * time_window)
ecg_source.stream(new_row, rollover = sampling_rate * ecg_length)
current_record_pos += 1
def ecg_dashboard(doc):
# dashboard element: ECG plot/s ---- ------- ---- ------- ---- ------- ---- -------
ecg_plot = figure(width=800, height=400, title='ECG', x_axis_label='time in ms', y_range=(-1, 1.5))
# plot ECG channels
for record_channel, color in zip(record_channels, ['green', 'blue']):
ecg_plot.line(source=ecg_source, x='time_ms', y=record_channel, alpha=.3, legend=record_channel+' ', color=color)
# dashboard element: heart rate variability ---- ------- ---- ------- ---- ------- ---- -------
hrv_plot = figure(width=400, height=400, title='heart rate variability', x_axis_label="r'r''", y_axis_label="r''r'''")
hrv_plot.circle(source=dashboard_source, x='r_diff_1', y='r_diff_2', size=10, alpha=.23)
# gather everything in a dashboard element and add it to the document
ecg_row = row(ecg_plot)
feature_row = row(hrv_plot)
dashboard = column(ecg_row, feature_row)
doc.add_root(dashboard)
doc.add_periodic_callback(update_dashboard_source, update_rate)
show(ecg_dashboard)
I did not find bokeh's user guide very helpful regarding updated plotting. Is there maybe a collection of best practices somewhere?
Concluding the conversation in comments: Each time a single point is added to the plot the entire canvas area in the browser is being re-drawn. This is how browsers work. Having a sampling_rate of 250 results in 250 plot updates per second that is one update per 4ms. This will make the browser running slower and slower as the number of points to be rendered (re-redrawn) every 4ms will increase.
I would advice to increase the update period from 4ms to about 100ms (possibly with larger data packets)

QML Performance issue when updating an item in presence of many non-overlapping items

In the following QML, the only dynamic part is the blinking rectangle. While it has no relation to the generated items, the blinking rectangle causes a heavy load and slows down the system (e.g. 100% CPU load on i.MX6 processor I am using), even when there is no overlap/binding between it and other items. Removing the Repeater solves the issue and rectangle smoothly blinks.
import QtQuick 2.3
Rectangle {
id: root
anchors.fill: parent
Repeater {
model: 10000
delegate: Rectangle {
width: 5
height: 5
x: (index % 200)*6
y: 50 + Math.floor(index / 200)*6
color: "blue"
border.color: "black"
}
}
Rectangle {
property bool blinker: false
width: 20
height: 20
color: blinker ? "green" : "red"
Timer {
running: true
interval: 100
repeat: true
onTriggered: { parent.blinker = !parent.blinker }
}
}
}
Here is the output (the red rectangle will blink in the actual application):
The model: 10000 parameter of Repeater may need to be set to a higher value in the case you have a better specification and don't experience slow down. The code is tested on Qt 5.3.2 and Qt 5.5.0 and the problem was present in both.
I have a fewer number of models (~100) in my actual application, but with more complex delegate. Therefore, the CPU (GPU?) usage depends on the complexity of the delegate + number of model items in the Repeater.
Why having a high number of items (or complex items) generated by Repeater affect the performance of application while they have no relation/overlap with the other dynamic object(s)?
Update 1
I've replaced Repeater with the following javascript code to generate the same number of objects with the same properties:
Component.onCompleted: {
var objstr = 'import QtQuick 2.0;Rectangle{id:sample;width:5; height:5;color:"blue";border.color: "black"}';
for(var i=0;i<200;i++) {
for(var j=0;j<50;j++) {
var obj = Qt.createQmlObject(objstr,root);
obj.x = i * 6
obj.y = 50 + j*6
}
}
}
But there performance issue was still present.
Update 2
I've done some examinations based on this article.
QSG_RENDERER_DEBUG=render
Setting this flag outputs some debugging information about rendering and batching. The output for the test application
isaac#ubuntu:~$ QSG_RENDERER_DEBUG=render ./qml-test
QML debugging is enabled. Only use this in a safe environment.
Batch thresholds: nodes: 64 vertices: 1024
Using buffer strategy: static
Renderer::render() QSGAbstractRenderer(0x93b9570) "rebuild: full"
Rendering:
-> Opaque: 14002 nodes in 2 batches...
-> Alpha: 0 nodes in 0 batches...
- 0x8f0a698 [ upload] [ clip] [opaque] [ merged] Nodes: 14000 Vertices: 168000 Indices: 224000 root: 0xb3e2a90 sets: 3
- 0x8f0b310 [ upload] [noclip] [opaque] [ merged] Nodes: 2 Vertices: 8 Indices: 12 root: 0x0
Renderer::render() QSGAbstractRenderer(0x93b9570) "rebuild: none"
Rendering:
-> Opaque: 14002 nodes in 2 batches...
-> Alpha: 0 nodes in 0 batches...
- 0x8f0a698 [retained] [ clip] [opaque] [ merged] Nodes: 14000 Vertices: 168000 Indices: 224000 root: 0xb3e2a90 sets: 3
- 0x8f0b310 [retained] [noclip] [opaque] [ merged] Nodes: 2 Vertices: 8 Indices: 12 root: 0x0
Renderer::render() QSGAbstractRenderer(0x93b9570) "rebuild: none"
This tells that items are batched in 2 group; one with 14000 nodes and one with 2 nodes. This seems to be what we expect.
QSG_VISUALIZE=batches flag
This switch visualizes the batches on the UI. Running this shows a solid color covering the whole UI. This means the blinking rectangle and the small rectangles are being rendered in one batch:
Setting clip: true didn't help to force separating the batches. By setting opacity: 0.5 for blinking rectangle, I finally succeeded to force QML engine to put it into another batch:
Interestingly, the blinking was still affected and slowed down by the high number of small rectangles!
QSG_RENDER_TIMING=1
The last flag I tried was QSG_RENDER_TIMING which report some timing information for rendering. Based on the output, the actual time spent is for render in the render loop. Based on the Qt documentation, render time is
Total time spent rendering the frame, including preparing and
uploading all the necessary data to the GPU. This is the gross render
time. Do not confuse it with the net Render Render time below.
but this wasn't helpful to me. So far, I haven't be able to find the root cause of this issue.
isaac#ubuntu:~$ QSG_RENDER_TIMING=1 ./qml-test
QML debugging is enabled. Only use this in a safe environment.
qt.scenegraph.time.compilation: shader compiled in 3ms
qt.scenegraph.time.renderer: time in renderer: total=27ms, preprocess=0, updates=5, binding=0, rendering=21
qt.scenegraph.time.renderloop: Frame rendered with 'basic' renderloop in 107ms, polish=0, sync=65, render=27, swap=1, frameDelta=0
qt.scenegraph.time.renderer: time in renderer: total=1ms, preprocess=0, updates=0, binding=0, rendering=1
qt.scenegraph.time.renderloop: Frame rendered with 'basic' renderloop in 1ms, polish=0, sync=0, render=1, swap=0, frameDelta=2
qt.scenegraph.time.renderer: time in renderer: total=8ms, preprocess=0, updates=0, binding=0, rendering=8
qt.scenegraph.time.renderloop: Frame rendered with 'basic' renderloop in 255ms, polish=0, sync=0, render=8, swap=24, frameDelta=255
qt.scenegraph.time.renderer: time in renderer: total=1ms, preprocess=0, updates=0, binding=0, rendering=1
qt.scenegraph.time.renderloop: Frame rendered with 'basic' renderloop in 290ms, polish=0, sync=0, render=1, swap=28, frameDelta=297
qt.scenegraph.time.renderer: time in renderer: total=0ms, preprocess=0, updates=0, binding=0, rendering=0
qt.scenegraph.time.renderloop: Frame rendered with 'basic' renderloop in 296ms, polish=0, sync=0, render=0, swap=29, frameDelta=303
qt.scenegraph.time.renderer: time in renderer: total=298ms, preprocess=0, updates=0, binding=0, rendering=298
qt.scenegraph.time.renderloop: Frame rendered with 'basic' renderloop in 300ms, polish=0, sync=0, render=298, swap=0, frameDelta=306
qt.scenegraph.time.renderer: time in renderer: total=592ms, preprocess=0, updates=0, binding=0, rendering=592
qt.scenegraph.time.renderloop: Frame rendered with 'basic' renderloop in 593ms, polish=0, sync=0, render=592, swap=0, frameDelta=600
qt.scenegraph.time.renderer: time in renderer: total=292ms, preprocess=0, updates=0, binding=0, rendering=292
qt.scenegraph.time.renderloop: Frame rendered with 'basic' renderloop in 298ms, polish=0, sync=0, render=295, swap=0, frameDelta=305
qt.scenegraph.time.renderer: time in renderer: total=286ms, preprocess=0, updates=0, binding=0, rendering=286
qt.scenegraph.time.renderloop: Frame rendered with 'basic' renderloop in 291ms, polish=0, sync=0, render=286, swap=0, frameDelta=298
qt.scenegraph.time.renderer: time in renderer: total=291ms, preprocess=0, updates=0, binding=0, rendering=291
qt.scenegraph.time.renderloop: Frame rendered with 'basic' renderloop in 296ms, polish=0, sync=0, render=294, swap=0, frameDelta=305
qt.scenegraph.time.renderer: time in renderer: total=286ms, preprocess=0, updates=0, binding=0, rendering=286
qt.scenegraph.time.renderloop: Frame rendered with 'basic' renderloop in 292ms, polish=0, sync=0, render=286, swap=0, frameDelta=298
qt.scenegraph.time.renderer: time in renderer: total=290ms, preprocess=0, updates=0, binding=0, rendering=290
qt.scenegraph.time.renderloop: Frame rendered with 'basic' renderloop in 295ms, polish=0, sync=0, render=291, swap=0, frameDelta=301
qt.scenegraph.time.renderer: time in renderer: total=297ms, preprocess=0, updates=0, binding=0, rendering=297
qt.scenegraph.time.renderloop: Frame rendered with 'basic' renderloop in 302ms, polish=0, sync=0, render=298, swap=0, frameDelta=310
qt.scenegraph.time.renderer: time in renderer: total=290ms, preprocess=0, updates=0, binding=0, rendering=290
qt.scenegraph.time.renderloop: Frame rendered with 'basic' renderloop in 293ms, polish=0, sync=0, render=290, swap=0, frameDelta=316
This is an old question, but it looks like there's no real resolution here, so I'll do my best to chime in with some useful bits and pieces.
So, you're definitely partly on the right track with looking at batching, great start. I imagine the reason you didn't see any effect from setting clip: true was that you may have been setting it on the wrong place -- you need to either set it on the bottom Rectangle (containing the Timer), or you need to contain the Repeater in something else that you can clip, like:
Item {
anchors.fill: parent
clip: true
Repeater {
...
}
}
This is because, while the Repeater inherits the Item type, it is a bit of a special item. The children it creates are parented to the parent of the repeater, not the repeater itself, so the repeater would have clipping set – but no visual children to apply that clipping to in your case.
The ideal solution here would be to set clip: true both on something containing the Repeater (as done above), and on the bottom Rectangle to ensure that neither of the two subtrees affect the performance of the other.
However, you note that this didn't directly solve your problem, so let's move on from batching to other things.
A quick observation: I notice that you are using the 'basic' renderloop instead of the 'threaded' one. Is there a reason for this? It won't buy you much with the example you have here (as you don't have many bindings evaluating and no other application to speak of), but in a real world case, it should be quite a bit better, so I would recommend trying to use it if at all possible.
Once you get past that, you need to know that the QtQuick scenegraph expects to run with a blocking vsync. Animations and everything else all tie in to the vsync of your display. When you're working at this level, you need to know how your graphics setup works, and pay special care to make sure that you are enabling that to happen.
So, now let's talk about the hardware side of the picture. I don't know precisely what your setup is on imx6, but I'm assuming you're using Linux & Vivante drivers on fbdev, and the eglfs QPA plugin from Qt. First things first, you should play around with the FB_MULTI_BUFFER environment variable to ensure you are tied to the vsync of the display (i.e. you probably want FB_MULTI_BUFFER=2 or FB_MULTI_BUFFER=3). I don't know if this is now set automatically, but it wasn't when I last had to work on such a system.
Assuming you are using fbdev, the mechanism for waiting on the display is an ioctl. You want to look at your display driver in the kernel, and see if it's respecting the FBIO_WAITFORVSYNC ioctl, and compile Qt to use that (grep qtbase for FBIO_WAITFORVSYNC – it should be somewhere in the eglfs platform plugin). You'll also note that it's "hidden" behind an environment variable: QT_QPA_EGLFS_FORCEVSYNC, so you'll want to export QT_QPA_EGLFS_FORCEVSYNC=1 once you have ensured it's built to issue that ioctl. While you're at it, you should check that the FBIOGET_VSCREENINFO ioctl is returning useful and correct information, as eglfs will use the returned information from that to determine the refresh rate of the display (see q_refreshRateFromFb in the eglfs plugin).
After all that, things may improve for you. If they don't, I can say that on a similar setup, I've run into cases before where there was no ability to force-throttle rendering (where FBIO_WAITFORVSYNC was effectively unusable), which means that you're left to doing this yourself. I don't know how universal this problem is, but it may well apply to you, so:
If you are in such a situation, you can tweak the QT_QPA_UPDATE_IDLE_TIME=x environment variable to tell Qt to wait for a minimum duration of at least x ms before drawing another frame, for instance, export QT_QPA_UPDATE_IDLE_TIME=32 would wait 32ms between frames at a minimum, giving you roughly 30 FPS. You should treat this with some caution though, as it is far from an ideal scenario, and it's not really what I would call a widely "supported" thing.

Resources