OpenGL ES / Vulkan: Per fragment stencil write/test (on Qualcomm Snapdragon XR2) - opengl-es

I would like to render two meshes, the first one writing into the stencil buffer and the second one testing against it.
I want to do that on a per fragment level though (the fragment shader of the first object should define which value to write into the stencil buffer and the fragment shader of the second object should define whether and against which stencil value the fragments of the second object should be clipped).
My Target Platform is the Oculus Quest 2, which has a Qualcomm Snapdragon XR.
If the platform would support GL_ARM_shader_framebuffer_fetch_depth_stencil, I could use that, but that's only supported on some Mali GPUs.
The reason I want to use stencils is that I want to render everything in a single forward rendering pass for performance reasons and since I'm already forced to use fragment discard in my shaders, early z-rejection is off the table anyway so that's not a concern.
How can I achieve per fragment stencil writing/testing on Qualcomm Snapdragon XR2 in either OpenGL ES 3.0 or Vulkan?
any pointers are appreciated.

I had to print out all available extensions on the Quest 2 recently for a project and can confirm that GL_ARM_shader_framebuffer_fetch_depth_stencil is supported.
To be clear though, this extension only enables reading the stencil value, not writing to it.
If it helps, these are the supported extensions:
GL_OES_EGL_image_external
GL_OES_EGL_sync
GL_OES_vertex_half_float
GL_OES_framebuffer_object
GL_OES_rgb8_rgba8
GL_OES_compressed_ETC1_RGB8_texture
GL_AMD_compressed_ATC_texture
GL_KHR_texture_compression_astc_ldr
GL_KHR_texture_compression_astc_hdr
GL_OES_texture_compression_astc
GL_OES_texture_npot
GL_EXT_texture_filter_anisotropic
GL_EXT_texture_format_BGRA8888
GL_EXT_read_format_bgra
GL_OES_texture_3D
GL_EXT_color_buffer_float
GL_EXT_color_buffer_half_float
GL_QCOM_alpha_test
GL_OES_depth24
GL_OES_packed_depth_stencil
GL_OES_depth_texture
GL_OES_depth_texture_cube_map
GL_EXT_sRGB
GL_OES_texture_float
GL_OES_texture_float_linear
GL_OES_texture_half_float
GL_OES_texture_half_float_linear
GL_EXT_texture_type_2_10_10_10_REV
GL_EXT_texture_sRGB_decode
GL_EXT_texture_format_sRGB_override
GL_OES_element_index_uint
GL_EXT_copy_image
GL_EXT_geometry_shader
GL_EXT_tessellation_shader
GL_OES_texture_stencil8
GL_EXT_shader_io_blocks
GL_OES_shader_image_atomic
GL_OES_sample_variables
GL_EXT_texture_border_clamp
GL_EXT_EGL_image_external_wrap_modes
GL_EXT_multisampled_render_to_texture
GL_EXT_multisampled_render_to_texture2
GL_OES_shader_multisample_interpolation
GL_EXT_texture_cube_map_array
GL_EXT_draw_buffers_indexed
GL_EXT_gpu_shader5
GL_EXT_robustness
GL_EXT_texture_buffer
GL_EXT_shader_framebuffer_fetch
GL_ARM_shader_framebuffer_fetch_depth_stencil
GL_OES_texture_storage_multisample_2d_array
GL_OES_sample_shading
GL_OES_get_program_binary
GL_EXT_debug_label
GL_KHR_blend_equation_advanced
GL_KHR_blend_equation_advanced_coherent
GL_QCOM_tiled_rendering
GL_ANDROID_extension_pack_es31a
GL_EXT_primitive_bounding_box
GL_OES_standard_derivatives
GL_OES_vertex_array_object
GL_EXT_disjoint_timer_query
GL_KHR_debug
GL_EXT_YUV_target
GL_EXT_sRGB_write_control
GL_EXT_texture_norm16
GL_EXT_discard_framebuffer
GL_OES_surfaceless_context
GL_OVR_multiview
GL_OVR_multiview2
GL_EXT_texture_sRGB_R8
GL_KHR_no_error
GL_EXT_debug_marker
GL_OES_EGL_image_external_essl3
GL_OVR_multiview_multisampled_render_to_texture
GL_EXT_buffer_storage
GL_EXT_external_buffer
GL_EXT_blit_framebuffer_params
GL_EXT_clip_cull_distance
GL_EXT_protected_textures
GL_EXT_shader_non_constant_global_initializers
GL_QCOM_texture_foveated
GL_QCOM_texture_foveated2
GL_QCOM_texture_foveated_subsampled_layout
GL_QCOM_shader_framebuffer_fetch_noncoherent
GL_QCOM_shader_framebuffer_fetch_rate
GL_EXT_memory_object
GL_EXT_memory_object_fd
GL_EXT_EGL_image_array
GL_NV_shader_noperspective_interpolation
GL_KHR_robust_buffer_access_behavior
GL_EXT_EGL_image_storage
GL_EXT_blend_func_extended
GL_EXT_clip_control
GL_OES_texture_view
GL_EXT_fragment_invocation_density
GL_QCOM_motion_estimation
GL_QCOM_validate_shader_binary
GL_QCOM_YUV_texture_gather
GL_IMG_texture_filter_cubic```

You can have per-invocation stencil reference values with VK_EXT_shader_stencil_export. Nevertheless that extension is not widely supported.
I am not sure what you are trying to do, but it seems you will need to find another way.

Related

Memory function of the boost::gil library

I am currently trying to make some TensorFlow Inference (C backend) using Boost::GIL (challenging). I need a few thinks, I have been able to load my png image (rgb8_image_t)
and did a conversion to rgb32_f_image_t.
I still need 3 thinks, the raw pointer of the data, memory allocated, and dimensions.
for the memory allocated unfortunately the function total_allocated_size_in_bytes() is private, so I did this:
boost::gil::view(dest).size() * boost::gil::view(dest).num_channels() * sizeof(value_type);
Which is valid, if I do not have any extra padding for alignment story. But does it exist any nice alternative?
For the dimension, I should match with numpy (from PILLOW), I hope both libraries are using the same memory layout pattern. From my understanding, by default, datas are interleaved and contiguous so, it should be good.
Last the raw pointer _memory, it is a private data member of the Image class with no dedicated function. boost::gil::view(dest).row_begin(0) returns a iterator on the first pixel but I not sure how I could get the pointer of the data _memory. Any suggestions ?
Thank you very much,
++t
ps: TensorFlow proposes a C++ backend, however, it is not installed from any package managers, and manipulate Bazel is beyond my strength.
GIL documentation pretty accurately documents the various memory layouts.
The point of the library, though, is to abstract away the memory layouts. If you require some representation (planar/interleaved, packed or unpacked) you are doing things "the hard way" for the library interface.
So, I think you can read and convert in one go, e.g. for a jpeg:
gil::rgb32f_image_t img;
gil::image_read_settings<gil::jpeg_tag> settings;
read_and_convert_image("input.jpg", img, settings);
Now getting the raw data is possible:
auto* raw_data = gil::interleaved_view_get_raw_data(view(img));
It happens to be the case that the preferred implementation storage is interleaved, which is likely what you're expecting. If your particular image storage is planar, the call will not compile (and you'd probably want planar_view_get_raw_data(vw, plane_index) instead).
Note that you'll have to reinterpret_cast to float [const]* if you need that, because there is not public interface to get a reference to the scoped_channel_value<>::value_, but the BaseChannelValue type is indeed float and you can assert that the wrapper doesn't add additional weight:
static_assert(sizeof(float) == sizeof(raw_data[0]));
Alternative Approach:
Conversely, you can setup your own raw pixel buffer, mount a mutable view into it and use that to read/convert your initial load into:
// get dimension
gil::image_read_settings<gil::jpeg_tag> settings;
auto info = gil::read_image_info("input.jpg", settings).get_info();
// setup raw pixel buffer & view
using pixel = gil::rgb32f_pixel_t;
auto data = std::make_unique<pixel[]>(info._width * info._height);
auto vw = gil::interleaved_view(info._width, info._height, data.get(),
info._width * sizeof(pixel));
// load into buffer
read_and_convert_view("input.jpg", vw, settings);
I've actually checked that it works correctly by writing out the resulting view:
//// just for test - doesn't work for 32f, so choose another pixel format
//gil::write_view("output.png", vw, gil::png_tag());

How to get updated buffer attributes in Threejs

When i apply a matrix to a buffergeometry
I want to get the updated position attributes fast , i am dealing with 1000000+ vertex .
I have tried Matrix4.applyToBufferAttribute() , but the buffer attribute is still the same
What is the most proper way to perform this ?
I have tried Matrix4.applyToBufferAttribute() , but the buffer attribute is still the same
Then it seems you are doing something wrong in your application. Matrix4.applyToBufferAttribute() does apply the matrix to the given attribute. The method is used multiple times in the core of three.js for example in BufferGeometry.applyMatrix():
https://github.com/mrdoob/three.js/blob/9f7f38b543c8a51d5614b72c04d657a4cfad68da/src/core/BufferGeometry.js#L141-L142
Ensure to set BufferAttribute.needsUpdate to true after the method invocation. And yes, it's the intended way to apply a 4x4 transformation matrix to a buffer attribute.

optional vertexbufferobjects in directx11

I have some models (geometries) which have some vertexinformation. For example position, normal, color, texcoord each of this information has its own vertexbuffer. But some of these models have texture coordinates some not...
To manage these differences I wrote a vertexshader which is checking, if the flag inside the constantbuffer "hasTextureCoordinates" is == 1. And if so it uses the texcoord parameter or not.
But Directx does not realy like this workaround and prints out:
D3D11 INFO: ID3D11DeviceContext::Draw: Element [3] in the current Input Layout's declaration references input slot 3, but there is no Buffer bound to this slot. This is OK, as reads from an empty slot are defined to return 0. It is also possible the developer knows the data will not be used anyway. This is only a problem if the developer actually intended to bind an input Buffer here. [ EXECUTION INFO #348: DEVICE_DRAW_VERTEX_BUFFER_NOT_SET]
I'm not sure if every hardware handles this correctly, also it's not nice to see inside the output this "warnings" every frame...
I know i could write two shaders, one with and one without texcoods, but the problem is that this is not the only maybe missing parameter... some has color other not, some has color and texturecoordinates and so on. And to write a shader for each combination of vertexbuffer inputs is extremly redundant. this is extremly bad, because if we change one shader, we have to change all other too. There is also the possibility of put parts of the shader to different files and include them, but it's confusing.
Is there a way to say directx that the specific vertexbuffer is optional?
Or does someone knows a better solution for this problem?
You can suppress this specific message programmatically. As it's an INFO rather than an ERROR or CORRUPTION message, it's safe to ignore.
#include <wrl/client.h>
using Microsoft::WRL::ComPtr;
ComPtr<ID3D11Debug> d3dDebug;
if ( SUCCEEDED( device.As(&d3dDebug) ) )
{
ComPtr<ID3D11InfoQueue> d3dInfoQueue;
if ( SUCCEEDED( d3dDebug.As(&d3dInfoQueue) ) )
{
#ifdef _DEBUG
d3dInfoQueue->SetBreakOnSeverity( D3D11_MESSAGE_SEVERITY_CORRUPTION, true );
d3dInfoQueue->SetBreakOnSeverity( D3D11_MESSAGE_SEVERITY_ERROR, true );
#endif
D3D11_MESSAGE_ID hide[] =
{
D3D11_MESSAGE_ID_SETPRIVATEDATA_CHANGINGPARAMS,
D3D11_MESSAGE_ID_DEVICE_DRAW_VERTEX_BUFFER_NOT_SET, // <--- Your message here!
// Add more message IDs here as needed
};
D3D11_INFO_QUEUE_FILTER filter = {};
filter.DenyList.NumIDs = _countof(hide);
filter.DenyList.pIDList = hide;
d3dInfoQueue->AddStorageFilterEntries( &filter );
}
}
In addition to suppressing 'noise' messages, in debug builds this also causes the debug layer to generate a break-point if you do hit a ERROR or CORRUPTION message as those really need to be fixed.
See Direct3D SDK Debug Layer Tricks
Note I'm using ComPtr here to simplify the QueryInterface chain, and I assume you are keeping your device as a ComPtr<ID3D11Device> device as I do in in Anatomy of Direct3D 11 Create Device
I also assume you are using VS 2013 or later so that D3D11_INFO_QUEUE_FILTER filter = {}; is sufficient to zero-fill the structure.

Need help understanding Tango's functions related to coordinate systems

I am confused by parameters of those functions related to coordinate systems, for eample:
TangoSupport_getMatrixTransformAtTime(double timestamp,
TangoCoordinateFrameType base_frame,
TangoCoordinateFrameType target_frame,
TangoSupportEngineType base_engine,
TangoSupportEngineType target_engine,
TangoSupportDisplayRotation display_rotation_type,
TangoMatrixTransformData *matrix_transform)
(1)Base_engine: If I choose COORDINATE_FRAME_START_OF_SERVICE as base_frame . As described in the document, the coordinate system will use "Right Hand Local Level" . Then, what's the purpose of the base_engine parameter ? Is it meaningful here to choose something other than TANGO_SUPPORT_ENGINE_TANGO ?
(2) Target_engine: I choose COORDINATE_FRAME_START_OF_SERVICE as base_frame , and DEVICE as target. choose OPENGL for base_engine. then choose any value for target_engine. the result is always same
(1)Base_engine: If I choose COORDINATE_FRAME_START_OF_SERVICE as base_frame . As described in the document, the coordinate system will use "Right Hand Local Level" . Then, what's the purpose of the base_engine parameter ? Is it meaningful here to choose something other than TANGO_SUPPORT_ENGINE_TANGO ?
This really depends on your use case. It is rare that you use Tango coordinate as base frame, unless you have another set of transformation that transform start service to local origin.
Let's say you did a query like this: TangoSupport_getMatrixTransformAtTime(0.0, START_SERVICE, DEVICE, TANGO, TANGO,...); it is quavalant of doing a TangoService_getPoseAtTime query with start service and device frame pair.
More common case is that you want to transform something(i.e depth point) in to your local origin (i.e OpenGL origin) for render. What you will do is: TangoSupport_getMatrixTransformAtTime(0.0, START_SERVICE, DEPTH, OPENGL, TANGO,...);, the result of this call is opengl_T_depth_camera, you can then multiply this transform to the depth point returned from depth camera: P_opengl = opengl_T_depth_camera * P_depth_camera;. P_opengl is the point you can render out directly in OpenGL.
(2) Target_engine: I choose COORDINATE_FRAME_START_OF_SERVICE as base_frame , and DEVICE as target. choose OPENGL for base_engine. then choose any value for target_engine. the result is always same
This should be true for OPENGL and TANGO. There's a happy coincedent that opengl coordinate is same as the device frame coordinate. So if you put TANGO or OPENGL on the target_frame, the result will be the same. But if you put UNITY as target engine type, the result will be different.

GL_EXT_packed_pixels vs GL_APPLE_packed_pixels

My application checks for GL_EXT_packed_pixels extension before using packed pixel formats such as UNSIGNED_INT_8_8_8_8_EXT. On my MacBook, my code can't find this extension, despite that using packed pixel formats still appears to work.
OpenGL Extension Viewer seems to suggest that it has a special name on OS X:
What's the difference? Should I just check for either GL_EXT_packed_pixels or GL_APPLE_packed_pixels when assessing if UNSIGNED_INT_8_8_8_8_EXT is supported?
EXT_packed_pixels has these definitions:
UNSIGNED_BYTE_3_3_2_EXT 0x8032
UNSIGNED_SHORT_4_4_4_4_EXT 0x8033
UNSIGNED_SHORT_5_5_5_1_EXT 0x8034
UNSIGNED_INT_8_8_8_8_EXT 0x8035
UNSIGNED_INT_10_10_10_2_EXT 0x8036
While APPLE_packed_pixels has these:
UNSIGNED_BYTE_3_3_2 0x8032
UNSIGNED_BYTE_2_3_3_REV 0x8362
UNSIGNED_SHORT_5_6_5 0x8363
UNSIGNED_SHORT_5_6_5_REV 0x8364
UNSIGNED_SHORT_4_4_4_4 0x8033
UNSIGNED_SHORT_4_4_4_4_REV 0x8365
UNSIGNED_SHORT_5_5_5_1 0x8034
UNSIGNED_SHORT_1_5_5_5_REV 0x8366
UNSIGNED_INT_8_8_8_8 0x8035
UNSIGNED_INT_8_8_8_8_REV 0x8367
UNSIGNED_INT_10_10_10_2 0x8036
UNSIGNED_INT_2_10_10_10_REV 0x8368
Comparing the two, EXT_packed_pixels is a subset of APPLE_packed_pixels, and the shared values are the same. Therefore, if APPLE_packed_pixels is supported, you can safely use all definitions from EXT_packed_pixels.
As your screen shot of the extension viewer already suggests, GL_EXT_packed_pixels has been core functionality since OpenGL 1.2. So in most cases, you should not have to test for any of these in the extension string. If you check the version first, and it's at least 1.2, you already know that the functionality is available. The test logic could look like this:
if (strcmp(glGetString(GL_VERSION), "1.2") >= 0 ||
strstr(glGetString(GL_EXTENSIONS), "_packed_pixels") != NULL)
{
// supported
}

Resources