How to create a 10bit eglImage?How to use 10bit on OpengL? - opengl-es

I can create 8bit Yuv eglImage and use it normally. But the NDK doesn't seem to support the 10bit P010. I can't find AHARDWAREBUFFER_FORMAT_YCbCr_P010 defined in the latest NDK version.
There is also no description of AHARDWAREBUFFER_FORMAT_YCbCr_P010 on the document. AHardwareBuffer
Maybe the current device doesn't support 10bit?
TEST(AHardwareBufferTest, PlanarLockAndUnlockYuvP010Succeed)
How can I use 10bit YUV P010 data on Eglimage and opengL?
I did some research, but nothing good came of it.
Below is my example of creating, welcome to comment and direction.
int ret = 0;
AHardwareBuffer_Desc desc { };
{
desc.width = (uint32_t)m_nWidth;
desc.height = (uint32_t)m_nHeight;
desc.usage = AHARDWAREBUFFER_USAGE_CPU_READ_OFTEN | AHARDWAREBUFFER_USAGE_CPU_WRITE_NEVER
| AHARDWAREBUFFER_USAGE_GPU_COLOR_OUTPUT | AHARDWAREBUFFER_USAGE_GPU_SAMPLED_IMAGE;
/**
* YUV P010 format.
* Must have an even width and height. Can be accessed in OpenGL
* shaders through an external sampler. Does not support mip-maps
* cube-maps or multi-layered textures.
*/
// AHARDWAREBUFFER_FORMAT_YCbCr_P010 = 0x36,
desc.format = AHARDWAREBUFFER_FORMAT_Y8Cb8Cr8_420;
desc.layers = 1;
ret = AHardwareBuffer_isSupported(&desc);
printf("AHardwareBuffer_isSupported: %d", ret);
}
AHardwareBuffer *dstHBuf = nullptr;
{
ret = AHardwareBuffer_allocate(&desc, &dstHBuf);
AHardwareBuffer_describe(dstHBuf, &desc);
}
auto eglDisplay = eglGetDisplay(EGL_DEFAULT_DISPLAY);//sCurEnv.GetDisplay();
EGLClientBuffer dst_cbuf = eglGetNativeClientBufferANDROID(dstHBuf);
EGLint attrs[] = { EGL_IMAGE_PRESERVED_KHR, EGL_TRUE, EGL_NONE};
EGLImageKHR egl_image = eglCreateImageKHR(eglDisplay, EGL_NO_CONTEXT,
EGL_NATIVE_BUFFER_ANDROID, dst_cbuf, attrs);
if (egl_image == EGL_NO_IMAGE_KHR) {
EGLint error = eglGetError();
printf("error creating EGLImage: %#x", error);
}
GLuint nSrcTexture = 0;
glGenTextures(1, &nSrcTexture);
glBindTexture(GL_TEXTURE_EXTERNAL_OES, nSrcTexture);
glTexParameteri(GL_TEXTURE_EXTERNAL_OES, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_EXTERNAL_OES, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_EXTERNAL_OES, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_EXTERNAL_OES, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glBindTexture(GL_TEXTURE_EXTERNAL_OES, nSrcTexture);
glEGLImageTargetTexture2DOES(GL_TEXTURE_EXTERNAL_OES, egl_image);

How can I use 10bit YUV P010 data on opengL?
Use GL_RGBA16UI to read 10bit data in accordance with 16bit.
Convert P010 to RGB10A2 and then use GL_RGB10_A2.

Related

OpenGL drops performance when writing to nonzero FBO attachment on AMD

I noticed that my 3D engine runs very slow on AMD hardware. After some investigation the slow code boiled down to creating FBO with several attachments and writing to any nonzero attachment. In all tests I compared AMD performance with the same AMD GPU, but writing to unaffected GL_COLOR_ATTACHMENT0, and with Nvidia hardware whose performance difference to my AMD device is well known.
Writing fragments to nonzero attachments is 2-3 times slower than expected.
This code is equivalent to how I create a framebuffer and measure performance in my test apps:
// Create a framebuffer
static const auto attachmentCount = 6;
GLuint fb, att[attachmentCount];
glGenTextures(attachmentCount, att);
glGenFramebuffers(1, &fb);
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, fb);
for (auto i = 0; i < attachmentCount; ++i) {
glBindTexture(GL_TEXTURE_2D, att[i]);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0 + i, GL_TEXTURE_2D, att[i], 0);
}
GLuint dbs[] = {
GL_NONE,
GL_COLOR_ATTACHMENT1,
GL_NONE,
GL_NONE,
GL_NONE,
GL_NONE};
glDrawBuffers(attachmentCount, dbs);
// Main loop
while (shouldWork) {
glClear(GL_COLOR_BUFFER_BIT);
for (int i = 0; i < 100; ++i) glDrawArrays(GL_TRIANGLES, 0, 6);
glfwSwapBuffers(window);
glfwPollEvents();
showFps();
}
Is anything wrong with it?
Fully reproducible minimal tests can be found here. I tried many other writing patterns or OpenGL states and described some of them in AMD Community.
I suppose the problem is in AMD's OpenGL driver, but if it's not, or you faced the same problem and found a workaround (a vendor extension?), please share.
UPD: moving problem detail here.
I prepared a minimal test pack, where the application creates an FBO with six RGBA UNSIGNED_BYTE attachments and renders 100 fullscreen rects per frame to it. There are four executables with four patterns of writing:
Writing shader output 0 to attachment 0. Only output 0 is routed to the framebuffer with glDrawBuffers. All other outputs are set to GL_NONE.
Same as 1, but with output and attachment 1.
Writing output 0 to attachment 0, but all six shader outputs are routed to attachments 0..6 respectively, and all drawbuffers except 0 are masked with glColorMaski.
Same as 3, but for attachment 1.
I run all tests on two machines with almost similar CPUs and following GPUs:
AMD Radeon RX550, driver version 19.30.01.16
Nvidia Geforce GTX 650 Ti, which is ~2x less powerful than RX550
and got these results:
Geforce GTX 650 Ti:
attachment0: 195 FPS
attachment1: 195 FPS
attachment0 masked: 195 FPS
attachment1 masked: 235 FPS
Radeon RX550:
attachment0: 350 FPS
attachment1: 185 FPS
attachment0 masked: 330 FPS
attachment1 masked: 175 FPS
Pre-built test executables are attached to the post or can be downloaded from Google drive.
Test sources (with MSVS-friendly cmake buildsystem) are available here on Github
All four programs show a black window and console with FPS counter.
We see that when writing to nonzero attachment, AMD is much slower than less powerful nvidia GPU and than itself. Also global masking of drawbuffer output drops some fps.
I also tried to use renderbuffers instead of textures, use other image formats (while the formats in tests are the most compatible ones), render to power-of-two sized framebuffer. Results were the same.
Explicitly turning off scissor, stencil and depth tests does not help.
If I decrease number of attachments or reduce framebuffer coverage by multiplying vertex coords by less then 1 value, test performance increases proportionally, and finally RX550 outperforms GTX 650 Ti.
glClear calls are also affected, and their performance under various conditions fits the above observations.
My teammate launched tests on Radeon HD 3000 with Linux natively and using Wine. Both test runs exposed the same huge difference between attachment0 and attachment1 tests. I can't tell exactly what is his driver version, but it's provided by Ubuntu 19.04 repos.
Another teammate tried the tests on Radeon RX590 and got the same 2 times difference.
Finally, let me copy-paste two almost equivalent test examples here. This one works fast:
#include <iostream>
#include <cassert>
#include <string>
#include <sstream>
#include <chrono>
#include "GL/glew.h"
#include "GLFW/glfw3.h"
#include <vector>
static std::string getErrorDescr(const GLenum errCode)
{
// English descriptions are from
// https://www.opengl.org/sdk/docs/man/docbook4/xhtml/glGetError.xml
switch (errCode) {
case GL_NO_ERROR: return "No error has been recorded. THIS message is the error itself.";
case GL_INVALID_ENUM: return "An unacceptable value is specified for an enumerated argument.";
case GL_INVALID_VALUE: return "A numeric argument is out of range.";
case GL_INVALID_OPERATION: return "The specified operation is not allowed in the current state.";
case GL_INVALID_FRAMEBUFFER_OPERATION: return "The framebuffer object is not complete.";
case GL_OUT_OF_MEMORY: return "There is not enough memory left to execute the command.";
case GL_STACK_UNDERFLOW: return "An attempt has been made to perform an operation that would cause an internal stack to underflow.";
case GL_STACK_OVERFLOW: return "An attempt has been made to perform an operation that would cause an internal stack to overflow.";
default:;
}
return "No description available.";
}
static std::string getErrorMessage()
{
const GLenum error = glGetError();
if (GL_NO_ERROR == error) return "";
std::stringstream ss;
ss << "OpenGL error: " << static_cast<int>(error) << std::endl;
ss << "Error string: ";
ss << getErrorDescr(error);
ss << std::endl;
return ss.str();
}
[[maybe_unused]] static bool error()
{
const auto message = getErrorMessage();
if (message.length() == 0) return false;
std::cerr << message;
return true;
}
static bool compileShader(const GLuint shader, const std::string& source)
{
unsigned int linesCount = 0;
for (const auto c: source) linesCount += static_cast<unsigned int>(c == '\n');
const char** sourceLines = new const char*[linesCount];
int* lengths = new int[linesCount];
int idx = 0;
const char* lineStart = source.data();
int lineLength = 1;
const auto len = source.length();
for (unsigned int i = 0; i < len; ++i) {
if (source[i] == '\n') {
sourceLines[idx] = lineStart;
lengths[idx] = lineLength;
lineLength = 1;
lineStart = source.data() + i + 1;
++idx;
}
else ++lineLength;
}
glShaderSource(shader, linesCount, sourceLines, lengths);
glCompileShader(shader);
GLint logLength;
glGetShaderiv(shader, GL_INFO_LOG_LENGTH, &logLength);
if (logLength > 0) {
auto* const log = new GLchar[logLength + 1];
glGetShaderInfoLog(shader, logLength, nullptr, log);
std::cout << "Log: " << std::endl;
std::cout << log;
delete[] log;
}
GLint compileStatus;
glGetShaderiv(shader, GL_COMPILE_STATUS, &compileStatus);
delete[] sourceLines;
delete[] lengths;
return bool(compileStatus);
}
static GLuint createProgram(const std::string& vertSource, const std::string& fragSource)
{
const auto vs = glCreateShader(GL_VERTEX_SHADER);
if (vs == 0) {
std::cerr << "Error: vertex shader is 0." << std::endl;
return 2;
}
const auto fs = glCreateShader(GL_FRAGMENT_SHADER);
if (fs == 0) {
std::cerr << "Error: fragment shader is 0." << std::endl;
return 2;
}
// Compile shaders
if (!compileShader(vs, vertSource)) {
std::cerr << "Error: could not compile vertex shader." << std::endl;
return 5;
}
if (!compileShader(fs, fragSource)) {
std::cerr << "Error: could not compile fragment shader." << std::endl;
return 5;
}
// Link program
const auto program = glCreateProgram();
if (program == 0) {
std::cerr << "Error: program is 0." << std::endl;
return 2;
}
glAttachShader(program, vs);
glAttachShader(program, fs);
glLinkProgram(program);
// Get log
GLint logLength = 0;
glGetProgramiv(program, GL_INFO_LOG_LENGTH, &logLength);
if (logLength > 0) {
auto* const log = new GLchar[logLength + 1];
glGetProgramInfoLog(program, logLength, nullptr, log);
std::cout << "Log: " << std::endl;
std::cout << log;
delete[] log;
}
GLint linkStatus = 0;
glGetProgramiv(program, GL_LINK_STATUS, &linkStatus);
if (!linkStatus) {
std::cerr << "Error: could not link." << std::endl;
return 2;
}
glDeleteShader(vs);
glDeleteShader(fs);
return program;
}
static const std::string vertSource = R"(
#version 330
layout(location = 0) in vec2 v;
void main()
{
gl_Position = vec4(v, 0.0, 1.0);
}
)";
static const std::string fragSource = R"(
#version 330
layout(location = 0) out vec4 outColor0;
void main()
{
outColor0 = vec4(0.5, 0.5, 0.5, 1.0);
}
)";
int main()
{
// Init
if (!glfwInit()) {
std::cerr << "Error: glfw init failed." << std::endl;
return 3;
}
static const int width = 800;
static const int height= 600;
glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3);
glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 3);
glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE);
GLFWwindow* window = nullptr;
window = glfwCreateWindow(width, height, "Shader test", nullptr, nullptr);
if (window == nullptr) {
std::cerr << "Error: window is null." << std::endl;
glfwTerminate();
return 1;
}
glfwMakeContextCurrent(window);
if (glewInit() != GLEW_OK) {
std::cerr << "Error: glew not OK." << std::endl;
glfwTerminate();
return 2;
}
// Shader program
const auto shaderProgram = createProgram(vertSource, fragSource);
glUseProgram(shaderProgram);
// Vertex buffer
GLuint vao;
glGenVertexArrays(1, &vao);
glBindVertexArray(vao);
GLuint buffer;
glGenBuffers(1, &buffer);
glBindBuffer(GL_ARRAY_BUFFER, buffer);
float bufferData[] = {
-1.0f, -1.0f,
1.0f, -1.0f,
1.0f, 1.0f,
-1.0f, -1.0f,
1.0f, 1.0f,
-1.0f, 1.0f
};
glBufferData(GL_ARRAY_BUFFER, std::size(bufferData) * sizeof(float), bufferData, GL_STATIC_DRAW);
glEnableVertexAttribArray(0);
glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, (GLvoid*)(0));
glClearColor(0.0f, 0.0f, 0.0f, 0.0f);
// Framebuffer
GLuint fb, att[6];
glGenTextures(6, att);
glGenFramebuffers(1, &fb);
glBindTexture(GL_TEXTURE_2D, att[0]);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glBindTexture(GL_TEXTURE_2D, att[1]);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glBindTexture(GL_TEXTURE_2D, att[2]);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glBindTexture(GL_TEXTURE_2D, att[3]);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glBindTexture(GL_TEXTURE_2D, att[4]);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glBindTexture(GL_TEXTURE_2D, att[5]);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, fb);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, att[0], 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT1, GL_TEXTURE_2D, att[1], 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT2, GL_TEXTURE_2D, att[2], 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT3, GL_TEXTURE_2D, att[3], 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT4, GL_TEXTURE_2D, att[4], 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT5, GL_TEXTURE_2D, att[5], 0);
GLuint dbs[] = {
GL_COLOR_ATTACHMENT0,
GL_NONE,
GL_NONE,
GL_NONE,
GL_NONE,
GL_NONE};
glDrawBuffers(6, dbs);
if (GL_FRAMEBUFFER_COMPLETE != glCheckFramebufferStatus(GL_DRAW_FRAMEBUFFER)) {
std::cerr << "Error: framebuffer is incomplete." << std::endl;
return 1;
}
if (error()) {
std::cerr << "OpenGL error occured." << std::endl;
return 2;
}
// Fpsmeter
static const uint32_t framesMax = 50;
uint32_t framesCount = 0;
auto start = std::chrono::steady_clock::now();
// Main loop
while (!glfwWindowShouldClose(window)) {
if (glfwGetKey(window, GLFW_KEY_ESCAPE) == GLFW_PRESS) glfwSetWindowShouldClose(window, GLFW_TRUE);
glClear(GL_COLOR_BUFFER_BIT);
for (int i = 0; i < 100; ++i) glDrawArrays(GL_TRIANGLES, 0, 6);
glfwSwapBuffers(window);
glfwPollEvents();
if (++framesCount == framesMax) {
framesCount = 0;
const auto now = std::chrono::steady_clock::now();
const auto duration = now - start;
start = now;
const float secsPerFrame = (std::chrono::duration_cast<std::chrono::microseconds>(duration).count() / 1000000.0f) / framesMax;
std::cout << "FPS: " << 1.0f / secsPerFrame << std::endl;
}
}
// Shutdown
glBindBuffer(GL_ARRAY_BUFFER, 0);
glBindVertexArray(vao);
glUseProgram(0);
glDeleteProgram(shaderProgram);
glDeleteBuffers(1, &buffer);
glDeleteVertexArrays(1, &vao);
glDeleteFramebuffers(1, &fb);
glDeleteTextures(6, att);
glfwMakeContextCurrent(nullptr);
glfwDestroyWindow(window);
glfwTerminate();
return 0;
}
And this one works equivalently fast on Nvidia and Intel GPUs, but 2-3 times slower than the first example on AMD GPUs:
#include <iostream>
#include <cassert>
#include <string>
#include <sstream>
#include <chrono>
#include "GL/glew.h"
#include "GLFW/glfw3.h"
#include <vector>
static std::string getErrorDescr(const GLenum errCode)
{
// English descriptions are from
// https://www.opengl.org/sdk/docs/man/docbook4/xhtml/glGetError.xml
switch (errCode) {
case GL_NO_ERROR: return "No error has been recorded. THIS message is the error itself.";
case GL_INVALID_ENUM: return "An unacceptable value is specified for an enumerated argument.";
case GL_INVALID_VALUE: return "A numeric argument is out of range.";
case GL_INVALID_OPERATION: return "The specified operation is not allowed in the current state.";
case GL_INVALID_FRAMEBUFFER_OPERATION: return "The framebuffer object is not complete.";
case GL_OUT_OF_MEMORY: return "There is not enough memory left to execute the command.";
case GL_STACK_UNDERFLOW: return "An attempt has been made to perform an operation that would cause an internal stack to underflow.";
case GL_STACK_OVERFLOW: return "An attempt has been made to perform an operation that would cause an internal stack to overflow.";
default:;
}
return "No description available.";
}
static std::string getErrorMessage()
{
const GLenum error = glGetError();
if (GL_NO_ERROR == error) return "";
std::stringstream ss;
ss << "OpenGL error: " << static_cast<int>(error) << std::endl;
ss << "Error string: ";
ss << getErrorDescr(error);
ss << std::endl;
return ss.str();
}
[[maybe_unused]] static bool error()
{
const auto message = getErrorMessage();
if (message.length() == 0) return false;
std::cerr << message;
return true;
}
static bool compileShader(const GLuint shader, const std::string& source)
{
unsigned int linesCount = 0;
for (const auto c: source) linesCount += static_cast<unsigned int>(c == '\n');
const char** sourceLines = new const char*[linesCount];
int* lengths = new int[linesCount];
int idx = 0;
const char* lineStart = source.data();
int lineLength = 1;
const auto len = source.length();
for (unsigned int i = 0; i < len; ++i) {
if (source[i] == '\n') {
sourceLines[idx] = lineStart;
lengths[idx] = lineLength;
lineLength = 1;
lineStart = source.data() + i + 1;
++idx;
}
else ++lineLength;
}
glShaderSource(shader, linesCount, sourceLines, lengths);
glCompileShader(shader);
GLint logLength;
glGetShaderiv(shader, GL_INFO_LOG_LENGTH, &logLength);
if (logLength > 0) {
auto* const log = new GLchar[logLength + 1];
glGetShaderInfoLog(shader, logLength, nullptr, log);
std::cout << "Log: " << std::endl;
std::cout << log;
delete[] log;
}
GLint compileStatus;
glGetShaderiv(shader, GL_COMPILE_STATUS, &compileStatus);
delete[] sourceLines;
delete[] lengths;
return bool(compileStatus);
}
static GLuint createProgram(const std::string& vertSource, const std::string& fragSource)
{
const auto vs = glCreateShader(GL_VERTEX_SHADER);
if (vs == 0) {
std::cerr << "Error: vertex shader is 0." << std::endl;
return 2;
}
const auto fs = glCreateShader(GL_FRAGMENT_SHADER);
if (fs == 0) {
std::cerr << "Error: fragment shader is 0." << std::endl;
return 2;
}
// Compile shaders
if (!compileShader(vs, vertSource)) {
std::cerr << "Error: could not compile vertex shader." << std::endl;
return 5;
}
if (!compileShader(fs, fragSource)) {
std::cerr << "Error: could not compile fragment shader." << std::endl;
return 5;
}
// Link program
const auto program = glCreateProgram();
if (program == 0) {
std::cerr << "Error: program is 0." << std::endl;
return 2;
}
glAttachShader(program, vs);
glAttachShader(program, fs);
glLinkProgram(program);
// Get log
GLint logLength = 0;
glGetProgramiv(program, GL_INFO_LOG_LENGTH, &logLength);
if (logLength > 0) {
auto* const log = new GLchar[logLength + 1];
glGetProgramInfoLog(program, logLength, nullptr, log);
std::cout << "Log: " << std::endl;
std::cout << log;
delete[] log;
}
GLint linkStatus = 0;
glGetProgramiv(program, GL_LINK_STATUS, &linkStatus);
if (!linkStatus) {
std::cerr << "Error: could not link." << std::endl;
return 2;
}
glDeleteShader(vs);
glDeleteShader(fs);
return program;
}
static const std::string vertSource = R"(
#version 330
layout(location = 0) in vec2 v;
void main()
{
gl_Position = vec4(v, 0.0, 1.0);
}
)";
static const std::string fragSource = R"(
#version 330
layout(location = 1) out vec4 outColor1;
void main()
{
outColor1 = vec4(0.5, 0.5, 0.5, 1.0);
}
)";
int main()
{
// Init
if (!glfwInit()) {
std::cerr << "Error: glfw init failed." << std::endl;
return 3;
}
static const int width = 800;
static const int height= 600;
glfwWindowHint(GLFW_CONTEXT_VERSION_MAJOR, 3);
glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 3);
glfwWindowHint(GLFW_OPENGL_PROFILE, GLFW_OPENGL_CORE_PROFILE);
GLFWwindow* window = nullptr;
window = glfwCreateWindow(width, height, "Shader test", nullptr, nullptr);
if (window == nullptr) {
std::cerr << "Error: window is null." << std::endl;
glfwTerminate();
return 1;
}
glfwMakeContextCurrent(window);
if (glewInit() != GLEW_OK) {
std::cerr << "Error: glew not OK." << std::endl;
glfwTerminate();
return 2;
}
// Shader program
const auto shaderProgram = createProgram(vertSource, fragSource);
glUseProgram(shaderProgram);
// Vertex buffer
GLuint vao;
glGenVertexArrays(1, &vao);
glBindVertexArray(vao);
GLuint buffer;
glGenBuffers(1, &buffer);
glBindBuffer(GL_ARRAY_BUFFER, buffer);
float bufferData[] = {
-1.0f, -1.0f,
1.0f, -1.0f,
1.0f, 1.0f,
-1.0f, -1.0f,
1.0f, 1.0f,
-1.0f, 1.0f
};
glBufferData(GL_ARRAY_BUFFER, std::size(bufferData) * sizeof(float), bufferData, GL_STATIC_DRAW);
glEnableVertexAttribArray(0);
glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, (GLvoid*)(0));
glClearColor(0.0f, 0.0f, 0.0f, 0.0f);
// Framebuffer
GLuint fb, att[6];
glGenTextures(6, att);
glGenFramebuffers(1, &fb);
glBindTexture(GL_TEXTURE_2D, att[0]);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glBindTexture(GL_TEXTURE_2D, att[1]);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glBindTexture(GL_TEXTURE_2D, att[2]);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glBindTexture(GL_TEXTURE_2D, att[3]);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glBindTexture(GL_TEXTURE_2D, att[4]);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glBindTexture(GL_TEXTURE_2D, att[5]);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, nullptr);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, fb);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, att[0], 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT1, GL_TEXTURE_2D, att[1], 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT2, GL_TEXTURE_2D, att[2], 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT3, GL_TEXTURE_2D, att[3], 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT4, GL_TEXTURE_2D, att[4], 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT5, GL_TEXTURE_2D, att[5], 0);
GLuint dbs[] = {
GL_NONE,
GL_COLOR_ATTACHMENT1,
GL_NONE,
GL_NONE,
GL_NONE,
GL_NONE};
glDrawBuffers(6, dbs);
if (GL_FRAMEBUFFER_COMPLETE != glCheckFramebufferStatus(GL_DRAW_FRAMEBUFFER)) {
std::cerr << "Error: framebuffer is incomplete." << std::endl;
return 1;
}
if (error()) {
std::cerr << "OpenGL error occured." << std::endl;
return 2;
}
// Fpsmeter
static const uint32_t framesMax = 50;
uint32_t framesCount = 0;
auto start = std::chrono::steady_clock::now();
// Main loop
while (!glfwWindowShouldClose(window)) {
if (glfwGetKey(window, GLFW_KEY_ESCAPE) == GLFW_PRESS) glfwSetWindowShouldClose(window, GLFW_TRUE);
glClear(GL_COLOR_BUFFER_BIT);
for (int i = 0; i < 100; ++i) glDrawArrays(GL_TRIANGLES, 0, 6);
glfwSwapBuffers(window);
glfwPollEvents();
if (++framesCount == framesMax) {
framesCount = 0;
const auto now = std::chrono::steady_clock::now();
const auto duration = now - start;
start = now;
const float secsPerFrame = (std::chrono::duration_cast<std::chrono::microseconds>(duration).count() / 1000000.0f) / framesMax;
std::cout << "FPS: " << 1.0f / secsPerFrame << std::endl;
}
}
// Shutdown
glBindBuffer(GL_ARRAY_BUFFER, 0);
glBindVertexArray(vao);
glUseProgram(0);
glDeleteProgram(shaderProgram);
glDeleteBuffers(1, &buffer);
glDeleteVertexArrays(1, &vao);
glDeleteFramebuffers(1, &fb);
glDeleteTextures(6, att);
glfwMakeContextCurrent(nullptr);
glfwDestroyWindow(window);
glfwTerminate();
return 0;
}
The only difference between these examples is the color attachment used.
I composed two almost similar copy-pasted programs on purpose to avoid possible nasty effects of framebuffer deletion and recreation.
UPD2: Also tried OpenGL 4.6 debug context on my test examples on both Nvidia and AMD. Got no performance warnings.
UPD3: RX470 results:
attachment0: 775 FPS
attachment1: 396 FPS
UPD4: I built attachment0 and attachment1 tests for webgl via emscripten and ran them on Radeon RX550. Full source is in problem's Github repo, build command lines are
emcc --std=c++17 -O3 -s WASM=1 -s USE_GLFW=3 -s USE_WEBGL2=1 ./FillRate_attachment0_webgl.cpp -o attachment0.html
emcc --std=c++17 -O3 -s WASM=1 -s USE_GLFW=3 -s USE_WEBGL2=1 ./FillRate_attachment1_webgl.cpp -o attachment1.html
Both test programs issue a single drawcall: glDrawArraysInstanced(GL_TRIANGLES, 0, 6, 1000);
First test: Firefox with default config, i.e. DirectX-backed ANGLE.
Unmasked Vendor: Google Inc.
Unmasked Renderer: ANGLE (Radeon RX550/550 Series Direct3D11 vs_5_0 ps_5_0)
attachment0: 38 FPS
attachment1: 38 FPS
Second test: Firefox with disabled ANGLE, (about:config -> webgl.disable-angle = true), using native OpenGL:
Unmasked Vendor: ATI Technologies Inc.
Unmasked Renderer: Radeon RX550/550 Series
attachment0: 38 FPS
attachment1: 19 FPS
We see that DirectX is not affected by the problem, and OpenGL issue is reproducible in WebGL. It's an expected result, as gamers and developers complained only about OpenGL performance.
P.S. Probably my issue is the root of this and this performance drops.
The problem is fixed by AMD since (at least) December 2019 driver. The fix is confirmed by abovementioned test programs and our game engine FPS rate.
See also this thread.
Dear AMD OpenGL driver team, thank you very much!

Puzzels about glteximage2d using gl_luminance

I met some problem about using gl_luminance to define FBO. Here is the code i used,
generateRenderToTexture(GL_RGBA, GL_RGBA, GL_UNSIGNED_BYTE, _maskTexture, _imageWidth, _imageHeight, false);
related code is as follows,
TextureBuffer _maskTexture;
class TextureBuffer {
public:
GLuint texture;
GLuint frameBuffer;
GLenum internalformat;
GLenum format;
GLenum type;
int w,h;
TextureBuffer() : texture(0), frameBuffer(0) {}
void release()
{
if(texture)
{
glDeleteTextures(1, &texture);
texture = 0;
}
if(frameBuffer)
{
glDeleteFramebuffers(1, &frameBuffer);
frameBuffer = 0;
}
}
};
void generateRenderToTexture(GLint internalformat, GLenum format, GLenum type,
TextureBuffer &tb, int w, int h, bool linearInterp)
{
glGenTextures(1, &tb.texture);
glBindTexture(GL_TEXTURE_2D, tb.texture);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, linearInterp ? GL_LINEAR : GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, linearInterp ? GL_LINEAR : GL_NEAREST);
glTexImage2D(GL_TEXTURE_2D, 0, internalformat, w, h, 0, format, type, NULL);
glGenFramebuffers(1, &tb.frameBuffer);
glBindFramebuffer(GL_FRAMEBUFFER, tb.frameBuffer);
glClear(_glClearBits);
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, tb.texture, 0);
GLenum status = glCheckFramebufferStatus(GL_FRAMEBUFFER);
if(status != GL_FRAMEBUFFER_COMPLETE)
printf("Framebuffer status: %x", (int)status);
tb.internalformat = internalformat;
tb.format = format;
tb.type = type;
tb.w = w;
tb.h = h;
}
The question is when I use,
generateRenderToTexture(GL_RGBA, GL_RGBA, GL_UNSIGNED_BYTE, _maskTexture, _imageWidth, _imageHeight, false);
The code went well. But if use gl_luminance instead,
generateRenderToTexture(GL_LUMINANCE, GL_LUMINANCE, GL_UNSIGNED_BYTE, _maskTexture, _imageWidthOriginal,
I don't know why i could not use GL_LUMINANCE to define the FBO. Anyone have some useful suggestions to solve this?
The only formats that are guaranteed to work as color FBO attachments in ES 2.0 are, according to table 4.5 in the spec document:
GL_RGBA4
GL_RGB5_A1
GL_RGB565
Support for rendering to GL_RGBA, which works for you, is not required by the standard. Many implementations support it, though. The OES_rgb8_rgba8 extension adds support for GL_RGB8 and GL_RGBA8 formats as render targets.
GL_LUMINANCE is not supported as a color-renderable format by the standard, and I can't find an extension for it either. It's possible that some implementations could support it, but you certainly can't count on it.
ES 3.0 lists GL_R8 as a color-renderable format. In ES 3.0, the RED/R formats replace the LUMINANCE/ALPHA formats from ES 2.0. So if you can move to ES 3.0, you have support to render to 1-component texture formats.
You're using non-extension FBO functions, which were introduced only with OpenGL-3. So unlike FBO extension functions (ending with ARB) those functions are available only with a OpenGL-3 context. In OpenGL-3 the GL_LUMINANCE and GL_ALPHA texture internal formats are deprecated, are not available in core profile. They have been replaced by the GL_RED texture format. You can use an appropriately written shader or texture swizzle parameters to make a GL_RED texture work just like GL_LUMINANCE (swizzle dst.rgb = texture.r) or GL_ALPHA (swizzle dst.a = texture.r).
I have solved by using GL_RG_EXT, or GL_RED_EXT, instead.

OpenGL FBO performance issue

i'm working on a school project and I've come across an issue with my FBO.
game is rendered in 2 passes:
1) I render to the shadow map texture using an FBO.
2) I render scene normally to the default FBO.
Issue is, for some reason, binding the FBO when i do the first pass slows down my game by roughly 200+ fps, I really don't know what could be wrong with the FBO since it's as barebones as possible.
If i were to render the shadow map without binding, so directly to the screen, it'd be 200 fps faster, so it's not an issue of rendering, it's an issue of binding the FBO I believe.
Anyways, here's the first pass function.
//renders to the depth buffer texture.
void firstPass()
{
static GLuint shadowID = Resources::getInstance().shadowShader;
shadowBuffer->bindForWriting();//make active frame buffer the shadow buffer
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
glViewport(0, 0, shadowBuffer->TEXTURE_WIDTH, shadowBuffer->TEXTURE_HEIGHT);
//render to the shadow map from the point of view of the sunlight.
glUseProgram(shadowID);
... set some uniforms
scene->render(shadowID);
shadowBuffer->unBindForWriting();//set frame buffer to default
glUseProgram(0);
}
And here is my FBO class in its entirety.
ShadowMapFBO()
{
init();
}
void bindForWriting()
{
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, FBO);
}
void unBindForWriting()
{
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, 0);
}
void bindForReading(GLenum TextureUnit, GLuint texture)
{
glActiveTexture(TextureUnit);
glBindTexture(GL_TEXTURE_2D, texture);
}
void BindTexture(GLuint textureID, GLuint location)
{
glUniform1i(textureID, location);
}
void unBindForReading()
{
glBindTexture(GL_TEXTURE_2D, 0);
}
GLuint shadowTexture;
int TEXTURE_WIDTH = 2048;
int TEXTURE_HEIGHT = 2048;
GLuint FBO;
void init()
{
glGenFramebuffers(1, &FBO);
glBindFramebuffer(GL_FRAMEBUFFER, FBO);
shadowTexture = Util::createTexture(TEXTURE_WIDTH, TEXTURE_HEIGHT, true);
glFramebufferTexture(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, shadowTexture, 0);
glDrawBuffer(GL_NONE);
glReadBuffer(GL_NONE);
GLenum Status = glCheckFramebufferStatus(GL_FRAMEBUFFER);
if (Status != GL_FRAMEBUFFER_COMPLETE)
printf("FB error, status: 0x%x\n", Status);
else
printf("Shadow Buffer created successfully.\n");
glBindFramebuffer(GL_FRAMEBUFFER, 0);
}
Here's how I create my depth buffer texture.
static GLuint createTexture(int width, int height)
{
GLuint textureId;
glGenTextures(1, &textureId);
glBindTexture(GL_TEXTURE_2D, textureId);
glTexImage2D(GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT, width, height, 0, GL_DEPTH_COMPONENT, GL_FLOAT, NULL);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP);
int i;
i = glGetError();
if (i != 0)
{
std::cout << "Error happened while loading the texture: " << i << std::endl;
}
glBindTexture(GL_TEXTURE_2D, 0);
return textureId;
}

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_BORDER, 0); Throws a GL_INVALID_ENUM?

I am creating a 2D openGL texture from an NSImage (cocoa desktop).
I grabbed some code from the internet and it seems to work.
However, I started to insert calls to glGetError() here and there to track other rendering bugs, and thus I traced a GL_INVALID_ENUM error to the call mentioned above. As per the documentation on glTexParameteri, both GL_TEXTURE_2D and GL_TEXTURE_BORDER seem to be valid parameters...?
any clue?
this is the code:
- (void) createOpenGLTexture
{
[_openGLContext makeCurrentContext];
if (_textureID != 0) {
glDeleteTextures(1, &_textureID);
_textureID = 0;
}
NSSize imageSize = [_originalImage size];
// .............................................................
// Flip Image Vertically
NSImage* image = [[NSImage alloc] initWithSize:imageSize];
NSRect rect = NSMakeRect(0, 0, imageSize.width, imageSize.height);
[image lockFocus];
NSAffineTransform* transform = [NSAffineTransform transform];
[transform translateXBy:0 yBy:rect.size.height];
[transform scaleXBy:+1.0 yBy:-1.0];
[transform concat];
[_originalImage drawAtPoint:NSZeroPoint
fromRect:rect
operation:NSCompositeCopy
fraction:1];
[image unlockFocus];
// Then we grab the raw bitmap data from the NSBitmapImageRep that is
// the ‘source’ of an NSImage
NSBitmapImageRep* bitmap = [[NSBitmapImageRep alloc] initWithData:[image TIFFRepresentation]];
[image release];
GLenum imageFormat = GL_RGBA;
imageSize = [bitmap size]; // Needed?
long sourceRowBytes = [bitmap bytesPerRow];
// This SHOULD be enough but nope
GLubyte* sourcePic = (GLubyte*) [bitmap bitmapData];
// We have to copy that raw data one row at a time….yay
GLubyte* pic = malloc( imageSize.height * sourceRowBytes );
GLuint i;
GLuint intHeight = (GLuint) (imageSize.height);
for (i = 0; i < imageSize.height; i++) {
memcpy(pic + (i * sourceRowBytes),
sourcePic + (( intHeight - i - 1 )*sourceRowBytes),
sourceRowBytes);
}
[bitmap release];
sourcePic = pic;
// .........................................................................
// Create the texture proper
glEnable(GL_TEXTURE_2D);
checkOpenGLError();
glEnable(GL_COLOR_MATERIAL);
checkOpenGLError();
glGenTextures (1, &_textureID);
checkOpenGLError();
glBindTexture (GL_TEXTURE_2D, _textureID);
checkOpenGLError();
glPixelStorei(GL_UNPACK_ALIGNMENT,1);
checkOpenGLError();
// Here we set the basic properties such as how it looks when resized and if it's bordered
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_BORDER, 0);
checkOpenGLError();
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
checkOpenGLError();
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
checkOpenGLError();
// Not really necessary all the time but we can also define if it can be tiled (repeating)
// Here GL_TEXTURE_WRAP_S = horizontal and GL_TEXTURE_WRAP_T = vertical
// This defaults to GL_REPEAT so we're going to prevent that by using GL_CLAMP
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP);
checkOpenGLError();
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP);
checkOpenGLError();
// And.....CREATE
glTexImage2D(GL_TEXTURE_2D,
0,
imageFormat,
rect.size.width,
rect.size.height,
0,
imageFormat,
GL_UNSIGNED_BYTE,
sourcePic);
checkOpenGLError();
glDisable(GL_TEXTURE_2D);
checkOpenGLError();
glDisable(GL_COLOR_MATERIAL);
checkOpenGLError();
}
GL_TEXTURE_BORDER is not a valid enum for glTexParameter(). Reference: http://www.opengl.org/sdk/docs/man/xhtml/glTexParameter.xml
You cannot (and don't need to) specify the bordersize. GL_TEXTURE_BORDER is only a valid enum for glGetTexLevelParameter() to retrieve the bordersize the opengl driver assigned to a specific texture level. As far as I am aware this is not used any more and the function returns 0 on all newer systems.
Don't confuse it with GL_TEXTURE_BORDER_COLOR which is used to set the color rendered when using the GL_CLAMP_BORDER wrapping mode.
GL_TEXTURE_BORDER isn't a valid value for pname when calling glTexParameteri. The border is specified in the call to glTexImage2D.
http://www.opengl.org/sdk/docs/man/xhtml/glTexParameter.xml

Render YUV video in OpenGL of ffmpeg using CVPixelBufferRef and Shaders

I'm using to render YUV frames of ffmpeg with the iOS 5.0 method "CVOpenGLESTextureCacheCreateTextureFromImage".
I'm using like the apple example GLCameraRipple
My result in iPhone screen is this: iPhone Screen
I need to know I'm doing wrong.
I put part of my code to find errors.
ffmpeg configure frames:
ctx->p_sws_ctx = sws_getContext(ctx->p_video_ctx->width,
ctx->p_video_ctx->height,
ctx->p_video_ctx->pix_fmt,
ctx->p_video_ctx->width,
ctx->p_video_ctx->height,
PIX_FMT_YUV420P, SWS_FAST_BILINEAR, NULL, NULL, NULL);
// Framebuffer for RGB data
ctx->p_frame_buffer = malloc(avpicture_get_size(PIX_FMT_YUV420P,
ctx->p_video_ctx->width,
ctx->p_video_ctx->height));
avpicture_fill((AVPicture*)ctx->p_picture_rgb, ctx->p_frame_buffer,PIX_FMT_YUV420P,
ctx->p_video_ctx->width,
ctx->p_video_ctx->height);
My render method:
if (NULL == videoTextureCache) {
NSLog(#"displayPixelBuffer error");
return;
}
CVPixelBufferRef pixelBuffer;
CVPixelBufferCreateWithBytes(kCFAllocatorDefault, mTexW, mTexH, kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange, buffer, mFrameW * 3, NULL, 0, NULL, &pixelBuffer);
CVReturn err;
// Y-plane
glActiveTexture(GL_TEXTURE0);
err = CVOpenGLESTextureCacheCreateTextureFromImage(kCFAllocatorDefault,
videoTextureCache,
pixelBuffer,
NULL,
GL_TEXTURE_2D,
GL_RED_EXT,
mTexW,
mTexH,
GL_RED_EXT,
GL_UNSIGNED_BYTE,
0,
&_lumaTexture);
if (err)
{
NSLog(#"Error at CVOpenGLESTextureCacheCreateTextureFromImage %d", err);
}
glBindTexture(CVOpenGLESTextureGetTarget(_lumaTexture), CVOpenGLESTextureGetName(_lumaTexture));
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
// UV-plane
glActiveTexture(GL_TEXTURE1);
err = CVOpenGLESTextureCacheCreateTextureFromImage(kCFAllocatorDefault,
videoTextureCache,
pixelBuffer,
NULL,
GL_TEXTURE_2D,
GL_RG_EXT,
mTexW/2,
mTexH/2,
GL_RG_EXT,
GL_UNSIGNED_BYTE,
1,
&_chromaTexture);
if (err)
{
NSLog(#"Error at CVOpenGLESTextureCacheCreateTextureFromImage %d", err);
}
glBindTexture(CVOpenGLESTextureGetTarget(_chromaTexture), CVOpenGLESTextureGetName(_chromaTexture));
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glBindFramebuffer(GL_FRAMEBUFFER, defaultFramebuffer);
// Set the view port to the entire view
glViewport(0, 0, backingWidth, backingHeight);
static const GLfloat squareVertices[] = {
1.0f, 1.0f,
-1.0f, 1.0f,
1.0f, -1.0f,
-1.0f, -1.0f,
};
GLfloat textureVertices[] = {
1, 1,
1, 0,
0, 1,
0, 0,
};
// Draw the texture on the screen with OpenGL ES 2
[self renderWithSquareVertices:squareVertices textureVertices:textureVertices];
// Flush the CVOpenGLESTexture cache and release the texture
CVOpenGLESTextureCacheFlush(videoTextureCache, 0);
CVPixelBufferRelease(pixelBuffer);
[moviePlayerDelegate bufferDone];
RenderWithSquareVertices method
- (void)renderWithSquareVertices:(const GLfloat*)squareVertices textureVertices:(const GLfloat*)textureVertices
{
// Use shader program.
glUseProgram(shader.program);
// Update attribute values.
glVertexAttribPointer(ATTRIB_VERTEX, 2, GL_FLOAT, 0, 0, squareVertices);
glEnableVertexAttribArray(ATTRIB_VERTEX);
glVertexAttribPointer(ATTRIB_TEXTUREPOSITON, 2, GL_FLOAT, 0, 0, textureVertices);
glEnableVertexAttribArray(ATTRIB_TEXTUREPOSITON);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
// Present
glBindRenderbuffer(GL_RENDERBUFFER, colorRenderbuffer);
[context presentRenderbuffer:GL_RENDERBUFFER];
}
My fragment shader:
uniform sampler2D SamplerY;
uniform sampler2D SamplerUV;
varying highp vec2 _texcoord;
void main()
{
mediump vec3 yuv;
lowp vec3 rgb;
yuv.x = texture2D(SamplerY, _texcoord).r;
yuv.yz = texture2D(SamplerUV, _texcoord).rg - vec2(0.5, 0.5);
// BT.601, which is the standard for SDTV is provided as a reference
/* rgb = mat3( 1, 1, 1,
0, -.34413, 1.772,
1.402, -.71414, 0) * yuv;*/
// Using BT.709 which is the standard for HDTV
rgb = mat3( 1, 1, 1,
0, -.18732, 1.8556,
1.57481, -.46813, 0) * yuv;
gl_FragColor = vec4(rgb, 1);
}
Very thanks,
I imagine the issue is that YUV420 (or I420) is a tri-planar image format. I420 is an 8 bit Y plane followed by 8 bit 2x2 subsampled U and V planes. The code from GLCameraRipple is expecting NV12 format: 8-bit Y plane followed by an interleaved U/V plane with 2x2 subsampling. Given this I expect you will need three textures. luma_tex, u_chroma_tex, v_chroma_tex.
Also note that GLCameraRipple may also be expecting 'video range'. In other words the values for the planar format are luma=[16,235] chroma=[16,240].

Resources