deterministic dithering with libsoxr - resampling

I have an issue with some unit test code which is giving different results for every execution.
I tracked it back to libsoxr (0.1.3) and discovered that is its down to the dithering option:
That is, if soxr_create() is invoked with:
soxr_io_spec_t soxIoSpec = soxr_io_spec(SOXR_INT16_I, SOXR_INT16_I);
sxIoSpec.flags |= SOXR_NO_DITHER;
The output of soxr_process() is deterministic.
But without adding the SOXR_NO_DITHER flag the output is slightly different for each execution.
There is another thing about the library which surprises me here.
soxr_oneshot() does not suffer from this problem (the non-determinism).
What is going on here?

Looking into the code I see that in soxr.c the dither uses a pseudo random number generator
but the seed is an implementation detail generated from the time by:
p->seed = (unsigned long)time(0) ^ (unsigned long)(size_t)p;
It does not seem to be exposed by the library thus preventing you from setting a particular seed which you could otherwise do tog et the same result each time the test is run.
I have suggested a minor enhanced to the API like the below to facilitate this,
though someone with more knowledge of the library may be able to suggest a better way.
In Soxr.h add:
typedef unsigned long soxr_seed_t;
// set or retrieve the random seed used by the dithering function
void soxr_setseed(soxr_t resampler, soxr_seed_t new_seed);
soxr_seed_t soxr_getseed(soxr_t* resampler);
In Soxr.c add:
void soxr_setseed(soxr_t resampler, soxr_seed_t new_seed)
{
resampler->seed = new_seed;
}
soxr_seed_t soxr_getseed(soxr_t resampler)
{
return resampler->seed;
}
One thing about the library which still surprises me here is that
soxr_oneshot() does not suffer from this problem (the non-determinism).
I can’t see how the seed is fixed or SOXR_NO_DITHER set by the internal call to soxr_create().
I have obviously missed something here which someone with more knowledge of the library may be able to explain.

Related

Parameters for dlib::find_min_bobyqa

I'm working on the C++ version of Matt Zucker's Page dewarping. So far everything works fine, but I have a problem with optimization. In line 748 of Github repo Matt uses optimize function from Scipy. My C++ equivalent is find_min_bobyqa from dlib.net. The code is:
auto f = [&](const column_vector& ppts) { return objective( dstpoints, ppts, keypoint_index); };
dlib::find_min_bobyqa(f,
params,
2 * params.nr() + 1, // npt - number of interpolation points: x.size() + 2 <= npt && npt <= (x.size()+1)*(x.size()+2)/2
dlib::uniform_matrix<double>(params.nr(), 1, -2), // lower bound constraint
dlib::uniform_matrix<double>(params.nr(), 1, 2), // upper bound constraint
1, // initial trust region radius
1e-5, // stopping trust region radius
4000 // max number of objective function evaluations
);
In my concrete example params is a dlib::column_vector with double values and length = 189. Every element of params is less than 2.0 and greater than -2.0. Function objective() returns double value and "alone" it works properly because I get the same value as in the Python version. But after running fin_min_bobyqa function I usually get the message:
Terminate called after throwing an instance of 'dlib:bobyqa_failure', return from BOBYQA because the objective function has been called max_f_evals times.
I set max_f_evals to quite big value to see if it optimizes at all, but it doesn't. I did some tweaking with parameters but without good results. How should I set the parameters of find_min_bobyqa to get the right solution?
I am very interested in this issue as well. Zucker's work, with very minor tweaks, is ideal for straightening sheet music images, and I was looking for ways to implement it in a mobile platform when I came across your question.
My research so far suggests that BOBYQA is not the equivalent of Powell's method in scipy. BOBYQA is constrained, and the one in scipy is not.
See these links for more information, and a possible way to compile the right supporting library - I would try UOBYQA or NEWUOA.
https://github.com/jacobwilliams/PowellOpt
https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html#rdd2e1855725e-3
(See the Notes section)
EDIT: see C version here:
https://github.com/emmt/Algorithms/tree/master/newuoa
I wanted to post this as a comment, but I don't have enough points for that.
I am very interested in your progress. If you're willing, please keep me posted.
I finally solved this problem. I used PRAXIS library, because it doesn't need derivative information and is fast.
I modified the code a little to my needs and now it is faster around few seconds than original version written in Python.

how to include text file as string at compile time without adding c++11 string literal prefix and suffix in the text file

I'm aware of many similar questions on this site. I really like the solution mention in the following link:
https://stackoverflow.com/a/25021520/884553
with some modification, you can include text file at compile time, for example:
constexpr const char* s =
#include "file.txt"
BUT to make this work you have to add string literal prefix and suffix to your original file, for example
R"(
This is the original content,
and I don't want this file to be modified. but i
don't know how to do it.
)";
My question is: is there a way to make this work but not modifying file.txt?
(I know I can use command line tools to make a copy, prepend and append to the copy, remove the copy after compile. I'm looking for a more elegant solution than this. hopefully no need of other tools)
Here's what I've tried (but not working):
#include <iostream>
int main() {
constexpr const char* s =
#include "bra.txt" // R"(
#include "file.txt" //original file without R"( and )";
#include "ket.txt" // )";
std::cout << s << "\n";
return 0;
}
/opt/gcc8/bin/g++ -std=c++1z a.cpp
In file included from a.cpp:5:
bra.txt:1:1: error: unterminated raw string
R"(
^
a.cpp: In function ‘int main()’:
a.cpp:4:27: error: expected primary-expression at end of input
constexpr const char* s =
^
a.cpp:4:27: error: expected ‘}’ at end of input
a.cpp:3:12: note: to match this ‘{’
int main() {
^
No, this cannot be done.
There is a c++2a proposal to allow inclusion of such resources at compile time called std::embed.
The motivation part of ths p1040r1 proposal:
Motivation
Every C and C++ programmer -- at some point -- attempts to #include large chunks of non-C++ data into their code. Of course, #include expects the format of the data to be source code, and thusly the program fails with spectacular lexer errors. Thusly, many different tools and practices were adapted to handle this, as far back as 1995 with the xxd tool. Many industries need such functionality, including (but hardly limited to):
Financial Development
representing coefficients and numeric constants for performance-critical algorithms;
Game Development
assets that do not change at runtime, such as icons, fixed textures and other data
Shader and scripting code;
Embedded Development
storing large chunks of binary, such as firmware, in a well-compressed format
placing data in memory on chips and systems that do not have an operating system or file system;
Application Development
compressed binary blobs representing data
non-C++ script code that is not changed at runtime; and
Server Development
configuration parameters which are known at build-time and are baked in to set limits and give compile-time information to tweak performance under certain loads
SSL/TLS Certificates hard-coded into your executable (requiring a rebuild and potential authorization before deploying new certificates).
In the pursuit of this goal, these tools have proven to have inadequacies and contribute poorly to the C++ development cycle as it continues to scale up for larger and better low-end devices and high-performance machines, bogging developers down with menial build tasks and trying to cover-up disappointing differences between platforms.
MongoDB has been kind enough to share some of their code below. Other companies have had their example code anonymized or simply not included directly out of shame for the things they need to do to support their workflows. The author thanks MongoDB for their courage and their support for std::embed.
The request for some form of #include_string or similar dates back quite a long time, with one of the oldest stack overflow questions asked-and-answered about it dating back nearly 10 years. Predating even that is a plethora of mailing list posts and forum posts asking how to get script code and other things that are not likely to change into the binary.
This paper proposes <embed> to make this process much more efficient, portable, and streamlined. Here’s an example of the ideal:
#include <embed>
int main (int, char*[]) {
constexpr std::span<const std::byte> fxaa_binary = std::embed( "fxaa.spirv" );
// assert this is a SPIRV file, compile-time
static_assert( fxaa_binary[0] == 0x03 && fxaa_binary[1] == 0x02
&& fxaa_binary[2] == 0x23 && fxaa_binary[3] == 0x07
, "given wrong SPIRV data, check rebuild or check the binaries!" )
auto context = make_vulkan_context();
// data kept around and made available for binary
// to use at runtime
auto fxaa_shader = make_shader( context, fxaa_binary );
for (;;) {
// ...
// and we’re off!
// ...
}
return 0;
}

OpenACCArray swap function

while trying to create an object oriented OpenACC implementation I stumbled upon this question.
From there I took the code provided by #mat-colgrove at the GTC15 (code available at http://www.pgroup.com/lit/samples/gtc15_S5233.tar).
Since I am interested how to use objects to manage data on with OpenACC I posted another question.
I was quite impressed by the ease of the OpenACCArray::swap function, so I created a small example to test it (see gist).
First I tried to just swap and hope that it is sufficient to swap the pointers on the host, but this ends in a fatal memory error. (presumably because the size and capacity members are not updated on the device)
A safer approach, that I assumed to work is to update the host, swap arrays and update device. This runs but creates wrong results.
I am compiling for nvidia accelerators.
Looks like this is my fault since I didn't test the swap routine.
The problem here is while the code is swapping the data on the host, the device copy of the objects still point to the old array. The fix is to re-attach (i.e. set the object's device pointers to the correct arrays) the lists.
void swap(OpenACCArray<type>& x)
{
type* tmp_list = list;
int tmp_size = _size;
int tmp_capacity = _capacity;
list = x.list;
_size = x._size;
_capacity = x._capacity;
x.list = tmp_list;
x._size = tmp_size;
x._capacity = tmp_capacity;
#ifdef _OPENACC
#pragma acc update device(_size,_capacity,x._size,x._capacity)
acc_attach((void**)&list);
acc_attach((void**)&x.list);
#endif
}
"acc_attach" is a PGI extension that hopefully will be adopted in the OpenACC 3.0 standard.
Thanks for trying things out and let me know if you encounter other issues.
- Mat

seeding default_random_engine?

I'm using visual studio 2010 which doesn't support <chrono>, so I have to seed default_random_engine. Thus, I've decided to seed it with rand as following
srand((unsigned int)time(NULL));
std::default_random_engine engine(rand());
std::normal_distribution<double> randn(0.0, 0.3);
instead of the following
unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();
std::default_random_engine engine(seed);
std::normal_distribution<double> randn(0.0, 0.3);
I'm getting what I'm expecting to acquire for both methods. My question is are there any considerations should I pay attention to by using rand()? (Note: I have no choice to use <chrono>
I recommend grabbing a seed from std::random_device:
std::default_random_engine engine(std::random_device{}());
which should provide you with significantly more entropy than std::time.
According to http://www.cplusplus.com/reference/random/random_device/, they recommend that you don't use std::random_device, as it isn't portable:
Notice that random devices may not always be available to produce
random numbers (and in some systems, they may even never be
available).
On a related page (http://www.cplusplus.com/reference/random/linear_congruential_engine/linear_congruential_engine/), they give the following as an example of creating a seed:
unsigned seed1 = std::chrono::system_clock::now().time_since_epoch().count();

Validation testing: how to validate a UI?

I have been asked to implement validation tests on the javascript part of our website. I've been considering using selenium WebDriver. One of the things we want to test is the UI: check whether it "looks" good (things that must be aligned are aligned, boxes are in the right position).
For the moment, the only option I found was to take a snapshot using Selenium, and either compare it to a test snapshot manually taken beforehand, or check the snapshots manually. The snapshot comparison is not very maintainable, as any change in the layout requires all the test snapshots to be taken again, and the manual check is very time consuming.
Does anyone know of any way (in Selenium or other) to achieve this?
It's not nice, but it can be done to some extent.
For positioning, you can use WebElement's getLocation() (Java doc, but the same method exists in all Selenium bindings). Note that most browsers render slightly differently, so do not expect things to be pixel-perfect when working with older IE. Also, things might be positioned slightly differently when e.g. the first font defined in CSS was not found and an alternative was used. Don't rely heavily on this method. But if you'll be able to make your tests sane and your environment stable, it will work.
For aligning, I wrote a simple Java method for WebDriver that asserts that an element is visually inside of another element.
There should be no false negatives, but there could be some false positives in the case when the inner element is visually inside, but its (invisible) actual borders "peek out". I haven't bumped into this problem, however, in my real experience, since nice websites behave nicely and don't need such hacks :). Still, it's kinda hackish and Selenium wasn't designed for this type of work, so it might be harder to implement more complex checks.
public static void assertContains(WebElement outerElem, WebElement innerElem) {
// get borders of outer element
Point outerLoc = outerElem.getLocation();
Dimension outerDim = outerElem.getSize();
int outerLeftX = outerLoc.getX();
int outerRightX = outerLeftX + outerDim.getWidth();
int outerTopY = outerLoc.getY();
int outerBottomY = outerTopY + outerDim.getHeight();
// get borders of inner element
Point innerLoc = innerElem.getLocation();
Dimension innerDim = innerElem.getSize();
int innerLeftX = innerLoc.getX();
int innerRightX = innerLeftX + innerDim.getWidth();
int innerTopY = innerLoc.getY();
int innerBottomY = innerTopY + innerDim.getHeight();
// assures the inner borders don't cross the outer borders
final String errorMsg = "ughh, some error message";
final boolean contains = (outerLeftX <= innerLeftX)
&& (innerRightX <= outerRightX)
&& (outerTopY <= innerTopY)
&& (innerBottomY <= outerBottomY);
assertTrue(errorMsg, contains);
}
If you use term validation in meaning: "Test that we have built the right thing", I would say it is nearly impossible to automate this. How will you judge if it looks pleasing or that it is easy to use, if not by having some people to really use it?
This kind of visual checks are also something humans are good at. If you use the website at all while developing it, you will notice quite easily if there is something fishy with the layouts and such.
For functionalities automated tests are good idea.

Resources