DSP - Converting a sampled signal from real samples to complex samples and vice versa - algorithm

How can I convert baseband sampled signal from real-valued samples to complex-valued samples (real,imaginary) and vice-versa.
My samples are integers, and I'm looking for a fast (but accurate) conversion algorithms.
A C++ sample code (real, not complex ;-) would be more than welcome.
Edit: IPP code will be much welcome.
Edit: I'm looking for a method that will convert n real-samples to n/2 complex-samples and vice-versa, without affecting the bandwidth.

Adding zeros as the imaginary is conceptually the first step in what you want to do. Initially you have a real only signal that looks like this in the frequency domain:
[r0, r1, r2, r3, ...]
DC +Fs/2
If you stuff it with zeros for the imaginary value, you'll see that you really have both positive and negative frequencies as mirror images:
[r0 + 0i, r1 + 0i, r2 + 0i, r3 + 0i, ...]
/--------~-\ /-~--------\
-Fs/2 DC +Fs/2
Next, you multiply that signal in the time domain by a complex tone at -Fs/4 (tuning the signal). Your signal will look like
----~-\ /-~--------\ /------
So now, you filter out the center half and you get:
Then you decimate by two and you end up with:
Which is what you want.
All of these steps can be performed efficiently in the time domain. If you pay attention to all of the intermediate steps, you'll notice that there are many places where you're multiplying by 0, +1, -1, +i, or -i. Furthermore, the half band low pass filter will have a lot of zeros and some symmetry to exploit. Since you know you're going to decimate by 2, you only have to calculate the samples you intend to keep. If you work through the algebra, you'll find a lot of places to simplify it for a clean and fast implementation.
Ultimately, this is all equivalent to a Hilbert transform, but I think it's much easier to understand when you decompose it into pieces like this.
Converting back to real from complex is similar. You'll stuff it with zeroes for every other sample to undo the decimation. You'll filter the complex signal to remove an alias you just introduced. You'll tune it up by Fs/4, and then throw away the imaginary component. (Sorry, I'm all ascii-arted out... :-)
Note that this conversion is lossy near the boundaries. You'd have to use an infinite length filter to do it perfectly.

If you want to create a complex vector with a strictly real spectrum, just add an imaginary component of 0.0 to every sample. Depending on your data format, this may be as easy as creating a double length memory array, zeroing it, and copying from every element of the source into every other element of the destination.
If you want to convert a complex vector containing complex data (non-zero imaginary components above your required minimum noise floor) into a real vector, you will need to double your bandwidth in order to not lose information, which may or may not make sense, unless you are modulating, demodulating or filtering the signal.
If you want to produce a one-sided signal with a complex spectrum from a real vector, you can use a Hilbert transform (or filter) to create an imaginary component with the same spectrum but conjugate phase (except for DC). This would probably not be both fast and accurate.

I'm not sure if that is what you're looking for but you might want to check the Hilbert Transform, which can be used to find the analytic representation of real-valued signals, i.e., a signal with the same amount of information but with no negative frequency components.
Such representation is mostly useful in Digital Signal Processing techniques employing Spectral Shifting such as Single Sideband Modulation, an efficient form of Amplitude Modulation (AM) that uses half the bandwidth used by the raw AM.

I don't have enough points to vote zml up yet, but his is clearly the right answer. The Hilbert transform essentially converts your real-valued signal into its more natural domain, where the components of sound are complex "phasors" rather than sine waves. It does this by essentially chopping of half of the Fourier spectrum, which involves a single choice of "helicity" (i.e. cw vs ccw) but allows you to do things like perfectly pitch shift by multiplying by a single phasor. The possibilities are endless, and I hope this complex representation of audio catches on!

Intel Performance Primitives (IPP) has a function which does exactly this.
From their documentation:
The ippsHilbert function computes a complex analytic signal,
which contains the original real signal as its real part and
computed Hilbert transform as its imaginary part.



I'm working on a guitar effects "pedal" using the NEXSYS A7 Board.
For this purpose, I've purchased the I2S2 PMOD and successfully got it up and running using the example code provided by Digilent.
Currently, the design is a "pass-through", meaning that audio comes into the FPGA and immediately out.
I'm wondering what would be the correct way to store the data, make some DSP on this data to create the effects, and then transmit the modified data back to the I2S2 PMOD.
Maybe it's unnecessary to store the data?
maybe I can pass it through an RTL block that's responsible for applying the effect and then simply transmit the modified data out?
Collated from comments and extended.
For a live performance pedal you don't want to store much data; usually 10s of ms or less. Start with something simple : store 50 or 100ms of data in a ring (read old data, store new data, inc address modulo memory size). Output = Newdata = ( incoming sample * 0.n + olddata * (1 - 0.n)) for variable n. Very crude reverb or echo.
Yes, ring = ring buffer FIFO. And you'll see my description is a very crude implementation of a ring buffer FIFO.
Now extend it to separate read and write pointers. Now read and write at different, harmonically related rates ... you have a pitch changer. With glitches when the pointers cross.
Think of ways to hide the glitches, and soon you'll be able to make the crappy noises Autotune adds to most all modern music from that bloody Cher song onwards. (This takes serious DSP : something called interpolating filters is probably the simplest way. Live with the glitches for now)
btw if I'm interested in a distortion effect, can it be accomplished by simply multiplying the incoming data by a constant?
Multiplying by a constant is ... gain.
Multiplying a signal by itself is squaring it ... aka second harmonic distortion or 2HD (which produces components on the octave of each tone in the input).
Multiplying a signal by the 2HD is cubing it ... aka 3HD, producing components a perfect fifth above the octave.
Multiplying the 2HD by the 2HD is the fourth power ... aka 4HD, producing components 2 octaves higher, or a perfect fourth above that fifth.
Multiply the 4HD by the signal to produce 5HD ... and so on to probably the 7th. Also note that these components will decrease dramatically in level; you probably want to add gain beyond 2HD, multiply by 4 (= shift left 2 bits) as a starting point, and increase or decrease as desired.
Now multiply each of these by a variable gain and mix them (mixing is simple addition) to add as many distortion components you want as loud as you want ... don't forget to add in the original signal!
There are other approaches to adding distortion. Try simply saturating all signals above 0.25 to 0.25, and all signals below -0.25 to -0.25, aka clipping. Sounds nasty but mix a bit of this into the above, for a buzz.
Learn how to make white noise (pseudo-random number, usually from a LFSR).
Multiply this by the input signal, and mix or match with the above, for some fuzz.
Learn digital filtering (low pass, high pass, band pass for EQ), and how to control filters with noise or the input signal, the world of sound is open to you.

How does choosing between pre and post zero padding of sequences impact results

I'm working on an NLP sequence labelling problem. My data consists of variable length sequences (w_1, w_2, ..., w_k) with corresponding labels (l_1, l_2, ..., l_k) (in this case the task is named entity extraction).
I intend to solve the problem using Recurrent Neural Networks. As the sequences are of variable length I need to pad them (I want batch size >1). I have the option of either pre zero padding them, or post zero padding them. I.e. either I make every sequence (0, 0, ..., w_1, w_2, ..., w_k) or (w_1, w_2, ..., w_k, 0, 0, ..., 0) such that the lenght of each sequence is the same.
How does the choice between pre- and post padding impact results?
It seems like pre padding is more common, but I can't find an explanation of why it would be better. Due to the nature of RNNs it feels like an arbitrary choice for me, since they share weights across time steps.
Commonly in RNN's, we take the final output or hidden state and use this to make a prediction (or do whatever task we are trying to do).
If we send a bunch of 0's to the RNN before taking the final output (i.e. 'post' padding as you describe), then the hidden state of the network at the final word in the sentence would likely get 'flushed out' to some extent by all the zero inputs that come after this word.
So intuitively, this might be why pre-padding is more popular/effective.
This paper (https://arxiv.org/pdf/1903.07288.pdf) studied the effect of padding types on LSTM and CNN. They found that post-padding achieved substantially lower accuracy (nearly half) compared to pre-padding in LSTMs, although there wasn't a significant difference for CNNs (post-padding was only slightly worse).
A simple/intuitive explanation for RNNs is that, post-padding seems to add noise to what has been learned from the sequence through time, and there aren't more timesteps for the RNN to recover from this noise. With pre-padding, however, the RNN is better able to adjust to the added noise of zeros at the beginning as it learns from the sequence through time.
I think more thorough experiments are needed in the community for more detailed mechanistic explanations on how padding affects performance.
I always recommend using pre-padding over post-padding, even for CNNs, unless the problem specifically requires post-padding.

what is the difference if I take max(abs(m) or max(m)?

What is the difference if I take max(abs(m)) or max(m) in Matlab, where m is the speech signal used in pulse coding modulation to find delta?
delta=2.0001*max(abs(m))/L and
Summary answer from mine comments (to make it more comprehensible I hope)
You got signed signal (possibly with small zero bias)
so you should use the max(abs(m))
to avoid overflow errors due to invalid signal magnitude computation
this is the case even if the signal is symmetrical
let see:
the green area is actually processed audio buffer
first example shows too small buffer
in this case you can miss the peaks even with max(abs(m))
the result is shifting of peak up and down
resulting in falsely compute too low bit count/step quantization constants
that leads to overflows and signal distortions (glitches in sound and weird echo like or underwater sounds)
The second example is big enough buffer size (have at least one whole period of carrior signal)
in this case for symmetrical signals the max(m) should work but you should add some small gap just to be sure
of coarse if any zero bias is present then you are screwed (unless you know its value)
the Red,Blue lines represents obtained dynamic range (your delta without scaling)
so as you can see the if you use max(abs(m)) then the buffer size can be half of what it needs to be for max(m) case (of coarse only for symmetrical signals)
magenta is red+blue

Can we convert every Algorithm in fixed point?

I have developed a Algorithm in MATLAB using floating point variable. In my algortihm I am doing eigen value decomposition ,rotation, transformation of matrices, inverse of matrices, division , addition and multipications of matrices several times.(So it is kind of processing of the the signal). I tried to convert it into the fixed point but I am unable to do because my variables and matrices changes it values every time. So for me it is very difficult to handle the overflow problem as I can not make any routine to handle the overflow. Can any one tell me how to handle this problem or is it not possible to convert the algorithm into fixed point.
I need a concerte reason to justify that I can not convert my Algorithm into fixed point(As it is my master thesis!)
P.S:- This algorithm is developed for the controller of the Analog to digital converter, which utilizes the Statistics of the signal and gives the effective decision threshold. I have just written the mathemetical operations.
the answer is YES and NO. it depends on the processed data dynamic range
if you are processing numbers/signal in specified range then YES
but if the numbers/signal has very high dynamic range then NO
you should use more fixed point formats for different stage of signal processing
for example ADC gives you values in exact defined range
so you have to use fixed format such that does not loss precision and have not many unused bits
after that you apply some filter or what ever the range changes
so you need to get bound of possible number ranges per stage and use the best suited fixed point format you have at disposal
This means you need some number of fixed point formats
and also the operations between them
you can have fixed number of bits and just change the position of decimal point...
To be more specific then you need add the block diagram of your processing pipeline
with the number ranges included
and list of used operations
matrix operations and integrals/sums are tricky because they can change the dynamic range considerably
The real question always stays if such implementation is faster then floating point ...
because sometimes the transition between different fixed point stages can be slower then direct floating point implementation ...

Ising 2D Optimization

I have implemented a MC-Simulation of the 2D Ising model in C99.
Compiling with gcc 4.8.2 on Scientific Linux 6.5.
When I scale up the grid the simulation time increases, as expected.
The implementation simply uses the Metropolis–Hastings algorithm.
I tried to find out a way to speed up the algorithm, but I haven't any good idea ?
Are there some tricks to do so ?
As jimifiki wrote, try to do a profiling session.
In order to improve on the algorithmic side only, you could try the following:
Lookup Table:
When calculating the energy difference for the Metropolis criteria you need to evaluate the exponential exp[-K / T * dE ] where K is your scaling constant (in units of Boltzmann's constant) and dE the energy-difference between the original state and the one after a spin-flip.
Calculating exponentials is expensive
So you simply build a table beforehand where to look up the possible values for the dE. There will be (four choose one plus four choose two plus four choose three plus four choose four) possible combinations for a nearest-neightbour interaction, exploit the problem's symmetry and you get five values fordE: 8, 4, 0, -4, -8. Instead of using the exp-function, use the precalculated table.
As mentioned before, it is possible to parallelize the algorithm. To preserve the physical correctness, you have to use a so-called checkerboard concept. Consider the two-dimensional grid as a checkerboard and compute only the white cells parallel at once, then the black ones. That should be clear, considering the nearest-neightbour interaction which introduces dependencies of the values.
You can also implement the simulation on a GPGPU, e.g. using CUDA, if you're already working on C99.
Some tips:
- Don't forget to align C99-structs properly.
- Use linear Arrays, not that nested ones. Aligned memory is normally faster to access, if done properly.
- Try to let the compiler do loop-unrolling, etc. (gcc special options, not default on O2)
Some more information:
If you look for an efficient method to calculate the critical point of the system, the method of choice would be finite-size scaling where you simulate at different system-sizes and different temperature, then calculate a value which is system-size independet at the critical point, therefore an intersection point of the corresponding curves (please see the theory to get a detailed explaination)
I hope I was helpful.
It's normal that your simulation times scale at least with the square of the size. Isn't it?
Here some subjestions:
If you are concerned with thermalization issues, try to use parallel tempering. It can be of help.
The Metropolis-Hastings algorithm can be made parallel. You could try to do it.
Check you are not pessimizing the code.
Are your spin arrays of ints? You could put many spins on the same int. It's a lot of work.
Moreover, remember what Donald taught us:
premature optimisation is the root of all evil
Before optimising you should first understand where your program is slow. This is called profiling.
