Data to audio and back. Modulation / demodulation with source code

Data to audio and back. Modulation / demodulation with source code - algorithm

I have a stream of binary data and want to convert it to raw waveform sound data, which I can send to the speakers.
This is what the old-school modems did in order to transfer binary data over the phone line (producing the typical modemish sound). It is called modulation.
Then I need a reverse process - from the raw waveform samples, I want to obtain the exact binary data. This is called demodulation.
Any bitrate will work for a start.
The sound is played using computer speakers and sampled using a microphone.
Bandwidth would be quite low (low quality microphone).
There is some background noise but not much.
I found one particular way to do this - Frequency shift keying. The problem is I can't find any source code.
Can you please point me to an implementation of FSK in any language?
Or offer any alternative encoding binary<->sound with available source code?

The simplest modulation scheme would be amplitude modulation (technically for the digital realm this would be called Amplitude Shift Keying). Take a fixed frequency (let's say 10Khz), your "carrier wave", and use the bits in your binary data to turn it on and off. If your data rate is 10 bits per second you will be toggling the 10KHz signal on and off at that rate. The demodulation would be an (optional) 10KHz filter followed by comparing with a threshold. This is a fairly simple scheme to implement. Generally the higher the signal frequency and your available bandwidth, the faster you can switch that signal on and off.
A very cool/fun application here would be to encode/decode as morse code and see how fast you can go.
FSK, shifting between two frequencies is more efficient in bandwidth and more immune to noise but will make the demodulator more complex as you need to distinguish between the two frequencies.
Advanced modulation scheme such as Phase Shift Keying are good at getting the highest bit rate for a given bandwidth and signal to noise ratio but they are more complicated to implement. Analog phone modems needed to deal with certain bandwidth (e.g. as little as 3Khz) and noise limitations. If you need to get the highest possible bitrate given bandwidth and noise limitations then is the way to go.
For actual code samples of advanced modulation schemes I would investigate application notes from DSP vendors (such as TI and Analog Devices) as those were common applications for DSPs.
Implementing a PI/4 Shift D-QPSK Baseband Modem Using the TMS320C50
QPSK modulation demystified
V.34 Transmitter and Receiver Implementation on the TMS320C50 DSP
Another very simple and not so efficient method is to use DTMF. Those are the tones generated by phone keypads where each symbol is a combination of two frequencies. If you Google you'll find a lot of source code. Depending on your application/requirements this may be a simple solution.
Let's dive in to some simple scheme implementation details, something like the morse code I mentioned earlier. We can use "dot" for 0 and "dash' for 1. An advantage of a morse like scheme is that it also solves the framing problem as you can resynchronize your sampling after every space. For simplicity let's pick the "Carrier Wave" frequency at 11KHz and assume your wave output is 44Khz, 16 bit, mono. We'll also use a square wave which will create harmonics but we don't really care. If 11KHz is beyond your microphone's frequency response then just divide all frequencies by 2 e.g. We'll pick some arbitrary level 10000 and so our "on" waveform looks like this:
{10000, 10000, 0, 0, 10000, 10000, 0, 0, 10000, 0, 0, ...} // 4 samples = 11Khz period
and our "off" waveform is just all zeros. I leave the coding of this part as an excersize to the reader.
And so we have something like:
const int dot_samples = 400; // ~10ms - speed up later
const int space_samples = 400; // ~10ms
const int dash_samples = 800; // ~20ms
void encode( uint8_t* source, int length, int16_t* target ) // assumes enough room in target
{
for(int i=0; i<length; i++)
{
for(int j=0; j<8; j++)
{
if((source[i]>>j) & 1) // If data bit is 1 we'll encode a dot
{
generate_on(&target, dash_samples); // Generate ON wave for n samples and update target ptr
}
else // otherwise a dash
{
generate_on(&target, dot_samples); // Generate ON wave for n samples and update target ptr
}
generate_off(&target, space_samples); // Generate zeros
}
}
}
The decoder is a bit more complicated but here's an outline:
Optionally band-pass filter the sampled signal around 11Khz. This will improve performance in a noisy enviornment. FIR filters are pretty simple and there are a few online design applets that will generate the filter for you.
Threshold the signal. Every value above 1/2 maximum amplitude is 1 every value below is 0. This assumes you have sampled the entire signal. If this is in real time you either pick a fixed threshold or do some sort of automatic gain control where you track the maximum signal level over some time.
Scan for start of dot or dash. You probably want to see at least a certain number of 1's in your dot period to consider the samples a dot. Then keep scanning to see if this is a dash. Don't expect a perfect signal - you'll see a few 0's in the middle of your 1's and a few 1's in the middle of your 0's. If there's little noise then differentiating the "on" periods from the "off" periods should be fairly easy.
Then reverse the above process. If you see dash push a 1 bit to your buffer, if a dot push a zero.

One purpose of modulation/demodulation is to adapt to channel characteristics. For instance, the channel might not be able to pass DC. Another purpose is to overcome a given amount and type of noise in the channel, while still transferring data above some given error rate.
For FSK, you simply want routines than can generate sine waves at two different frequencies on the transmit end, and filter and detect two different frequencies on the receiving end. The length of each segment of sine waves, the separation in frequency, and the amplitude will depend on the data rate and amount of noise you need to overcome.
In the simplest case, zero noise, simply produce N or 2N sine waves within successive fixed time frames. Something like:
x[i] = amplitude * sin( i * 2 * pi * (data[j] ? 1.0 : 2.0) * freq) / sampleRate )
On the receiving end, you can sample the signal at well above twice the max frequency and measure the distance between zero crossings, and see if you find an bunch of short period or long period waveforms. Much fancier methods using digital signal processing filters (IIR, FIR, etc.) and various statistic detectors can be used in the presence of non-zero noise.

Related

NEXSYS A7 Board - I2S2 PMOD

I'm working on a guitar effects "pedal" using the NEXSYS A7 Board.
For this purpose, I've purchased the I2S2 PMOD and successfully got it up and running using the example code provided by Digilent.
Currently, the design is a "pass-through", meaning that audio comes into the FPGA and immediately out.
I'm wondering what would be the correct way to store the data, make some DSP on this data to create the effects, and then transmit the modified data back to the I2S2 PMOD.
Maybe it's unnecessary to store the data?
maybe I can pass it through an RTL block that's responsible for applying the effect and then simply transmit the modified data out?

Collated from comments and extended.
For a live performance pedal you don't want to store much data; usually 10s of ms or less. Start with something simple : store 50 or 100ms of data in a ring (read old data, store new data, inc address modulo memory size). Output = Newdata = ( incoming sample * 0.n + olddata * (1 - 0.n)) for variable n. Very crude reverb or echo.
Yes, ring = ring buffer FIFO. And you'll see my description is a very crude implementation of a ring buffer FIFO.
Now extend it to separate read and write pointers. Now read and write at different, harmonically related rates ... you have a pitch changer. With glitches when the pointers cross.
Think of ways to hide the glitches, and soon you'll be able to make the crappy noises Autotune adds to most all modern music from that bloody Cher song onwards. (This takes serious DSP : something called interpolating filters is probably the simplest way. Live with the glitches for now)
btw if I'm interested in a distortion effect, can it be accomplished by simply multiplying the incoming data by a constant?
Multiplying by a constant is ... gain.
Multiplying a signal by itself is squaring it ... aka second harmonic distortion or 2HD (which produces components on the octave of each tone in the input).
Multiplying a signal by the 2HD is cubing it ... aka 3HD, producing components a perfect fifth above the octave.
Multiplying the 2HD by the 2HD is the fourth power ... aka 4HD, producing components 2 octaves higher, or a perfect fourth above that fifth.
Multiply the 4HD by the signal to produce 5HD ... and so on to probably the 7th. Also note that these components will decrease dramatically in level; you probably want to add gain beyond 2HD, multiply by 4 (= shift left 2 bits) as a starting point, and increase or decrease as desired.
Now multiply each of these by a variable gain and mix them (mixing is simple addition) to add as many distortion components you want as loud as you want ... don't forget to add in the original signal!
There are other approaches to adding distortion. Try simply saturating all signals above 0.25 to 0.25, and all signals below -0.25 to -0.25, aka clipping. Sounds nasty but mix a bit of this into the above, for a buzz.
Learn how to make white noise (pseudo-random number, usually from a LFSR).
Multiply this by the input signal, and mix or match with the above, for some fuzz.
Learn digital filtering (low pass, high pass, band pass for EQ), and how to control filters with noise or the input signal, the world of sound is open to you.

what is the difference if I take max(abs(m) or max(m)?

What is the difference if I take max(abs(m)) or max(m) in Matlab, where m is the speech signal used in pulse coding modulation to find delta?
delta=2.0001*max(abs(m))/L and
delta=2.0001*max(m)/L

Summary answer from mine comments (to make it more comprehensible I hope)
You got signed signal (possibly with small zero bias)
so you should use the max(abs(m))
to avoid overflow errors due to invalid signal magnitude computation
this is the case even if the signal is symmetrical
let see:
the green area is actually processed audio buffer
first example shows too small buffer
in this case you can miss the peaks even with max(abs(m))
the result is shifting of peak up and down
resulting in falsely compute too low bit count/step quantization constants
that leads to overflows and signal distortions (glitches in sound and weird echo like or underwater sounds)
The second example is big enough buffer size (have at least one whole period of carrior signal)
in this case for symmetrical signals the max(m) should work but you should add some small gap just to be sure
of coarse if any zero bias is present then you are screwed (unless you know its value)
the Red,Blue lines represents obtained dynamic range (your delta without scaling)
so as you can see the if you use max(abs(m)) then the buffer size can be half of what it needs to be for max(m) case (of coarse only for symmetrical signals)
magenta is red+blue

DCF77 decoder vs. noisy signal

I have almost completed my open source DCF77 decoder project. It all started out when I noticed that the standard (Arduino) DCF77 libraries perform very poorly on noisy signals. Especially I was never able to get the time out of the decoders when the antenna was close to the computer or when my washing machine was running.
My first approach was to add a (digital) exponential filter + trigger to the incoming signal.
Although this improved the situation significantly, it was still not really good. Then I started to read some standard books on digital signal processing and especially the original works of Claude Elwood Shannon. My conclusion was that the proper approach would be to not "decode" the signal at all because it is (except for leap seconds) completely known a priori. Instead it would be more appropriate to match the received data to a locally synthesized signal and just determine the proper phase. This in turn would reduce the effective bandwidth by some orders of magnitude and thus reduce the noise significantly.
Phase detection implies the need for fast convolution. The standard approach for efficient convolution is of course the fast Fourier transform. However I am implementing for the Arduino / Atmega 328. Thus I have only 2k RAM. So instead of the straightforward approach with FFT, I started stacking matched phase locked loop filters. I documented the different project stages here:
First try: exponential filter
Start of the better apprach: phase lock to the signal / seconds ticks
Phase lock to the minutes
Decoding minute and hour data
Decoding the whole signal
Adding a local clock to deal with signal loss
Using local synthesized signal for faster lock reacquisition after signal loss
I searched the internet quite extensively and found no similar approach. Still I wonder if there are similar (and maybe better) implementations. Or if there exist research on this kind of signal reconstruction.
What I am not searching for: designing optimized codes for getting close to the Shannon limit. I am also not searching for information on the superimposed PRNG code on DCF77. I also do not need hints on "matched filters" as my current implementation is an approximation of a matched filter. Specific hints on Viterbi Decoders or Trellis approaches are not what I am searching for - unless they address the issue of tight CPU and RAM constraints.
What I am searching for: are there any descriptions / implementations of other non-trivial algorithms for decoding signals like DCF77, with limited CPU and RAM in the presence of significant noise? Maybe in some books or papers from the pre internet era?

Have you considered using a chip matched filter to perform your convolution?
http://en.wikipedia.org/wiki/Matched_filter
They are almost trivially easy to implement, as each chip / bit period can be implemented as a n add subtract delay line ( use a circular buffer )
A simple one for a square wave (will also work, but less optimal with other waveforms) of unknown sequence (but known frequency) can be implemented something like this:
// Filter class
template <int samples_per_bit>
class matchedFilter(
public:
// constructor
matchedFilter() : acc(0) {};
// destructor
~matchedFilter() {};
int filterInput(int next_sample){
int temp;
temp = sample_buffer.insert(nextSample);
temp -= next_sample;
temp -= result_buffer.insert(temp);
return temp;
};
private:
int acc;
CircularBuffer<samples_per_bit> sample_buffer;
CircularBuffer<samples_per_bit> result_buffer;
);
// Circular buffer
template <int length>
class CircularBuffer(
public:
// constructor
CircularBuffer() : element(0) {
buffer.fill(0);
};
// destructor
~CircularBuffer(){};
int insert(int new_element){
int temp;
temp = array[element_pos];
array[element_pos] = new_element;
element_pos += 1;
if (element_pos == length){
element_pos = 0;
};
return temp;
}
private:
std::array<int, length> buffer;
int element_pos;
);
As you can see, resource wise, this is relatively trivial. It there is a specific waveform you're after, you can cascade these together to give a longer correlation.

The reference to matched filters by Ollie B. is not what I was asking for. I already covered this before in my blog.
However by now I received a very good hint by private mail. There exists a paper "Performance Analysis and
Receiver Architectures of DCF77 Radio-Controlled Clocks" by Daniel Engeler. This is the kind of stuff I am searching for.
With further searches starting from the Engeler paper I found the following German patents DE3733966A1 - Anordnung zum Empfang stark gestoerter Signale des Senders dcf-77 and DE4219417C2 - Schmalbandempfänger für Datensignale.

Technique for balancing controller input and output

I have a system where I use RS232 to control a lamp that takes an input given in float representing voltage (in the range 2.5 - 7.5). The control then gives a output in the range 0 to 6000 which is the brightness a sensor picks up.
What I want is to be able to balance the system so that I can specify a brightness value, and the system should balance in on a voltage value that achieves this.
Is there some standard algorithm or technique to find what the voltage input should be in order to get a specific output? I was thinking of an an algorithm which iteratively tries values and from each try it determines some new value which should be better in order to achieve the determined output value. (in my case that is 3000).
The voltage values required tend to vary between different systems and also over the lifespan of the lamp, so this should preferably be done completely automatic.
I am just looking for a name for a technique or algorithm, but pseudo code works just as well. :)

Calibrate the system on initial run by trying all voltages between 2.5 and 7.5 in e.g. 0.1V increments, and record the sensor output.
Given e.g. 3000 as a desired brightness level, pick the voltage that gives the closest brightness then adjust up/down in small increments based on the sensor output until the desired brightness is achieved. From time to time (based on your calibrated values becoming less accurate) recalibrate.

After some more wikipedia browsing I found this:
Control loop feedback mechanism:
previous_error = setpoint - actual_position
integral = 0
start:
error = setpoint - actual_position
integral = integral + (error*dt)
derivative = (error - previous_error)/dt
output = (Kp*error) + (Ki*integral) + (Kd*derivative)
previous_error = error
wait(dt)
goto start
[edit]
By removing the "integral" component and tweaking the weights (Ki and Kd), the loop works perfectly.

I am not at all into physics, but if you can assume that the relationship between voltage and brightness is somewhat close to linear, you can use a standard binary search.
Other than that, this reminds me of the inverted pendulum, which is one of the standard examples for the use of fuzzy logic.

DSP - Converting a sampled signal from real samples to complex samples and vice versa

How can I convert baseband sampled signal from real-valued samples to complex-valued samples (real,imaginary) and vice-versa.
My samples are integers, and I'm looking for a fast (but accurate) conversion algorithms.
A C++ sample code (real, not complex ;-) would be more than welcome.
Edit: IPP code will be much welcome.
Edit: I'm looking for a method that will convert n real-samples to n/2 complex-samples and vice-versa, without affecting the bandwidth.

Adding zeros as the imaginary is conceptually the first step in what you want to do. Initially you have a real only signal that looks like this in the frequency domain:
[r0, r1, r2, r3, ...]
/-~--------\
DC +Fs/2
If you stuff it with zeros for the imaginary value, you'll see that you really have both positive and negative frequencies as mirror images:
[r0 + 0i, r1 + 0i, r2 + 0i, r3 + 0i, ...]
/--------~-\ /-~--------\
-Fs/2 DC +Fs/2
Next, you multiply that signal in the time domain by a complex tone at -Fs/4 (tuning the signal). Your signal will look like
----~-\ /-~--------\ /------
DC
So now, you filter out the center half and you get:
________/-~--------\________
DC
Then you decimate by two and you end up with:
/-~--------\
Which is what you want.
All of these steps can be performed efficiently in the time domain. If you pay attention to all of the intermediate steps, you'll notice that there are many places where you're multiplying by 0, +1, -1, +i, or -i. Furthermore, the half band low pass filter will have a lot of zeros and some symmetry to exploit. Since you know you're going to decimate by 2, you only have to calculate the samples you intend to keep. If you work through the algebra, you'll find a lot of places to simplify it for a clean and fast implementation.
Ultimately, this is all equivalent to a Hilbert transform, but I think it's much easier to understand when you decompose it into pieces like this.
Converting back to real from complex is similar. You'll stuff it with zeroes for every other sample to undo the decimation. You'll filter the complex signal to remove an alias you just introduced. You'll tune it up by Fs/4, and then throw away the imaginary component. (Sorry, I'm all ascii-arted out... :-)
Note that this conversion is lossy near the boundaries. You'd have to use an infinite length filter to do it perfectly.

If you want to create a complex vector with a strictly real spectrum, just add an imaginary component of 0.0 to every sample. Depending on your data format, this may be as easy as creating a double length memory array, zeroing it, and copying from every element of the source into every other element of the destination.
If you want to convert a complex vector containing complex data (non-zero imaginary components above your required minimum noise floor) into a real vector, you will need to double your bandwidth in order to not lose information, which may or may not make sense, unless you are modulating, demodulating or filtering the signal.
If you want to produce a one-sided signal with a complex spectrum from a real vector, you can use a Hilbert transform (or filter) to create an imaginary component with the same spectrum but conjugate phase (except for DC). This would probably not be both fast and accurate.

I'm not sure if that is what you're looking for but you might want to check the Hilbert Transform, which can be used to find the analytic representation of real-valued signals, i.e., a signal with the same amount of information but with no negative frequency components.
Such representation is mostly useful in Digital Signal Processing techniques employing Spectral Shifting such as Single Sideband Modulation, an efficient form of Amplitude Modulation (AM) that uses half the bandwidth used by the raw AM.

I don't have enough points to vote zml up yet, but his is clearly the right answer. The Hilbert transform essentially converts your real-valued signal into its more natural domain, where the components of sound are complex "phasors" rather than sine waves. It does this by essentially chopping of half of the Fourier spectrum, which involves a single choice of "helicity" (i.e. cw vs ccw) but allows you to do things like perfectly pitch shift by multiplying by a single phasor. The possibilities are endless, and I hope this complex representation of audio catches on!

Intel Performance Primitives (IPP) has a function which does exactly this.
From their documentation:
The ippsHilbert function computes a complex analytic signal,
which contains the original real signal as its real part and
computed Hilbert transform as its imaginary part.
https://software.intel.com/content/www/us/en/develop/documentation/ipp-dev-reference/top/volume-1-signal-and-data-processing/transform-functions/hilbert-transform-functions/hilbert.html#hilbert

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio