Hardware: Shortest delay to send an ouput signal using windows - windows

I would like to set an output from my computer high or low. This will change roughly 5 times a second. I will be measuring the output on an oscilloscope. The important thing is to make the time between requesting the change in software and the output changing state as short as possible. The shorter the delay, the more accurate my result.
Does anyone know which of the following options has the shortest delay in a windows environment (it has to be in windows)?
USB
Serial
Parallel Port
Something else?
I could try all three and measure the difference, but presumably someone else has done this already?
Thanks!

Serial and Parallel will be lower latency that USB (I would expect) as the "height" of the stack between you and the port is smaller. Measuring the latency would be challenging - how will you know when the bit of code which writes to the port is executed?
However, even with a low potential latency, the jitter induced by windows is likely to be quite large as well.
Vaguely related and quite interesting...
https://superuser.com/questions/419070/transatlantic-ping-faster-than-sending-a-pixel-to-the-screen/419167#419167

USB can transmit data the fastest by far.

Related

Auto detect UART baud rate of a working device

I have a device which is continuously sending data through UART.
I'm trying to read it using a terminal application on Windows-based PC. The problem is that I don't know at which baud rate the device is sending the data.
The data I'm getting at higher baud rates doesn't make any sense so I have narrowed it down to lesser than or equal to 600 among the standard baud rates available on terminal.
Is there any software to detect the baud rate or a method using any microcontroller??
No, not if you want to get done quickly. Ten years doing this type of task says an oscilloscope or even an inexpensive USB-based logic analyzer is your best solution here. This isn't a software problem yet, it's a signal detection problem. You should be able to clear this up in a few minutes with the right instrument.
I'm assuming you're only doing this exercise because the transmitting part is one for which you cannot find a datasheet. If you had a datasheet in hand, that would clear this up, or at least suggest the possible baud rates you should try.
I tried Realterm software and found out that the data is coming at 300 Baud Rate, no parity, 8 data bits and 1 or 2 stop bits. For the remaining options, I get Framing error and break condition in the software. Thank you all...
Old thread but I thought it might be useful.
Something I did was write a python/pyserial script that kept cycling through different baud rates (300 - 115200) and listened and filtered for strings that aren't garbage. Something that decides easily, with the assumption that but would be good clear text so you have the right rate.
It worked well enough and seemed to find the right rate more than fast enough to esc into the bootloader of my AP.

Using usb cable for random number generation

I have a thought, but am unsure how to execute it. I want to take a somewhat long usb cable and plug both ends into the same machine. Then I would like to send a signal from one end and time how long it would take to reach the other end. I think this should cause signal to arrive at different times and that would cause me to get random numbers.
Can someone suggest a language in which I could do this the quickest? I have zero experience in sending signals over usb and don't know where to start or how to start. Any help will be greatly appreciated.
I simply want to do this as a fun in home project, so I don't need anything official and just would like to see if this idea can work.
EDIT: What if I store the usb cable in liquid nitrogen or a substance just as cold in order to slow down the signal as much as possible (I have access to liquid nitrogen).
Sorry I can't comment (not enough rep), but the delay should always be the same through the wire. This might limit the true randomness of your numbers. Plus the acutal delay time in the wire might be shorter than even a CPU cycle.
If your operating system is Windows, you may run into this type of issue:
Why are .NET timers limited to 15 ms resolution?
Apparently the minimum time resolution on Windows is around 15ms.
EDIT: In response to your liquid nitrogen edit, according to these graphs, you may have more luck with heat! Interestingly enough...
Temperature vs Conductivity http://www.emeraldinsight.com/content_images/fig/1740240120008.png
I want to take a somewhat long usb cable and plug both ends into the same machine.
Won't work. A USB connection is always Host -> Device, a PC can only be Host. And the communication uses predictable 1 ms intervals - bad for randomness.
Some newer microcontrollers have both RNG and USB on chip, that way you can make a real USB RNG.
What if I store the usb cable in liquid nitrogen or a substance just as cold in order to slow down the signal
The signal would travel a tiny bit faster, as the resistance of the cable is lower.

Cycle accurate emulation

I'm currently learning C for my next emulation project, a cycle accurate 68000 core (my last project being a non-cycle accurate Sega Master System emulator written in Java which is now on its third release). My query regards cycle level accuracy as taking things to this level is a new thing for me.
To break things down to a granularity of 1 CPU cycle, presumably I need to know how long memory accesses take and so on, but my question is that for instructions that take multiple cycles in their memory fetch/write stages, what is the CPU doing each cycle - e.g. are x amount of bits copied per cycle.
With my SMS emulator I didn't have to worry too much about M1 stages etc, as it just used a cycle count for each instruction - in other words it is only accurate to an instruction level, not a cycle level. I'm not looking for architecture specific details, merely an idea of what sort of things I should look out for when going to this level of granularity.
68k details are welcome however. Basically I'm wondering what is supposed to happen if a video chip reads from an area of memory whilst a CPU is still writing the data to it mid way through that phase of an instruction, and other similar situations. I hope I've made it clear enough, thank you.
For a really cycle accurate emulation you have first to decide on a master clock you want to use as reference. That should be the fastest clock at which's granularity the software running can detect differences in order of occurance. This could by the CPU clock, but in most cases the bus cycle time decides at which granularity events can be discerned (and that is often only a fraction of the CPU clock).
Then you need to find out the precendence order the different devices (IC's) connected to that bus have (if there is more than one bus master). An example would be if (and how) video DMA can delay the CPU.
There exist generally no at the same time events. Either the CPU writes before the DMA reads, or the other way around (that is still true in case of dual ported devices, you just need to consider the device's inherent predence mechanism).
Once you have a solid understanding which clock is the effectively controlling the granularity of discernible events you can think about how to structure the emulator to reproduce that behaviour exactly.
This way you can create a 100% cycle exact emulation, given you have enough information about all the devices behavior.
Sorry I can't give you more detailed info, I know nothing about the specifics of the Sega's hardware.
My guess is that you don't have to get in to excruciating detail to get good enough results for the timing for this sort of thing. Which you can't do anyway f you don't want to get into the specifics of the architecture.
Your main question seemed to be "what is supposed to happen if a video chip reads from an area of memory whilst a CPU is still writing the data to it". Generally on these older chips, the bus protocols are pretty simple (they're not packetized) and there is usually a pin that indicates that the bus is busy. So if the CPU is writing to memory, the video chip will simply have to wait until the CPU is done. Because of these sorts of limitations, dual ported ram was popular for a while so that the frame buffer could be simultaneously written by the CPU and read by the RAMDAC.

Map device driver code to Logic Analyzer waveform

As per SDIO specification, the sequence of operations (for write transaction) take place as:
Command53 -- CommandLatency -- Command53Response -- ResponseLatency -- startbit -- write-number-of-bytes -- CRC -- endbit -- WriteLatency -- startbit -- CRC -- endbit -- busybit.
During benchmarking of SDIO UART driver, the time values which I got were more than expected. A lot of latency was found especially during write transaction.
Reasons for latency could be scheduler allocating processor time to other processes, delay in work queues, etc.
I would like to analyze and understand the latency. May be understanding the mapping between the device driver code and the Logic Analyzer waveform can lead to some cue.
Can somebody shed some light on this?
Thank you.
EDIT 1:
Sorry! I assumed a few things.
In sdio_uart_transmit_chars() there is a call to sdio_out() which in turn calls sdio_writeb() and this call writes byte wise (one byte at a time) to a SDIO UART device. I modified the driver to use sdio_writesb() i.e. multi-byte mode. This reduced the time taken to write X bytes relatively. Interestingly, with increase in size of write data, there was exponential increase in WriteLatency (as mentioned above).
This latency could be because of many reasons. I would like to understand these reasons.
Setup: I am using Linux (v 2.6.32) laptop and a loadable kernel module (which is modified sdio_uart.c)
EDIT 2:
May be adding 'SDIO' in this question is misleading..(not sure at the moment). The reasons for delay could be generic to any device driver while interacting with the hardware and it may be independent of SDIO write process.
If somebody can point me to related online resource, I would be happy to explore and update the result here.
Hope I added more clarity this time. Please comment if I the question is still not clear.
Thank you for your time.
EDIT 3:
Yes, I am looking at the signals on Logic Analyzer (LA) and there are longer delays during and between writes than I expected.
To give an idea about time values:
For 512 bytes transfer: At the hardware level theoretically the write should take 50 micro seconds (us), however in reality I got 200 us.
This gap of 150 us is what I want to understand.
Note:
1) I am rounding off the time values to simplify the case.
2) All the time values are calculated at Kernel level and no user space issue is involved here.
One thing worth looking at is if your sd interface functions by DMA, such that the driver can program the state machine and then it just runs by itself, or if getting the message out requires repetitive service by the driver, which might be delayed by other kernel obligations.
You could also see if there may be an I/O bottleneck, for example is the SD interface or whatever bus it hangs off of also used for something else?
Finally you could search for ways to increase the priority. At an extreme, you could switch to a real time SD driver rather than a normal one.

Linux: Timing during recording/playing sound

I have a more general question, regarding timing in a standard Linux-OS dealing with playing sound and receiving data over a serial port.
In the moment, I'm reading a PCM-Signal arriving over a USB-to-Serial Bridge (pl2303) which is recorded, encoded and sent from a FPGA.
Now, I need to create "peaks" at a known position in the recorded soundstream, and plan to play a soundfile from the same machine which is recording at a known moment. The peak has to begin and stop inside windows of maximal 50ms, it's length could be ~200ms...
Now, my question is: How precise can I expect the timing to be? I know, that several components add "unkown lag", jitter:
USB-to-Serial Bridge collects ~20 bytes from the serial side before sending them to the USB-side (at 230400Baud this results in ~1ms)
If I call "`sleep 1; mpg123 $MP3FILE` &" directly before my recording software, the Linux-Kernel will schedule them differenty (maybe this causes a few 10ms, depending on system load?)
The soundcard/driver will maybe add some more unkown lag...
Will tricks like "nice" or "sched_setscheduler" add value in my case?
I could build an additional thread inside my recording sofware, which plays the sound. Doing this, the timing may be more precise, but I have a lot more work to do ...
Thanks a lot.
I will try it anyway, but I'm looking for some background toughts to understand and solve my upcoming problems better.
I am not 100% sure, but I would imagine that your kernel would need to be rebuilt to allow the scheduler to reduce latency time in switching tasks a la multitasking, in kernels 2.6.x series, there's an option to make the kernel more smoother by making it pre-emptible.
Go to Processor Type and features
Pre-emption Model
Select Preemptible kernel (low latency desktop)
This should streamline the timing and make the sounds appear smoother as a result of less jitters.
Try that and recompile the kernel again. There are of course, plenty of kernel patches that would reduce the timeslice for each task-switch to make it even more smoother, your mileage may vary depending on:
Processor speed - what processor is used?
Memory - how much RAM?
Disk input/output - the faster, the merrier
using those three factors combined, will have an influence on the scheduler and the multi-tasking features. The lower the latency, the more fine-grained it is.
Incidentally, there is a specialized linux distribution that is catered for capturing sound in real-time, I cannot remember the name of it, the kernel in that distribution was heavily patched to make sound capture very smooth.
it's me again... After one restless night, I solved my strange timing-problems... My first edit is not completely correct, since what I posted was not 100% reproducible. After running some more tests, I can come up with the following Plot, showing timing accuracy:
Results from analysis http://mega2000.de/~mzenzes/pics4web/2010-05-28_13-37-08_timingexperiment.png
I tried two different ubuntu-kernels: 2.6.32-21-generic and 2.6.32-10-rt
I tried to achieve RT-scheduling: sudo chrt --fifo 99 ./experimenter.sh
And I tried to change powersaving-options: echo {performance,conservative} | sudo tee /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
This resulted in 8 different tests, with 50 runs each. Here are the numbers:
mean(peakPos) std(peakPos)
rt-kernel-fifo99-ondemand 0.97 0.0212
rt-kernel-fifo99-performance 0.99 0.0040
rt-kernel-ondemand 0.91 0.1423
rt-kernel-performance 0.96 0.0078
standard-kernel-fifo99-ondemand 0.68 0.0177
standard-kernel-fifo99-performance 0.72 0.0142
standard-kernel-ondemand 0.69 0.0749
standard-kernel-performance 0.69 0.0147

Resources