I'd like to be able to poison a signal in VHDL so that any subsequent consumers of it get a poisoned value too so I can find unintended links. Its fine if poisoning works in simulation only.
Background: I've made a CPU in VHDL and want to pipeline it. Before I do that I want to make sure that no part of the pipeline is sampling outputs from other parts when it shouldn't. So all the parts in the pipeline create and hold their valid output for exactly one clock cycle and set the various outputs to zero at other times. And for those times where the output is zero the next stage in the pipeline shouldn't even be looking at those values. But I want to confirm it under simlation.
So I tried setting the outputs to 'U' or 'X' during the times they dont represent data and these show up in the simulator as orange or red, but if I force them to be erroneously sampled the bad values dont propagate. So it seems like neither U nor X are what I want. Without changing the consumers of these (e.g. calling is_x(..) in hundreds of places) is there any easy way to poison those values when running under the simulator.
If it matters, this is on a Spartan7 based Arty board with the free version of Vivado.
Related
This is an architectural question regarding gnuradio(-companion) and since I am not sure how to tackle this problem in the first place I first describe what I want to achieve and then how I think I would to it.
Problem
I implement a special form of an RFID reader with an Ettus X310 SDR: The transmitter sends an OOK/AM modulated (PIE encoded) request, followed by a pure Sine wave. The RFID tag backscatters its response onto this sine wave using OOK/AM modulation in FM0 or "Miller subcarrier" coding (a form of a differential Manchester coding). I want to receive its response, translate it into bits (and form a PDU), buffer different responses in a FIFO and send them for further processing. The properties of the tag response are:
It is asynchronuous. I do not know when the response is coming and if it does, when the proper sampling times are: I cannot simply filter, sample, decimate the signal and use a simple slicer because I do not know what the sample points are.
The response comes into very small "bursts" (say, 100 bits). Hence I cannot afford performing timing recovery on bits and waste them (except I buffer the entire signal somehow which I do not think is the way to do it).
The signal starts with a small preamble (UHF RFID Gen2 preamble) which is 6 bits (~8 bit transitions). This may not be enough for for time recovery but can be used to identify the start of a response somehow.
It uses mentioned FM0 encoding, so I have a guaranteed transition every bit. For that reason, I do not have to sample them but could detect the transitions and convert them into bits. I would not need conventional clock recovery (e.g. M&M) either.
My Thoughts
"Ordinary" gnuradio preprocessing brings me to the received oversampled bits: Downconversion, filtering; possibly a slicer which uses a lowpass filter to subtract the mean value and a comparator (note that even this may be challenging because the lowpass filter may have a large settling time of few bits until it obtains the right mean value).
In order to detect the actual transmission, I do not think I have much choice other than a simple squelch that detects a higher signal level than the noise floor (is this true or is there a way to detect the transmission using the preamble only?)
Once the squelch block detects a transmission, I could use a differentiator (or similar) to get the edges. But my understanding of the transition between this "baseband land" and "bits/PDUs" ends: I would need a block that triggers asynchronously (rather than samples at fixed intervals). In an actual system, the edges from the described detector could act as clock input of a flip flop. However, I do not see which standard gnuradio block would allow me to do this.
Once in "bits land", the bits (or PDUs) would be processed at a much lower rate. However, two clock domains are crossed: the normal baseband sampling rate, an irregular rate by which the transitions are detected and the rate at which the bits are read. For that reason, I would be looking for a FIFO or shift register, in which the detected bits are shifted in at whichever edge transition rate they come in and read out at the regular bit rate on the other side.
Question
What is the correct architecture/approach to implement this in gnuradio?
I could imagine to implement this with my own blocks. But as much as possible I would like to use standard block, gnuradio-companion. I would like to resort to own blocks (in particular C++) only as last resort if either not possible otherwise or if it would really not be the right way to so it otherwise.
This question is an extension of the another shown here, VHDL Process Confusion with Sensitivity Lists
However, having less than 50 Rep points, I was unable to comment for further explanation.
So, I ran into the same problem from the link and accept the answer shown. However, now I am interested in what the recommended approach is in order to match simulation with post-synthesis behavior. The accepted answer in the link states that Level Sensitive Latches are not recommended as a solution due to them causing more problems. So my question is what is the recommended approach? Is there one?
In other words, I want to acheive what was trying to be achieved in that post, but in a way that won't cause more problems. I need my sensitivity list to not be ignored by my synthesis tools.
Also, I am new to VHDL so it may be possible that using processes is not right way to achieve the result I want. I am using a DE2-115 with Quartus Prime 16.0. Any information will be greatly appreciated.
If you are using VHDL to program your FPGA-based prototyping board you are interested in the synthesis semantics of the language. It is rather different from the simulation semantics described in the Language Reference Manual (LRM). Even worse: it is not standardized and varies between synthesis tools. Anyway, synthesis means translation from the VHDL code to digital hardware. The only recommended approach here, for a beginner who still do not clearly understands the synthesis semantics is:
Think hardware first, code next.
In other words, draw a nice block diagram of the hardware you want on a sheet of paper. And use the following 10 rules. Strictly. No exceptions. Never. And do not forget to carefully check the last one, it is as essential as the others but a bit more difficult to verify.
Surround your drawing with a large rectangle. This is the boundary of your circuit. Everything that crosses this boundary is an input or output port. The VHDL entity will describe this boundary.
Clearly separate edge-triggered registers (e.g. square blocks) from combinatorial logic (e.g. round blocks).
Do not use level-triggered latches.
Use only rising-edge triggered registers and use the same single clock for all of them. Its name is clock. It comes from the outside and is an input of all square blocks and only them. Do not even represent the clock, it is the same for all square blocks and you can leave it implicit in your diagram.
Represent the communications between blocks with named and oriented arrows. For the block an arrow comes from, the arrow is an output signal. For the block an arrow goes to, the arrow is an input signal.
Arrows have one single origin but they can have several destinations. If an arrow has several destinations, fork the arrow as many times as needed.
Some arrows come from outside the large rectangle. These are the input ports of the entity. An input arrow cannot also be the output of any of your blocks.
Some arrows go outside. These are the output ports. An output arrow has one single origin and one single destination: the outside. No forks on output arrows. So, an output arrow cannot be also the input of one of your blocks. If you want to use an output arrow as an input for some of your blocks, insert a new round block to split it in two parts: the input of the new block, with as many forks as you wish, and the output arrow that comes from the new block and goes outside. The new block will become a simple continuous assignment in VHDL. A kind of transparent renaming.
All arrows that do not come or go from/to the outside are internal signals. You will declare them all in the architecture.
Every cycle in the diagram must comprise at least one square block.
If you cannot find a way to describe the function you want with this approach, the problem is with the function you want. Not with VHDL or the synthesizer. It means that the function you want is not digital hardware. Implement it using another technology.
The VHDL coding becomes a detail:
one synchronous process per square block,
one combinatorial process per round block.
A synchronous process looks like this:
process(clock)
begin
if rising_edge(clock) then
o1 <= i1;
...
on <= in;
end if;
end process;
where i1, i2,..., in are all arrows that enter the corresponding square block of your diagram and o1, ..., om are all arrows that output the corresponding square block of your diagram. Do not change anything, except the names of the signals. Nothing. Not even a single character. OK?
A combinatorial process looks like this:
process(i1, i2,... , in)
<declarations>
begin
o1 <= <default_value_for_o1>;
...
om <= <default_value_for_om>;
<statements>
end process;
where i1, i2,..., in are all arrows that enter the corresponding round block of your diagram. all and no more. Do not forget a single arrow and do not add anything else. There are no exceptions. Never. And where o1, ..., om are all arrows that output the corresponding round block of your diagram. all and no more. Do not change anything except <declarations>, the names of the inputs, the names of the outputs, the values of the <default_value_for_oi> and <statements>. Do not forget a single default value assignment. If you had to create a new round block to split a primary output arrow, the corresponding process is just:
process(i)
begin
o <= i;
end process;
which you can simplify as:
o <= i;
without the enclosing process declaration. It is the equivalent concurrent signal assignment.
Once you will be comfortable with this coding style and only then, you will:
Skip the drawing for simple designs. But continue thinking hardware first. Draw in your head instead of on a sheet of paper but continue drawing.
Use asynchronous resets:
process(clock, reset)
begin
if reset = '1' then
o <= reset_value_for_o;
elsif rising_edge(clock) then
o <= i;
end if;
end process;
Merge several combinatorial processes in one single combinatorial process. This is trivial and is just a simple reorganization of the block diagram.
Merge some combinatorial processes with synchronous processes. But in order to do this you must go back to your block diagram and add an eleventh rule:
Group several round blocks and at least one square block by drawing an enclosure around them. Also enclose the arrows that can be. Do not let an arrow cross the boundary of the enclosure if it does not come or go from/to outside the enclosure. Once this is done, look at all the output arrows of the enclosure. If any of them comes from a round block of the enclosure or is also an input of the enclosure, you cannot merge these processes in a synchronous process.
And later on you will also start using latches, falling clock edges, multiple clocks and resynchronizers between clock domains... But we will discuss these when the time will have come.
I have a large, rather complicated procedural content generation lua project. One thing I want to be able to do, for debugging purposes, is use a random seed so that I can re-run the system & get the same results.
To the end, I print out the seed at the start of a run. The problem is, I still get completely different results each time I run it. Assuming the seed doesn't change anywhere else, this shouldn't be possible, right?
My question is, what other ways are there to influence the output of lua's math.random()? I've searched through all the code in the project, and there's only one place where I call math.randomseed(), and I do that before I do anything else. I don't use the time or date for any calculations, so that wouldn't be influencing the results... What else could I be missing?
Updated on 2/22/16 monkey patching math.random & math.randomseed has, oftentimes (but not always) output the same sequence of random numbers. But still not the same results – so I guess the real question is now: what behavior in lua is indeterminate, and could result in different output when the same code is run in sequence? Noting where it diverges, when it does, is helping me narrow it down, but I still haven't found it. (this code does NOT use coroutines, so I don't think it's a threading / race condition issue)
randomseed is using srandom/srand function, which "sets its argument as the seed for a new sequence of pseudo-random integers to be returned by random()".
I can offer several possible explanations:
you think you call randomseed, but you do not (random will initialize the sequence for you in this case).
you think you call randomseed once, but you call it multiple times (or some other part of the code calls randomseed as well, possibly at different times in your sequence).
some other part of the code calls random (some number of times), which generates different results for your part of the code.
there is nothing wrong with the generated sequence, but you are misinterpreting the results.
your version of Lua has a bug in srandom/random processing.
there is something wrong with srandom or random function in your system.
Having some information about your version of Lua and your system (in addition to the small example demonstrating the issue) would help in figuring out what's causing this.
Updated on 2016/2/22: It should be fairly easy to check; monkeypatch both math.randomseed and math.random and log all the calls and the values returned by the functions for two subsequent runs. Compare the results. If the results differ, you should be able to isolate why they differ and reproduce on a smaller example. You can also look at where the functions are called from using debug.traceback.
Correct, as stated in the documentation, 'equal seeds produce equal sequences of numbers.'
Immediately after setting the seed to a known constant value, output a call to rand - if this varies across runs, you know something is seriously wrong (corrupt library download, whack install, gamma ray hit your drive, etc).
Assuming that the first value matches across runs, add another output midway through the code. From there, you can use a binary search to zero in on where things go wrong (I.E. first half or second half of the code block in question).
While you can & should use some intuition to find the error as you go, keep in mind that if intuition alone was enough, you would have already found it, thus a bit of systematic elimination is warranted.
Revision to cover comment regarding array order:
If possible, use debugging tools. This SO post on detecting when the value of a Lua variable changes might help.
In the absence of tools, here's one way to roll your own for this problem:
A full debugging dump of any sizable array quickly becomes a mess that makes it tough to spot changes. Instead, I'd use a few extra variables & a test function to keep things concise.
Make two deep copies of the array. Let's call them debug01 & debug02 & call the original array original. Next, deliberately swap the order of two elements in debug02.
Next, build a function to compare two arrays & test if their elements match up & return / print the index of the first mismatch if they do not. Immediately after initializing the arrays, test them to ensure:
original & debug01 match
original & debug02 do not match
original & debug02 mismatch where you changed them
I cannot stress enough the insanity of using an unverified (and thus, potentially bugged) test function to track down bugs.
Once you've verified the function works, you can again use a binary search to zero in on where things go off the rails. As before, balance the use of a systematic search with your intuition.
I am doing a VLSI Project and I am implementing a Barrel Shifter using a tool called DSCH.The schematic for the same is realized using Transmission Gates.
What the circuit does is, it ROTATES the 8 bit word(8-bit shifter) with as many rotations chosen from a decoder in one clock cycle.
But I want to know the use of a rotator and why is it still called a shifter even though it's rotating.
Also please help me with some applications regarding Rotator which could be added to the present circuit to show it's use?
Rotation is just shifting with the bit exiting from one end fed back into the input at the other end, possibly by way of the carry flag bit. At the level of a simple implementation, it would make sense to have one circuit for both operations, with some additional control lines to select the source at the input side between the output of the other side, 0, or 1. Sign extension during right shift of 2's complement numbers would be another selectable option often built in.
The stackexchange sites aren't really suited to "list" questions including about applications, but a couple come to mind:
If you want a vector to test every bit of another value in turn, and to do so repeatedly, you could just keep rotating an initial one-bit-active value over an over, never having to re-initialize it.
You could swap a two-part (typically double-byte) value to imitate the opposite endieness of encoding by rotating it half way. Or putting it another way, it can be a single-operation swap of the values of two pairable but also independently accessible registers (think AL and AH together making up AX in real mode x86). But this will not work two endian swap a four-part value, such as a 32-bit value on a byte-addressable machine.
Various coding, checksum, and hashing schemes may wish to transform a value
I have a function that I use to look up a value based on an index. The value takes some time to calculate, so I want to do it with ParallelMap, and references another similar such function that returns a list of expressions, also based on an index.
However, when I set it all up in a seemingly reasonable fashion, I see some very bizarre behaviour. First, I see that the function appears to work, albeit very slowly. For large indexes, however, the processor activity in Taskmangler stays entirely at zero for an extended period of time (i.e. 2-4 minutes) where all instances of Mathematica are seemingly inert. Then, without the slightest blip of CPU use, a result appears. Is this another case of Mathematica spukhafte Fernwirkung?
That is, I want to create a variable/function that stores an expression, here a list of integers (ListOfInts), and then on the parallel workers I want to perform some function on that expression (here I apply a set of replacement rules and take the Min). I want the result of that function to also be indexed by the same index under another variable/function (IndexedFunk), whose result is then available back on the main instance of Mathematica:
(*some arbitrary rules that will convert some of the integers to negative values:*)
rulez=Dispatch[Thread[Rule[Range[222],-Range[222]]]];
maxIndex = 333;
Clear[ListOfInts]
Scan[(ListOfInts[#]=RandomInteger[{1,999},55])&,Range[maxIndex ]]
(*just for safety's sake:*)
DistributeDefinitions[rulez, ListOfInts]
Clear[IndexedFunk]
(*I believe I have to have at least one value of IndexedFunk defined before I Share the definition to the workers:*)
IndexedFunk[1]=Min[ListOfInts[1]]/.rulez
(*... and this should let me retrieve the values back on the primary instance of MMA:*)
SetSharedFunction[IndexedFunk]
(*Now, here is the mysterious part: this just sits there on my multiprocessor machine for many minutes until suddenly a result appears. If I up maxIndex to say 99999 (and of course re-execute the above code again) then the effect can more clearly be seen.*)
AbsoluteTiming[Short[ParallelMap[(IndexedFunk[#]=Min[ListOfInts[#]/.rulez])&, Range[maxIndex]]]]
I believe this is some bug, but then I am still trying to figure out Mathematica Parallel, so I can't be too confident in this conclusion. Despite its being depressingly slow, it is nonetheless impressive in its ability to perform calculations without actually requiring a CPU to do so.
I thought perhaps it was due to whatever communications protocol is being used between the master and slave processes, perhaps it is so slow that it just appears that the processors are doing nothing when if fact they are just waiting to send the next bit of some definition or other. In which case I thought ParallelMap[..., Method->"CoarsestGrained"] would be of some use. But no, that doesn't work neither.
A question: "Am I doing something obviously wrong, or is this a bug?"
I am afraid you are. The problem is with the shared definition of a variable. Mathematica maintains a single coherent value in all copies of the variable across kernels, and therefore that variable becomes a single point of huge contention. CPU is idle because kernels line up to the queue waiting for the variable IndexedFunk, and most time is spent in interprocess or inter-machine communication. Go figure.
By the way, there is no function SetSharedDefinition in any Mathematica version I know of. You probably intended to write SetSharedVariable. But remove that evil call anyway! To avoid contention, return results from the parallelized computation as a list of pairs, and then assemble them into downvalues of your variable at the main kernel:
Clear[IndexedFunk]
Scan[(IndexedFunk[#[[1]]] = #[[2]]) &,
ParallelMap[{#, Min[ListOfInts[#] /. rulez]} &, Range[maxIndex]]
]
ParallelMap takes care of distributing definition automagically, so the call to DistributeDefinitions is superfluous. (As a minor note, it is not correct as written, omitting the maxIndex variable, but the omission is automatically taken care of by ParallelMap in this particular case.)
EDIT, NB!: The automatic distribution applies only to the version 8 of Mathematica. Thanks #MikeHoneychurch for the correction.