System Verilog, how to sum array values? - for-loop

I'm trying to sum array values using System Verilog.
My data are declared like this:
reg signed [23:0] n2 [31:0];
reg signed [15:0] w2 [195:0];
w2 is a reg with values stock in it.
for(int i2=0; i2<32; i2++) begin
for(int j2=0; j2<196; j2++) begin
n2[i2] <= n2[i2] + w2[j2];
end
end
end
I need to sum 196 * 16bit (*32), so it requires a 24 bit(*32)?
I tried to simulate my design, and I only have X in n2 reg.
Also, I run the RTL Analysis and open the elaborated design, and I have a warning like this:
[Synth 8-324] index 10 out of range
It's pointing to the line:
for(int j2=0; j2<196; j2++) begin
but I don't know why.

I only have X in n2 reg.
Of course you have.
At the start n2 is not initialized so it contains all X-es. Now you add numbers to X-es which still give you X-es. It is the same as if you would add to an uninitialized variable in most languages.
Try this:
for(int i2=0; i2<32; i2++) begin
n2[i2] <= w2[0];
for(int j2=1; j2<196; j2++) begin
n2[i2] <= n2[i2] + w2[j2];
end
end
(I have kept your <= as I don't know the context. But I have doubts this is correct)
I have a warning ...
I can't help you with your synth warning. I don't see anything obvious wrong but then again you provided only a very small snippet of code.

Related

Why does this error in indexing BCD adder appear?

I am not sure, what exactly the error is. I think, my indexing in the for-loop is not Verilog-compatible, but I might be wrong.
Is it allowed to index like this (a[(4*i)+3:4*i]) in a for-loop just like in C/C++?
Here is a piece of my code, so the for-loop would make more sense
module testing(
input [399:0] a, b,
input cin,
output reg cout,
output reg [399:0] sum );
// bcd needs 4 bits + 1-bit carry --> 5 bits [4:0]
reg [4:0] temp_1;
always #(*) begin
for (int i = 0; i < 100; i++) begin
if (i == 0) begin // taking care of cin so the rest of the loop works smoothly
temp_1[4:0] = a[3:0] + b[3:0] + cin;
sum[3:0] = temp_1[3:0];
cout = temp_1[4];
end
else begin
temp_1[4:0] = a[(4*i)+3:4*i] + b[(4*i)+3:4*i] + cout;
sum[(4*i)+3:4*i] = temp_1[3:0];
cout = temp_1[4];
end
end
end
endmodule
This might seem obvious. I'm doing the exercises from:
HDLBits and got stuck on this one in particular for a long time (This solution isn't the one intended for the exercise).
Error messages Quartus:
Error (10734): Verilog HDL error at testing.v(46): i is not a constant File: ../testing.v Line: 46
Error (10734): Verilog HDL error at testing.v(47): i is not a constant File: ../testing.v Line: 47
But I tried the same way in indexing and got the same error
The error appears because Verilog does not allow variables at both indices of a part select (bus slice indexes).
The most dynamic thing that can be done involves the indexed part select.
Here is a related but not duplicate What is `+:` and `-:`? SO question.
Variations of this question are common on SO and other programmable logic design forums.
I took your example and used the -: operator rather than the : and changed the RHS of this to a constant. This version compiles.
module testing(
input [399:0] a, b,
input cin,
output reg cout,
output reg [399:0] sum );
// bcd needs 4 bits + 1-bit carry --> 5 bits [4:0]
reg [4:0] temp_1;
always #(*) begin
for (int i = 0; i < 100; i++) begin
if (i == 0) begin // taking care of cin so the rest of the loop works smoothly
temp_1[4:0] = a[3:0] + b[3:0] + cin;
sum[3:0] = temp_1[3:0];
cout = temp_1[4];
end
else begin
temp_1[4:0] = a[(4*i)+3-:4] + b[(4*i)+3-:4] + cout;
sum[(4*i)+3-:4] = temp_1[3:0];
cout = temp_1[4];
end
end
end
endmodule
The code will not behave as you wanted it to using the indexed part select.
You can use other operators that are more dynamic to create the behavior you need.
For example shifting, and masking.
Recommend you research what others have done, then ask again if it still is not clear.

Johnson Counter Syntax error. Unexpected token: generate

My college teacher asked for me to implement a Johnson Counter and it's test bench, with an width<=32 (he calls it an N parameter), and the implementation has to use generate/for structures. Although I had learned a little about Johnson Counter, I don't know how to use generate in this case, and I had some errors when I tried to run the test bench. Here is my implementation so far:
module johnsonCounter #(parameter N = 32)
(
input clk,
input rstn,
output reg [N-1:0] out
);
always # (posedge clk) begin
if (!rstn)
out <= 1;
else begin
out[N-1] <= ~out[0];
generate
for (int i = 0; i < N-1; i=i+1) begin
out[i] <= out[i+1];
end
endgenerate
end
end
endmodule
Here is the test bench:
module tb;
parameter N = 32;
reg clk;
reg rstn;
wire [N-1:0] out;
johnsonCounter u0 (.clk (clk),
.rstn (rstn),
.out (out));
always #10 clk = ~clk;
initial begin
{clk, rstn} <= 0;
$monitor ("T=%0t out=%b", $time, out);
repeat (2) #(posedge clk);
rstn <= 1;
repeat (15) #(posedge clk);
$finish;
end
initial begin
$dumpvars;
$dumpfile("dump.vcd");
end
endmodule
These are the errors:
ERROR VCP2000 "Syntax error. Unexpected token: generate[_GENERATE]. This is a Verilog keyword since IEEE Std 1364-2001 and cannot be used as an identifier. Use -v95 argument for compilation." "design.sv" 13 7
ERROR VCP2020 "begin...end pair(s) mismatch detected. 2 <end> tokens are missing." "design.sv" 17 7
ERROR VCP2020 "module/macromodule...endmodule pair(s) mismatch detected. 1 <endmodule> tokens are missing." "design.sv" 17 7
ERROR VCP2000 "Syntax error. Unexpected token: endgenerate[_ENDGENERATE]. This is a Verilog keyword since IEEE Std 1364-2001 and cannot be used as an identifier. Use -v95 argument for compilation." "design.sv" 17 7
Any help is welcome =)
It is illegal to use generate in that way.
For your code, just a for loop is needed (without generate):
always # (posedge clk) begin
if (!rstn)
out <= 1;
else begin
out[N-1] <= ~out[0];
for (int i = 0; i < N-1; i=i+1) begin
out[i] <= out[i+1];
end
end
end
For generate syntax, refer to the IEEE Std 1800-2017, section 27. Generate constructs.
I tried implementing it using the generate construct. I am also new at this, so if anybody sees any problem or error, or could provide any suggestion to improve performance, I would appreciate it.
Regarding your question, I always use generate to instantiate several modules, I think it makes my code cleaner and easier to understand. So what I did is to define a simple D flip-flop module, which I will use to instantiate it. If you want to use generate, you have to define an iterative variable with genvar. Also, you should use generate outside an always block (I don't know if there is a situation where you could use it inside the always block). Below, you can see the code.
module ff
(
input clk,
input rstn,
input d,
output reg q,
output reg qn
);
always #(posedge clk)
begin
if(!rstn)
begin
q <= 0;
qn <= 1;
end
else
begin
q <= d;
qn <= ~d;
end
end
endmodule
module johnsonCounter #(parameter N = 4)
(
input clk,
input rstn,
output [N-1:0] out,
output [N-1:0] nout
);
genvar i;
generate
for (i = 0; i < N-1; i=i+1) begin
ff flip (.clk(clk), .rstn(rstn), .d(out[i+1]), .q(out[i]), .qn(nout[i]));
end
endgenerate
ff lastFlip (.clk(clk), .rstn(clk), .d(nout[0]), .q(out[N-1]), .qn(nout[N-1]));
endmodule
Here you have the testbench, too. One thing I changed from your code is the dumpfile line. It should go before dumpvar.
module tb;
parameter N = 4;
reg clk;
reg rstn;
wire [N-1:0] out;
johnsonCounter u0 (.clk (clk),
.rstn (rstn),
.out (out));
always #10 clk = ~clk;
initial begin
{clk, rstn} <= 0;
$monitor ("T=%0t out=%b", $time, out);
repeat (2) #(posedge clk);
rstn <= 1;
repeat (15) #(posedge clk);
$finish;
end
initial begin
$dumpfile("dump.vcd");
$dumpvars;
end
endmodule
This code was tested using EDA Playground and it worked fine but, as I said, I am not an expert, so if anybody finds any error or have any suggestion, it is welcome.

Finding the maximum value of an array in log2 time in Verilog?

I have designed a 'sorter' that finds the maximum value of its input, which is 16 31-bit words. In simulation, it works, but I am not sure if it will work in hardware (as it doesn't seem to be working on the FPGA as planned). Can someone please let me know if this will work? I am trying to save on resources, that is why I am trying to reuse the same register. Thank you...
module para_sort(clk, ready, array_in, out_max)
input clk, ready;
input [16*31-1:0] array_in;
output reg [30:0] out_max;
reg [30:0] temp_reg [0:15]
integer i, j;
always # (posedge clk)
begin
if(ready)
begin
for(j=0; j<16; j=j+1)
begin
temp_reg[j] <= array_in[31*(j+1)-1 -: 31];
end
i<=0;
done<=0;
end
else
begin
if(i<4)
begin
for(j=0; j<16; j=j+1)
if(temp_reg[j+1] > temp_reg[j]
temp_reg[((j+2)>>1)-1] <= temp_reg[j+1]
else
temp_reg[((j+2)>>1)-1] <= temp_reg[j]
i<=i+1;
end
end
if(i == 4)
begin
out_max <= temp_reg[0];
done <=1;
i <= i + 1;
end
if(i == 5)
done <=0;
end
endmodule
Sorry for the long code. If you have any questions about the code, please let me know.
Before answering the question, I assume that you do not have any problem with code syntax and semantics, and the module is part of the design so that you do not have problem with number of I/O pins, as it seems to be the case.
First of all, the algorithm that has been used in this posted code is incorrect. Maybe you meant something different, but this one is incorrect. The following code worked for me:
//The same code as you posted, but slight changes are made
//Maybe you have tried compiling this code and missed some points while posting
module para_sort(clk, ready, array_in, out_max, done);
input clk, ready;
input [16*31-1:0] array_in;
output reg [30:0] out_max;
output reg done;
reg [30:0] temp_reg [0:16];
integer i, j;
always # (posedge clk)
begin
if(ready)
begin
for(j=0; j<16; j=j+1)
begin
temp_reg[j] <= array_in[31*(j+1)-1 -: 31];
end
i<=0;
done<=0;
end
else
begin
if(i<4)
begin
for(j=0; j<16; j=j+2)
begin
if(temp_reg[j+1] > temp_reg[j])
temp_reg[((j+2)>>1)-1] <= temp_reg[j+1];
else
temp_reg[((j+2)>>1)-1] <= temp_reg[j];
end // end of the for loop
i<=i+1;
end
end
if(i == 4)
begin
out_max <= temp_reg[0];
done <=1;
i <= i + 1;
end
if(i == 5)
done <=0;
end
endmodule
If the above code does not solve the problem, then:
The problem might be with timing. In Quartus Prime software, there are netlist viewer tools (The same is true for Vivado). When you check the generated circuit, you can see paths with many combinational blocks and feedback (caused mostly by the second for loop). If the second for loop does not have enough time to complete execution in a single clock, you will loose synchronization and the results will be unpredictable.
So,
Try the above code first
If the code does not solve your problem, then try FSM (with case and if - else - if statements ) even the code becomes longer or looks uglier (not user friendly), it is more FPGA hardware friendly.

vhdl code (for loop)

Description:
I want to write vhdl code that finds the largest integer in the array A which is an array of 20 integers.
Question:
what should my algorithm look like, to input where the sequential statements are?
my vhdl code:
highnum: for i in 0 to 19 loop
i = 0;
i < 20;
i<= i + 1;
end loop highnum;
This does not need to be synthesizable but I dont know how to form this for loop a detailed example explaining how to would be appreciated.
Simply translating the C loop to VHDL, inside a VHDL clocked process, will work AND be synthesisable. It will generate a LOT of hardware because it has to generate the output in a single clock cycle, but that doesn't matter if you are just simulating it.
If that is too much hardware, then you have to implement it as a state machine with at least two states, Idle and Calculating, so that it performs only one loop iteration per clock cycle while Calculating, and returns to the Idle state when done.
First of all you should know how have you defined the array in vhdl.
Let me define an array for you.
type array_of_integer array(19 downto 0) of integer;
signal A : array_of_integer :=(others => 0);
signal max : integer;
-- Now above is the array in vhdl of integers all are initialized to value 0.
A(0) <= 1;
A(1) <= 2;
--
--
A(19)<= 19;
-- Now the for loop for calculating maximum
max <= A(0);
for i in 0 to 19 loop
if (A(i) > max) then
max <= A(i);
end if;
end loop;
-- Now If you have problems in understating that where to put which part of code .. in a ----vhdl entity format .. i.e process, ports, etc... you can reply !

Verilog two dimensional array syntax

I would like to instantiate an array of registers, and declare them all according to a certain function. This is for a multiplier block that I'm hoping to construct.
The code I'm working with is below, but this is the line that the compiler does not appreciate:
q[i][7:0] = {8{a[i]}} & b[7:0];
As the code is written out, I hope to make the registers q[0],q[1],....q[7] all store the 8-bit value define by the RHS above. Can anyone tell me what would be the proper way to do this?
Entire code:
`timescale 1ns / 1ps
module multiplier_2(
input [7:0] A,
input [7:0] B,
output reg [15:0] P,
input start,
output stop
);
reg [7:0] q[7:0];
reg P = 0;
//create 8 bit vectors q[i]
genvar i;
generate
for (i = 0; i < 8;i = i+1)
begin: loop
q[i][7:0] = {8{a[i]}} & b[7:0];
end
endgenerate
always # (*)
begin
if (start == 1'b1)
begin
for (i = 0; i < 8; i = i+1)
begin
P = P + (q[i] << i);
end
end
end
endmodule
EDIT: this code also doesn't work:
`timescale 1ns / 1ps
module multiplier_2(
input [7:0] a,
input [7:0] b,
output reg [15:0] P = 16'd0,
input start,
output stop
);
reg [7:0] q[7:0];
//create 8 bit vectors q[i]
genvar i;
generate
always begin
for (i = 0; i < 8;i = i+1)
begin: loop
q[i] = {8{a[i]}} & b[7:0];
end
end
endgenerate
always # (*)
begin
stop = 1'b0;
if (start == 1'b1)
begin
for (i = 0; i < 8; i = i+1)
begin
P = P + (q[i] << i);
end
end
stop = 1'b1;
end
endmodule
Error message:
"Line 16: Procedural assignment to a non-register i is not permitted, left-hand side should be reg/integer/time/genvar"
I do not think this require a generate statement. A standard for loop will work:
reg [7:0] q [0:7];
integer i;
always #* begin
for (i = 0; i < 8; i=i+1) begin: loop
q[i] = {8{a[i]}} & b[7:0];
end
end
Beware of what hardware you are implying though. For loops like generate statements imply parallel hardware.
NB: it is more common to list memories with the depth from 0 to x ie: reg [7:0] q [0:7];
You've got all sorts of issues here. First off, you're getting confused about what a generate statement is, and what you're trying to generate. Are you (1) trying to generate a single always block, which must contain sequential/procedural code, or are you (2) trying to generate/replicate 8 continuous assignments?
You're presumably not doing (1), since there's no point in generating a single always block; the generate is redundant. That leaves (2). So, get rid of the always begin after the generate. The i in your loop is now the 'genvar', or generation variable, and you're replicating 8 assignments; so far, so good. Get rid of the begin:loop and end; you're replicating a single statement, so they're pointless verbiage.
Next problem: the generate loop is now creating concurrent, or parallel, statements; in Verilog-speak, they're module-level statements. They means that they must be continuous assignments, ie they must have an assign in front of them, and not just ordinary procedural assignments, as you've written them. That also means that q must be declared as a wire, and not a reg. There's no good reason for this; it's just how Verilog is.
You now have a second always block, which is a concurrent (module-level) statement, which must contain sequential/procedural code. The i you're referring to in this block is the original genvar, which doesn't work. A genvar can only be used in specific generation-related circumstances; this isn't inside a generate, and you need an ordinary variable here as your index. you can do this by naming your outer begin/end, and declaring a variable inside it, or any other way. You'll now find out that you're creating a procedural assignment to net stop; this is illegal, so change stop's declaration to a reg. This should be enough to get your code to compile.
BTW, #(*) is verbose and unnecessary, and has historically confused at least one tool. #* is more concise.
You've got other issues. Your second always contains a loop. It looks like it might be logically correct, but your synthesiser has to unroll this, and carry out 8 additions, and set stop. This isn't going to work in real life. Think about making these additions concurrent and putting them in a generate, or creating a clocked pipeline, and some more robust (clocked) way of creating stop.

Resources