I am trying to build a simple UART reception parser command line based on example from Tang Nano 9K repo here, here is my modified version. It basically uses a memory to hold some values, which is working. Once I receive 5 characters I would like to send them back to host, basically looping through the memory items and sending once tx_data_rdy is set.
module uart_test(
input clk,
input rst_n,
input uart_rx,
output uart_tx,
output reg [5:0] led
);
parameter CLK_FRE = 27; //MHz
parameter UART_FRE = 115200; //bps
localparam IDLE = 0;
localparam SEND = 1;
localparam WAIT = 2;
localparam PROCESS_RX = 3;
parameter MSG_SIZE = 256;
reg[7:0] tx_data;
reg[7:0] tx_str;
reg tx_data_valid;
wire tx_data_ready;
reg[7:0] tx_cnt;
wire[7:0] rx_data;
wire rx_data_valid;
wire rx_data_ready;
reg[31:0] wait_cnt;
reg[3:0] state;
reg [7:0] message [MSG_SIZE - 1:0];
reg [7:0] rx_index;
integer i;
localparam DATA_NUM2 = 3; // just for testing
assign rx_data_ready = 1'b1;//always can receive data,
always@(posedge clk or negedge rst_n)
begin
if(rst_n == 1'b0)
begin
led <= ~6'b100000;
wait_cnt <= 32'd0;
tx_data <= 8'd0;
state <= IDLE;
tx_cnt <= 8'd0;
tx_data_valid <= 1'b0;
rx_index <= 8'd0;
for (i = 0; i < 256; i = i+1) begin
message[i] <= 0;
end
end
else
case(state)
IDLE:
begin
if(rx_data_valid == 1'b1)
begin
message[rx_index] <= rx_data; // send uart received data
if(rx_index >= 8'd4)
begin
rx_index <= 8'd0;
tx_cnt <= 8'd0;
state <= PROCESS_RX;
end
else begin
led <= ~rx_data[5:0];
rx_index <= rx_index + 8'd1;
end
end
end
PROCESS_RX:
begin
tx_data <= tx_str;
tx_data_valid <= 1'b1;
state <= WAIT;
end
WAIT:
begin
if(tx_data_valid == 1'b1 && tx_data_ready == 1'b1) begin
tx_data_valid <= 1'b0;
tx_cnt <= tx_cnt + 8'd1; //Send data counter
if(tx_cnt <= (DATA_NUM2 - 1)) begin
state <= PROCESS_RX;
led <= ~6'b000110;
end
else begin
led <= ~6'b000100;
state <= IDLE;
end
end
end
default:
state <= IDLE;
endcase
end
always@(tx_cnt)
tx_str <= message[tx_cnt];
uart_rx#
(
.CLK_FRE(CLK_FRE),
.BAUD_RATE(UART_FRE)
) uart_rx_inst
(
.clk (clk ),
.rst_n (rst_n ),
.rx_data (rx_data ),
.rx_data_valid (rx_data_valid ),
.rx_data_ready (rx_data_ready ),
.rx_pin (uart_rx )
);
uart_tx#
(
.CLK_FRE(CLK_FRE),
.BAUD_RATE(UART_FRE)
) uart_tx_inst
(
.clk (clk ),
.rst_n (rst_n ),
.tx_data (tx_data ),
.tx_data_valid (tx_data_valid ),
.tx_data_ready (tx_data_ready ),
.tx_pin (uart_tx )
);
endmodule
The Verilog is not able to synthetize as it runs out of SLICEs.
If I just comment the line that increments the index variable used for indexing the memory, i.e. tx_cnt <= tx_cnt + 8'd1; , then it builds ok.
I have tried different things without success, any Verilog expert probably can quickly see the problem but I am not getting it.
What am I doing wrong?

The problems is that the synthesis tool is trying to make a 8-bit 256 location memory out of logic fabric/slices/registers. The physical FPGA does not have the resources to implement the design.
Here are two solutions:
Reduce the memory size. 256 elements seems like a lot when the need is 5.
Changing the size to 16 reduces the utilization burden, and provides margin based on the need of 5.
Change the parameter to
parameter MSG_SIZE = 16;and change the for loop indexing to
for (i = 0; i < 16; i = i+1) beginCode using a style which will infer a BRAM primitive for the memory (rather than inferring logic fabric/slices/registers). FPGA BRAM's do not have reset pins (reset inputs) on them therefore its not possible to reset/initialized them using a reset signal in RTL. Most applications don't need a reset. There does not seem to be a reason this application needs a reset. Make sure the design writes at least one value to the memory before reading one.
If the tools detect a coding style attempting reset (like the posted code), then they will make a memory using logic fabric/slices/registers for the user. Removing code which attempts to reset the memory will allow the synthesis tool to infer a BRAM and use 0 slices for the memory. For this solution, keep the memory size at 256 because if the memory is small beyond some threshold, then the tool may decide not to waste a BRAM on what it analyzes to be a very small memory.
Another coding style issue is that the read side of the memory was modeled as a transparent latch, and BRAMS don't have them so that also prevents BRAM inference. Latches are generally not recommended for RTL design. Here is a small re-write to create a separate synchronous process for the memory to facilitate the tools to inference of BRAM.
Please run the simulations to make sure it behaves as desired. It may need a read enable if the data does not come out of the memory at the right time.