The encoded message contains negative integers. Also the buffer of the BitWriter doesn't change.
ChatGPT helped me to fix errors regarding operator precedence.
import java.util.Arrays;
public class RansEncoder {
private final int[] symbolFrequencies;
private final int totalFrequency;
private final int[] cumulativeDistributionFunction;
private final int numberOfQuantizationBits;
private final int numberOfNormalizationBits;
private final int numberOfBitsToReadAndWrite;
private int state;
public RansEncoder(int numberOfQuantizationBits, int numberOfNormalizationBits, int numberOfBitsToReadAndWrite, int[] symbolFrequencies) {
this.numberOfQuantizationBits = numberOfQuantizationBits;
this.numberOfNormalizationBits = numberOfNormalizationBits;
this.numberOfBitsToReadAndWrite = numberOfBitsToReadAndWrite;
this.totalFrequency = 1 << numberOfQuantizationBits;
this.symbolFrequencies = quantizeSymbolFrequencies(symbolFrequencies);
this.cumulativeDistributionFunction = getCumulativeDistributionFunction();
this.state = 1 << numberOfNormalizationBits;
}
private int[] quantizeSymbolFrequencies(int[] symbolFrequencies) {
final int totalFrequencyBeforeQuantization = Arrays.stream(symbolFrequencies).sum();
return Arrays.stream(symbolFrequencies)
.mapToDouble(Double::valueOf)
.map(symbolFrequency -> symbolFrequency / totalFrequencyBeforeQuantization)
.map(symbolProbability -> symbolProbability * totalFrequency)
.mapToLong(Math::round)
.mapToInt(Math::toIntExact).toArray();
}
private int[] getCumulativeDistributionFunction() {
int[] cumulativeDistributionFunction = new int[symbolFrequencies.length + 1];
for (int i = 0; i < symbolFrequencies.length + 1; i++) {
cumulativeDistributionFunction[i] = Arrays.stream(symbolFrequencies).limit(i).sum();
}
return cumulativeDistributionFunction;
}
public void write(int symbol, BitWriter writer) {
normalizeOverflowedStateByWritingBits(symbol, writer);
encode(symbol);
}
private void normalizeOverflowedStateByWritingBits(int symbol, BitWriter writer) {
while (state >= symbolFrequencies[symbol] << numberOfNormalizationBits) {
writer.writeBits(state & ((1 << numberOfBitsToReadAndWrite) - 1), numberOfBitsToReadAndWrite);
state >>= numberOfBitsToReadAndWrite;
}
}
private void encode(int symbol) {
int nextState = getNextState(symbol);
state = nextState;
}
private int getNextState(int symbol) {
int nextState = (Math.floorDiv(state, symbolFrequencies[symbol]) << numberOfQuantizationBits) + cumulativeDistributionFunction[symbol] + state % symbolFrequencies[symbol];
return nextState;
}
}
The utility classes can be found here and here. I don't think they cause the issue because they are used in a big open source project.
A potential error could be the initialization of the state or the implementation of the normalizeOverflowedStateByWritingBits method.
The test uses 8, 16, 8 for the number of quantization, normalization and read/write bits. The symbolFrequencies are {5000, 2000, 2000, 1000} and the message to be encoded is an array consisting of 4000 random integers between 0 and 4 exclusive.
I used this implementation as a reference.