Running sysbench against volumes I'm finding the gp3 volumes are much slower. Even when I provision a volume with 16000IOPS after waiting for optimization I'm getting a cap at 1000 IOPS when monitoring through Percona PMM, New Relic CloudWatch.
Instance:
- C5.4xlarge Ubuntu 18.04
Volumes:
- gp2 3000GB (gives 9000 IOPS)
- gp3 3000GB 9000 IOPS 250/s
Sysbench results below:
sysbench --file-total-size=15G --file-num=16 fileio prepare
sysbench --file-total-size=15G --file-num=16 --file-test-mode=rndrw --time=600 fileio run
sysbench --file-total-size=15G --file-num=16 fileio cleanup
gp3 | 9000 | 3000 | 250/s
File operations:
reads/s: 576.37
writes/s: 384.24
fsyncs/s: 153.70
Throughput:
read, MiB/s: 9.01
written, MiB/s: 6.00
General statistics:
total time: 600.0333s
total number of events: 668612
Latency (ms):
min: 0.00
avg: 0.90
max: 337.40
95th percentile: 3.89
sum: 599693.33
Threads fairness:
events (avg/stddev): 668612.0000/0.00
execution time (avg/stddev): 599.6933/0.00
gp2 | 9000 | 3000 | gp2
File operations:
reads/s: 1523.68
writes/s: 1015.79
fsyncs/s: 406.33
Throughput:
read, MiB/s: 23.81
written, MiB/s: 15.87
General statistics:
total time: 600.0064s
total number of events: 1767487
Latency (ms):
min: 0.00
avg: 0.34
max: 70.10
95th percentile: 1.06
sum: 599390.12
Threads fairness:
events (avg/stddev): 1767487.0000/0.00
execution time (avg/stddev): 599.3901/0.00
Percona PMM gp3 > gp2 comparison:

My initial enthusiasm for gp3 was dampened by inferior performance to gp2, however, when I set the size, IOPS, and bandwidth the same on C5, M5, and M5A instance type I got similar performance measurements from
fio: https://github.com/axboe/fioI'm testing on CentOS 7.8,
fio-3.7-2.el7.x86_64. Regionus-east-1.Are you sure sysbench is the right benchmarking tool? The
fileiooptions don't seem greatly documented, and I can get wildly different results when changing the sync mode, flags, etc.Your example command equivalent to this because of the default values:
When I run a different sysbench command, notably using
directaccess andfsync=off:I get equivalent performance from gp2 and gp3. (Actually, gp3 is slightly better).
Here's a summary of my
fiodata, using the commandfio --rw=TEST --direct=1 --ioengine=libaio --bs=16k --numjobs=8 --size=1G --group_reportingwith TEST=
{read,write,randread,randwrite,randrw}gp2:
gp3: