I'm new to DynamoDb. So if someone could explain to me, what am I doing wrong, or what I miss in my understanding that would be great. I'm trying to get the most effiicient way of searching for a row, that contains some value. 'm playing around with some test data to see how it works and how to design everything.
I have a table with about 1700 rows. Some rows have quite some data in them. There is PK - Id, And some other attributes like Name, Nationality, Description etc. I also added GSI on 'Name' With projection type 'KEYS_ONLY'
Now, my scenario is to find a person, that name contains given string. Let's say Name is 'Pablo Picasso', and I want to find any 'Picasso' My assumtion was, that if I am scanning the GSI it should be pretty fast, I understand, Scan can only go thorugh !mb of data, but I assumed, that My GSI looked something like this:
| Name. | Id |
|---|---|
| A Hopper | 2 |
| Timoty c | 3 |
| Donald Duck | 14 |
Having that in mind, I was sure it should find my row on first scan. Unfortunetaly my first scan went only through like 340 rows. I was able to find my row after 4 calls to Dynamo. When I made simillar scan, but not on the GSI it took 5 calls. which doesn't seem like that different.
Am I doing something wrong? Or do I missunderstood anything?
For testing purposes I'm using C# code like this:
var result = await _dynamoDb.ScanAsync(new ScanRequest(DynamoConstants.ArtistsTableName)
{
IndexName = "NameIndex",
FilterExpression = "contains(#Name, :name)",
ExpressionAttributeNames = new Dictionary<string, string>() { { "#Name", "name" } },
ExpressionAttributeValues = new Dictionary<string, AttributeValue>()
{ { ":name", new AttributeValue("Picasso") } }
});
My index looks like this:
var nameIndex = new GlobalSecondaryIndex
{
IndexName = "NameIndex",
ProvisionedThroughput = new ProvisionedThroughput
{
ReadCapacityUnits = 5,
WriteCapacityUnits = 5
},
Projection = new Projection { ProjectionType = "KEYS_ONLY" },
KeySchema = new List<KeySchemaElement> {
new() { AttributeName = "name", KeyType = "HASH"}
}
};
EDIT: I did some more digging and found out, that in fact GSI size is the same as the whole table.
...
"TableSizeBytes": 5435537,
"ItemCount": 1792,
"TableArn": "arn:aws:dynamodb:ddblocal:000000000000:table/artists",
"GlobalSecondaryIndexes": [
{
"IndexName": "NameIndex",
"KeySchema": [
{
"AttributeName": "name",
"KeyType": "HASH"
}
],
"Projection": {
"ProjectionType": "KEYS_ONLY"
},
"IndexStatus": "ACTIVE",
"ProvisionedThroughput": {
"ReadCapacityUnits": 5,
"WriteCapacityUnits": 5
},
"IndexSizeBytes": 5435537,
"ItemCount": 1792,
.....
But why? Is there anything wrong with my Index creation?