Amazon DynamoDB has a partition as a unit of storage (partitions are automatically replicated across three AZs). Now when it comes to sizing there can be two considerations
- data size
- network throughput
A partition can contain maximum of 10 GB of data so if you have 50 GB of data to be stored, the number of partitions needed will be 50/10 = 5.
Now this criteria is only good if you are happy with default network throughput provided. Let’s define throughput using new terminology.
Read Capacity Units (RCU) – 4 KB/sec
Write Capacity Units (WCU) – 1 KB/sec
Each partition provides 3000 RCUs or 1000 WCUs or a combination. So in short formula to calculate number of partitions is
RCUs needed/3000 + WCUs needed/1000
Now this assumes that your data size is less than 10 GB. If data size is more than 10 GB you would need more than 1 partition even if your read/write needs are met with one partition.
In DynamoDB a row is called an item. A partition contains one or more items. It means it can contain one item also. In that case the size limit on item (including it’s data and metadata) is same as size limit of a partition i.e. 10 GB.
An item in DynamoDB can have two types of keys. Think of it as either a primary key or composite key in relational world.
First type is #hashkey and second type is a combination of #hashkey and range key. Items first get sorted by rangekey.
Now let’s say a partition has 10 items in it. DynamoDB assumes that the load across these items will be evenly distributed. What if one or two keys get more load them others. They are called hot keys. The way to resolve hot key issue is to redesign hash.