Deleting and Rebuilding Dynamodb Table

Sometimes you might want to update a large DynamoDB table with terrabytes of data. This could involve updating a certain field with a new value, or adding additional field to each record. If this table needs to be online at all time, and if this data cannot be rebuilt from another source, then this might be an issue.

If the table can go offline for a while, and if the table can be built from another source, then one consideration that comes to mine is to use a transformation job, either via AWS EMR or AWS Glue Job, to scan the table, and update it accordingly. Although this option is straightfoward, it is or can be costly, as you will need to consider the cost of reading the table and updating each record. This is because running a transformation for a long period of time with large RCU and WCU can incur cost quickly.

Here is an alternative: you can delete the table, and build it again from the source. This is a less constly approach if you consider that DynamoDB does not incur cost on deleting, and there is no transformation job involved. The cost here is only the WCU done on the table when rebuilding it. Of course, if rebuilding the table is itself a constly operation, then deleting the table might be a terrible option.

With this option, however, you want to ensure that if you are using an infrastructure-as-code method of deployments, that his will be be affected by the delete-and-build operation. After testing delete-and-build operation in your non-prod environment, test your infrastructure-as-code solution against the environment to ensure that the table is detected correctly.

In my case, I use CDK to manage my infrastructure, and I noticed that rebuilding the table does not cause CDK runs to fail. On the other hand, the CDK does not update the table to ensure that the table matches the specifications mentioned in the CDK. This is likely because CDK only deploys changes if it detects changes in made in the CDK. To work around this, I had to recreate the table to match the CDK configuration exactly.

As with everything, test the changes before making the changes in production.