Storing relational data in AWS DynamoDB vs AWS RDS

Example data: Cumulus granules/files data

Recently on Cumulus, we wanted to re-examine how we’re storing our granules/files data. For context, on Cumulus a granule is a set of related data science files, so granules are composed of files. The granules and files are stored in separate tables. Granules to files have a one-to-many relationship.

Cumulus is currently using DynamoDB to store this data, which is a NoSQL database, and as a result there is no referential integrity between the file records and the granule record they belong to. This means that when you write a file record to DynamoDB, there’s nothing at the database level enforcing that it must refer to a valid granule record (by that record’s ID usually).

DynamoDB vs RDS

While my first instinct for storing interrelated data is to use RDS/SQL databases, you can model relational data in DynamoDB using a technique called “Global secondary index overloading”. See the section titled “GSI overloading” in this article for an explanation of this technique.

Comparing DynamoDB using GSI overloading to using RDS, these were some takeaways:

  • Using GSI overloading/DynamoDB requires having a very good understanding of the queries that you will run against your data up front so that you optimize your model/indexes to handle them performantly

Conclusion

The biggest takeaway of using DynamoDB vs RDS for relational data is:

DynamoDB can handle/mimic relational data queries when the queries are predictable, but it is not at all the right solution for ad-hoc/unpredictable queries

In other words, the efficiency of using DynamoDB/GSI overloading depends on your queries being well understood up front so that the model/indexes can be designed to accommodate them. However, if you throw a query at DynamoDB that it was not designed to handle well, it may require table scanning and be very slow/costly. With RDS/SQL, the schema and data entity attributes are well defined, so supporting a new query on a given field can usually be achieved efficiently and without much difficulty (by adding new indexes, etc).

Trying to solve more problems than I create

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store