Here we’ll revisit some of the more conceptual differences between using SQL and NoSQL databases, and touch upon the lambda architecture, which is a model to help you think about working with large and/or complex datasets.

Input comes for a large part from the book by Marz and Warren “Big Data: Principles and best practices of scalable realtime data systems” (Manning Publications, 2015).

Note: whenever I mention “application” below, I’m not necessarily talking about a program complete with user interface that you can install on your Windows machine as a .exe or something that ends up in your Applications folder if you’re using OSX. “Application” is more generic than this, and also covers cases where you’re investigating data in R or SAS (either in an interactive or non-interactive way).