SQL vs. NoSQL

This blog can be treat as a brief summary of THIS BLOG.

THE DOOR OPENS ON A VIEW OF MOUNTAINS

Projects where SQL is ideal:

  1. logical related discrete data requirements which can be identified up-front
  2. data integrity is essential
  3. standards-based proven technology with good developer experience and support.

Projects where NoSQL is ideal:

  1. unrelated, indeterminate or evolving data requirements
  2. simpler or looser project objectives, able to start coding immediately
  3. speed and scalability is imperative.

   

BACKGROUND

SQL is a abbreviation of Structured Query Language, which usage exploded in the late 1990s with the rising popularity of web applications and open-source options such as MySQL, PostgreSQL and SQLite.

NoSQL databases have existed since the 1960s, but have been recently gaining traction with popular options such as MongoDB, CouchDB, Redis and Apache Cassandra.

SQL ACID vs. NoSQL CAP

ACID

CAP

SQL Tables vs. NoSQL Documents

SQL databases provide a store of related data tables, like this:

IDfnamelnameage
1JackWan20

while NoSQL database provide a store of JSON-like documents, like this:

{
  ID: 1,
  fname: "Jack",
  lname: "Wan",
  age: 20
}

Similar documents can be stored in a Collection in NoSQL, which is analogous to the tables of SQL. Furthermore, you can store any data you like in any document in NoSQL while you can only store certain types of data in SQL.

SQL Schema vs. NoSQL Schemaless

In an SQL database, you can’t add data until you define tables and field types in what’s referred to as a schema. The schema optionally contains other information, such as —

But in a NoSQL database, data can be added anywhere, at any time. There’s no need to specify a document design or even a collection up-front, in MongoDB, you can do something like this:

db.users.insert (
  ID: 1,
  fname: "Jack",
  lname: "Wan",
  age: 20
);
// MongoDB will automatically add a unique _id value to each document in a collection. 

SQL Normalization vs. NoSQL Denormalization

In SQL, if we want to add country information to our users, we can add a foreign key country_id to our users table. This minimizes data redundancy; we’re not repeating the country information for every user — only the reference to it. This technique is known as normalization, and has practical benefits. We can update a single country without changing user data.

In NoSQL, we can also do something like this:

{
  ID: 1,
  fname: "Jack",
  lname: "Wan",
  age: 20,
  country_id: "EA001"
}

However, this is not always practical, whose reasons will be explained later. We may opt to denormalize our documents and repeat country information for every user:

{
  ID: 1,
  fname: "Jack",
  lname: "Wan",
  age: 20,
  country: {
    region: "East Asia",
    name: "China"
  }
}

And this will lead to faster queries, but updating the country info will cost a lot - significantly worse performance.

SQL Relational JOIN vs. NoSQL

In SQL, we can obtain related data in multiple tables by simply using a single SQL JOIN statement:

SELECT users.fname, users.lname, countries.name
FROM users
LEFT JOIN users.country_id ON countries.id;

But NoSQL has no equivalent of JOIN(It may not be necessarily true anymore for MongoDB has developed a $lookup operator), which becomes a reason why denormalization is often essential.

SQL Data Integrity vs. NoSQL

Most SQL databases allow you to enforce data integrity rules using foreign key constraints. And the also schema enforces these rules for the database to follow.

In NoSQL databases, there is no such data integrity options.

Performance

NoSQL is regularly quoted as being faster than SQL. NoSQL’s simpler denormalized store allows you to retrieve all information about a specific item in a single request. There’s no need for related JOINs or complex SQL queries.

That said, your project design and data requirements will have most impact. A well-designed SQL database will almost certainly perform better than a badly designed NoSQL equivalent and vice versa.

Scalability

As your data grows, you may find it necessary to distribute the load among multiple servers. This can be tricky for SQL-based systems. How do you allocate related data? Clustering is possibly the simplest option; multiple servers access the same central store — but even this has challenges.

NoSQL’s simpler data models can make the process easier, and many have been built with scaling functionality from the start.

· 数据库