SQL vs. NoSQL

This blog can be treat as a brief summary of THIS BLOG.

THE DOOR OPENS ON A VIEW OF MOUNTAINS

Projects where SQL is ideal:

logical related discrete data requirements which can be identified up-front
data integrity is essential
standards-based proven technology with good developer experience and support.

Projects where NoSQL is ideal:

unrelated, indeterminate or evolving data requirements
simpler or looser project objectives, able to start coding immediately
speed and scalability is imperative.

BACKGROUND

SQL is a abbreviation of Structured Query Language, which usage exploded in the late 1990s with the rising popularity of web applications and open-source options such as MySQL, PostgreSQL and SQLite.

NoSQL databases have existed since the 1960s, but have been recently gaining traction with popular options such as MongoDB, CouchDB, Redis and Apache Cassandra.

SQL ACID vs. NoSQL CAP

ACID

A — Atomicity
The property which guarantees atomic operations, either a set of queries can complete as a whole or none does. This is the key feature for transactions.
C — Consistency
Data are available as soon as they are completely inserted or updated.
I — Isolation
Implies that, transactions are independent. Therefore data will not be negatively affected by two transactions happening on same set of data.
D — Durability
Committed data after a transaction or any other operation is never lost. Either they get inserted or failure is notified (Failed transactions).

CAP

C — Consistency
This demonstrates the guarantee on the execution of updates and the availability of the updates as soon as they are acknowledged to the updater. In simpler terms if a database is consistent, updates are available as soon as they are completed, which is not a guarantee in a distributed environment.
A — Availability
This demonstrates the property of a database where it is capable to serve a request. Most of the SQL databases drop queries if the load/execution times are greater. Availability is expected to be very high and response times are expected to be very low in NoSQL databases by elimination of transactional properties that are present in SQL databases.
P — Partition tolerance
The property of databases being able to function with failures among nodes due to network issues. For an example a database may contain several nodes (MongoDB nodes) that work together (By a mechanism such as Mapreduce). The property is preserved if the database as a whole can operate even one or more nodes are unreachable in a distributed environment.

SQL Tables vs. NoSQL Documents

SQL databases provide a store of related data tables, like this:

ID	fname	lname	age
1	Jack	Wan	20

while NoSQL database provide a store of JSON-like documents, like this:

{
  ID: 1,
  fname: "Jack",
  lname: "Wan",
  age: 20
}

Similar documents can be stored in a Collection in NoSQL, which is analogous to the tables of SQL. Furthermore, you can store any data you like in any document in NoSQL while you can only store certain types of data in SQL.

SQL Schema vs. NoSQL Schemaless

In an SQL database, you can’t add data until you define tables and field types in what’s referred to as a schema. The schema optionally contains other information, such as —

primary keys — unique identifiers which apply to a single record
indexes — commonly queried fields indexed to aid quick searching
relationships — logical links between data fields
functionality such as triggers and stored procedures.

But in a NoSQL database, data can be added anywhere, at any time. There’s no need to specify a document design or even a collection up-front, in MongoDB, you can do something like this:

db.users.insert (
  ID: 1,
  fname: "Jack",
  lname: "Wan",
  age: 20
);
// MongoDB will automatically add a unique _id value to each document in a collection.

SQL Normalization vs. NoSQL Denormalization

In SQL, if we want to add country information to our users, we can add a foreign key country_id to our users table. This minimizes data redundancy; we’re not repeating the country information for every user — only the reference to it. This technique is known as normalization, and has practical benefits. We can update a single country without changing user data.

In NoSQL, we can also do something like this:

{
  ID: 1,
  fname: "Jack",
  lname: "Wan",
  age: 20,
  country_id: "EA001"
}

However, this is not always practical, whose reasons will be explained later. We may opt to denormalize our documents and repeat country information for every user:

{
  ID: 1,
  fname: "Jack",
  lname: "Wan",
  age: 20,
  country: {
    region: "East Asia",
    name: "China"
  }
}

And this will lead to faster queries, but updating the country info will cost a lot - significantly worse performance.

SQL Relational JOIN vs. NoSQL

In SQL, we can obtain related data in multiple tables by simply using a single SQL JOIN statement:

SELECT users.fname, users.lname, countries.name
FROM users
LEFT JOIN users.country_id ON countries.id;

But NoSQL has no equivalent of JOIN(It may not be necessarily true anymore for MongoDB has developed a $lookup operator), which becomes a reason why denormalization is often essential.

SQL Data Integrity vs. NoSQL

Most SQL databases allow you to enforce data integrity rules using foreign key constraints. And the also schema enforces these rules for the database to follow.

In NoSQL databases, there is no such data integrity options.

Performance

NoSQL is regularly quoted as being faster than SQL. NoSQL’s simpler denormalized store allows you to retrieve all information about a specific item in a single request. There’s no need for related JOINs or complex SQL queries.

That said, your project design and data requirements will have most impact. A well-designed SQL database will almost certainly perform better than a badly designed NoSQL equivalent and vice versa.

Scalability

As your data grows, you may find it necessary to distribute the load among multiple servers. This can be tricky for SQL-based systems. How do you allocate related data? Clustering is possibly the simplest option; multiple servers access the same central store — but even this has challenges.

NoSQL’s simpler data models can make the process easier, and many have been built with scaling functionality from the start.

November 2, 2017 · 数据库