Parallel Computing
Parallel computing partitions a task into identical subtasks each of
which has its own processing power (cpu) but shared memory address space
For example, a code that
computes the sum of the first N integers is very easy to program in parallel.
Each processor will be responsible for computing a fraction of the global sum.
However, at the end of the execution, all processors have to communicate their
local sums to a single processor for example that will add them all up to
obtain the global sum.
Distributed Computing
A distributed system consists of multiple autonomous computers that
communicate through a computer network. These computers interact with each other in
order to achieve a common goal and they have their own local memory.
Distributed computing can
scale better than parallel computing because you can add more nodes if you run
short of memory. Hadoop, ElasticSearch and most NoSQL databases are some
examples of open-source projects which leverage concept of distributed
computing.
Cluster
Cluster is set computers they mostly work on same task and can be viewed
or treated as single system. They are mostly used for load balancing, high
availability and high computing needs.
Node
Each system in a cluster is known as Node. In
other words a cluster is formed with set of Nodes. Each node has its own
operating system, memory etc.
High Availability
High Availability in a cluster means atleast
one node is available for the user to access the cluster. For example if one
node fails, the data in that node has to be replicated to other nodes in short time
so that the user can access that. HA clusters usually use a heartbeat private
network connection which is used to monitor the health and status of each node
in the cluster. Load balancing servers are also one example of high
availability clusters.
Vertically Scalable
Scaling vertically (or scale up) means to add resources to a single node in a
system, typically involving the addition of CPUs or memory to a single computer
when performance degrades. Most of the relational databases are
vertically scalable.
Horizontally Scalable
Horizontal scalability is the capability of a
system (cluster) to accept adding or removing multiple nodes (independent units
of resources) and making them work as a single system. To scale
horizontally (or scale out) usually we add
more nodes to a system. For example we can increase the number of nodes in a
load balancing cluster from two to three.
Single point of failure
In a cluster
if single node fails, the entire cluster will stop working. This kind of
architecture is called single point of failure. High availability clusters
should be architectured in a way to avoid single point of failure.
shared nothing architecture
In shared nothing architecture, each node is
independent and self-sufficient, and there is no single point of failure across the system. None of
the nodes in the cluster shares memory or disk storage.
Sharding
A shard is a horizontal partition of a database
table or data. In most of the NoSQL systems a single table will be split into
multiple shards. These shards will be shared across the nodes in the cluster.
Replication
It is the process of creating redundant copies
of shards to achieve partition tolerance. Usually two copies of same shard will
not be stored in a single node.
Fault Tolerance
Fault-tolerant describes a cluster designed so
that, in the event that a node fails, a backup component or procedure can immediately
take its place with no loss of service.
Elasticity
Elasticity is the ability to add new hardware
to a cluster without any interruption or downtime. It should not require
reconfiguration and supports incremental addition or removal of hardware.