Apache Cassandra is a highly scalable, high-performance distributed database designed to handle large amounts of data on many commodity servers, providing high availability without a single point of failure. It is a type of NoSQL database. Let’s first understand what a NoSQL database does.
NoSQL database (sometimes referred to as Not Only SQL) is a database that provides a mechanism for storing and retrieving data other than the tabular relationships used in relational databases. These databases have no schemas, support easy replication, have a simple API, eventually consistent, and can handle large amounts of data.
The primary goal of a NoSQL database is to have
- simplicity of design
- scale-out, and
- finer control over availability.
NoSql databases use different data structures compared to relational databases. It makes some operations faster in NoSQL. The suitability of a given NoSQL database depends on the problem it needs to solve.
. Relational Database
The following table lists the points that differentiate a relational database from a NoSQL database.
NoSql Database Relational Database Supports a powerful query language. Supports very simple query language. It has a fixed scheme. No fixed scheme. It follows ACID (atomicity, consistency, insulation and durability). It is only “eventually consistent.” Supports transactions. It does not support transactions.
In addition to Cassandra, we have the following NoSQL databases that are quite popular
: Apache HBase,
HBase is a distributed, non-relational, open-source database modeled after Google’s BigTable and written in Java. It is developed as part of the Apache Hadoop project and runs on top of HDFS, providing BigTable-like capabilities for Hadoop.
MongoDB – MongoDB is a document-oriented, cross-platform database system that eschews the use of traditional table-based relational database structure in favor of JSON-like documents with dynamic schemas that make integrating data into certain types of applications easier and faster.
What is Apache Cassandra
? Apache Cassandra is
an open-source, distributed, and decentralized/distributed storage system (database) for managing large amounts of structured data distributed across the globe. It provides a high availability service without a single point of failure.
Listed below are some of the notable points of Apache Cassandra
is scalable, fault-tolerant, and consistent
It is a column-oriented database
Its distribution design is based on Amazon’s Dynamo and its data model on Google’s
Built on Facebook, it differs markedly from relational database management systems.
Cassandra implements a Dynamo-style replication model without a single point of failure, but adds a more efficient “column family” data model.
Cassandra is being used by some of the biggest companies like Facebook, Twitter, Cisco, Rackspace, ebay, Twitter, Netflix and more.
Cassandra has become so popular due to its excellent technical characteristics. Below are some of the features of
Elastic scalability: Cassandra is highly scalable; it allows you to add more hardware to accommodate more clients and more data as per requirements
Always-on architecture: Cassandra has no single point of failure and is continuously available for business-critical applications that cannot afford a failure.
Fast linear scale performance − Cassandra is linearly scalable, that is, it increases its performance as the number of nodes in the cluster increases. Thus, it maintains a fast response time.
Cassandra adapts to all possible data formats, including: structured, semi-structured and unstructured. You can dynamically accommodate changes in your data structures according to your needs.
Easy data distribution: Cassandra provides the flexibility to distribute data where you need it by replicating data across multiple data centers.
Transaction Support: Cassandra supports properties such as atomicity, consistency, isolation, and durability (ACID).
Quick Writes: Cassandra was designed to run on cheap hardware. It performs incredibly fast writes and can store hundreds of terabytes of data, without sacrificing read efficiency.
Flexible data storage:
Cassandra’s Story Cassandra was developed on Facebook
- for inbox search.
- It was open source by Facebook in July
- Cassandra was accepted into the Apache Incubator in March
- It became a high-level Apache project since February 2010.