Apache Cassandra is an open-source NoSQL database engine with fault tolerance, linear scalability, and consistency across multiple nodes. Apache Cassandra’s distributed architecture allows it to manage massive amounts of data with dynamo-style replication. When replicas are kept on several nodes in a cluster, high availability and zero points of failure are achieved. Cassandra has been making the lives of enterprises in web development, software engineering, and data analysis easier with its unique yet highly efficient capacity to provide real-life performance and experience. It’s no surprise that Prashant Malik and Avinash Lakshman created Cassandra on Facebook in 2008. This tutorial will show you how to set up and configure your Ubuntu machine. Other tutorials can be found here: How to Install Gradle on Ubuntu, how to install and configure Microsoft Teams on Ubuntu, how to install PostgreSQL on Ubuntu, how to install and configure Microsoft Teams on Ubuntu, how to install, configure Prometheus for Monitoring on a Linux Server.
Benefits of Cassandra
- Open Source: Because it is open-source, businesses can use it for free! you can download it without worrying about your wallet. It’s not the end of open sourcing’s goodness; there’s also the Cassandra community, which brings together niche-specific people to discuss various elements of this massive open-source project. It can even be used in conjunction with other Apache projects.
- Fault Tolerance: Not only does Apache Cassandra safeguard the data, but it also stores it in several locations. Even if one server fails or is hacked, the user can easily recover the information from another site. It is entirely up to you how many replications you wish to establish, which are then activated by Cassandra’s high-level backup and recovery capabilities.
- Highly Available: High Availability is achieved by replicating data across multiple locations and data centers, resulting in high availability. The architecture is peer-to-peer, and it is also masterless, which means that every node can read and write data. Data may be instantly replicated between data centers and continents because of this.
- Multi-Data Center and Hybrid Cloud Support: You can employ hybrid cloud support as well as access numerous data centers. Cassandra is a distributed system that allows enormous numbers of nodes to be deployed across many data centers.
A computer with administrative privileges. This can easily be achieved by running the command on a machine with Ubuntu installed on it.
Step 1: Update Your Computer
Step 2: Installing Java on Ubuntu
Checking whether Java is installed is the first step in installing Apache Cassandra. To be more explicit, OpenJDK is required for Apache Cassandra to run well. When you install a different version, you’re more likely to get configuration issues. If it’s already installed, you can check with the command
root@ubuntu:/home/rdgmh# java --version openjdk 11.0.14 2022-01-18 OpenJDK Runtime Environment (build 11.0.14+9-Ubuntu-0ubuntu2.20.04) OpenJDK 64-Bit Server VM (build 11.0.14+9-Ubuntu-0ubuntu2.20.04, mixed mode, sharing)
If you need to install version 17.02, for example, you can use the command below:
$ sudo apt install openjdk-8-jdk -y
Step 3: Install Apache Cassandra in Ubuntu
To allow access to repositories using the https protocol, first install the apt-transport-https package “
sudo apt install apt-transport-https“.
root@ubuntu:/home/rdgmh# sudo apt install apt-transport-https Reading package lists... Done Building dependency tree Reading state information... Done The following NEW packages will be installed: apt-transport-https 0 upgraded, 1 newly installed, 0 to remove and 257 not upgraded. Need to get 4,680 B of archives. After this operation, 162 kB of additional disk space will be used. Get:1 http://us.archive.ubuntu.com/ubuntu focal-updates/universe amd64 apt-transport-https all 2.0.6 [4,680 B] Fetched 4,680 B in 1s (3,197 B/s) Selecting previously unselected package apt-transport-https. (Reading database ... 184096 files and directories currently installed.) Preparing to unpack .../apt-transport-https_2.0.6_all.deb ... Unpacking apt-transport-https (2.0.6) ... Setting up apt-transport-https (2.0.6) ...
wget command to import the GPG key with the command below:
wget -q -O - https://www.apache.org/dist/cassandra/KEYS | sudo apt-key add -
Add Apache Cassandra’s repository to the system’s sources list file as shown with the command below:
$ echo "deb http://www.apache.org/dist/cassandra/debian 40x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list deb http://www.apache.org/dist/cassandra/debian 40x main'
Ensure that the package list is updated:
sudo apt update
install Apache Cassandra using the command:
sudo apt install cassandra
You could still verify Cassandra if you want:
sudo systemctl status cassandra.service
STEP 4: Further Configuration of Apache Cassandra
If using Cassandra in a cluster, you can customize the main settings using the
cassandra.yaml file. it is advisable to create a backup of your
cassandra.yaml file if you intend to edit it. To do so, use this command:
sudo cp /etc/cassandra/cassandra.yaml /etc/cassandra/cassandra.yaml.backup
We used the /etc/cassandra directory as a destination for the backup, but you can change the path as you see fit. We choose the /etc/cassandra directory as the backup destination, but you may alter it as you see appropriate.
Rename Apache Cassandra Cluster
Use a text editor of your choice to open the cassandra.yaml file
sudo vim /etc/cassandra/cassandra.yaml
Locate the line that says cluster name: Test Cluster is the default name. That is the first thing you should do when you begin working with Cassandra. Exit and save the file if you don’t want to make any more modifications.
Add IP Addresses of Cassandra Nodes
Another item you should add to the Cassandra. If you’re operating a
cluster.yaml is the IP address of each node. Access the configuration file and look for the seeds item under the
seed _provider section:
STEP 5: Cassandra Command-Line Shell
Cassandra includes a command-line utility (CLI). Cassandra Query Language (CQL) is used for communication. Open the terminal and type “
cqlsh” to begin a new shell. The result is as shown below
root@ubuntu:/home/rdgmh# :$ cqlsh connected to Apache Cassandra Cluster at 127.0.0.1.:9042. [cqLsh 5.0.1.a. | cassanda 4.0 | CQL spec 3.4.5 | NatIve protocol v4] use HELP for hetp. cqLsh>
A shell window appears, displaying the default cluster connection. If you modified the
cluster_name option, it would display the one you specified in the configuration file and in our own file, it was renamed as
Apache Cassandra Cluster. The connection to localhost shown above is the default.
You should have a functional Cassandra installation on your Ubuntu machine if you follow these simple instructions. In addition, we demonstrated how to change the most essential settings in the Cassandra configuration file. Make a backup of the conf file just in case, and you’re ready to use the Cassandra database software.