Apache Cassandra is an open-source NoSQL database engine with fault tolerance, linear scalability, and consistency across multiple nodes. Apache Cassandra’s distributed architecture allows it to manage massive amounts of data with dynamo-style replication. When replicas are kept on several nodes in a cluster, you’ll achieve high availability and zero points of failure. Cassandra has been making the lives of enterprises in web development, software engineering, and data analysis easier with its unique yet highly efficient capacity to provide real-life performance and experience. It’s no surprise that Prashant Malik and Avinash Lakshman created Cassandra on Facebook in 2008. If you’re wondering how to install and configure Apache Cassandra on Linux, you’re just a step away from learning the process.
This tutorial will show you how to install and configure Apache Cassandra on your Ubuntu machine. Find other tutorials here: How to Install Gradle on Ubuntu, how to install and configure Microsoft Teams on Ubuntu, how to install PostgreSQL on Ubuntu, how to install and configure Microsoft Teams on Ubuntu, how to install, configure Prometheus for Monitoring on a Linux Server.
Benefits of Installing Apache Cassandra
- Open Source: Because it is open-source, businesses can use it for free! you can download it without worrying about your wallet. It’s not the end of open sourcing’s goodness; there’s also the Cassandra community, which brings together niche-specific people to discuss various elements of this massive open-source project. You can use this in conjunction with other Apache projects.
- Fault Tolerance: Not only does Apache Cassandra safeguard the data, but it also stores it in several locations. Even if one server fails or is hacked, the user can easily recover the information from another site. It is entirely up to you to determine how many replications you wish to establish. Furthermore, the Cassandra’s high-level backup and recovery capabilities activate s these replications.
- Highly Available: High Availability results from replicating data across multiple locations and data centers. The architecture is peer-to-peer, and it is also masterless, which means that every node can read and write data. In addition, this makes data instantly replicated between data centers and continents because of this.
- Multi-Data Center and Hybrid Cloud Support: You can employ hybrid cloud support as well as access numerous data centers. Cassandra is a distributed system that allows the deployment of enormous numbers of nodes across many data centers.
Requirements for Installing Apache Cassandra on Linux
Before proceeding with installing and configuring Apache Cassandra on Linux, ensure you meet the necessary requirements. This include Linux server, Java Development Kit installed, etc.
A computer with administrative privileges. Users can easily achieve this by running the command on a machine with Ubuntu installed on it.
Step 1: Update Your Computer
Step 2: Installing Java on Ubuntu
Checking whether Java is installed is the first step in installing Apache Cassandra. To be more explicit, OpenJDK is required for Apache Cassandra to run well. When you install a different version, you’re more likely to get configuration issues. If it’s already installed, you can check with the command
root@ubuntu:/home/rdgmh# java --version openjdk 11.0.14 2022-01-18 OpenJDK Runtime Environment (build 11.0.14+9-Ubuntu-0ubuntu2.20.04) OpenJDK 64-Bit Server VM (build 11.0.14+9-Ubuntu-0ubuntu2.20.04, mixed mode, sharing)
If you need to install version 17.02, for example, you can use the command below:
$ sudo apt install openjdk-8-jdk -y
Step 3: Install Apache Cassandra on Linux
To install the latest version of Apache Cassandra on Linux server, you need to add the Apache Cassandra repository. Thus, to allow access to repositories using the https protocol, first install the apt-transport-https package “
sudo apt install apt-transport-https“.
root@ubuntu:/home/rdgmh# sudo apt install apt-transport-https Reading package lists... Done Building dependency tree Reading state information... Done The following NEW packages will be installed: apt-transport-https 0 upgraded, 1 newly installed, 0 to remove and 257 not upgraded. Need to get 4,680 B of archives. After this operation, 162 kB of additional disk space will be used. Get:1 http://us.archive.ubuntu.com/ubuntu focal-updates/universe amd64 apt-transport-https all 2.0.6 [4,680 B] Fetched 4,680 B in 1s (3,197 B/s) Selecting previously unselected package apt-transport-https. (Reading database ... 184096 files and directories currently installed.) Preparing to unpack .../apt-transport-https_2.0.6_all.deb ... Unpacking apt-transport-https (2.0.6) ... Setting up apt-transport-https (2.0.6) ...
wget command to import the GPG key with the command below:
wget -q -O - https://www.apache.org/dist/cassandra/KEYS | sudo apt-key add -
Add Apache Cassandra’s repository to the system’s sources list file as shown with the command below:
$ echo "deb http://www.apache.org/dist/cassandra/debian 40x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list deb http://www.apache.org/dist/cassandra/debian 40x main'
Ensure that the package list is updated:
sudo apt update
Then, install Apache Cassandra using the command:
sudo apt install cassandra
You could still verify Cassandra if you want:
sudo systemctl status cassandra.service
STEP 4: Configure Apache Cassandra
If using Cassandra in a cluster, you can customize the main settings using the
cassandra.yaml file. However, it is advisable to create a backup of your
cassandra.yaml file if you intend to edit it. To do so, use this command:
sudo cp /etc/cassandra/cassandra.yaml /etc/cassandra/cassandra.yaml.backup
We used the /etc/cassandra directory as a destination for the backup, but you can change the path as you see fit. We choose the /etc/cassandra directory as the backup destination, but you may alter it as you see appropriate.
Rename Apache Cassandra Cluster
The next step to take in installing Apache Cassandra is renaming the Cassandra cluster.
Then, use a text editor of your choice to open the cassandra.yaml file
sudo vim /etc/cassandra/cassandra.yaml
Afterward, locate the line that says cluster name: Test Cluster is the default name. That is the first thing you should do when you begin working with Cassandra. Exit and save the file if you don’t want to make any more modifications.
Add IP Addresses of Cassandra Nodes
Another item you should add to the Cassandra. If you’re operating a
cluster.yaml is the IP address of each node. Access the configuration file and look for the seeds item under the
seed _provider section:
STEP 5: Cassandra Command-Line Shell
Cassandra includes a command-line utility (CLI). Cassandra Query Language (CQL) is used for communication. Open the terminal and type “
cqlsh” to begin a new shell. The result is as shown below
root@ubuntu:/home/rdgmh# :$ cqlsh connected to Apache Cassandra Cluster at 127.0.0.1.:9042. [cqLsh 5.0.1.a. | cassanda 4.0 | CQL spec 3.4.5 | NatIve protocol v4] use HELP for hetp. cqLsh>
A shell window appears, displaying the default cluster connection. If you modified the
cluster_name option, it would display the one you specified in the configuration file and in our own file, it was renamed as
Apache Cassandra Cluster. The connection to localhost shown above is the default.
Congratulation! You’ve successfully installed and configured Apache Cassandra on Linux s
You should have a functional Cassandra installation on your Ubuntu machine if you follow these simple instructions. In addition, we demonstrated how to change the most essential settings in the Cassandra configuration file. Make a backup of the conf file just in case, and you’re ready to use the Cassandra database software. Remember, knowing how to install and configure Apache Cassandra on Linux is just the beginning. Familiarize yourself with how it works.