Skip to content

TechDirectArchive

Hands-on IT, Cloud, Security, Veeam & DevOps

  • Home
  • About
  • Advertise With US
  • Reviews
  • Contact
  • Toggle search form

How to Install Hadoop on Linux

Posted on 30/03/202409/04/2024 Raphael Gab-Momoh By Raphael Gab-Momoh No Comments on How to Install Hadoop on Linux
  1. Home
  2. Network | Monitoring
  3. How to Install Hadoop on Linux
Hadoop-installation

In this guide, we shall discuss how to Install Hadoop on Linux. With the use of straightforward programming paradigms, the Apache Hadoop software library provides a framework for the distributed processing of massive data sets over networks of computers. Please see how to fix MySQL Workbench could not connect to MySQL server, fix “WARNING: The provided hosts list is empty only the localhost is available and note that the implicit localhost does not match all“, and How to perform SSH key-based authentication in Linux.

It is intended to scale up from a small number of servers to thousands of devices, each of which provides local computing and storage. The library itself is designed to identify and manage faults at the application layer, so it may give an available service on top of a cluster of computers, each of which may be prone to failures, rather than relying on hardware to deliver high availability.

Also, see how to Associate SSH Public key with Azure Linux VM, and how to install Java Runtime Environment on Mac OS.

Prerequisites to installing Hadoop on Linux

  • Ubuntu 18.04 or Higher
  • Access to a command line tool
  • Sudo or root privileges on local /remote machines

Step1: Install OpenJDK on Ubuntu

A suitable Java Runtime Environment (JRE) and Java Development Kit are necessary for the Hadoop framework’s services, which are developed in Java (JDK). Before beginning a new installation, use the following command to update your system:

sudo apt update

Currently, Apache Hadoop 3 fully supports Java 8.x. Both the runtime environment and the development kit are included in the Ubuntu OpenJDK 8 package.

To install OpenJDK 8 in your terminal, enter the following command:

sudo apt install openjdk-8-jdk headless -y

The interaction between components of a Hadoop ecosystem might be impacted by the OpenJDK or Oracle Java version. Check the current Java version when the installation is finished:

java -version; javac -version

Which Java edition is being used is revealed in the output?

java-version
java version

Step2: Create a Non-Root User in the Hadoop Environment

Particularly for the Hadoop environment, it is preferable to create a non-root user. You may more effectively manage your cluster and increase security by using a unique user.

The user must be able to create a passwordless SSH connection with localhost in order for Hadoop services to operate without interruption.

Install OpenSSH on Ubuntu

Install the OpenSSH server and client using the following command:

sudo apt install openssh-server openssh-client -y
ssh1
open ssh installed
ssh2
open ssh installed

configure the SSH using the command

nano /etc/ssh/sshd_config

You can choose to change the port to anything you want

ssh-port
ssh config
  • Change the port number to the value of your choice. Make sure there is no “#” at the beginning of the line.
  • Exit the editor and confirm that you want to save the changes.
  • For the changes to take effect, restart the sshd service with this command:
service sshd restart

Create Hadoop User

To add a new user to Hadoop, use the adduser command:

sudo adduser hadoop
adduser
adduser

In the given instance, the username is hadoop. Any username and password that you want to use are acceptable. Change the current user to the newly created one, then enter the associated password:

su - hadoop
su-hadoop
su hadoop

Now, the user must be able to connect to localhost over SSH without being requested for a password.

Enable Passwordless SSH for Hadoop User

Create an SSH key pair and specify where it should be kept with the command:

ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
rsa
keygen

To save the public key as authorized keys in the ssh directory, use the cat command:

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Set the permissions for your user with the chmod command:

chmod 0600 ~/.ssh/authorized keys

Note: A password is no longer required each time the new user wants to SSH.

Use the hadoop user to SSH to localhost to ensure everything is configured properly.

ssh localhost
ssh-localhost
ssh localhost

The Hadoop user may now effortlessly create an SSH connection to the localhost after a brief question.

Step3: Download and Install Hadoop on Ubuntu

You can use wget to download it from the official website

hadoopbinary
binary

Download and extract

hadoopdownload
wget https://downloads.apache.org/hadoop/common/hadoop-3.3.3/hadoop-3.3.3.tar.gz
wget-hadoop
wget hadoop

Once the download is complete, extract the files to initiate the Hadoop installation:

tar xzf hadoop-3.3.3.tar.gz

Now, the hadoop-3.3.3 directory is where you may find the Hadoop binary files.

Step4: Single Node Hadoop Deployment

Hadoop performs best when set up on a sizable networked cluster of machines in a fully distributed configuration. However, you may set up Hadoop on a single node if you’re unfamiliar with it and wish to investigate fundamental commands or test out applications.

With this configuration, also known as pseudo-distributed mode, each Hadoop daemon can function as a separate Java process. Editing a collection of configuration files listed below allows you to customize a Hadoop environment:

  1. bashrc
  2. hadoop-env.sh
  3. core-site.xml
  4. hdfs-site.xml
  5. mapred-site-xml
  6. yarn-site.xml

Configure Hadoop Environment Variables (bashrc)

Edit the .bashrc shell configuration file using a text editor of your choice (we will be using vim):

sudo vim .bashrc

I hope you found the steps on how to Install Hadoop on Linux very useful. Please feel free to leave a comment below.

Rate this post

Thank you for reading this post. Kindly share it with others.

  • Share on X (Opens in new window) X
  • Share on Reddit (Opens in new window) Reddit
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on Facebook (Opens in new window) Facebook
  • Share on Pinterest (Opens in new window) Pinterest
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Telegram (Opens in new window) Telegram
  • Share on WhatsApp (Opens in new window) WhatsApp
  • Share on Mastodon (Opens in new window) Mastodon
  • Share on Bluesky (Opens in new window) Bluesky
  • Share on Threads (Opens in new window) Threads
  • Share on Nextdoor (Opens in new window) Nextdoor
Network | Monitoring

Post navigation

Previous Post: How to Install SonarQube on Ubuntu 20.04 LTS
Next Post: How to encrypt your system with Trellix Data Encryption

Related Posts

  • Featured image Excel crash 1
    How to Fix Microsoft Excel Crash Issues Network | Monitoring
  • Featured image Two Factor Authentication
    Change Two-Factor Authentication in Microsoft 365/Office 365 Network | Monitoring
  • Screenshot 1
    Fix SMB Freezes That Break Backups on Critical Linux Servers Network | Monitoring
  • Veeam ONE Update
    How to Upgrade Veeam ONE to 13.0.2.6723 to Address Security Fixes Backup
  • cisco switches 2
    How to disable Spanning-Tree Globally Network | Monitoring
  • Microaoft Edge
    Bing AI-Powered Copilot: How to install Microsoft Edge on macOS Network | Monitoring

More Related Articles

Featured image Excel crash 1 How to Fix Microsoft Excel Crash Issues Network | Monitoring
Featured image Two Factor Authentication Change Two-Factor Authentication in Microsoft 365/Office 365 Network | Monitoring
Screenshot 1 Fix SMB Freezes That Break Backups on Critical Linux Servers Network | Monitoring
Veeam ONE Update How to Upgrade Veeam ONE to 13.0.2.6723 to Address Security Fixes Backup
cisco switches 2 How to disable Spanning-Tree Globally Network | Monitoring
Microaoft Edge Bing AI-Powered Copilot: How to install Microsoft Edge on macOS Network | Monitoring

Leave a Reply Cancel reply

You must be logged in to post a comment.

Microsoft MVP

VEEAMLEGEND

vexpert-badge-stars-5

Virtual Background

GoogleNews

Categories

veeaam100

Veeam Vanguard

  • Nakivo windows server backup
    How to perform Windows Server backup with Nakivo or Windows Server backup utility Windows Server
  • enable or disable WinRM
    How to enable or disable WinRM via the command-line Network | Monitoring
  • How to Block IP Addresses Using Group Policy (GPO) in Active Directory
    Block IP Addresses Using Group Policy (GPO) in Active Directory Network | Monitoring
  • ChatGPT on Linux
    Set Up and Use ChatGPT in Linux Terminal Linux
  • featuredpkg
    How to solve /var/lib/dpkg/lock Error in Ubuntu Linux Linux
  • microsoft ntlm2
    NT LAN Manager: How to prevent NTLM credentials from being sent to remote servers Security | Vulnerability Scans and Assessment
  • Feature image  Error Code 0xC1900101 – 0x30018
    How to Fix Windows Update Error Code 0xC1900101 – 0x30018 Windows
  • google chrome web browser download icon png favpng 2fg4fswmttnwqnvax7lrd1hxp
    How to remove Quick Access from Google Drive Online Windows

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 1,791 other subscribers
  • RSS - Posts
  • RSS - Comments
  • About
  • Authors
  • Write for us
  • Advertise with us
  • General Terms and Conditions
  • Privacy policy
  • Feedly
  • Telegram
  • Youtube
  • Facebook
  • Instagram
  • LinkedIn
  • Tumblr
  • Pinterest
  • Twitter
  • mastodon

Tags

Active Directory Azure Bitlocker Microsoft Windows PowerShell WDS Windows 10 Windows 11 Windows Deployment Services Windows Server 2016

Copyright © 2025 TechDirectArchive

Loading Comments...

You must be logged in to post a comment.