Skip to content

TechDirectArchive

Hands-on IT, Cloud, Security & DevOps Insights

  • Home
  • About
  • Advertise With US
  • Reviews
  • Contact
  • Toggle search form
Home » Network | Monitoring » How to Install Hadoop on Linux
  • Screenshot 2022 03 20 at 20.37.16
    Setup HTTPS users using Git credentials and Pushing Code to AWS CodeCommit AWS/Azure/OpenShift
  • Chefconf Poster Desktop Chef 1280x1024 1
    Chef – Node Bootstrapping Configuration Management Tool
  • a Multiple SPF Records  Issues and Examples@2x
    How to setup SPF and TXT Records in AWS AWS/Azure/OpenShift
  • what is saml e1690630549650
    What is SAML – Security Assertion Markup Language Windows Server
  • How to Disable Users Seeing Wi Fi Password on Windows.jpg banner
    How to Prevent Users from Seeing Wi-Fi Password on Windows Network | Monitoring
  • SU
    How to switch users in Linux Linux
  • Featured image Some Settings are managed by your organization
    How to Fix “Some Settings Are Managed by Your Organization” Error in Windows Update Windows
  • Screenshot 2022 03 20 at 21.08.50
    How to integrate AWS CodeBuild and AWS CodeCommit to SonarCloud AWS/Azure/OpenShift

How to Install Hadoop on Linux

Posted on 30/03/202409/04/2024 Raphael Gab-Momoh By Raphael Gab-Momoh No Comments on How to Install Hadoop on Linux
Hadoop-installation

In this guide, we shall discuss how to Install Hadoop on Linux. With the use of straightforward programming paradigms, the Apache Hadoop software library provides a framework for the distributed processing of massive data sets over networks of computers. Please see how to fix MySQL Workbench could not connect to MySQL server, fix “WARNING: The provided hosts list is empty only the localhost is available and note that the implicit localhost does not match all“, and How to perform SSH key-based authentication in Linux.

It is intended to scale up from a small number of servers to thousands of devices, each of which provides local computing and storage. The library itself is designed to identify and manage faults at the application layer, so it may give an available service on top of a cluster of computers, each of which may be prone to failures, rather than relying on hardware to deliver high availability.

Also, see how to Associate SSH Public key with Azure Linux VM, and how to install Java Runtime Environment on Mac OS.

Prerequisites to installing Hadoop on Linux

  • Ubuntu 18.04 or Higher
  • Access to a command line tool
  • Sudo or root privileges on local /remote machines

Step1: Install OpenJDK on Ubuntu

A suitable Java Runtime Environment (JRE) and Java Development Kit are necessary for the Hadoop framework’s services, which are developed in Java (JDK). Before beginning a new installation, use the following command to update your system:

sudo apt update

Currently, Apache Hadoop 3 fully supports Java 8.x. Both the runtime environment and the development kit are included in the Ubuntu OpenJDK 8 package.

To install OpenJDK 8 in your terminal, enter the following command:

sudo apt install openjdk-8-jdk headless -y

The interaction between components of a Hadoop ecosystem might be impacted by the OpenJDK or Oracle Java version. Check the current Java version when the installation is finished:

java -version; javac -version

Which Java edition is being used is revealed in the output?

java-version
java version

Step2: Create a Non-Root User in the Hadoop Environment

Particularly for the Hadoop environment, it is preferable to create a non-root user. You may more effectively manage your cluster and increase security by using a unique user.

The user must be able to create a passwordless SSH connection with localhost in order for Hadoop services to operate without interruption.

Install OpenSSH on Ubuntu

Install the OpenSSH server and client using the following command:

sudo apt install openssh-server openssh-client -y
ssh1
open ssh installed
ssh2
open ssh installed

configure the SSH using the command

nano /etc/ssh/sshd_config

You can choose to change the port to anything you want

ssh-port
ssh config
  • Change the port number to the value of your choice. Make sure there is no “#” at the beginning of the line.
  • Exit the editor and confirm that you want to save the changes.
  • For the changes to take effect, restart the sshd service with this command:
service sshd restart

Create Hadoop User

To add a new user to Hadoop, use the adduser command:

sudo adduser hadoop
adduser
adduser

In the given instance, the username is hadoop. Any username and password that you want to use are acceptable. Change the current user to the newly created one, then enter the associated password:

su - hadoop
su-hadoop
su hadoop

Now, the user must be able to connect to localhost over SSH without being requested for a password.

Enable Passwordless SSH for Hadoop User

Create an SSH key pair and specify where it should be kept with the command:

ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
rsa
keygen

To save the public key as authorized keys in the ssh directory, use the cat command:

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Set the permissions for your user with the chmod command:

chmod 0600 ~/.ssh/authorized keys

Note: A password is no longer required each time the new user wants to SSH.

Use the hadoop user to SSH to localhost to ensure everything is configured properly.

ssh localhost
ssh-localhost
ssh localhost

The Hadoop user may now effortlessly create an SSH connection to the localhost after a brief question.

Step3: Download and Install Hadoop on Ubuntu

You can use wget to download it from the official website

hadoopbinary
binary

Download and extract

hadoopdownload
wget https://downloads.apache.org/hadoop/common/hadoop-3.3.3/hadoop-3.3.3.tar.gz
wget-hadoop
wget hadoop

Once the download is complete, extract the files to initiate the Hadoop installation:

tar xzf hadoop-3.3.3.tar.gz

Now, the hadoop-3.3.3 directory is where you may find the Hadoop binary files.

Step4: Single Node Hadoop Deployment

Hadoop performs best when set up on a sizable networked cluster of machines in a fully distributed configuration. However, you may set up Hadoop on a single node if you’re unfamiliar with it and wish to investigate fundamental commands or test out applications.

With this configuration, also known as pseudo-distributed mode, each Hadoop daemon can function as a separate Java process. Editing a collection of configuration files listed below allows you to customize a Hadoop environment:

  1. bashrc
  2. hadoop-env.sh
  3. core-site.xml
  4. hdfs-site.xml
  5. mapred-site-xml
  6. yarn-site.xml

Configure Hadoop Environment Variables (bashrc)

Edit the .bashrc shell configuration file using a text editor of your choice (we will be using vim):

sudo vim .bashrc

I hope you found the steps on how to Install Hadoop on Linux very useful. Please feel free to leave a comment below.

Rate this post

Thank you for reading this post. Kindly share it with others.

  • Click to share on X (Opens in new window) X
  • Click to share on Reddit (Opens in new window) Reddit
  • Click to share on LinkedIn (Opens in new window) LinkedIn
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on Pinterest (Opens in new window) Pinterest
  • Click to share on Tumblr (Opens in new window) Tumblr
  • Click to share on Telegram (Opens in new window) Telegram
  • Click to share on WhatsApp (Opens in new window) WhatsApp
  • Click to share on Pocket (Opens in new window) Pocket
  • Click to share on Mastodon (Opens in new window) Mastodon
  • Click to share on Bluesky (Opens in new window) Bluesky
  • Click to share on Threads (Opens in new window) Threads
  • Click to share on Nextdoor (Opens in new window) Nextdoor
Network | Monitoring

Post navigation

Previous Post: How to Install SonarQube on Ubuntu 20.04 LTS
Next Post: How to encrypt your system with Trellix Data Encryption

Related Posts

  • Emulate Cisco Router with GNS3
    How to make a router function as a switch in GNS3 Network | Monitoring
  • SQLIOSim utility to simulate SQL Server
    Testing Disk Subsystem Integrity for SQL Server with SQLIOSim Network | Monitoring
  • screenshot 2020 05 03 at 18.30.44
    Setup Cisco ASA: Wiping Old Configurations Network | Monitoring
  • tmutil localsnapshot
    Fix what is taking up my Mac Storage? Delete Snapshots Network | Monitoring
  • images 5
    How to Set Up a WatchGuard XTM and Access WSM Network | Monitoring
  • Screenshot 2020 05 14 at 19.08.33
    Backup image to TFTP server Network | Monitoring

More Related Articles

Emulate Cisco Router with GNS3 How to make a router function as a switch in GNS3 Network | Monitoring
SQLIOSim utility to simulate SQL Server Testing Disk Subsystem Integrity for SQL Server with SQLIOSim Network | Monitoring
screenshot 2020 05 03 at 18.30.44 Setup Cisco ASA: Wiping Old Configurations Network | Monitoring
tmutil localsnapshot Fix what is taking up my Mac Storage? Delete Snapshots Network | Monitoring
images 5 How to Set Up a WatchGuard XTM and Access WSM Network | Monitoring
Screenshot 2020 05 14 at 19.08.33 Backup image to TFTP server Network | Monitoring

Leave a Reply Cancel reply

You must be logged in to post a comment.

Microsoft MVP

VEEAMLEGEND

vexpert-badge-stars-5

Virtual Background

GoogleNews

Categories

veeaam100

sysadmin top30a

  • Screenshot 2022 03 20 at 20.37.16
    Setup HTTPS users using Git credentials and Pushing Code to AWS CodeCommit AWS/Azure/OpenShift
  • Chefconf Poster Desktop Chef 1280x1024 1
    Chef – Node Bootstrapping Configuration Management Tool
  • a Multiple SPF Records  Issues and Examples@2x
    How to setup SPF and TXT Records in AWS AWS/Azure/OpenShift
  • what is saml e1690630549650
    What is SAML – Security Assertion Markup Language Windows Server
  • How to Disable Users Seeing Wi Fi Password on Windows.jpg banner
    How to Prevent Users from Seeing Wi-Fi Password on Windows Network | Monitoring
  • SU
    How to switch users in Linux Linux
  • Featured image Some Settings are managed by your organization
    How to Fix “Some Settings Are Managed by Your Organization” Error in Windows Update Windows
  • Screenshot 2022 03 20 at 21.08.50
    How to integrate AWS CodeBuild and AWS CodeCommit to SonarCloud AWS/Azure/OpenShift

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 1,839 other subscribers
  • RSS - Posts
  • RSS - Comments
  • About
  • Authors
  • Write for us
  • Advertise with us
  • General Terms and Conditions
  • Privacy policy
  • Feedly
  • Telegram
  • Youtube
  • Facebook
  • Instagram
  • LinkedIn
  • Tumblr
  • Pinterest
  • Twitter
  • mastodon

Tags

AWS Azure Bitlocker Microsoft Windows PowerShell WDS Windows 10 Windows 11 Windows Deployment Services Windows Server 2016

Copyright © 2025 TechDirectArchive

 

Loading Comments...
 

You must be logged in to post a comment.