Version: 3.0.0

Vagrant Environment Setup for Apache Ambari

This guide helps you set up a basic multi-node Vagrant environment for Apache Ambari development and testing. The environment consists of one Ambari Server node and two Agent nodes, providing a minimal platform for development and testing.

Overview

This guide is part of the Quick Start section and covers:

Setting up a basic three-node Vagrant environment
Configuring network and shared storage
Setting up SSH access between nodes
Configuring security settings (firewall, SELinux)
Preparing the environment for Ambari installation

For complete installation instructions, refer to the Installation Guide.

System Requirements

Minimum Host Machine Resources:
- CPU: 6+ cores (2 cores per VM)
- RAM: 24GB+ (8GB per VM)
- Storage: 100GB+ free space
Software Requirements:
- VirtualBox 6.1+
- Vagrant 2.2+
- Operating System: Linux, macOS, or Windows with virtualization support

Important Notes

This configuration provides minimum requirements for basic development and testing
Each VM requires 8GB RAM minimum for basic Hadoop services
The shared folder for RPM repository must exist on the host machine
Port 8080 should be available on the host machine for Ambari Web UI
For production environments, refer to the official sizing guide
Additional resources may be required depending on your specific use case

Prerequisites

Install VirtualBox
Install Vagrant

Environment Architecture

The Vagrant environment creates a minimal distributed setup with the following components:

Ambari Server Node (vm1):
- Primary controller node
- Hosts Ambari Server and Web UI
- Manages cluster configuration and operations
- IP: 192.168.56.20
- Web UI accessible at http://localhost:8080
Agent Nodes (vm2, vm3):
- Execute and monitor Hadoop services
- Report status to Ambari Server
- Support service distribution and scaling
- IPs: 192.168.56.21-22
Network Layout:
- Private network for inter-node communication
- Port 8080 forwarded for Ambari Web UI access
- Automated hosts file configuration
- Disabled firewall for development ease
Shared Storage:
- RPM repository accessible to all nodes
- Consistent package access across cluster

Setting Up Vagrant Environment

Create a new directory for your Vagrant environment:

mkdir ambari-vagrant
cd ambari-vagrant

Create the RPM repository directory:

mkdir -p ./ambari-repo

Create a Vagrantfile:

# Vagrantfile for Apache Ambari 3-node development environment
# This configuration creates a minimal cluster with one server and two agent nodes
# All manual configuration steps will be performed after VM creation

Vagrant.configure("2") do |config|
  # VM 1 - Primary Ambari Server Node
  # This VM will host the Ambari Server and Web UI
  config.vm.define "vm1" do |vm1|
    # Use Rocky Linux 8 as the base operating system
    vm1.vm.box = "generic/rocky8"
    
    # Set the hostname to vm1 for proper identification
    vm1.vm.hostname = "vm1"

    # Port forwarding for Ambari Web UI
    # This allows accessing the Ambari interface at http://localhost:8080 from your host machine
    vm1.vm.network "forwarded_port", guest: 8080, host: 8080

    # Private network configuration
    # Creates a private network for inter-VM communication with a static IP
    vm1.vm.network "private_network", ip: "192.168.56.20"

    # VirtualBox provider-specific configuration
    vm1.vm.provider "virtualbox" do |vb|
      # Disable GUI mode (headless operation)
      vb.gui = false
      
      # Allocate 8GB RAM to this VM (minimum required for Ambari Server)
      vb.memory = "8192"
      
      # Allocate 2 CPU cores to this VM
      vb.cpus = 2
    end
  end

  # VM 2 - First Agent Node
  # This VM will run Ambari Agent and host Hadoop services
  config.vm.define "vm2" do |vm2|
    # Use Rocky Linux 8 as the base operating system
    vm2.vm.box = "generic/rocky8"
    
    # Set the hostname to vm2 for proper identification
    vm2.vm.hostname = "vm2"

    # Private network configuration
    # Creates a private network for inter-VM communication with a static IP
    vm2.vm.network "private_network", ip: "192.168.56.21"

    # VirtualBox provider-specific configuration
    vm2.vm.provider "virtualbox" do |vb|
      # Disable GUI mode (headless operation)
      vb.gui = false
      
      # Allocate 8GB RAM to this VM (minimum required for Hadoop services)
      vb.memory = "8192"
      
      # Allocate 2 CPU cores to this VM
      vb.cpus = 2
    end
  end

  # VM 3 - Second Agent Node
  # This VM will run Ambari Agent and host additional Hadoop services
  config.vm.define "vm3" do |vm3|
    # Use Rocky Linux 8 as the base operating system
    vm3.vm.box = "generic/rocky8"
    
    # Set the hostname to vm3 for proper identification
    vm3.vm.hostname = "vm3"

    # Private network configuration
    # Creates a private network for inter-VM communication with a static IP
    vm3.vm.network "private_network", ip: "192.168.56.22"

    # VirtualBox provider-specific configuration
    vm3.vm.provider "virtualbox" do |vb|
      # Disable GUI mode (headless operation)
      vb.gui = false
      
      # Allocate 8GB RAM to this VM (minimum required for Hadoop services)
      vb.memory = "8192"
      
      # Allocate 2 CPU cores to this VM
      vb.cpus = 2
    end
  end

  # Shared folder for Ambari RPM repository
  # This maps ./ambari-repo on the host to /vagrant_data on all VMs
  # Used for distributing RPM packages to all nodes
  config.vm.synced_folder "./ambari-repo", "/vagrant_data"

  # Disable VirtualBox Guest Additions auto-update
  # This prevents potential issues during VM startup
  config.vbguest.auto_update = false
  config.vbguest.no_remote = true
end

Install sshpass (required for SSH key distribution):

# For macOS:
brew install sshpass

# For Linux:
sudo apt-get install sshpass  # Ubuntu/Debian
sudo yum install sshpass      # RHEL/CentOS

Start the Vagrant environment:

vagrant up

Manual Configuration Steps

After starting your VMs, you must perform several important configuration steps to ensure proper cluster operation. These manual steps make it easier to understand the configuration process and troubleshoot issues.

1. Root User Configuration

By default, vagrant ssh vm1 logs you in as the vagrant user. For Ambari installation and configuration, we'll use the root user for all operations:

Switch to the root user on each VM:

# Connect to each VM
vagrant ssh vm1  # Repeat for vm2, vm3

# Switch to root user
sudo su -

Set a password for the root user:

# While logged in as root
passwd

# Enter and confirm a new password when prompted
# Remember this password for future root access

Note: Root access is required for Ambari installation. The Ambari setup process needs to install packages and modify system configurations that require root privileges. All subsequent steps should be performed as the root user.

2. SSH Configuration

On each VM, modify SSH configuration to allow password authentication and root login:

# Connect to each VM and switch to root
vagrant ssh vm1  # Repeat for vm2, vm3
sudo su -

# Edit sshd_config
vi /etc/ssh/sshd_config

# Make these changes:
# PasswordAuthentication yes
# PermitRootLogin yes

# Restart sshd service
systemctl restart sshd

Generate SSH keys on vm1 as root:

# Connect to vm1 and switch to root
vagrant ssh vm1
sudo su -

# Generate SSH key if not exists
if [ ! -f ~/.ssh/id_rsa ]; then
  ssh-keygen -t rsa -N "" -f ~/.ssh/id_rsa
fi

Set up passwordless SSH from vm1 to all VMs as root:

# On vm1 as root
# Copy keys to each VM (including vm1 itself)
ssh-copy-id -o StrictHostKeyChecking=no root@vm1
ssh-copy-id -o StrictHostKeyChecking=no root@vm2
ssh-copy-id -o StrictHostKeyChecking=no root@vm3

Test SSH connectivity as root:

# Test SSH access between nodes as root
ssh root@vm2 echo "Connection to vm2 successful"
ssh root@vm3 echo "Connection to vm3 successful"

3. Security Configuration

Disable SELinux on each VM as root:

# Connect to each VM and switch to root if not already
vagrant ssh vm1  # Repeat for vm2, vm3
sudo su -

# Disable SELinux immediately
setenforce 0

# Disable SELinux permanently
sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config

Ensure firewall is disabled on each VM as root:

# Connect to each VM and switch to root if not already
vagrant ssh vm1  # Repeat for vm2, vm3
sudo su -

# Stop firewall
systemctl stop firewalld

# Disable firewall on boot
systemctl disable firewalld

4. Hosts File Configuration

Configure /etc/hosts on each VM as root:

# Connect to each VM and switch to root if not already
vagrant ssh vm1  # Repeat for vm2, vm3
sudo su -

# Edit hosts file
vi /etc/hosts

# Remove or comment out any lines with:
# 127.0.0.1 vm1
# 127.0.0.1 vm2
# 127.0.0.1 vm3

# Add these entries if not present:
192.168.56.20 vm1
192.168.56.21 vm2
192.168.56.22 vm3

5. Enable Development Repository

The Rocky Linux development repository needs to be enabled on each VM to install dependencies required for Ambari:

# Connect to each VM and switch to root if not already
vagrant ssh vm1  # Repeat for vm2, vm3
sudo su -

# Edit the Rocky-Devel repository configuration
vi /etc/yum.repos.d/Rocky-Devel.repo

# There are two possible scenarios:
# 1. If all lines are commented (start with #), uncomment all lines
# 2. If you see "enabled=0", change it to "enabled=1"

# After editing, verify the repository is enabled
yum repolist | grep devel

Note: Enabling the development repository is critical for installing dependencies required by Ambari. Without this repository, you may encounter package installation failures during Ambari setup.

6. Verify Configuration

Check SSH connectivity as root:

# Connect to vm1 and switch to root if not already
vagrant ssh vm1
sudo su -

# Test SSH connections as root
ssh root@vm2 echo "Connection to vm2 successful"
ssh root@vm3 echo "Connection to vm3 successful"

Verify security settings as root:

# Connect to vm1 and switch to root if not already
vagrant ssh vm1
sudo su -

# Check SELinux status on each VM
for i in {1..3}; do
  echo "=== VM$i SELinux Status ==="
  ssh root@vm$i getenforce  # Should show 'Disabled'
done

# Check firewall status on each VM
for i in {1..3}; do
  echo "=== VM$i Firewall Status ==="
  ssh root@vm$i systemctl status firewalld  # Should show 'inactive'
done

Verify hosts file configuration as root:

# Connect to vm1 and switch to root if not already
vagrant ssh vm1
sudo su -

# Check hosts file on each VM
for i in {1..3}; do
  echo "=== VM$i Hosts File ==="
  ssh root@vm$i cat /etc/hosts
done

Test network connectivity as root:

# Connect to vm1 and switch to root if not already
vagrant ssh vm1
sudo su -

# Test ping between all nodes
for i in {1..3}; do
  echo "=== Testing from VM$i ==="
  for j in {1..3}; do
    [ $i -ne $j ] && ssh root@vm$i ping -c 1 vm$j
  done
done

Troubleshooting

If you encounter issues during the manual configuration:

SSH Issues:

# If SSH connection fails, check sshd configuration
vagrant ssh vm1
sudo su -
cat /etc/ssh/sshd_config | grep PasswordAuthentication
cat /etc/ssh/sshd_config | grep PermitRootLogin

# Restart sshd on problem node
systemctl restart sshd

# Manually copy SSH keys if needed
ssh-copy-id -o StrictHostKeyChecking=no root@vm2
ssh-copy-id -o StrictHostKeyChecking=no root@vm3

SELinux/Firewall Issues:

# Connect to vm1 and switch to root
vagrant ssh vm1
sudo su -

# Check SELinux status
ssh root@vm1 getenforce

# Manually disable SELinux
ssh root@vm1 setenforce 0
ssh root@vm1 sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config

# Check firewall status
ssh root@vm1 systemctl status firewalld

# Manually disable firewall
ssh root@vm1 systemctl stop firewalld
ssh root@vm1 systemctl disable firewalld

Hosts File Issues:

# Connect to vm1 and switch to root
vagrant ssh vm1
sudo su -

# Check hosts file content
ssh root@vm1 cat /etc/hosts

# Manually fix hosts file
ssh root@vm1 sed -i '/127.0.0.1.*vm[123]/d' /etc/hosts
ssh root@vm1 "echo '192.168.56.20 vm1' >> /etc/hosts"
ssh root@vm1 "echo '192.168.56.21 vm2' >> /etc/hosts"
ssh root@vm1 "echo '192.168.56.22 vm3' >> /etc/hosts"

Resource Issues:
- If VMs are slow or unresponsive, check host resource usage
- Ensure each VM has at least 8GB RAM allocated
- Verify at least 2 CPU cores per VM
- Check available disk space on host
Network Connectivity:
- Test inter-VM communication with ping
- Verify VirtualBox network settings
- Check for IP conflicts
- Ensure port 8080 is available on host

Next Steps

After setting up your Vagrant environment:

Verify all VMs are running:

vagrant status

Test SSH access to each VM:

vagrant ssh vm1  # Similarly for vm2, vm3

Proceed to the Installation Guide to install and configure Ambari Server and Agents.

Common Vagrant Commands

vagrant up: Start the VMs
vagrant halt: Stop the VMs
vagrant destroy: Remove the VMs
vagrant status: Check VMs status
vagrant reload: Restart VMs with new Vagrantfile configuration
vagrant ssh vm1: Connect to VM1 (similarly for vm2, vm3)

Overview​

System Requirements​

Important Notes​

Prerequisites​

Environment Architecture​

Setting Up Vagrant Environment​

Manual Configuration Steps​

1. Root User Configuration​

2. SSH Configuration​

3. Security Configuration​

4. Hosts File Configuration​

5. Enable Development Repository​

6. Verify Configuration​

Troubleshooting​

Next Steps​

Common Vagrant Commands​