Monday, April 21, 2014

Day#1 Hadoop: Install Hadoop on Ubuntu single node VM

Mount Ubuntu virtual machine through hyper-v.

1) Download Ubuntu
After you setup your Ubuntu VM, then go to http://hadoop.apache.org/releases.html#Download
And download a stable Hadoop 1.x release version. I chose hadoop-1.2.1.tar.gz 



2) Install Java
$ sudo apt-get install openjdk-7-jdk
Verify java using java -version on shell prompt. My java version "1.7.0_51"
















 






3) Setting up passswordless ssh connection to localhost
$sudo apt-get install openssh-server
$ssh-keygen

$cat  /home/butik/.ssh/id_rsa.pub >> /home/butik/.ssh/authorized_keys

Then you should be able to connect ssh password less.























4) Untar Hadoop

Make a directory for Hadoop using sudo privilege.

$sudo mkdir /usr/local/hadoop

$sudo mkdir /usr/local/hadoop/tmp  # for tmp files

$sudo chown –R butik /usr/local/hadoop/

Go to download folder and untar gunzipped hadoop binaries.

$tar -zxvf hadoop-1.2.1.tar.gz

$cd  hadoop-1.2.1

copy hadoop files into the directory.

$cp -r * /usr/local/hadoop













5) Set hadoop home

.bashrc

Open  $HOME/.bashrc in vi editor and add following lines
export HADOOP_PREFIX=/usr/local/hadoop
export PATH=$PATH:$HADOOP_PREFIX/bin









$exec bash

6) Set up Hadoop configuration files




sudo vi hadoop-env.sh

export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-amd64
export HADOOP_OPTS=-Djava.net.preferIPv4stack=true

sudo vi core-site.xml

<configuration>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/usr/local/hadoop/tmp</value>
</property>

<property>
  <name>fs.default.name</name>
  <value>hdfs://localhost:10001</value>
</property>
</configuration>

sudo vi mapred-site.xml

<configuration>

<property> <name>mapred.job.tracker</name> <value>localhost:10002</value> </property>

</configuration>

sudo vi hdfs-site.xml (optional)

<configuration>
<property>
  <name>dfs.replication</name>
  <value>1</value>
</property>
</configuration>

7) Start Hadoop

$hadoop namenode -format































$start-all.sh

jps shows 5 deamons .Hadoop is up and running


















8) transfer a file into HDFS 


$hadoop fs -mkdir /user/butik/data
$hadoop fs -put test.txt /user/butik/data

Check the file in locahost:50070










No comments:

Post a Comment