Baixar a versão mais nova do Hadoop
http://ftp.unicamp.br/pub/apache/hadoop/common/
Neste momento a versão mais nova está no link: http://ftp.unicamp.br/pub/apache/hadoop/common/hadoop-1.2.1/hadoop-1.2.1.tar.gz
Comandos:
$ wget http://ftp.unicamp.br/pub/apache/hadoop/common/hadoop-1.2.1/hadoop-1.2.1.tar.gz$ tar -zxvf hadoop-1.2.1.tar.gz
$ sudo cp -r hadoop-1.2.1/ /usr/local
Instalar o Java SDK
No Linux CentOS podemos usar o yum. Na versão atual do Java SDK o
Comandos:
$ yum install java-1.7.0-openjdk-develou
$ yum install java-devel
Baixar a versão mais nova do Mahout
http://ftp.unicamp.br/pub/apache/mahout/Neste momento: http://ftp.unicamp.br/pub/apache/mahout/0.8/mahout-distribution-0.8.tar.gz
Comandos:
$ wget http://ftp.unicamp.br/pub/apache/mahout/0.8/mahout-distribution-0.8.tar.gz$ tar -zxvf mahout-distribution-0.8.tar.gz
$ sudo cp -r mahout-distribution-0.8/ /usr/local
Adicionar as linhas abaixo no Arquivo ~/.bachrc:
export HADOOP_PREFIX=/usr/local/hadoop-1.2.1
export PATH=$PATH:$HADOOP_PREFIX/bin
export MAHOUT_HOME=/usr/local/mahout-distribution-0.8
export PATH=$PATH:$MAHOUT_HOME
export JAVA_HOME=/usr/lib/jvm/java-1.6.0
Configurando o ssh para conexão de um servidor no outro sem pedir a senha:
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Configurando o Cluster
No master:/etc/hosts adicionar:
192.168.0.123 master
192.168.0.131 slave1
192.168.0.132 slave2
$HADOOP_HOME/conf/master adicionar:
master
$HADOOP_HOME/conf/slaves adicionar:
master
slave1
Em todas as máquinas:
conf/core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://master:54310</value>
<description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.</description>
</property>
conf/mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>master:54311</value>
<description>The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task.
</description>
</property>
conf/hdfs-site.xml
Atenção: <value>2</value> abaixo refere-se a quantidade de nós disponíveis.
<property>
<name>dfs.replication</name>
<value>2</value>
<description>
Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time.
</description>
</property>
Logo depois executar
http://www.higherpass.com/linux/Tutorials/Installing-And-Using-Hadoop/1/
http://wiki.apache.org/hadoop/PoweredBy
</description>
</property>
Logo depois executar
Comando:
$ start-all.shLinks:
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/http://www.higherpass.com/linux/Tutorials/Installing-And-Using-Hadoop/1/
http://wiki.apache.org/hadoop/PoweredBy
Nenhum comentário:
Postar um comentário