Tag: NoSQL

CentOS 6 – Cloudera


So much work and so few time.. I won’t have the time to explain, so it’s just a post for keeping a trace of my install scripts for Cloudera on CentOS 6

mkdir /opt/quidquid
mkdir /opt/quidquid/PROGS
yum install -y nmap wget apr apr-devel apr-util apr-util-devel libxml pcre pcre-devel gcc openssl-devel
cd ~
wget -c http://apache.crihan.fr/dist/httpd/httpd-2.2.27.tar.gz
tar zxf httpd-2.2.27.tar.gz
cd httpd-2.2.27
./configure --prefix=/opt/apache-2.2.27 --enable-so --enable-ssl --enable-ssl=shared --enable-rewrite --enable-rewrite=shared --with-z=/usr
make install
ln -s /opt/apache-2.2.27/ /opt/apache
cd ~
rm -Rf ~/httpd-2.*
vi /opt/apache/conf/httpd.conf
groupadd www
useradd -g www www
cat > /etc/init.d/httpd << "EOF"
. /etc/rc.d/init.d/functions


case "$1" in

    echo -n "Starting httpd: "
    daemon $APACHEHOME/bin/httpd
    touch /var/lock/subsys/httpd

    echo -n "Shutting down http: "
    killproc httpd
    rm -f /var/lock/subsys/httpd
    rm -f /var/run/httpd.pid

    status httpd

    $0 stop
    $0 start

    echo -n "Reloading httpd: "
    killproc httpd -HUP

    echo "Usage: $0 {start|stop|restart|reload|status}"
    exit 1

exit 0

chmod 700 /etc/init.d/httpd
/etc/init.d/httpd start
/etc/init.d/httpd stop
/sbin/chkconfig --level 3 httpd on
/sbin/chkconfig --level 06 httpd off

mkdir -p /home/www/html
mkdir -p /home/www/cgi-bin
mkdir -p /home/www/html/CLOUDSME
cat > /home/www/html/robots.txt << "EOF"
User-agent: *
Disallow: /

cat > /home/www/html/index.html << "EOF"
Bonjour !

chgrp -R www /home/www
chmod -R 775 /home/www
cp /opt/apache/conf/httpd.conf /opt/apache/conf/httpd.old
vi /opt/apache/conf/httpd.conf

Modifier les lignes suivantes pour correspondre à nos besoins :
- User www
- Group www
- ServerName
- Listen 80
- DocumentRoot “/home/www/html”
- /home/www/html”>
- ScriptAlias /cgi-bin/ “/home/www/cgi-bin/”
- /home/www/cgi-bin”>

/etc/init.d/httpd start

iptables -P INPUT ACCEPT
iptables -F
iptables -A INPUT -i lo -j ACCEPT
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -P INPUT DROP
iptables -P FORWARD DROP
iptables -A INPUT -i eth0 -p icmp -j ACCEPT
iptables -A INPUT -p tcp --dport 22 -j ACCEPT
iptables -A INPUT -p tcp --dport 80 -j ACCEPT
iptables -A INPUT -p tcp --dport 443 -j ACCEPT
iptables -L
/sbin/service iptables save


cd ~/
wget -c http://archive.cloudera.com/cdh4/one-click-install/redhat/6/x86_64/cloudera-cdh-4-0.x86_64.rpm
yum --nogpgcheck localinstall cloudera-cdh-4-0.x86_64.rpm
yum install hadoop-conf-pseudo

Installer Java :

cd /opt/
wget -c http://blog.quidquid.fr/jdk/jdk-7u51-linux-x64.tar.gz
tar zxf /opt/jdk-7u51-linux-x64.tar.gz
chown -R root:root /opt/jdk1.7.0_51
ln -s /opt/jdk1.7.0_51 /opt/jdk

Ajout de java_home dans les variables d’environnement.
cat >> ~/.bashrc << "EOF"
# -------------------------
export JAVA_HOME=/opt/jdk

cat >> /etc/bashrc << "EOF"
# -------------------------
export JAVA_HOME=/opt/jdk

Ajouter VMCLOUDERA à la fin de chaque lignes
&#91;bash&#93;vi /etc/hosts&#91;/bash&#93;

sudoedit /etc/sudoers and add :
hdfs	ALL=(ALL)	ALL

Se connecter en tant que hdfs
su - hdfs
hdfs namenode -format

vi /etc/hadoop/conf.pseudo/hadoop-env.sh
export JAVA_HOME=/opt/jdk

Tout démarrer :
for x in <code>cd /etc/init.d ; ls hadoop-hdfs-*</code> ; do sudo service $x start ; done

sudo service hadoop-hdfs-namenode start
sudo service hadoop-hdfs-secondarynamenode start
sudo service hadoop-hdfs-datanode start

3. Optional: Start services on boot
sudo chkconfig hadoop-hdfs-namenode on
sudo chkconfig hadoop-hdfs-secondarynamenode on
sudo chkconfig hadoop-hdfs-datanode on

Step 3: Create the /tmp Directory

Remove the old /tmp if it exists:

sudo -u hdfs hadoop fs -rm -r /tmp

Create a new /tmp directory and set permissions:

sudo -u hdfs hadoop fs -mkdir /tmp sudo -u hdfs hadoop fs -chmod -R 1777 /tmp

Step 4: Create Staging and Log Directories

Create the staging directory and set permissions:

sudo -u hdfs hadoop fs -mkdir /tmp/hadoop-yarn/staging
sudo -u hdfs hadoop fs -chmod -R 1777 /tmp/hadoop-yarn/staging

Create the done_intermediate directory under the staging directory and set permissions:

sudo -u hdfs hadoop fs -mkdir /tmp/hadoop-yarn/staging/history/done_intermediate
sudo -u hdfs hadoop fs -chmod -R 1777 /tmp/hadoop-yarn/staging/history/done_intermediate

Change ownership on the staging directory and subdirectory:

sudo -u hdfs hadoop fs -chown -R mapred:mapred /tmp/hadoop-yarn/staging

Create the /var/log/hadoop-yarn directory and set ownership:

sudo -u hdfs hadoop fs -mkdir /var/log/hadoop-yarn
sudo -u hdfs hadoop fs -chown yarn:mapred /var/log/hadoop-yarn

Step 5: Verify the HDFS File Structure:

Run the following command:

$ sudo -u hdfs hadoop fs -ls -R /

You should see the following directory structure:

drwxrwxrwt – hdfs supergroup 0 2014-04-25 11:29 /tmp
drwxr-xr-x – hdfs supergroup 0 2014-04-25 11:29 /tmp/hadoop-yarn
drwxrwxrwt – mapred mapred 0 2014-04-25 11:30 /tmp/hadoop-yarn/staging
drwxr-xr-x – mapred mapred 0 2014-04-25 11:30 /tmp/hadoop-yarn/staging/history
drwxrwxrwt – mapred mapred 0 2014-04-25 11:30 /tmp/hadoop-yarn/staging/history/done_intermediate
drwxr-xr-x – hdfs supergroup 0 2014-04-25 11:33 /var
drwxr-xr-x – hdfs supergroup 0 2014-04-25 10:53 /var/lib
drwxr-xr-x – hdfs supergroup 0 2014-04-25 11:33 /var/log
drwxr-xr-x – yarn mapred 0 2014-04-25 11:33 /var/log/hadoop-yarn

Step 6: Start YARN

sudo service hadoop-yarn-resourcemanager start
sudo service hadoop-yarn-nodemanager start
sudo service hadoop-mapreduce-historyserver start

sudo chkconfig hadoop-yarn-resourcemanager on
sudo chkconfig hadoop-yarn-nodemanager on
sudo chkconfig hadoop-mapreduce-historyserver on

sudo -u hdfs hadoop fs -mkdir /user
sudo -u hdfs hadoop fs -mkdir /user/clouduser
sudo -u hdfs hadoop fs -chown clouduser /user/clouduser

Testing everything is ok

useradd -g users clouduser
passwd clouduser
su – clouduser
hadoop fs -mkdir input
hadoop fs -put /etc/hadoop/conf/*.xml input

hadoop fs -ls input
Found 3 items:
-rw-r–r– 1 clouduser users 1348 2014-04-25 11:42 input/core-site.xml
-rw-r–r– 1 clouduser users 1913 2014-04-25 11:42 input/hdfs-site.xml
-rw-r–r– 1 clouduser users 1001 2014-04-25 11:42 input/mapred-site.xml

Set HADOOP_MAPRED_HOME for user joe:

export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce

Run an example Hadoop job to grep with a regular expression in your input data.

hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep input output23 ‘dfs[a-z.]+’

After the job completes, you can find the output in the HDFS directory named output23 because you specified that output directory to Hadoop.

$ hadoop fs -ls Found 2 items
drwxr-xr-x – clouduser users 0 2014-04-25 11:45 /user/clouduser/input
drwxr-xr-x – clouduser users 0 2014-04-25 11:45 /user/clouduser/output23

You can see that there is a new directory called output23.
List the output files.

$ hadoop fs -ls output23 Found 2 items
drwxr-xr-x – clouduser users 0 2014-04-25 11:45 /user/joe/output23/_SUCCESS
-rw-r–r– 1 clouduser users 1068 2014-04-25 11:45 /user/joe/output23/part-r-00000

Read the results in the output file.

hadoop fs -cat output23/part-r-00000 | head
1 dfs.safemode.min.datanodes
1 dfs.safemode.extension
1 dfs.replication
1 dfs.permissions.enabled
1 dfs.namenode.name.dir
1 dfs.namenode.checkpoint.dir
1 dfs.datanode.data.dir

iptables -A INPUT -p tcp –dport 631 -j ACCEPT
iptables -A INPUT -p tcp –dport 8031 -j ACCEPT
iptables -A INPUT -p tcp –dport 8042 -j ACCEPT
iptables -A INPUT -p tcp –dport 8080 -j ACCEPT
iptables -A INPUT -p tcp –dport 8088 -j ACCEPT
/sbin/service iptables save

Scala – Scalatra – Salat – MongoDB : test drive


Hello there, a little test with Scala (2.9.0-1) for the backend services, Scalatra for the REST part, Salat and MongoBB for the persistance layer. I will use sbt (0.10) for the project management and you should use IntelliJ Idea IDE (what else ? ).

What this test does ?

It exposes a REST API (and its documentation) in order to play with a user (login, email, password). This API is extremely simplified, so you understand easely the concepts.
The objective is to :
– add a user,
– get a user (with its id or login),
– delete a user
– have the API documentation online for each method
Keep reading

Scala – Scalatra – Salat – MongoDB : découverte


Bonjour, un petit test avec Scala (2.9.0-1) pour la partie service, Scalatra concernant le front en REST, Salat et MongoBB pour la persistance des données. J’utiliserai sbt (0.10) pour la gestion du projet et IntelliJ Idea comme IDE (What else ?).

Qu’allons nous voir ici ?

Nous allons exposer une API REST (et sa documentation associée) afin de créer des utilisateurs (identifiant, courriel, mot de passe). Cette API est extrèmement simple afin de bien comprendre ce qui se passe.
L’objectif est donc de pouvoir :
– créer un utilisateur,
– trouver un utilisateur (avec son ID ou son identifiant),
– effacer un utilisateur
– visualiser la documentation de chaque API en ligne pour chacune des méthodes
Keep reading

CentOS 6 – Install Voldemort


Voldemort est une base de données NoSQL, un système de stockage distribué fonctionnant par clef/valeur… en très très gros c’est une hashmap (pour ceux qui viennent du monde Java).

Les URLs des différentes ressources :

Installation CentOS 6 Net install x86_64 en “Basic server”. Keep reading