Monday, February 21, 2011

R on amazon EMR

Seems like I found an opening.

This tutorial seems to be very relevant and useful to me.

If I intend to work on the cluster from my R, I need to tell R where the cluster is sitting

maybe this is the way

http://jeffreybreen.wordpress.com/

Thursday, February 17, 2011

ssh tunnelling blues

I have been meddling with ssh tunneling issues now and for the first time I have been able to set up an ssh session with the remote server
The nomenclature is explained in this beautiful tutorial

The real trick I got from this tutorial and another tutorial

Run the following on myPC
$ssh -t userid@gateway ssh remoteserver

Still gotta figure out how to do sftp through ssh tunneling

Wednesday, February 16, 2011

Hadoop vs Hadoop Streaming

What is the difference between Hadoop and Hadoop streaming?

Hadoop is a gigantic program that defines its mapper and reducer all in one code (or compiled as one) and all conf details are all in that one object file
With hadoop streaming, you can use the streaming option to run a mapper of any kind and a reducer of any kind and specify other details (conf etc) externally

Tuesday, February 15, 2011

algorithmic and algorithm

algorithm encapsulates algorithmic, and has options of being boxed ruled or plain. However once algorithm package is loaded, the option settings are global to the entire package


\usepackage{algorithm}[boxed]

and then whenever you create an algorithm code

\begin{algorithm}
\caption{XOXO}
\label{XOXO}
\begin{algorithmic}[1]
.
.
.

\end{algorithmic}
\end{algorithm}

Monday, February 14, 2011

Finally figured the svn puzzle

Today is my lucky day. SVN has finally shown me some love. So this is what I wanted to do always

1) Have a data folder "research" that needs backing up , and checking out from either lab computer or home computer
2) Create two users on the server machine (where you run svnadmin commands)
3) From either the lab computer or the home machine use svn co svn+ssh://@/pathtorepos to check out files
4) the passwd file in /pathtorepos/conf/passwd is not of any use... this file really got me twisted. No matter what the contents of this file are, there needs to be a login on the machine for each user that wants to checkout any data
5) last but not the least. once the "research" folder is imported, one needs to check it back out on to the machine where modifications need to be made.

Sunday, February 13, 2011

Change the float page fraction


\renewcommand{\dbltopfraction}{0.9} % fit big float above 2-col. text
\renewcommand{\textfraction}{0.07} % allow minimal text w. figs
% Parameters for FLOAT pages (not text pages):
\renewcommand{\floatpagefraction}{0.7} % require fuller float pages
% N.B.: floatpagefraction MUST be less than topfraction !!
\renewcommand{\dblfloatpagefraction}{0.7} % require fuller float pages

Friday, February 11, 2011

svn version control tips and pitfalls

Here is a step by step procedure to do the following
1) svn repository to be created on a machine labpc (with your login name mylogin) with ipaddress lab_ipaddress at location /svnrepos

The data to be imported is in /media/data

There are two users on this machine lab_user home_user .These users modify or update the repository from two different locations labpc and homepc

All computers run Ubuntu OS

2) Prefer web access

I will update the post later on web access. For now I will post instructions on setting it up.

On labPC
1) run $ svnadmin create /svnrepos
svn import /media/data /svnrepos
2) Change /svnrepos/conf/svnserve.conf to look like this

[general]
anon-access = none
auth-access = write
password-db = passwd

3) Modify the pasword file /svnrepos/conf/passwd
to

User1 = passw1
User2 = pass2

4) on homepc
change /etc/hosts and add the following line

lab_ipaddress mysvn.server.purdue.edu

the mysvn.server.purdue.edu could be changed to anything

5) on homepc, Add the following lines to .ssh/config

Host mysvn.server.purdue.edu
User mylogin
Port some_number

6) Finally try $svn list svn+ssh://mylogin@mysvn.server.purdue.edu/svnrepos

Common pitfalls

1) the svnserve.conf should have no leading spaces
2) the port number specified in step 5 should not be a port that is commonly used...

More to come later

Latex table of figures

\begin{figure}
\centering
\begin{tabular}{cc}
\begin{minipage}[c]{0.5\linewidth}
\epsfig{file=myimage.eps,width=\linewidth}
\caption{Image 1}
\end{minipage} &
\begin{minipage}[c]{0.5\linewidth}
\epsfig{file=edgeimage2.eps,width=\linewidth}
\caption{Image 2}
\end{minipage} \\
\begin{minipage}[c]{0.5\linewidth}
\epsfig{file=out3.eps,width=\linewidth}
\caption{Image 3}
\end{minipage} &
\begin{minipage}[c]{0.5\linewidth}
\epsfig{file=out.eps,width=\linewidth}
\caption{Image 4}
\end{minipage} \\
\end{tabular}
\end{figure}