=== modified file 'db-pom.xml' --- db-pom.xml 2012-03-27 09:18:56 +0000 +++ db-pom.xml 2012-04-09 08:21:39 +0000 @@ -4,7 +4,7 @@ org.hisp.dhis dhis-documentation-docbook DHIS2 Documentation - 2.7-SNAPSHOT + 2.8-SNAPSHOT DHIS 2 Documentation pom === modified file 'src/docbkx/en/dhis2_r.xml' --- src/docbkx/en/dhis2_r.xml 2012-03-21 20:20:36 +0000 +++ src/docbkx/en/dhis2_r.xml 2012-04-09 08:21:39 +0000 @@ -7,11 +7,45 @@ R is freely available, open source statistical computing environment. R refers to both the computer programming language, as well as the software which can be used to create and run R scripts. There are numerous sources on the web which describe the extensive set of features of R. R is a natural extension to DHIS2, as it provides powerful statistical routines, data manipulation functions, and visualization tools. This chapter will describe how to setup R and DHIS2 on the same server, and will provide a simple example of how to retrieve data from the DHIS2 database into an R data frame and perform some basic calculations. +
+ Installing R + If you are installing R on the same server as DHIS, you should consider using the Comprehensive R Archive Network (CRAN) to get the latest distribution of R. All you need to do is to add the following like to you /etc/apt/source.list file. + deb <your R mirror>/bin/linux/ubuntu <your Ubuntu distribution> + You will need to replace <your R mirror> with one from the list available here. You will also need to replace <your Ubuntu distribution> with the name of the distribution you are using. + Once you have done this, invoke the following commands + sudo apt-get update +gpg --keyserver pgp.mit.edu --recv-keys 51716619E084DAB9 +gpg --armor --export 51716619E084DAB9 | apt-key add - +sudo apt-get install r-base r-cran-dbi + At this point, you should have a functional R installation on your machine. + Next, lets see if everything is working by simpling invoking R from the command line. + foo@bar:~$ R + +R version 2.14.1 (2011-12-22) +Copyright (C) 2011 The R Foundation for Statistical Computing +ISBN 3-900051-07-0 +Platform: i686-pc-linux-gnu (32-bit) + +R is free software and comes with ABSOLUTELY NO WARRANTY. +You are welcome to redistribute it under certain conditions. +Type 'license()' or 'licence()' for distribution details. + +R is a collaborative project with many contributors. +Type 'contributors()' for more information and +'citation()' on how to cite R or R packages in publications. + +Type 'demo()' for some demos, 'help()' for on-line help, or +'help.start()' for an HTML browser interface to help. +Type 'q()' to quit R. + +> + +
Using ODBC to retrieve data from DHIS2 into R - In this example, we will use a system-wide ODBC connector which will be used to retrieve data from the DHIS2 database. There are some disadvantages with this approach, as ODBC is slower than other methods and it does raise some security concerns by providing a system-wide connector to all users. However, it is a convenient method to provide a connection to multiple users. The use of the R package RODBC will be used in this case. Other alternatives would be the use of the RPostgreSQL package, which can interface directly through the Postgresql driver. - First, we will install R and some other required and useful packages. Invoke the following command: - apt-get install r-base r-cran-odbc r-cran-lattice odbc-postgresql + In this example, we will use a system-wide ODBC connector which will be used to retrieve data from the DHIS2 database. There are some disadvantages with this approach, as ODBC is slower than other methods and it does raise some security concerns by providing a system-wide connector to all users. However, it is a convenient method to provide a connection to multiple users. The use of the R package RODBC will be used in this case. Other alternatives would be the use of the RPostgreSQL package, which can interface directly through the Postgresql driver described in + Assuming you have already installed R from the procedure in the previous section. Invoke the following command to add the required libraries for this example. + apt-get install r-cran-odbc r-cran-lattice odbc-postgresql Next, we need to configure the ODBC connection. Edit the file to suit your local situation using the following template as a guide. Lets create and edit a file called odbc.ini [dhis2] Description = DHIS2 Database @@ -33,30 +67,6 @@ Debug = 0 Finally, we need to install the ODBC connection with odbcinst -i -d -f odbc.ini - Next, lets execute R and see if the ODBC connection is working. Invoke R from the command line. - foo@bar:~$ R - -R version 2.14.1 (2011-12-22) -Copyright (C) 2011 The R Foundation for Statistical Computing -ISBN 3-900051-07-0 -Platform: i686-pc-linux-gnu (32-bit) - -R is free software and comes with ABSOLUTELY NO WARRANTY. -You are welcome to redistribute it under certain conditions. -Type 'license()' or 'licence()' for distribution details. - -R is a collaborative project with many contributors. -Type 'contributors()' for more information and -'citation()' on how to cite R or R packages in publications. - -Type 'demo()' for some demos, 'help()' for on-line help, or -'help.start()' for an HTML browser interface to help. -Type 'q()' to quit R. - -[Previously saved workspace restored] - -> - This shows that R is working properly. From the R prompt, execute the following commands to connect to the DHIS2 database. > library(RODBC) @@ -76,7 +86,7 @@ 10 Deaths of malaria case provided with anti-malarial treatment 1 to 5 Years > - It seems R is able to retreive data from the DHIS2 database. As an illustrative example, lets say we have been asked to calculate the relative percentage of OPD male and female under 5 attendances for the last twelve months.First, lets create an SQL query which will provide us the basic information which will be required. + It seems R is able to retrieve data from the DHIS2 database. As an illustrative example, lets say we have been asked to calculate the relative percentage of OPD male and female under 5 attendances for the last twelve months.First, lets create an SQL query which will provide us the basic information which will be required. OPD<-sqlQuery(channel,"SELECT p.startdate, de.name as de, sum(dv.value::double precision) FROM datavalue dv INNER JOIN period p on dv.periodid = p.periodid