Interactive Heat Maps for R

In every statistical analysis, the first thing one should do is try and visualise the data before any modeling. In microarray studies, a common visualisation is a heatmap of gene expression data.

In this post I simulate some gene expression data and visualise it using the heatmaply package in R by Tal Galili. This package extends the plotly engine to heatmaps, allowing you to inspect certain values of the data matrix by hovering the mouse over a cell. You can also zoom into a region of the heatmap by drawing a rectangle over an area of your choice

The following function simulates data from a multivariate normal distribution, and allows exposure dependent correlations between the data. You will need the mvrnorm function from the MASS library to run this function:

Next we need to install and load the heatmaply package (this also requires the devtools package):

The syntax is extremely simple. Lets plot a few different interactive heatmaps of the data. We first plot the genes and specify how many groups we want for the rows (subjects) and columns (genes) of the data. In this case we specify 2 groups for both the rows and columns:

We see that the clustering algorithm is not able to properly cluster the subjects because we would have expected to see two equal size groups in the dendrogram on the y-axis.

Let’s plot the correlation matrix of the genes: