EpiMethEx (Epigenetic Methylation and Expression), a R package to perform a large-scale integrated analysis by cyclic correlation analyses between methylation and gene expression data.
The EpiMethEx package is under submission on bioconductor (https://github.com/Bioconductor/Contributions/issues), to use it you can follow the instructions below.
R CMD build EpiMethEx
install.packages(path_to_file, repos = NULL, type="source")
Annotations <- data.frame(
ID = c("cg11663302","cg01552731", "cg09081385"),
Relation_to_UCSC_CpG_Island = c("Island","N_Shore","N_Shore"),
UCSC_CpG_Islands_Name = c("chr1:18023481-18023792","chr19:46806998-46807617",
"chr12:120972167-120972447"),
UCSC_RefGene_Accession = c("NM_001011722","NM_152794","NM_014868"),
Chromosome_36 = c("1","19","12"),
Coordinate_36 = c("17896255","51498747","119456453"),
UCSC_RefGene_Name = c("ARHGEF10L","HIF3A","RNF10"),
UCSC_RefGene_Group =c("Body","1stExon","TSS200"),
stringsAsFactors=FALSE)
Expressions <- data.frame(
'sample' = c("ARHGEF10L", "HIF3A", "RNF10"),
'TCGA-YD-A89C-06' = c(-0.746592469762, -0.753826336325, 0.4953280),
'TCGA-Z2-AA3V-06' = c(0.578807530238, -2.30662633632, 0.1023280),
'TCGA-EB-A3Y6-01' = c(-0.363492469762, -2.67922633632, -0.6147720),
'TCGA-EE-A3JA-06' = c(-2.97279246976, -3.61932633632, 0.02932801),
'TCGA-D9-A4Z2-01' = c(-0.128492469762, 0.679073663675, 0.4017280),
'TCGA-D3-A51G-06' = c(-0.4299925, -4.0626263, -1.0136720),
stringsAsFactors=FALSE)
Methylation <- data.frame(
'sample' = c("cg11663302", "cg01552731", "cg09081385"),
'TCGA-YD-A89C-06' = c(0.9856, 0.7681, 0.0407),
'TCGA-Z2-AA3V-06' = c(0.9863, 0.8551, 0.0244),
'TCGA-EB-A3Y6-01' = c(0.9876, 0.6473, 0.028),
'TCGA-EE-A3JA-06' = c(0.9826, 0.4587, 0.0343),
'TCGA-D9-A4Z2-01' = c(0.9881, 0.8509, 0.0215),
'TCGA-D3-A51G-06' = c(0.9774, 0.813, 0.0332),
stringsAsFactors=FALSE)
4.1 or use the “curatedTCGAData” package:
source("https://bioconductor.org/biocLite.R")
BiocInstaller::biocLite("curatedTCGAData")
library(curatedTCGAData)
library(MultiAssayExperiment)
Methylation <- curatedTCGAData(diseaseCode = "SKCM", assays = "Methylation", dry.run = F)
Expressions <- curatedTCGAData(diseaseCode = "SKCM", assays = "RNASeq2GeneNorm", dry.run = F)
it’s most important to remember that curatedTCGAData doesn’t allow to download dataset of Annotations,therefore it must be loaded manually through csv file or created ad hoc
4.2 or use the csv file:
Expression <- read.csv2("Expressions.csv", header = T,sep = ";",stringsAsFactors=FALSE)
Annotations <- read.csv2("Annotations.csv",header = T,sep = ";",stringsAsFactors=FALSE)
Methylation <- read.csv2("Methylation.csv",header = T,sep = ";",stringsAsFactors=FALSE)
library(EpiMethEx)
epimethex.analysis(Expressions, Annotations, Methylation, 1, 3, 2,TRUE, TRUE, FALSE)
To further evaluate the biological significance of the methylation hotspots involved in gene regulation mechanisms, is possible to filter EpiMethEx output with an additional R script (https://github.com/giupardeb/EpiMethEx-Filter)