16S rDNA V3-V4 amplicon sequencing analysis using dada2, phyloseq, LEfSe, picrust2 and other tools. Demo: https://ycl6.github.io/16S-Demo/
This GitHub repository includes codes and scripts that demonstrate the use of dada2
and phyloseq
(and associated tools and R packages) to analyze 16S rDNA amplicon sequencing data. An working example is included in the example
folder.
Disclaimer:
Do not use any of the provided codes and scripts in production without fully understanding of the contents. I am not responsible for errors or omissions or for any consequences from use of the contents and make no warranty with respect to the currency, completeness, or accuracy of the contents from this GitHub repository.
I strongly recommend using conda to manage the software and R packages required for this analysis.
export2graphlan
and graphlan
as they do not support Python 3.picrust2
>= 2.4 or higher in the main environment with Python 3 due to conflicts, you will need to create a separate environment to run the latest picrust2
.conda activate my_env
to activate an environment named my_env
, and conda deactivate
to deactivate an environment. Learn more about managing environments here.# Create main "16S" env
# Remove `picrust2=2.4` from the list if it causes conflicts
conda create -n 16S -c conda-forge -c bioconda r-base=4.1 python=3 curl libcurl openssl boost-cpp \
r-doparallel r-devtools r-rcurl r-httr r-magick r-png r-ggplot2 r-data.table r-phangorn r-ape r-gridextra \
r-ggbeeswarm r-ggrepel r-vegan r-tidyverse r-gtools r-r.utils bioconductor-dada2 bioconductor-phyloseq \
bioconductor-decipher bioconductor-deseq2 bioconductor-shortread bioconductor-biostrings \
bioconductor-biomformat bioconductor-aldex2 cutadapt raxml raxml-ng lefse picrust2=2.4
# Use "conda activate 16S" to activate this environment
# Create "graphlan" env
conda create -n graphlan export2graphlan graphlan
# Use "conda activate graphlan" to activate this environment
# Create "picrust2" env if it causes conflicts when setting up the "16S" env
conda create -n picrust2 picrust2=2.4
# Use "conda activate picrust2" to activate this environment
cutadapt
in RUsage: perl run_trimming.pl project_folder fastq_folder forward_primer_sequence reverse_primer_sequence
Usage: perl run_trimming.pl PRJEB27564 raw CCTACGGGNGGCWGCAG GACTACHVGGGTATCTAATCC
cutadapt
in terminal/consolecutadapt
in R by using system2
functionfilterAndTrim
picrust2
is installed)https://zenodo.org/record/4587955#.YNWax3VKhkY
Note: These files are intended for use in classifying prokaryotic 16S sequencing data and are not appropriate for classifying eukaryotic ASVs.
Install BLAST
conda install -c bioconda blast
or
sudo apt-get install ncbi-blast+
Download NCBI’s 16S rRNA BLAST DB
wget ftp://ftp.ncbi.nih.gov/blast/db/16S_ribosomal_RNA.tar.gz
tar zxf 16S_ribosomal_RNA.tar.gz
Convert 16SMicrobial BLAST DB into FASTA format
blastdbcmd -db 16S_ribosomal_RNA -entry all -out 16SMicrobial.fa
gzip 16SMicrobial.fa
http://evomics.org/phyloseq/taxa_summary-r/
Use gunzip taxa_summary.R.gz
to extract file