This repository contains the python package for Helical
Helical provides a framework for state-of-the-art pre-trained bio foundation models on genomics and transcriptomics modalities.
Helical simplifies the entire application lifecycle when building with bio foundation models. You will be able to:
We will update this repo on a regular basis with new models, benchmarks, modalities and functions - so stay tuned.
Let’s build the most exciting AI-for-Bio community together!
We have integrated TranscriptFormer into our helical package and have made a model card for it in our Transcriptformer model folder. If you would like to test the model, take a look at our example notebook!
We’re thrilled to announce the release of our first-ever mRNA Bio Foundation Model, designed to:
Check out our blog post to learn more about our approach and read the model card to get started.
We recommend installing Helical within a conda environment with the commands below (run them in your terminal) - this step is optional:
conda create --name helical-package python=3.11.8
conda activate helical-package
To install the latest pip release of our Helical package, you can run the command below:
pip install helical
To install the latest Helical package, you can run the command below:
pip install --upgrade git+https://github.com/helicalAI/helical.git
Alternatively, clone the repo and install it:
git clone https://github.com/helicalAI/helical.git
pip install .
[Optional] To install mamba-ssm and causal-conv1d use the command below:
pip install helical[mamba-ssm]
or in case you’re installing from the Helical repo cloned locally:
pip install .[mamba-ssm]
Note:
causal_conv1d
requires torch
to be installed already. First installing helical
separately (without [mamba-ssm]
) will install torch
for you. A second installation (with [mamba-ssm]
), installs the packages correctly.If you desire to run your code in a singularity file, you can use the singularity.def file and build an apptainer with it:
apptainer build --sandbox singularity/helical singularity.def
and then shell into the sandbox container (use the --nv flag if you have a GPU available):
apptainer shell --nv --fakeroot singularity/helical/
To run examples, be sure to have installed the Helical package (see Installation) and that it is up-to-date.
You can look directly into the example folder above and download the script of your choice, look into our documentation for step-by-step guides or directly clone the repository using:
git clone https://github.com/helicalAI/helical.git
Within the examples/notebooks
folder, open the notebook of your choice. We recommend starting with Quick-Start-Tutorial.ipynb
Example | Description | Colab |
---|---|---|
Quick-Start-Tutorial.ipynb | A tutorial to quickly get used to the helical package and environment. | |
Helix-mRNA.ipynb | An example of how to use the Helix-mRNA model. | |
Geneformer-vs-TranscriptFormer.ipynb | Zero-Shot Reference Mapping with Geneformer & TranscriptFormer and compare the outcomes. | |
Hyena-DNA-Inference.ipynb | An example how to do probing with HyenaDNA by training a neural network on 18 downstream classification tasks. | |
Cell-Type-Annotation.ipynb | An example how to do probing with scGPT by training a neural network to predict cell type annotations. | |
Cell-Type-Classification-Fine-Tuning.ipynb | An example how to fine-tune different models on classification tasks. | |
HyenaDNA-Fine-Tuning.ipynb | An example of how to fine-tune the HyenaDNA model on downstream benchmarks. | |
Cell-Gene-Cls-embedding-generation.ipynb | A notebook explaining the different embedding modes of single cell RNA models. |
We are eager to help you and interact with you:
If you are (or plan to) working with bio foundation models s.a. Geneformer or UCE on RNA and DNA data, Helical will be your best buddy! We provide and improve on:
We will continuously upload the latest model, publish benchmarks and make our code more efficient.
A lot of our models have been published by talend authors developing these exciting technologies. We sincerely thank the authors of the following open-source projects:
You can find the Licenses for each model implementation in the model repositories:
Please use this BibTeX to cite this repository in your publications:
@software{allard_2024_13135902,
author = {Helical Team},
title = {helicalAI/helical: v1.1.0},
month = nov,
year = 2024,
publisher = {Zenodo},
version = {1.1.0},
doi = {10.5281/zenodo.13135902},
url = {https://doi.org/10.5281/zenodo.13135902}
}