File size: 2,177 Bytes
31ea74a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
license: apache-2.0
---

## About

RNA-FM (RNA Foundation Model) is a state-of-the-art **pretrained language model for RNA sequences**, serving as the foundation for an integrated RNA research ecosystem. 
Trained on **23+ million non-coding RNA (ncRNA) sequences** via self-supervised learning, RNA-FM extracts comprehensive structural and functional information from RNA sequences *without* relying on experimental labels.
**[mRNA‑FM](https://arxiv.org/abs/2204.00300)** is a direct extension of RNA-FM, trained exclusively on 45 million mRNA coding sequences (CDS). 
It is specifically designed to capture information unique to mRNA and has demonstrated excellent performance in related tasks.
Consequently, RNA-FM generates **general-purpose RNA embeddings** suitable for a broad range of downstream tasks, including but not limited to secondary and tertiary structure prediction, RNA family clustering, and functional RNA analysis.


The full codes are available at GitHub: https://github.com/ml4bio/RNA-FM.

## Citation

If you use the model in your research, please cite our paper with the following.

```
@article{chen2022interpretable,
  title={Interpretable RNA foundation model from unannotated data for highly accurate RNA structure and function predictions},
  author={Chen, Jiayang and Hu, Zhihang and Sun, Siqi and Tan, Qingxiong and Wang, Yixuan and Yu, Qinze and Zong, Licheng and Hong, Liang and Xiao, Jin and Shen, Tao and others},
  journal={arXiv preprint arXiv:2204.00300},
  year={2022}
}

@article{shen2024accurate,
  title={Accurate RNA 3D structure prediction using a language model-based deep learning approach},
  author={Shen, Tao and Hu, Zhihang and Sun, Siqi and Liu, Di and Wong, Felix and Wang, Jiuming and Chen, Jiayang and Wang, Yixuan and Hong, Liang and Xiao, Jin and others},
  journal={Nature Methods},
  pages={1--12},
  year={2024},
  publisher={Nature Publishing Group US New York}
}

@article{chen2020rna,
  title={RNA secondary structure prediction by learning unrolled algorithms},
  author={Chen, Xinshi and Li, Yu and Umarov, Ramzan and Gao, Xin and Song, Le},
  journal={arXiv preprint arXiv:2002.05810},
  year={2020}
}
```