Logo RNA-Scope

Benchmarking RNA Language Models for RNA Sequence Understanding


Introduction

UltraBench

Pre-trained language models (pLMs) have advanced our understanding of RNA biology. However, current evaluation frameworks remain limited in capturing the inherent complexity of RNA, leading to insufficient and biased assessments that hinder their practical applications. Here, we introduce Logo RNA-Scope, a comprehensive benchmarking framework designed to gauge RNA pLMs via structure prediction, interaction classification, and function characterization. This framework includes 1,253 experiments spanning diverse subtasks of varying complexity and enables systematic model comparison with consistent architectural modules. Model assessment shows that generalization of sequence flexibility across RNA families, target contexts, and environmental features remains challenging for existing models. RNA-Scope provides a systematic, robust, and fair evaluation framework to accelerate RNA modeling.

RNA Structure Tasks - Performance

RNA Structure Tasks Performance

Keypoints : RNA sequences determine structures. This panel outlines tasks for inferring RNA structure from one-dimensional (1D) sequence, including (1) secondary structure prediction for base-pairing likelihood, (2) chemical reactivity prediction for structural dynamics and nucleotide accessibility, and (3) contact map prediction for spatial interactions essential to three-dimensional topology. Together, these tasks establish a sub-framework for understanding the structural diversity of RNA across families. Results are reported as the mean ± standard deviation across three independent runs using different random seeds.

(1) Secondary Structure Prediction Logo

Secondary Structure Prediction (SSP) is formulated as a binary-classification problem in which the model predicts the pairing state \(y_i \in \{0,1\}\) of each nucleotide \(x_i\) in an RNA sequence, thereby capturing its base-pairing pattern. The benchmark comprises three independent datasets—bpRNA,SetA, and SetB—each split into two evaluation subsets: an intra-family split (training and test sequences drawn from the same RNA family) and an inter-family split (training and test sequences drawn from different RNA families).

Model (Module) bpRNA Set A Set B
PrecisionIntra-FamilyRecallIntra-FamilyF1Intra-Family PrecisionInter-FamilyRecallInter-FamilyF1Inter-Family PrecisionIntra-FamilyRecallIntra-FamilyF1Intra-Family PrecisionInter-FamilyRecallInter-FamilyF1Inter-Family PrecisionIntra-FamilyRecallIntra-FamilyF1Intra-Family PrecisionInter-FamilyRecallInter-FamilyF1Inter-Family
One-hot 0.465 ± 0.0080.668 ± 0.0210.548 ± 0.003 0.399 ± 0.0020.608 ± 0.0220.482 ± 0.006 0.630 ± 0.0230.763 ± 0.0500.689 ± 0.007 0.278 ± 0.0100.455 ± 0.0730.343 ± 0.014 0.438 ± 0.0220.320 ± 0.0210.369 ± 0.008 0.605 ± 0.0080.442 ± 0.0350.510 ± 0.021
Dense 0.433 ± 0.0150.763 ± 0.0390.552 ± 0.002 0.387 ± 0.0060.712 ± 0.0470.501 ± 0.007 0.628 ± 0.0130.777 ± 0.0290.695 ± 0.005 0.277 ± 0.0060.472 ± 0.0470.348 ± 0.010 0.435 ± 0.0270.346 ± 0.0220.384 ± 0.006 0.600 ± 0.0070.469 ± 0.0440.526 ± 0.027
RNABERT (MLP) 0.551 ± 0.0030.551 ± 0.0070.551 ± 0.004 0.517 ± 0.0000.567 ± 0.0070.541 ± 0.003 0.614 ± 0.0000.814 ± 0.0020.700 ± 0.001 0.488 ± 0.0000.803 ± 0.0000.607 ± 0.000 0.525 ± 0.0010.512 ± 0.0040.518 ± 0.002 0.649 ± 0.0010.440 ± 0.0050.524 ± 0.004
RNA-FM (MLP) 0.747 ± 0.0010.788 ± 0.0070.766 ± 0.004 0.563 ± 0.0030.661 ± 0.0080.608 ± 0.002 0.810 ± 0.0020.867 ± 0.0040.837 ± 0.001 0.657 ± 0.0090.716 ± 0.0090.685 ± 0.001 0.867 ± 0.0020.871 ± 0.0020.869 ± 0.000 0.699 ± 0.0050.533 ± 0.0110.605 ± 0.006
3UTRBERT (MLP) 0.600 ± 0.0080.702 ± 0.0220.647 ± 0.012 0.529 ± 0.0060.686 ± 0.0170.597 ± 0.003 0.683 ± 0.0090.822 ± 0.0060.746 ± 0.004 0.500 ± 0.0020.743 ± 0.0160.598 ± 0.004 0.670 ± 0.0050.676 ± 0.0140.673 ± 0.005 0.616 ± 0.0030.483 ± 0.0140.512 ± 0.009
SpliceRBERT (MLP) 0.613 ± 0.0180.696 ± 0.0400.651 ± 0.024 0.533 ± 0.0070.652 ± 0.0340.586 ± 0.011 0.708 ± 0.0190.822 ± 0.0230.760 ± 0.001 0.496 ± 0.0050.718 ± 0.0390.587 ± 0.010 0.723 ± 0.0100.712 ± 0.0180.717 ± 0.005 0.615 ± 0.0030.495 ± 0.0360.548 ± 0.022
UTR-LM (MLP) 0.599 ± 0.0140.646 ± 0.0150.621 ± 0.007 0.543 ± 0.0040.634 ± 0.0180.585 ± 0.005 0.671 ± 0.0070.785 ± 0.0100.723 ± 0.005 0.505 ± 0.0020.729 ± 0.0140.597 ± 0.005 0.639 ± 0.0140.669 ± 0.0340.653 ± 0.016 0.626 ± 0.0070.499 ± 0.0300.555 ± 0.017
RiNALMo (MLP) 0.781 ± 0.0060.814 ± 0.0090.797 ± 0.004 0.575 ± 0.0070.683 ± 0.0160.625 ± 0.006 0.871 ± 0.0040.892 ± 0.0070.881 ± 0.002 0.675 ± 0.0080.756 ± 0.0100.713 ± 0.003 0.887 ± 0.0110.901 ± 0.0070.894 ± 0.003 0.789 ± 0.0040.673 ± 0.0130.726 ± 0.006
RNABERT (CNN) 0.587 ± 0.0040.610 ± 0.0110.598 ± 0.003 0.542 ± 0.0020.603 ± 0.0130.571 ± 0.005 0.652 ± 0.0100.746 ± 0.0180.696 ± 0.003 0.507 ± 0.0040.730 ± 0.0310.599 ± 0.010 0.548 ± 0.0050.606 ± 0.0230.575 ± 0.008 0.588 ± 0.0020.537 ± 0.0280.561 ± 0.015
RNA-FM (CNN) 0.763 ± 0.0090.812 ± 0.0120.786 ± 0.001 0.541 ± 0.0070.650 ± 0.0150.590 ± 0.005 0.813 ± 0.0060.847 ± 0.0090.842 ± 0.001 0.649 ± 0.0200.727 ± 0.0160.685 ± 0.005 0.879 ± 0.0150.870 ± 0.0140.874 ± 0.001 0.682 ± 0.0120.489 ± 0.0170.569 ± 0.007
3UTRBERT (CNN) 0.612 ± 0.0150.748 ± 0.0190.672 ± 0.005 0.509 ± 0.0110.697 ± 0.0370.588 ± 0.016 0.700 ± 0.0010.837 ± 0.0060.762 ± 0.002 0.496 ± 0.0050.733 ± 0.0130.592 ± 0.008 0.704 ± 0.0020.704 ± 0.0030.704 ± 0.001 0.616 ± 0.0030.401 ± 0.0080.486 ± 0.005
SpliceRBERT (CNN) 0.634 ± 0.0030.760 ± 0.0110.691 ± 0.003 0.496 ± 0.0050.656 ± 0.0130.565 ± 0.004 0.727 ± 0.0070.829 ± 0.0090.774 ± 0.000 0.492 ± 0.0030.703 ± 0.0130.579 ± 0.006 0.730 ± 0.0040.744 ± 0.0200.737 ± 0.008 0.614 ± 0.0040.441 ± 0.0150.513 ± 0.011
UTR-LM (CNN) 0.618 ± 0.0060.728 ± 0.0120.668 ± 0.003 0.524 ± 0.0070.688 ± 0.0190.595 ± 0.007 0.689 ± 0.0050.807 ± 0.0120.743 ± 0.003 0.498 ± 0.0020.716 ± 0.0120.587 ± 0.003 0.709 ± 0.0080.668 ± 0.0050.688 ± 0.006 0.620 ± 0.0030.400 ± 0.0030.486 ± 0.003
RiNALMo (CNN) 0.785 ± 0.0060.823 ± 0.0070.803 ± 0.000 0.572 ± 0.0140.643 ± 0.0250.605 ± 0.006 0.874 ± 0.0020.888 ± 0.0070.881 ± 0.002 0.678 ± 0.0030.783 ± 0.0200.706 ± 0.007 0.908 ± 0.0050.897 ± 0.0020.903 ± 0.002 0.779 ± 0.0100.650 ± 0.0180.708 ± 0.008
RNABERT (ResNet) 0.593 ± 0.0260.818 ± 0.0580.685 ± 0.003 0.479 ± 0.0110.730 ± 0.0770.576 ± 0.015 0.690 ± 0.0140.805 ± 0.0320.743 ± 0.006 0.484 ± 0.0040.723 ± 0.0390.579 ± 0.009 0.705 ± 0.0050.639 ± 0.0090.670 ± 0.004 0.610 ± 0.0070.393 ± 0.0210.478 ± 0.013
RNA-FM (ResNet) 0.734 ± 0.0080.820 ± 0.0120.775 ± 0.001 0.540 ± 0.0090.693 ± 0.0160.607 ± 0.001 0.783 ± 0.0060.861 ± 0.0060.820 ± 0.001 0.585 ± 0.0150.766 ± 0.0050.663 ± 0.008 0.851 ± 0.0170.870 ± 0.0040.860 ± 0.006 0.665 ± 0.0150.533 ± 0.0450.591 ± 0.023
3UTRBERT (ResNet) 0.597 ± 0.0100.804 ± 0.0360.685 ± 0.007 0.485 ± 0.0040.713 ± 0.0510.576 ± 0.015 0.699 ± 0.0090.801 ± 0.0180.746 ± 0.003 0.489 ± 0.0060.692 ± 0.0290.573 ± 0.008 0.706 ± 0.0220.686 ± 0.0340.695 ± 0.009 0.609 ± 0.0040.422 ± 0.0490.498 ± 0.032
SpliceRBERT (ResNet) 0.627 ± 0.0120.781 ± 0.0330.695 ± 0.006 0.491 ± 0.0080.692 ± 0.0310.574 ± 0.005 0.718 ± 0.0200.835 ± 0.0350.772 ± 0.004 0.486 ± 0.0060.719 ± 0.0560.579 ± 0.014 0.719 ± 0.0170.706 ± 0.0150.712 ± 0.003 0.599 ± 0.0130.528 ± 0.0060.561 ± 0.003
UTR-LM (ResNet) 0.625 ± 0.0120.488 ± 0.0280.697 ± 0.003 0.491 ± 0.0060.676 ± 0.0420.568 ± 0.012 0.720 ± 0.0120.797 ± 0.0330.756 ± 0.008 0.491 ± 0.0050.678 ± 0.0470.569 ± 0.013 0.738 ± 0.0280.698 ± 0.0430.716 ± 0.012 0.613 ± 0.0080.387 ± 0.0580.472 ± 0.042
RiNALMo (ResNet) 0.788 ± 0.0140.823 ± 0.0100.805 ± 0.003 0.573 ± 0.0070.660 ± 0.0360.613 ± 0.014 0.861 ± 0.0050.903 ± 0.0090.882 ± 0.002 0.664 ± 0.0220.757 ± 0.0440.707 ± 0.007 0.902 ± 0.0150.891 ± 0.0160.897 ± 0.001 0.773 ± 0.0220.596 ± 0.0350.672 ± 0.015

(2) Chemical Reactivity Prediction Logo

Chemical Reactivity Prediction (CRP) is framed as a regression task in which the model estimates the nucleotide-wise chemical reactivity \(y_i \in [0,1]\) for each position \(x_i\) of an RNA molecule. Reactivity values correlate with secondary structure: unpaired or flexible regions generally exhibit higher chemical reactivity, whereas base-paired segments tend to be less reactive. Accordingly, CRP captures both static base-pairing information and dynamic conformational fluctuations  The benchmark provides two Test sets— TestS (short RNAs) and TestL (long RNAs)—whose ground-truth reactivities were obtained from high-throughput probing experiments. Models are assessed by the mean absolute error (MAE ↓) on each test set.

Model (Module) TestS TestL
MAE ↓ MAE ↓
One-hot 0.179 ± 0.001 0.167 ± 0.002
Dense0.176 ± 0.0010.174 ± 0.002
RNABERT (MLP)0.255 ± 0.0000.266 ± 0.001
RNA-FM (MLP)0.214 ± 0.0010.187 ± 0.003
3UTRBERT (MLP)0.202 ± 0.0010.195 ± 0.002
SpliceBERT (MLP)0.207 ± 0.0010.207 ± 0.001
UTR-LM (MLP)0.201 ± 0.0010.196 ± 0.003
RiNALMo (MLP)0.183 ± 0.0010.195 ± 0.002
RNABERT (CNN)0.228 ± 0.0010.241 ± 0.002
RNA-FM (CNN)0.197 ± 0.0030.176 ± 0.001
3UTRBERT (CNN)0.192 ± 0.0000.182 ± 0.002
SpliceBERT (CNN)0.196 ± 0.0010.179 ± 0.002
UTR-LM (CNN)0.191 ± 0.0010.176 ± 0.007
RiNALMo (CNN)0.173 ± 0.0010.179 ± 0.002
RNABERT (ResNet)0.181 ± 0.0020.175 ± 0.003
RNA-FM (ResNet)0.196 ± 0.0040.166 ± 0.004
3UTRBERT (ResNet)0.187 ± 0.0020.172 ± 0.004
SpliceBERT (ResNet)0.193 ± 0.0040.177 ± 0.003
UTR-LM (ResNet)0.180 ± 0.0010.171 ± 0.005
RiNALMo (ResNet)0.163 ± 0.0000.163 ± 0.004

(3) Contact Map Prediction Logo

Contact Map Prediction (CMP) is framed as a binary-classification task in which the model predicts, for every nucleotide pair \((x_i, x_j)\), whether the two residues are within 8 Å in three-dimensional space, i.e. their contact state \(y_{ij} \in \{0,1\}\). Owing to the intrinsic flexibility of RNA, accurately recovering these long-range spatial relationships remains highly challenging.

Model (Module) Short@L/5 Long@L/5
One-hot 0.158 ± 0.006 0.168 ± 0.007
Dense0.190 ± 0.0150.170 ± 0.006
RNABERT (MLP)0.013 ± 0.0010.021 ± 0.003
RNA-FM (MLP)0.137 ± 0.0030.129 ± 0.002
3UTRBERT (MLP)0.103 ± 0.0020.086 ± 0.003
SpliceBERT (MLP)0.111 ± 0.0020.121 ± 0.002
UTR-LM (MLP)0.111 ± 0.0020.097 ± 0.002
RiNALMo (MLP)0.173 ± 0.0010.164 ± 0.002
RNABERT (CNN)0.117 ± 0.0050.102 ± 0.002
RNA-FM (CNN)0.141 ± 0.0050.150 ± 0.004
3UTRBERT (CNN)0.138 ± 0.0040.109 ± 0.005
SpliceBERT (CNN)0.150 ± 0.0070.140 ± 0.004
UTR-LM (CNN)0.152 ± 0.0050.132 ± 0.001
RiNALMo (CNN)0.182 ± 0.0020.146 ± 0.003
RNABERT (ResNet)0.084 ± 0.0030.083 ± 0.003
RNA-FM (ResNet)0.145 ± 0.0080.150 ± 0.007
3UTRBERT (ResNet)0.137 ± 0.0160.122 ± 0.002
SpliceBERT (ResNet)0.155 ± 0.0020.131 ± 0.009
UTR-LM (ResNet)0.075 ± 0.0060.101 ± 0.002
RiNALMo (ResNet)0.133 ± 0.0020.139 ± 0.005

RNA Interaction Tasks Performance

Keypoints : Both naturally and artificially evolved RNA sequences adopt defined structural shapes, enabling high-specificity interactions with a wide array of target molecules. This panel outlines tasks for inferring RNA interaction with different targets from 1D sequence, including (1) binary binding prediction for RNA–protein interactions within cells; (2) systematic binding ranking for categorizing RNA interaction with targets of varying molecular sizes; and (3) binding affinity prediction for RNA–protein interaction with varying strengths in vitro. Together, these tasks establish a sub-framework for understanding the interaction diversity of RNA across targets, encompassing both persistent and transient interaction. As before, results are reported as the mean ± standard deviation across three independent runs with different random seeds.

(1) Binary Binary Prediction Logo

Binary Binding Prediction (BBP) defines a binary-classification task in which the model predicts whether an RNA sequence interacts with a specific RNA-binding protein (RBP). The benchmark comprises 22 independent datasets—one for each RBP profiled in K562 and HepG2 cell lines. To eliminate redundancy, any sequences sharing over 80% identity are removed.

Model (Module) Average of 22 RBPs datasets AKAP1_HepG2 BCLAF1_HepG2 DDX24_K562 DDX3X_HepG2 DDX3X_K562 FAM120A_K562 G3BP1_HepG2 GRWD1_HepG2 IGF2BP1_K562 LARP4_HepG2 LIN28B_K562 PABPC4_K562 PPIG_HepG2 PUM2_K562 RBM15_K562 RPS3_HepG2 SND1_HepG2 UPF1_HepG2 UPF1_K562 UCHL5_K562 YBX3_K562 YBX3_K562
F1 F1 F1 F1 F1 F1 F1 F1 F1 F1 F1 F1 F1 F1 F1 F1 F1 F1 F1 F1 F1 F1
One-hot0.703 0.753 ± 0.007 0.733 ± 0.0090.638 ± 0.005 0.850 ± 0.001 0.808 ± 0.002 0.731 ± 0.0060.608 ± 0.016 0.654 ± 0.007 0.698 ± 0.001 0.650 ± 0.0030.603 ± 0.006 0.682 ± 0.004 0.694 ± 0.012 0.899 ± 0.0020.646 ± 0.004 0.675 ± 0.006 0.670 ± 0.005 0.765 ± 0.0040.717 ± 0.003 0.685 ± 0.003 0.634 ± 0.002 0.679 ± 0.005
Dense0.704 0.757 ± 0.002 0.743 ± 0.0080.651 ± 0.009 0.847 ± 0.001 0.808 ± 0.002 0.736 ± 0.0040.623 ± 0.006 0.657 ± 0.002 0.695 ± 0.004 0.656 ± 0.0080.593 ± 0.008 0.679 ± 0.003 0.698 ± 0.005 0.901 ± 0.0030.651 ± 0.006 0.678 ± 0.006 0.657 ± 0.007 0.768 ± 0.0010.721 ± 0.003 0.679 ± 0.012 0.626 ± 0.005 0.673 ± 0.001
DNABERT2 (MLP)0.683 0.745 ± 0.005 0.697 ± 0.0010.599 ± 0.010 0.846 ± 0.001 0.807 ± 0.004 0.717 ± 0.0140.584 ± 0.007 0.621 ± 0.003 0.691 ± 0.002 0.645 ± 0.0080.584 ± 0.007 0.655 ± 0.003 0.669 ± 0.003 0.820 ± 0.0030.658 ± 0.006 0.681 ± 0.006 0.629 ± 0.007 0.755 ± 0.0030.711 ± 0.002 0.668 ± 0.001 0.591 ± 0.002 0.657 ± 0.005
HyenaDNA (MLP)0.640 0.722 ± 0.007 0.674 ± 0.0050.536 ± 0.026 0.846 ± 0.002 0.793 ± 0.007 0.678 ± 0.0180.527 ± 0.020 0.532 ± 0.013 0.695 ± 0.002 0.580 ± 0.0370.412 ± 0.008 0.579 ± 0.019 0.659 ± 0.002 0.794 ± 0.0050.631 ± 0.008 0.665 ± 0.003 0.609 ± 0.014 0.732 ± 0.0030.684 ± 0.013 0.608 ± 0.004 0.509 ± 0.023 0.607 ± 0.007
NT (MLP)0.666 0.744 ± 0.002 0.676 ± 0.0060.572 ± 0.008 0.840 ± 0.006 0.796 ± 0.001 0.700 ± 0.0030.586 ± 0.001 0.593 ± 0.001 0.680 ± 0.006 0.641 ± 0.0090.467 ± 0.053 0.647 ± 0.003 0.649 ± 0.011 0.814 ± 0.0020.638 ± 0.004 0.663 ± 0.003 0.599 ± 0.019 0.743 ± 0.0010.688 ± 0.008 0.642 ± 0.002 0.635 ± 0.001 0.635 ± 0.001
RNABERT (MLP)0.539 0.548 ± 0.009 0.617 ± 0.0070.456 ± 0.012 0.798 ± 0.003 0.689 ± 0.007 0.501 ± 0.0210.436 ± 0.004 0.457 ± 0.012 0.615 ± 0.007 0.408 ± 0.0000.406 ± 0.000 0.407 ± 0.000 0.567 ± 0.007 0.726 ± 0.0010.550 ± 0.011 0.605 ± 0.008 0.522 ± 0.004 0.601 ± 0.0140.480 ± 0.017 0.538 ± 0.008 0.415 ± 0.000 0.519 ± 0.004
RNA-FM (MLP)0.698 0.776 ± 0.000 0.715 ± 0.0010.628 ± 0.002 0.852 ± 0.001 0.811 ± 0.004 0.751 ± 0.0030.638 ± 0.001 0.638 ± 0.008 0.734 ± 0.006 0.507 ± 0.0090.602 ± 0.015 0.696 ± 0.001 0.684 ± 0.003 0.819 ± 0.0040.674 ± 0.002 0.698 ± 0.002 0.669 ± 0.002 0.779 ± 0.0030.740 ± 0.003 0.679 ± 0.002 0.590 ± 0.018 0.672 ± 0.005
3UTRBERT (MLP)0.716 0.788 ± 0.002 0.720 ± 0.0020.641 ± 0.003 0.854 ± 0.001 0.826 ± 0.002 0.755 ± 0.0010.661 ± 0.003 0.650 ± 0.002 0.722 ± 0.002 0.696 ± 0.0010.609 ± 0.002 0.695 ± 0.005 0.707 ± 0.002 0.853 ± 0.0030.681 ± 0.003 0.685 ± 0.001 0.662 ± 0.003 0.783 ± 0.0020.746 ± 0.003 0.702 ± 0.006 0.606 ± 0.004 0.690 ± 0.003
SpliceBERT (MLP)0.736 0.789 ± 0.003 0.742 ± 0.0020.697 ± 0.003 0.863 ± 0.001 0.838 ± 0.001 0.747 ± 0.0040.689 ± 0.001 0.692 ± 0.004 0.743 ± 0.001 0.709 ± 0.0040.649 ± 0.001 0.724 ± 0.003 0.746 ± 0.006 0.842 ± 0.0040.679 ± 0.001 0.708 ± 0.005 0.687 ± 0.003 0.787 ± 0.0030.744 ± 0.003 0.742 ± 0.002 0.652 ± 0.002 0.727 ± 0.001
UTR-LM (MLP)0.682 0.760 ± 0.003 0.696 ± 0.0070.595 ± 0.009 0.850 ± 0.002 0.813 ± 0.002 0.735 ± 0.0030.590 ± 0.009 0.596 ± 0.030 0.701 ± 0.003 0.662 ± 0.0020.555 ± 0.055 0.667 ± 0.007 0.673 ± 0.003 0.815 ± 0.0030.645 ± 0.007 0.671 ± 0.006 0.625 ± 0.018 0.654 ± 0.0040.766 ± 0.002 0.722 ± 0.003 0.572 ± 0.001 0.650 ± 0.007
RiNALMo (MLP)0.726 0.797 ± 0.003 0.740 ± 0.0030.664 ± 0.008 0.861 ± 0.001 0.824 ± 0.004 0.768 ± 0.0020.663 ± 0.007 0.665 ± 0.004 0.742 ± 0.001 0.690 ± 0.0030.633 ± 0.004 0.709 ± 0.004 0.715 ± 0.003 0.843 ± 0.0010.689 ± 0.002 0.704 ± 0.012 0.668 ± 0.007 0.715 ± 0.0020.790 ± 0.002 0.755 ± 0.005 0.621 ± 0.002 0.714 ± 0.007
DNABERT2 (CNN)0.685 0.739 ± 0.005 0.697 ± 0.0010.599 ± 0.010 0.845 ± 0.004 0.799 ± 0.003 0.724 ± 0.0020.602 ± 0.006 0.614 ± 0.008 0.694 ± 0.002 0.638 ± 0.0100.589 ± 0.004 0.670 ± 0.010 0.656 ± 0.007 0.843 ± 0.0060.657 ± 0.004 0.677 ± 0.005 0.652 ± 0.004 0.750 ± 0.0090.702 ± 0.007 0.651 ± 0.002 0.618 ± 0.002 0.644 ± 0.006
HyenaDNA (CNN)0.697 0.760 ± 0.002 0.733 ± 0.0070.614 ± 0.023 0.851 ± 0.006 0.813 ± 0.000 0.755 ± 0.0070.613 ± 0.002 0.626 ± 0.011 0.738 ± 0.004 0.643 ± 0.0060.406 ± 0.000 0.670 ± 0.009 0.730 ± 0.003 0.883 ± 0.0050.704 ± 0.005 0.702 ± 0.008 0.669 ± 0.010 0.767 ± 0.0020.716 ± 0.004 0.657 ± 0.007 0.605 ± 0.018 0.684 ± 0.007
NT (CNN)0.674 0.742 ± 0.008 0.689 ± 0.0020.598 ± 0.005 0.842 ± 0.004 0.796 ± 0.003 0.712 ± 0.0050.593 ± 0.007 0.608 ± 0.004 0.685 ± 0.005 0.637 ± 0.0030.453 ± 0.067 0.641 ± 0.006 0.668 ± 0.002 0.835 ± 0.0060.658 ± 0.003 0.670 ± 0.014 0.625 ± 0.007 0.752 ± 0.0010.695 ± 0.007 0.647 ± 0.002 0.646 ± 0.002 0.646 ± 0.002
RNABERT (CNN)0.642 0.704 ± 0.002 0.670 ± 0.0020.527 ± 0.002 0.835 ± 0.002 0.792 ± 0.002 0.685 ± 0.0040.515 ± 0.004 0.584 ± 0.006 0.679 ± 0.003 0.598 ± 0.0060.507 ± 0.009 0.631 ± 0.010 0.607 ± 0.014 0.815 ± 0.0020.615 ± 0.004 0.642 ± 0.007 0.567 ± 0.005 0.724 ± 0.0020.684 ± 0.002 0.606 ± 0.001 0.535 ± 0.008 0.613 ± 0.003
RNA-FM (CNN)0.724 0.788 ± 0.002 0.740 ± 0.0020.658 ± 0.001 0.859 ± 0.003 0.812 ± 0.001 0.761 ± 0.0070.654 ± 0.003 0.662 ± 0.001 0.743 ± 0.002 0.662 ± 0.0010.619 ± 0.003 0.702 ± 0.002 0.722 ± 0.003 0.875 ± 0.0060.707 ± 0.004 0.709 ± 0.002 0.689 ± 0.006 0.788 ± 0.0020.755 ± 0.002 0.693 ± 0.005 0.642 ± 0.001 0.697 ± 0.002
3UTRBERT (CNN)0.728 0.789 ± 0.002 0.755 ± 0.0020.666 ± 0.003 0.853 ± 0.006 0.826 ± 0.002 0.763 ± 0.0020.663 ± 0.010 0.675 ± 0.006 0.727 ± 0.002 0.701 ± 0.0000.628 ± 0.008 0.708 ± 0.011 0.734 ± 0.000 0.890 ± 0.0040.688 ± 0.007 0.698 ± 0.007 0.678 ± 0.006 0.791 ± 0.0010.752 ± 0.001 0.718 ± 0.004 0.639 ± 0.008 0.711 ± 0.004
SpliceBERT (CNN)0.748 0.792 ± 0.002 0.770 ± 0.0000.712 ± 0.004 0.860 ± 0.000 0.838 ± 0.003 0.759 ± 0.0040.694 ± 0.004 0.710 ± 0.003 0.746 ± 0.001 0.716 ± 0.0010.642 ± 0.010 0.729 ± 0.000 0.769 ± 0.004 0.872 ± 0.0060.708 ± 0.002 0.717 ± 0.003 0.713 ± 0.002 0.789 ± 0.0020.759 ± 0.003 0.748 ± 0.001 0.676 ± 0.002 0.743 ± 0.006
UTR-LM (CNN)0.728 0.791 ± 0.004 0.751 ± 0.0010.666 ± 0.012 0.853 ± 0.002 0.815 ± 0.003 0.759 ± 0.0030.657 ± 0.003 0.680 ± 0.008 0.740 ± 0.007 0.681 ± 0.0060.631 ± 0.001 0.690 ± 0.037 0.744 ± 0.004 0.887 ± 0.0050.704 ± 0.011 0.698 ± 0.006 0.671 ± 0.004 0.709 ± 0.0020.785 ± 0.003 0.749 ± 0.002 0.645 ± 0.006 0.716 ± 0.004
RiNALMo (CNN)0.749 0.800 ± 0.004 0.776 ± 0.0020.720 ± 0.005 0.865 ± 0.002 0.820 ± 0.006 0.778 ± 0.0030.681 ± 0.007 0.703 ± 0.002 0.751 ± 0.004 0.702 ± 0.0020.643 ± 0.006 0.720 ± 0.005 0.764 ± 0.001 0.895 ± 0.0050.729 ± 0.003 0.722 ± 0.005 0.701 ± 0.002 0.740 ± 0.0060.793 ± 0.001 0.769 ± 0.002 0.666 ± 0.003 0.748 ± 0.004
DNABERT2 (ResNet)0.669 0.726 ± 0.008 0.697 ± 0.0070.577 ± 0.001 0.843 ± 0.001 0.795 ± 0.001 0.701 ± 0.0210.572 ± 0.005 0.599 ± 0.006 0.678 ± 0.008 0.639 ± 0.0190.561 ± 0.008 0.628 ± 0.004 0.659 ± 0.002 0.823 ± 0.0090.653 ± 0.005 0.664 ± 0.009 0.620 ± 0.004 0.749 ± 0.0040.684 ± 0.007 0.626 ± 0.009 0.594 ± 0.005 0.634 ± 0.012
HyenaDNA (ResNet)0.727 0.771 ± 0.003 0.753 ± 0.0010.683 ± 0.004 0.853 ± 0.001 0.810 ± 0.001 0.757 ± 0.0070.633 ± 0.005 0.679 ± 0.005 0.732 ± 0.004 0.658 ± 0.0040.607 ± 0.009 0.686 ± 0.003 0.739 ± 0.005 0.896 ± 0.0040.720 ± 0.004 0.699 ± 0.004 0.684 ± 0.007 0.778 ± 0.0030.737 ± 0.003 0.704 ± 0.001 0.639 ± 0.002 0.710 ± 0.008
NT (ResNet)0.681 0.738 ± 0.006 0.708 ± 0.0030.601 ± 0.014 0.841 ± 0.003 0.790 ± 0.016 0.701 ± 0.0070.601 ± 0.008 0.623 ± 0.003 0.688 ± 0.015 0.624 ± 0.0130.561 ± 0.003 0.664 ± 0.002 0.672 ± 0.009 0.829 ± 0.0040.673 ± 0.012 0.678 ± 0.006 0.644 ± 0.003 0.742 ± 0.0030.700 ± 0.015 0.655 ± 0.006 0.634 ± 0.002 0.634 ± 0.002
RNABERT (ResNet)0.675 0.732 ± 0.010 0.721 ± 0.0020.571 ± 0.016 0.838 ± 0.008 0.789 ± 0.008 0.709 ± 0.0100.559 ± 0.008 0.628 ± 0.004 0.689 ± 0.003 0.624 ± 0.0150.528 ± 0.005 0.647 ± 0.021 0.684 ± 0.003 0.878 ± 0.0050.624 ± 0.009 0.657 ± 0.006 0.607 ± 0.006 0.755 ± 0.0050.708 ± 0.002 0.658 ± 0.008 0.606 ± 0.005 0.643 ± 0.010
RNA-FM (ResNet)0.719 0.779 ± 0.006 0.746 ± 0.0010.644 ± 0.005 0.854 ± 0.004 0.811 ± 0.004 0.752 ± 0.0090.639 ± 0.005 0.665 ± 0.008 0.732 ± 0.005 0.656 ± 0.0020.609 ± 0.012 0.686 ± 0.002 0.720 ± 0.005 0.886 ± 0.0050.706 ± 0.006 0.704 ± 0.006 0.676 ± 0.005 0.780 ± 0.0030.744 ± 0.002 0.691 ± 0.004 0.631 ± 0.001 0.686 ± 0.002
3UTRBERT (ResNet)0.728 0.792 ± 0.005 0.751 ± 0.0010.660 ± 0.004 0.853 ± 0.000 0.824 ± 0.002 0.748 ± 0.0090.656 ± 0.015 0.676 ± 0.005 0.721 ± 0.001 0.683 ± 0.0070.618 ± 0.004 0.694 ± 0.007 0.732 ± 0.003 0.893 ± 0.0040.693 ± 0.008 0.685 ± 0.010 0.673 ± 0.014 0.791 ± 0.0010.743 ± 0.005 0.716 ± 0.003 0.637 ± 0.005 0.704 ± 0.008
SpliceBERT (ResNet)0.750 0.785 ± 0.002 0.776 ± 0.0000.730 ± 0.004 0.863 ± 0.001 0.840 ± 0.004 0.756 ± 0.0090.691 ± 0.013 0.720 ± 0.003 0.743 ± 0.003 0.710 ± 0.0050.646 ± 0.003 0.716 ± 0.005 0.789 ± 0.005 0.895 ± 0.0020.713 ± 0.002 0.710 ± 0.008 0.705 ± 0.001 0.788 ± 0.0020.745 ± 0.006 0.750 ± 0.004 0.675 ± 0.007 0.749 ± 0.006
UTR-LM (ResNet)0.723 0.785 ± 0.008 0.745 ± 0.0090.659 ± 0.014 0.850 ± 0.003 0.821 ± 0.007 0.752 ± 0.0060.649 ± 0.003 0.673 ± 0.013 0.730 ± 0.002 0.663 ± 0.0110.620 ± 0.003 0.689 ± 0.010 0.736 ± 0.008 0.892 ± 0.0050.693 ± 0.005 0.690 ± 0.009 0.692 ± 0.007 0.699 ± 0.0020.781 ± 0.001 0.739 ± 0.002 0.639 ± 0.009 0.700 ± 0.003
RiNALMo (ResNet)0.750 0.798 ± 0.004 0.781 ± 0.0010.721 ± 0.010 0.863 ± 0.006 0.823 ± 0.004 0.778 ± 0.0060.681 ± 0.002 0.711 ± 0.005 0.754 ± 0.004 0.699 ± 0.0070.641 ± 0.001 0.712 ± 0.005 0.768 ± 0.006 0.893 ± 0.0040.720 ± 0.006 0.716 ± 0.006 0.707 ± 0.002 0.746 ± 0.0050.799 ± 0.004 0.766 ± 0.004 0.667 ± 0.003 0.749 ± 0.009

(2) Systematic Binding Ranking Logo

Systematic Binding Ranking (SBR) defines a multi-label classification task in which the model ranks RNA species by their binding preferences across diverse molecular targets. The benchmark comprises three SELEX-derived datasets—DAse, TARDBP, and ISLETS—spanning small molecules, proteins, and (multi)cellular systems.

Model (Module) DAse TARDBP ISLETS
F1 F1 F1
One-hot 0.614 ± 0.005 0.461 ± 0.0100.415 ± 0.003
Dense0.622 ± 0.0010.472 ± 0.0030.412 ± 0.002
DNABERT2 (MLP)0.629 ± 0.0030.425 ± 0.0050.403 ± 0.002
HyenaDNA (MLP)0.522 ± 0.0040.376 ± 0.0030.317 ± 0.015
NT (MLP)0.599 ± 0.0100.391 ± 0.0010.360 ± 0.035
RNABERT (MLP)0.528 ± 0.0060.307 ± 0.0060.309 ± 0.007
RNA-FM (MLP)0.605 ± 0.0060.396 ± 0.0060.369 ± 0.012
3UTRBERT (MLP)0.611 ± 0.0070.401 ± 0.0050.397 ± 0.005
SpliceBERT (MLP)0.620 ± 0.0120.396 ± 0.0070.367 ± 0.012
UTR-LM (MLP)0.536 ± 0.0150.366 ± 0.0020.299 ± 0.031
RiNALMo (MLP)0.571 ± 0.0020.373 ± 0.0040.332 ± 0.004
DNABERT2 (CNN)0.640 ± 0.0020.456 ± 0.0010.414 ± 0.002
HyenaDNA (CNN)0.597 ± 0.0060.438 ± 0.0040.390 ± 0.007
NT (CNN)0.622 ± 0.0120.448 ± 0.0060.396 ± 0.002
RNABERT (CNN)0.538 ± 0.0020.414 ± 0.0080.341 ± 0.007
RNA-FM (CNN)0.640 ± 0.0160.460 ± 0.0070.410 ± 0.004
3UTRBERT (CNN)0.642 ± 0.0020.462 ± 0.0070.403 ± 0.003
SpliceBERT (CNN)0.639 ± 0.0060.466 ± 0.0030.414 ± 0.003
UTR-LM (CNN)0.571 ± 0.0190.432 ± 0.0040.372 ± 0.012
RiNALMo (CNN)0.595 ± 0.0080.441 ± 0.0030.370 ± 0.009
DNABERT2 (ResNet)0.617 ± 0.0090.438 ± 0.0020.401 ± 0.006
HyenaDNA (ResNet)0.623 ± 0.0050.471 ± 0.0030.411 ± 0.005
NT (ResNet)0.638 ± 0.0130.454 ± 0.0040.413 ± 0.003
RNABERT (ResNet)0.599 ± 0.0030.461 ± 0.0030.404 ± 0.002
RNA-FM (ResNet)0.634 ± 0.0040.460 ± 0.0080.409 ± 0.001
3UTRBERT (ResNet)0.624 ± 0.0090.468 ± 0.0080.414 ± 0.002
SpliceBERT (ResNet)0.617 ± 0.0130.456 ± 0.0020.406 ± 0.003
UTR-LM (ResNet)0.597 ± 0.0050.446 ± 0.0040.372 ± 0.011
RiNALMo (ResNet)0.564 ± 0.0310.448 ± 0.0040.381 ± 0.005

(3) Binding Affinity Prediction Logo

Binding Affinity Prediction (BAP) is a regression task to quantify the binding strength of RNA sequences and to probe how nucleotide mutations modulate interaction affinity. The benchmark comprises two HiTS-RAP datasets—GFP and NELF—which measure affinities for mutagenized aptamers. Models train on wild-type and single-mutant examples and are evaluated on double mutants to rigorously assess their ability to capture mutational effects on binding.

Model (Module) GFP NELF
Spearman p Spearman p
One-hot 0.215 ± 0.008 0.668 ± 0.130
Dense0.138 ± 0.0410.388 ± 0.059
DNABERT2 (MLP)-0.124 ± 0.0440.148 ± 0.010
HyenaDNA (MLP)0.092 ± 0.0630.140 ± 0.005
NT (MLP)0.139 ± 0.0500.314 ± 0.004
RNABERT (MLP)-0.049 ± 0.0360.129 ± 0.005
RNA-FM (MLP)0.073 ± 0.0060.331 ± 0.009
3UTRBERT (MLP)-0.043 ± 0.0050.331 ± 0.009
SpliceBERT (MLP)0.079 ± 0.0100.298 ± 0.003
UTR-LM (MLP)0.028 ± 0.0870.134 ± 0.072
RiNALMo (MLP)0.076 ± 0.0700.157 ± 0.018
DNABERT2 (CNN)-0.030 ± 0.0020.241 ± 0.021
HyenaDNA (CNN)0.139 ± 0.0110.181 ± 0.009
NT (CNN)0.138 ± 0.0170.259 ± 0.007
RNABERT (CNN)-0.016 ± 0.0030.135 ± 0.003
RNA-FM (CNN)0.224 ± 0.0730.384 ± 0.002
3UTRBERT (CNN)0.207 ± 0.1840.194 ± 0.008
SpliceBERT (CNN)0.107 ± 0.0110.340 ± 0.007
UTR-LM (CNN)0.037 ± 0.0410.244 ± 0.069
RiNALMo (CNN)0.067 ± 0.0610.161 ± 0.011
DNABERT2 (ResNet)0.267 ± 0.0140.580 ± 0.029
HyenaDNA (ResNet)0.376 ± 0.0320.625 ± 0.143
NT (ResNet)0.334 ± 0.0090.626 ± 0.015
RNABERT (ResNet)0.029 ± 0.1650.677 ± 0.052
RNA-FM (ResNet)0.407 ± 0.0460.646 ± 0.059
3UTRBERT (ResNet)0.262 ± 0.1650.498 ± 0.213
SpliceBERT (ResNet)0.162 ± 0.0710.510 ± 0.040
UTR-LM (ResNet)0.233 ± 0.0980.671 ± 0.050
RiNALMo (ResNet)0.109 ± 0.0930.256 ± 0.096

RNA Function Tasks Performance

Keypoints : Functional RNA demands both conserved structure and specific environment. This panel outlines tasks for inferring RNA functions with different spatiotemporal conditions from 1D sequence across pre-mRNA, mRNA, and ncRNA. Together, these tasks establish a sub-framework for understanding the evolutionary function diversity of RNA across conservation levels and contexts.

Pre-mRNA-related Function Tasks

pre-mRNA-related function tasks include three subtasks:

(1) Splicing Site Prediction (SPS) comprises two binary classification tasks: distinguishing whether a sequence corresponds to a splice donor or acceptor site. SPS is central to pre-mRNA processing, removing non-coding regions and joining coding regions by recognizing donor and acceptor sites.

(2) Splicing Event Prediction (SPE) is a multi-label classification task that maps an RNA sequence x to one of four splicing event labels (ES, AA, AD, IR), revealing the diversity of splicing mechanisms.

(3) Polyadenylation Signal Prediction (PAS) is a binary classification task that predicts the presence of the polyadenylation signal—a hexamer motif upstream of the RNA 3′-end cleavage site—critical for mRNA maturation.

Model (Module) Splicing Site Splicing Event PAS
Donor_human Donor_ara Donor_rice Acceptor_human Acceptor_ara Acceptor_rice human ara rice human mouse_bl mouse_sp
Accuracy Accuracy Accuracy Accuracy Accuracy Accuracy F1 F1 F1 Accuracy Accuracy Accuracy
One-hot 0.903 ± 0.001 0.893 ± 0.0120.754 ± 0.018 0.900 ± 0.002 0.746 ± 0.0130.735 ± 0.017 0.585 ± 0.004 0.310 ± 0.0270.350 ± 0.032 0.752 ± 0.003 0.633 ± 0.0100.638 ± 0.012
Dense 0.895 ± 0.002 0.762 ± 0.0080.738 ± 0.006 0.895 ± 0.001 0.738 ± 0.0100.720 ± 0.015 0.589 ± 0.016 0.330 ± 0.0330.380 ± 0.025 0.744 ± 0.001 0.629 ± 0.0120.633 ± 0.014
DNABERT2 (MLP) 0.808 ± 0.001 0.662 ± 0.0130.624 ± 0.008 0.799 ± 0.001 0.682 ± 0.0220.671 ± 0.027 0.582 ± 0.006 0.288 ± 0.0140.364 ± 0.012 0.744 ± 0.002 0.617 ± 0.0020.628 ± 0.002
HyenaDNA (MLP) 0.780 ± 0.001 0.665 ± 0.0040.631 ± 0.004 0.797 ± 0.002 0.692 ± 0.0070.652 ± 0.010 0.533 ± 0.003 0.291 ± 0.0160.378 ± 0.014 0.738 ± 0.003 0.641 ± 0.0070.647 ± 0.006
NT (MLP) 0.776 ± 0.001 0.667 ± 0.0070.597 ± 0.006 0.775 ± 0.004 0.655 ± 0.0010.607 ± 0.014 0.535 ± 0.003 0.296 ± 0.0200.367 ± 0.011 0.738 ± 0.001 0.628 ± 0.0060.637 ± 0.005
RNABERT (MLP) 0.675 ± 0.003 0.600 ± 0.0030.587 ± 0.003 0.664 ± 0.001 0.612 ± 0.0150.573 ± 0.010 0.410 ± 0.011 0.215 ± 0.0140.301 ± 0.008 0.702 ± 0.002 0.624 ± 0.0020.633 ± 0.002
RNA-FM (MLP) 0.803 ± 0.000 0.694 ± 0.0110.669 ± 0.009 0.807 ± 0.003 0.708 ± 0.0130.680 ± 0.007 0.592 ± 0.004 0.344 ± 0.0030.402 ± 0.007 0.458 ± 0.001 0.643 ± 0.0090.655 ± 0.008
3UTRBERT (MLP) 0.814 ± 0.006 0.707 ± 0.0020.669 ± 0.005 0.814 ± 0.004 0.730 ± 0.0020.725 ± 0.002 0.599 ± 0.003 0.348 ± 0.0150.384 ± 0.021 0.765 ± 0.004 0.654 ± 0.0080.664 ± 0.009
SpliceBERT (MLP) 0.886 ± 0.001 0.802 ± 0.0110.765 ± 0.017 0.887 ± 0.001 0.843 ± 0.0040.833 ± 0.005 0.646 ± 0.009 0.367 ± 0.0060.410 ± 0.017 0.763 ± 0.003 0.606 ± 0.0080.618 ± 0.008
UTR-LM (MLP) 0.768 ± 0.000 0.632 ± 0.0070.603 ± 0.006 0.766 ± 0.001 0.625 ± 0.0090.601 ± 0.006 0.526 ± 0.011 0.263 ± 0.0300.308 ± 0.028 0.738 ± 0.001 0.642 ± 0.0030.648 ± 0.003
RiNALMo (MLP) 0.868 ± 0.002 0.780 ± 0.0010.734 ± 0.003 0.858 ± 0.001 0.779 ± 0.0020.776 ± 0.002 0.640 ± 0.002 0.353 ± 0.0060.400 ± 0.010 0.760 ± 0.002 0.635 ± 0.0160.649 ± 0.019
DNABERT2 (CNN) 0.861 ± 0.004 0.726 ± 0.0090.710 ± 0.015 0.851 ± 0.004 0.725 ± 0.0170.748 ± 0.003 0.619 ± 0.003 0.306 ± 0.0250.359 ± 0.020 0.753 ± 0.007 0.610 ± 0.0090.618 ± 0.009
HyenaDNA (CNN) 0.954 ± 0.001 0.893 ± 0.0040.890 ± 0.007 0.946 ± 0.001 0.854 ± 0.0090.855 ± 0.005 0.663 ± 0.010 0.417 ± 0.0200.464 ± 0.020 0.760 ± 0.010 0.678 ± 0.0150.682 ± 0.014
NT (CNN) 0.925 ± 0.005 0.837 ± 0.0090.819 ± 0.014 0.902 ± 0.001 0.796 ± 0.0090.777 ± 0.033 0.570 ± 0.011 0.346 ± 0.0110.384 ± 0.006 0.471 ± 0.002 0.632 ± 0.0070.640 ± 0.006
RNABERT (CNN) 0.631 ± 0.001 0.844 ± 0.0060.821 ± 0.011 0.904 ± 0.003 0.799 ± 0.0090.778 ± 0.003 0.511 ± 0.011 0.295 ± 0.0120.345 ± 0.021 0.727 ± 0.002 0.640 ± 0.0030.645 ± 0.002
RNA-FM (CNN) 0.958 ± 0.001 0.891 ± 0.0020.891 ± 0.003 0.945 ± 0.001 0.845 ± 0.0020.848 ± 0.004 0.717 ± 0.046 0.448 ± 0.0170.499 ± 0.018 0.773 ± 0.001 0.672 ± 0.0150.681 ± 0.013
3UTRBERT (CNN) 0.957 ± 0.001 0.889 ± 0.0020.883 ± 0.010 0.947 ± 0.002 0.841 ± 0.0040.846 ± 0.009 0.713 ± 0.004 0.440 ± 0.0170.469 ± 0.011 0.779 ± 0.002 0.639 ± 0.0110.648 ± 0.012
SpliceBERT (CNN) 0.957 ± 0.000 0.906 ± 0.0070.894 ± 0.018 0.941 ± 0.001 0.890 ± 0.0090.892 ± 0.011 0.760 ± 0.004 0.495 ± 0.0140.535 ± 0.014 0.778 ± 0.005 0.630 ± 0.0140.638 ± 0.012
UTR-LM (CNN) 0.925 ± 0.001 0.797 ± 0.0020.800 ± 0.007 0.921 ± 0.001 0.801 ± 0.0030.798 ± 0.008 0.657 ± 0.010 0.414 ± 0.0130.447 ± 0.012 0.766 ± 0.006 0.659 ± 0.0120.663 ± 0.013
RiNALMo (CNN) 0.903 ± 0.002 0.855 ± 0.0010.831 ± 0.003 0.895 ± 0.002 0.859 ± 0.0010.850 ± 0.002 0.691 ± 0.002 0.383 ± 0.0050.450 ± 0.007 0.770 ± 0.004 0.634 ± 0.0030.647 ± 0.003
DNABERT2 (ResNet) 0.937 ± 0.000 0.866 ± 0.0080.858 ± 0.010 0.927 ± 0.003 0.853 ± 0.0040.854 ± 0.006 0.697 ± 0.003 0.366 ± 0.0110.421 ± 0.011 0.740 ± 0.007 0.616 ± 0.0070.625 ± 0.010
HyenaDNA (ResNet) 0.964 ± 0.000 0.903 ± 0.0090.900 ± 0.010 0.954 ± 0.001 0.848 ± 0.0130.853 ± 0.014 0.621 ± 0.002 0.389 ± 0.0410.437 ± 0.013 0.781 ± 0.001 0.693 ± 0.0070.697 ± 0.007
NT (ResNet) 0.954 ± 0.001 0.893 ± 0.0120.896 ± 0.009 0.937 ± 0.002 0.859 ± 0.0020.864 ± 0.002 0.797 ± 0.005 0.526 ± 0.0120.566 ± 0.012 0.741 ± 0.015 0.648 ± 0.0080.655 ± 0.007
RNABERT (ResNet) 0.965 ± 0.001 0.908 ± 0.0050.901 ± 0.006 0.852 ± 0.001 0.856 ± 0.0110.843 ± 0.009 0.725 ± 0.001 0.493 ± 0.0260.538 ± 0.026 0.738 ± 0.005 0.636 ± 0.0070.641 ± 0.006
RNA-FM (ResNet) 0.967 ± 0.000 0.913 ± 0.0040.917 ± 0.005 0.955 ± 0.001 0.859 ± 0.0030.869 ± 0.004 0.768 ± 0.005 0.511 ± 0.0260.550 ± 0.033 0.776 ± 0.002 0.671 ± 0.0130.679 ± 0.013
3UTRBERT (ResNet) 0.967 ± 0.001 0.912 ± 0.0070.919 ± 0.004 0.954 ± 0.001 0.869 ± 0.0040.876 ± 0.003 0.727 ± 0.006 0.476 ± 0.0130.514 ± 0.011 0.775 ± 0.003 0.656 ± 0.0160.663 ± 0.015
SpliceBERT (ResNet) 0.965 ± 0.001 0.935 ± 0.0010.940 ± 0.003 0.950 ± 0.001 0.898 ± 0.0020.904 ± 0.005 0.792 ± 0.005 0.514 ± 0.0110.562 ± 0.012 0.775 ± 0.002 0.628 ± 0.0110.641 ± 0.013
UTR-LM (ResNet) 0.945 ± 0.001 0.861 ± 0.0060.854 ± 0.009 0.941 ± 0.001 0.826 ± 0.0070.829 ± 0.004 0.678 ± 0.005 0.436 ± 0.0040.471 ± 0.006 0.766 ± 0.004 0.654 ± 0.0110.657 ± 0.011
RiNALMo (ResNet) 0.931 ± 0.002 0.871 ± 0.0040.854 ± 0.003 0.928 ± 0.002 0.864 ± 0.0070.857 ± 0.015 0.711 ± 0.003 0.453 ± 0.0150.502 ± 0.011 0.769 ± 0.002 0.661 ± 0.0100.676 ± 0.012

mRNA-related Function Tasks

mRNA-related function tasks include three subtasks:

(1) Coding Potential Prediction (CPP) is a binary-classification task that distinguishes coding RNAs from non-coding ones, aiming to improve transcript coding potential for mRNA-based therapeutics.

(2) mRNA Subcellular Localization (mSL) is a multi-label classification task that assigns each mRNA sequence one or more subcellular localization labels, predicting spatial distribution patterns that modulate protein synthesis and cellular function.

(3) Ribosome Loading Prediction (RLP) is a regression task that estimates the Mean Ribosome Load (MRL) of a given 5′ untranslated region (5′UTR), quantifying ribosome occupancy to optimize translational efficiency.

Model (Module) Coding Potential mRNA SL Ribosome Loading
Accuracy (human) Accuracy (mouse) Accuracy (zebrafish) Accuracy (fruit_fly) Accuracy (s_cerevisia) F1
One-hot 0.945 ± 0.001 0.924 ± 0.002 0.900 ± 0.005 0.924 ± 0.0090.721 ± 0.017 0.583 ± 0.010 0.648 ± 0.012
Dense 0.946 ± 0.001 0.922 ± 0.002 0.900 ± 0.007 0.872 ± 0.0740.713 ± 0.012 0.588 ± 0.010 0.641 ± 0.027
DNABERT2 (MLP) 0.942 ± 0.001 0.940 ± 0.001 0.895 ± 0.004 0.873 ± 0.0070.774 ± 0.007 0.516 ± 0.014 0.284 ± 0.019
HyenaDNA (MLP) 0.896 ± 0.004 0.872 ± 0.011 0.886 ± 0.005 0.897 ± 0.0000.606 ± 0.005 0.494 ± 0.010 0.154 ± 0.019
NT (MLP) 0.895 ± 0.003 0.877 ± 0.003 0.874 ± 0.004 0.914 ± 0.0010.694 ± 0.006 0.476 ± 0.003 0.113 ± 0.008
RNABERT (MLP) 0.726 ± 0.003 0.744 ± 0.001 0.640 ± 0.009 0.482 ± 0.0140.165 ± 0.011 0.425 ± 0.000 -0.036 ± 0.010
RNA-FM (MLP) 0.941 ± 0.002 0.942 ± 0.001 0.908 ± 0.007 0.912 ± 0.0180.779 ± 0.018 0.520 ± 0.011 0.169 ± 0.009
3UTRBERT (MLP) 0.905 ± 0.002 0.918 ± 0.003 0.885 ± 0.003 0.744 ± 0.0130.827 ± 0.016 0.516 ± 0.010 0.299 ± 0.021
SpliceBERT (MLP) 0.962 ± 0.000 0.964 ± 0.001 0.928 ± 0.002 0.916 ± 0.0120.790 ± 0.013 0.560 ± 0.004 0.158 ± 0.005
UTR-LM (MLP) 0.899 ± 0.001 0.891 ± 0.002 0.867 ± 0.008 0.864 ± 0.0140.672 ± 0.018 0.476 ± 0.005 -0.006 ± 0.019
RiNALMo (MLP) 0.969 ± 0.001 0.976 ± 0.000 0.937 ± 0.004 0.948 ± 0.0120.713 ± 0.010 0.476 ± 0.014 0.230 ± 0.003
DNABERT2 (CNN) 0.955 ± 0.001 0.939 ± 0.001 0.899 ± 0.001 0.885 ± 0.0070.759 ± 0.001 0.530 ± 0.008 0.335 ± 0.014
HyenaDNA (CNN) 0.927 ± 0.003 0.905 ± 0.004 0.875 ± 0.002 0.911 ± 0.0020.677 ± 0.008 0.492 ± 0.005 0.453 ± 0.008
NT (CNN) 0.911 ± 0.002 0.880 ± 0.001 0.892 ± 0.002 0.905 ± 0.0050.719 ± 0.024 0.529 ± 0.015 0.285 ± 0.009
RNABERT (CNN) 0.784 ± 0.005 0.780 ± 0.003 0.752 ± 0.003 0.834 ± 0.0030.538 ± 0.014 0.425 ± 0.000 0.407 ± 0.022
RNA-FM (CNN) 0.952 ± 0.001 0.948 ± 0.003 0.912 ± 0.001 0.888 ± 0.0230.713 ± 0.036 0.544 ± 0.009 0.488 ± 0.008
3UTRBERT (CNN) 0.925 ± 0.003 0.921 ± 0.002 0.901 ± 0.003 0.819 ± 0.0200.807 ± 0.009 0.569 ± 0.029 0.481 ± 0.003
SpliceBERT (CNN) 0.967 ± 0.001 0.964 ± 0.001 0.924 ± 0.004 0.872 ± 0.0120.719 ± 0.041 0.567 ± 0.021 0.461 ± 0.022
UTR-LM (CNN) 0.953 ± 0.002 0.946 ± 0.001 0.911 ± 0.005 0.869 ± 0.0080.748 ± 0.026 0.509 ± 0.016 0.401 ± 0.019
RiNALMo (CNN) 0.928 ± 0.003 0.913 ± 0.000 0.906 ± 0.001 0.925 ± 0.0020.669 ± 0.039 0.594 ± 0.010 0.501 ± 0.004
DNABERT2 (ResNet) 0.941 ± 0.003 0.922 ± 0.005 0.887 ± 0.007 0.895 ± 0.0080.732 ± 0.014 0.597 ± 0.010 0.427 ± 0.010
HyenaDNA (ResNet) 0.935 ± 0.001 0.908 ± 0.003 0.899 ± 0.005 0.914 ± 0.0040.643 ± 0.003 0.593 ± 0.012 0.637 ± 0.012
NT (ResNet) 0.919 ± 0.001 0.888 ± 0.002 0.884 ± 0.003 0.913 ± 0.0050.637 ± 0.007 0.598 ± 0.007 0.419 ± 0.015
RNABERT (ResNet) 0.854 ± 0.003 0.832 ± 0.005 0.790 ± 0.010 0.834 ± 0.0090.637 ± 0.038 0.523 ± 0.018 0.630 ± 0.011
RNA-FM (ResNet) 0.955 ± 0.001 0.948 ± 0.001 0.911 ± 0.002 0.871 ± 0.0370.704 ± 0.037 0.591 ± 0.007 0.644 ± 0.023
3UTRBERT (ResNet) 0.916 ± 0.002 0.916 ± 0.001 0.889 ± 0.006 0.840 ± 0.0040.817 ± 0.014 0.587 ± 0.004 0.608 ± 0.001
SpliceBERT (ResNet) 0.968 ± 0.001 0.965 ± 0.001 0.931 ± 0.007 0.914 ± 0.0110.737 ± 0.028 0.607 ± 0.022 0.635 ± 0.008
UTR-LM (ResNet) 0.954 ± 0.001 0.948 ± 0.000 0.920 ± 0.004 0.911 ± 0.0190.745 ± 0.010 0.608 ± 0.004 0.647 ± 0.011
RiNALMo (ResNet) 0.972 ± 0.001 0.977 ± 0.001 0.932 ± 0.000 0.972 ± 0.0020.843 ± 0.000 0.604 ± 0.004 0.643 ± 0.014

ncRNA-related Function Tasks

ncRNA-related function tasks include three subtasks:

(1) ncRNA Category Classification (NCC) is a multi-label classification task that assigns each ncRNA sequence to one or more of 13 predefined categories, capturing conserved sequence patterns to elucidate non-coding regulatory roles.

(2) microRNA Subcellular Localization (miSL) is a multi-label classification task that predicts the subcellular localization of microRNAs, providing insights into their functional contexts and aiding therapeutic design.

(3) gRNA Efficiency Prediction (gEP) is a regression task that estimates the cleavage efficiency of guide RNAs in the CRISPR–Cas9 system, enabling optimization of on-target genome editing performance.

Model (Module) ncRNA Category miRNA SL gRNA Efficiency
Accuracy subACC HLoss Average of 6 datasets mECS (Koike-Yusa) HEL (Labuhn) A375 (Shalem) HEK293T (Xi Xiang) Zebrafish (Gagnon) Zebrafish (Shkumatava)
Spearman p Spearman p Spearman p Spearman p Spearman p Spearman p
One-hot 0.947 ± 0.0050.299 ± 0.017 0.308 ± 0.001 0.252 0.262 ± 0.028 0.095 ± 0.011 0.202 ± 0.016 0.457 ± 0.043 0.307 ± 0.033 0.191 ± 0.017
Dense 0.950 ± 0.0080.322 ± 0.000 0.307 ± 0.000 0.254 0.294 ± 0.026 0.105 ± 0.006 0.207 ± 0.020 0.488 ± 0.011 0.264 ± 0.029 0.166 ± 0.022
DNABERT2 (MLP) 0.839 ± 0.0050.325 ± 0.022 0.295 ± 0.028 0.130 0.149 ± 0.010 0.019 ± 0.014 0.098 ± 0.016 0.207 ± 0.016 0.206 ± 0.020 0.101 ± 0.024
HyenaDNA (MLP) 0.676 ± 0.0170.322 ± 0.000 0.307 ± 0.000 0.178 0.140 ± 0.005 0.080 ± 0.008 0.143 ± 0.002 0.324 ± 0.016 0.239 ± 0.022 0.143 ± 0.008
NT (MLP) 0.727 ± 0.0090.322 ± 0.000 0.307 ± 0.000 0.171 0.146 ± 0.006 0.025 ± 0.004 0.118 ± 0.005 0.325 ± 0.013 0.210 ± 0.011 0.201 ± 0.010
RNABERT (MLP) 0.523 ± 0.0070.322 ± 0.000 0.307 ± 0.000 0.129 0.115 ± 0.005 0.060 ± 0.003 0.032 ± 0.002 0.278 ± 0.003 0.218 ± 0.007 0.071 ± 0.023
RNA-FM (MLP) 0.965 ± 0.0010.333 ± 0.008 0.295 ± 0.004 0.200 0.193 ± 0.002 -0.015 ± 0.001 0.146 ± 0.003 0.386 ± 0.007 0.323 ± 0.012 0.168 ± 0.013
3UTRBERT (MLP) 0.829 ± 0.0030.302 ± 0.004 0.308 ± 0.003 0.157 0.163 ± 0.006 0.048 ± 0.001 0.111 ± 0.006 0.279 ± 0.038 0.206 ± 0.037 0.134 ± 0.010
SpliceBERT (MLP) 0.849 ± 0.0080.325 ± 0.004 0.300 ± 0.003 0.134 0.142 ± 0.015 0.017 ± 0.012 0.081 ± 0.019 0.233 ± 0.069 0.183 ± 0.039 0.151 ± 0.035
UTR-LM (MLP) 0.576 ± 0.0090.322 ± 0.000 0.307 ± 0.000 0.167 0.156 ± 0.011 0.014 ± 0.006 0.140 ± 0.003 0.294 ± 0.006 0.267 ± 0.020 0.132 ± 0.011
RiNALMo (MLP) 0.957 ± 0.0150.328 ± 0.005 0.278 ± 0.006 0.184 0.163 ± 0.005 0.022 ± 0.026 0.127 ± 0.021 0.324 ± 0.029 0.284 ± 0.001 0.183 ± 0.030
DNABERT2 (CNN) 0.900 ± 0.0070.356 ± 0.012 0.264 ± 0.003 0.137 0.145 ± 0.006 0.065 ± 0.010 0.084 ± 0.010 0.233 ± 0.015 0.212 ± 0.009 0.094 ± 0.032
HyenaDNA (CNN) 0.804 ± 0.0180.322 ± 0.000 0.307 ± 0.000 0.228 0.218 ± 0.007 0.089 ± 0.004 0.174 ± 0.008 0.392 ± 0.007 0.300 ± 0.007 0.196 ± 0.012
NT (CNN) 0.895 ± 0.0220.322 ± 0.000 0.307 ± 0.000 0.249 0.241 ± 0.004 0.067 ± 0.012 0.232 ± 0.006 0.485 ± 0.007 0.208 ± 0.014 0.264 ± 0.001
RNABERT (CNN) 0.715 ± 0.0090.322 ± 0.000 0.307 ± 0.000 0.195 0.222 ± 0.011 0.141 ± 0.009 0.119 ± 0.003 0.395 ± 0.012 0.201 ± 0.012 0.091 ± 0.007
RNA-FM (CNN) 0.967 ± 0.0030.342 ± 0.020 0.268 ± 0.007 0.256 0.253 ± 0.008 0.072 ± 0.005 0.208 ± 0.014 0.519 ± 0.015 0.293 ± 0.024 0.193 ± 0.020
3UTRBERT (CNN) 0.918 ± 0.0030.311 ± 0.011 0.301 ± 0.005 0.175 0.207 ± 0.014 0.033 ± 0.003 0.130 ± 0.023 0.316 ± 0.035 0.236 ± 0.014 0.128 ± 0.012
SpliceBERT (CNN) 0.915 ± 0.0060.322 ± 0.000 0.307 ± 0.000 0.187 0.198 ± 0.004 0.041 ± 0.016 0.102 ± 0.012 0.343 ± 0.021 0.274 ± 0.026 0.166 ± 0.034
UTR-LM (CNN) 0.909 ± 0.0040.322 ± 0.000 0.307 ± 0.000 0.209 0.197 ± 0.004 0.070 ± 0.017 0.155 ± 0.002 0.348 ± 0.012 0.296 ± 0.036 0.191 ± 0.023
RiNALMo (CNN) 0.974 ± 0.0020.325 ± 0.010 0.264 ± 0.010 0.253 0.269 ± 0.023 0.073 ± 0.032 0.216 ± 0.013 0.480 ± 0.023 0.302 ± 0.016 0.180 ± 0.010
DNABERT2 (ResNet) 0.883 ± 0.0110.299 ± 0.062 0.294 ± 0.031 0.103 0.131 ± 0.011 0.043 ± 0.016 0.073 ± 0.020 0.180 ± 0.007 0.131 ± 0.034 0.060 ± 0.053
HyenaDNA (ResNet) 0.941 ± 0.0080.302 ± 0.017 0.285 ± 0.017 0.216 0.229 ± 0.043 0.111 ± 0.018 0.180 ± 0.025 0.423 ± 0.076 0.207 ± 0.060 0.146 ± 0.046
NT (ResNet) 0.905 ± 0.0100.291 ± 0.028 0.309 ± 0.020 0.189 0.179 ± 0.022 0.033 ± 0.045 0.162 ± 0.018 0.382 ± 0.015 0.173 ± 0.089 0.206 ± 0.049
RNABERT (ResNet) 0.947 ± 0.0040.311 ± 0.016 0.308 ± 0.001 0.164 0.188 ± 0.024 0.102 ± 0.026 0.124 ± 0.019 0.259 ± 0.046 0.172 ± 0.015 0.037 ± 0.021
RNA-FM (ResNet) 0.975 ± 0.0020.362 ± 0.035 0.263 ± 0.011 0.225 0.229 ± 0.015 0.045 ± 0.024 0.175 ± 0.008 0.440 ± 0.044 0.273 ± 0.040 0.188 ± 0.036
3UTRBERT (ResNet) 0.941 ± 0.0020.285 ± 0.026 0.302 ± 0.020 0.166 0.169 ± 0.022 0.039 ± 0.007 0.126 ± 0.007 0.319 ± 0.014 0.230 ± 0.033 0.110 ± 0.026
SpliceBERT (ResNet) 0.945 ± 0.0060.308 ± 0.038 0.273 ± 0.011 0.184 0.192 ± 0.033 0.047 ± 0.037 0.131 ± 0.024 0.349 ± 0.061 0.232 ± 0.020 0.153 ± 0.036
UTR-LM (ResNet) 0.958 ± 0.0060.331 ± 0.009 0.294 ± 0.005 0.246 0.240 ± 0.011 0.074 ± 0.018 0.191 ± 0.010 0.445 ± 0.020 0.345 ± 0.045 0.183 ± 0.030
RiNALMo (ResNet) 0.969 ± 0.0020.302 ± 0.005 0.266 ± 0.001 0.237 0.233 ± 0.020 0.049 ± 0.049 0.209 ± 0.006 0.468 ± 0.011 0.287 ± 0.017 0.177 ± 0.030