Identification of lncRNAs in the two rice genotypes
After eliminating low-quality reads, 111,183,429 and 105,132,619 clean reads in FL478 and IR29 rice genotypes remained for downstream analysis, respectively. Approximately, 40 million paired-end clean reads were mapped to the reference genome and assembled, as described in our previous study22. After mapping based on both indica and japonica genome references, the japonica genome reference was selected due to higher percentage mapping23. We performed a strict/novel pipeline to find high-confident lncRNA transcripts (Fig. 1). Firstly, 382,622 (96.15%) and 357,520 (95.59%) protein-encoding transcripts in FL478 and IR29 were respectively removed. Although more transcripts were mapped to the protein coding genes, 15,131 and 16,256 transcripts with class codes “u,” “o,” “i,” and “x,” which are most likely to be noncoding, were respectively selected in FL478 and IR29 cuffcompare output files (Supplementary Table S2). Then, we identified and removed some transcripts overlapping with other non-coding RNAs (including snRNAs, snoRNAs, and microRNAs etc.) in both genotypes (Supplementary Table S3). Subsequently, 256 (FL478) and 237 (IR29) transcripts with the length > 200 bp were nominated as potential long noncoding RNAs by CPC, and their highest and lowest lengths were detected approximately 3000 bp and 600 bp, respectively (Supplementary Table S4). The highest percent of potential lncRNAs was found in chromosome 1 in both genotypes (Supplementary Fig. S2). We also re-evaluated the selected non-coding transcripts through CPC with Pfam and SwissProt NR databases, resulting in excluding 39.84% and 36.70% of them in FL478 and IR29, respectively (Supplementary Table S4). Thus, 132 lncRNAs were found to be more frequent in FL478 than in IR29 (111 lncRNAs), of which 93 were sense lncRNAs, eight were anti-sense lncRNAs, 27 were lincRNAs, and four were intronic lncRNAs in FL478 (Supplementary Fig. S3). Similarly, sense lncRNAs (85) were the most frequent among the identified lncRNAs, while intronic lncRNAs (2) were the least frequent in IR29 (Supplementary Fig. S3).
Identification of differentially expressed lncRNA
Differentially expressed lncRNAs (DE-lncRNA) were identified by comparing samples collected at different situations (control and stress) in the root tissues to inspect the expression patterns of lncRNAs under salt stress conditions in the salt tolerant genotype (FL478) and its susceptible parent (IR29). Overall, only a few DE-lncRNAs were identified, with most of them being sense lncRNAs. A total of four DE-lncRNAs were detected in FL478, among which, two were up-regulated and two were down-regulated under salt conditions. Moreover, a total of nine DE-lncRNAs were found in IR29, including six up-regulated and three down-regulated DE-lncRNAs under salt stress (Table 1, Fig. 2). We further found two novel DE-lncRNAs in IR29 under salt stress (Table 1, Fig. 2). Two and seven out of the total DE-lncRNAs were exclusively expressed in FL478 and IR29, respectively (Table 1, Fig. 2). The comparative analysis of FL478 and IR29 lncRNAs in response to salt stress revealed that two DE-lncRNAs overlapped in both genotypes, although their lengths were shorter in FL478 than in IR29 (Table 1, Fig. 2).
Chromatin accessibility of DE-lncRNAs
The logical necessary condition for the involvement of lncRNAs in the differential expression of target genes is the accessibility of their genomic region. All the 11 DE-lncRNAs were compared to the ATAC-seq tracks in the 6 tissues (Supplementary Fig. S4). These regions in the genome had elevated ATAC-seq levels in all tissues.
To assess the genome-wide accessibility of DE-lncRNAs, the mean ATAC-seq signals across the length of each DE-lncRNAs were calculated (Fig. 3). In addition, to establish the significance of this result, for each DE-lncRNAs 50 regions of equal length with DE-lncRNAs were selected from random genomic positions and the average ATAC-seq signals were measured for them. The average signals for 3 out of 4 DE-lncRNAs in FL478 and 8 out of 9 DE-lncRNAs in IR29 were significantly higher than the random selection (Fig. 3).
Functional prediction of salt stress‐responsive DE-lncRNAs involved in trans-regulation
The potential functions of salt-responsive DE-lncRNAs were explored by detecting the trans-regulatory networks of DE-lncRNAs in each genotype. Overall, four (in FL478) and nine (in IR29) DE-lncRNAs, together with 4670 identified expressed transcripts (ETs) through RNAseq, were subjected to co-expression network analysis. WGCNA analyses identified 49 modules in the two genotypes. We found that lncRNA.2-FL was highly correlated with ETs in the transcriptional module M39 (r2 = 0.96, p < 1e − 04) in FL478 under salinity stress (Fig. 4). Intriguingly, the M39 module including lncRNA.2-FL and 173 ETs was significantly up-regulated in FL478 under salt stress whereas these genes (Supplementary Table S5) were down-regulated in IR29 (Fig. 4). Subsequently, the significantly enriched GO terms of lncRNA.2-FL co-expressed ETs (Supplementary Table S5) in the M39 module were identified using Fisher’s exact test (P < 0.05) revealing that they were mainly enriched for salinity stress-related categories. The GO terms “cell wall organization”, “response to oxidative stress”, “response to chemical stimulus” and “carbohydrate metabolic process” were enriched among biological processes (Supplementary Fig. S5). In the category of molecular functions, “hydrolase activity”, “peroxidase activity”, and “oxidoreductase activity” were indicated as dominant terms (Supplementary Fig. S5). In the cellular component, more transcripts were categorized in the “extracellular region” and “cytoplasm” (Supplementary Fig. S5). To get more insight into the function of lncRNA.2-FL targets, online KEGG automatic annotation server (KAAS) was used for target genes in the M39 module. The results indicated that 47 out of the 173 DETs were classified into 13 KEGG pathways related to “metabolism”, “genetic information processing”, and “signaling and cellular processes” (Supplementary Fig. S6). The term “enzyme” was predominantly enriched including glutathione S-transferase, phosphatase, dehydrogenase, and peroxidase under salt stress compared to normal conditions (Supplementary Fig. S6). In total, our results suggested that the M39 module may have a significant role in the salt tolerance of FL478. Notably, the M39 module was not significant in IR29 under normal and salinity treated conditions (Fig. 4).
It is worthy to note that 33 out of 173 ETs were significantly salt stress responsive (DETs, the Q-value cut-off ≤ 0.05 and − 1 ≥ log2 fold change ≤ 1 were set as the threshold for significantly differential expression) in FL478, of which 17 DETs were specifically expressed in FL478 (Supplementary Table S5).
Functional prediction of salt stress‐responsive DE-lncRNAs involved in cis-regulation
To identify the putative functions of salinity-responsive DE-lncRNAs, the DE-lncRNAs were searched within 100 kb upstream and downstream for protein-encoding genes. Then, the DE-lncRNAs and their neighboring protein-coding genes were subjected to co-expression analysis. In total, seven lncRNA neighbors-protein-coding gene pairs were found to be involved in cis-acting regulation in FL478 (Supplementary Table S6, Fig. 2). Two protein-coding genes with 89 kb and 23 kb distance were found downstream of lncRNA.1; among them, LOC_Os02g16000 (log2 FC = 1.07, q-value = 0.01, and Cor = − 0.80) encoded a gene similar to GAMYB-binding protein (Supplementary Table S6, Fig. 2). Also, LOC_Os06g36910 (log2 FC = − 1.14, q-value = 0.05, and Cor = − 0.97), which is a pentatricopeptide repeat (PPR) protein, was located at 14 kb upstream of lncRNA.2-FL (Supplementary Table S6). Further, we found three protein-coding genes in the neighborhood of lncRNA.4, including LOC_Os04g43070 (log2 FC = − 2.42, q-value = 0.001, and Cor = − 0.94) encoding AMT1-1 located 15 kb downstream of lncRNA.4 whose function is response to abscisic acid, LOC_Os04g43300 (log2 FC = − 1.14, q-value = 0.01, and Cor = − 0.87) encoding BRCA1 located 95 kb upstream of lncRNA.4 whose function is positive regulation of transcription, and LOC_Os04g43200 (log2 FC = 5.55, q-value = 0.001, and Cor = 0.90) encoding peroxygenase (PXG) located 47 kb upstream of lncRNA.4 whose function is calcium-binding peroxygenase (Supplementary Table S6, Fig. 2).
Likewise, we identified five lncRNA–mRNA pairs involved in cis-acting regulation in IR29. LOC_Os02g16030 (log2 FC = − 2.84421, q-value = 0.0012, and Cor = 0.92673) encoding a plasma membrane component, located three bp upstream of lncRNA.1 (Supplementary Table S6). LOC_Os01g41900.1 (log2 FC = 1.38, q-value = 0.001, and Cor = 0.87) encoding the MYB transcription factor was located 8 kb upstream of lncRNA.3-IR (Supplementary Table S6). LOC_Os04g42020 (log2 FC = 1.01357, q-value = 0.003, and Cor = − 0.9186) whose function is zinc binding was spaced 93163 bp downstream of lncRNA.7-IR (Supplementary Table S6). Also, LOC_Os05g45030 (log2 FC = 1.30, q-value = 0.011, and Cor = 0.56) encoding calcium homeostasis regulator (CHoR) was located 81 kb upstream of lncRNA.9-IR (Supplementary Table S6, Fig. 2). Calcium, known as an essential plant element, is related to adaptive responses against environmental stresses42. These results showed that DE-lncRNAs might cis-regulate their neighboring protein-encoding genes’ expression in response to salt stress in both genotypes.
Impact assessment of possible mutations in DE-lncRNAs on cis-regulatory target genes
To find a causal relationship between DE-lncRNAs and their cis-regulatory target genes, we used a recently developed DNN21 to predict the change in the expression levels of the genes as a result of introducing mutations in DE-lncRNAs. We retrained this Enformer network with the ATAC-seq tracks of six tissues in rice33. Further, our trained Enformer model was used to predict the changes in accessibility of the genetic regions as a result of introduction of mutations in the DE-lncRNAs.
The model predicts that only a window of lncRNA.2-FL had a significant prediction peak in ISMS that can affect the TSS of Os06g0565000 (LOC_Os06g36910) (Fig. 5). LOC_Os06g36910 is a PPR-coding gene. PPR proteins are as RNA-binding proteins, which participate in posttranscriptional processes such as RNA editing, splicing, stability, cleavage, degradation, and translation43.
Mining the common TFBS in the cis and trans-regulatory target genes
To get insight into the regulatory mechanisms by which lncRNA.2-FL influence its target genes, point mutation analysis was employed to find TFBSs in the neighborhood of the cis and trans-regulatory target genes of lncRNA.2-FL that significantly affect the TSS of the genes. We found five TFBSs in lncRNA.2-FL that coincided with ISMS prediction peaks to affect the expression of its cis-regulatory target gene (Os06g0565000 (LOC_Os06g36910)) (Fig. 5). These binding sites belong to LBD and ERF TF families (Table 2).
These five TFBSs were also found on the ISMS prediction peaks of the 12 out of 17 specific trans-regulatory target genes of FL478 (Table 3, Supplementary Table S5). Notably, LOC Os01g07480 encoding an LBD TF was spotted to affect “OSAIR12” and “OSAPP1” trans-regulatory target genes (Fig. 6b,c; Table 3). Similar mutagenesis analysis around these genes further support the causal link of this TF in the expression of LOC Os01g07480 (Fig. 6).
Salt responsive lncRNAs as potential targets of rice miRNAs
The crosstalk of DE-lncRNAs and miRNAs was inspected through exploring the DE-lncRNAs regarded as the target pattern of known miRNAs in Oryza sativa. All the DE-lncRNAs in FL478 were identified to act as target patterns of 13 known osa-miRNAs, including osa-miR1427, osa-miR1848, osa-miR2275d, and osa-miR2926 (Supplementary Table S7, Fig. 2). Also, six out of nine DE-lncRNAs in IR29 were recognized as target patterns of 16 known miRNAs, including osa-miR2877, osa-miR1858, and osa-miR1427 (Supplementary Table S7, Fig. 2). The comparative analysis of these miRNAs in the two contrasting genotypes revealed that five and eight miRNAs were exclusively found in FL478 and IR29, respectively (Supplementary Table S7, Supplementary Fig. S7, Fig. 2) whereas eight miRNAs were commonly identified in both genotypes. Among the miRNAs that specifically identified in FL478, the target genes of two miRNAs, including osa-miR2926 and osa-miR2930, were predicted to be associated with lncRNA.2-FL (Supplementary Table S7, Fig. 2). As shown in Supplementary Table S7, multiple interactions were identified between lncRNA.2-FL with osa-miR2926 and osa-miR2930 with many mRNAs. Among them, genes controlling potassium and chloride channels were found, which may have potential roles in salt stress tolerance (Supplementary Table S7). Also, gene encoding glutathione S-transferase was previously predicted as target of osa-miR2275d related to lncRNA.3-FL (down-regulated) (Supplementary Table S7, Fig. 2)44. The interactions between common DE-LncRNAs and miRNAs predicted 5 and 3 known miRNAs target related to lncRNA.1 and lncRNA.4, respectively (Supplementary Table S7). GO analysis of miRNAs target genes associated with lncRNA.1 which was down-regulated in both genotypes suggested that the genes were significantly enriched in catalytic activity term (Supplementary Table S8). Also, many miRNAs target genes were predicted for lncRNA.4; GO analysis of these target genes suggested that heme binding term was most significant term as a probabilistic function of lncRNA.4 (p-value = 0.00096 and FDR = 0.039) (Supplementary Table S8).
Reliability assessment of RNA-seq based inferences through qRT-PCR
To verify analysis of RNA-seq data, 7 DE-lncRNAs were randomly nominated for qRT-PCR. Overall, the qRT-PCR results confirmed the outcome of RNA-seq analysis (Fig. 7). The qRT-PCR data revealed that 5 DE-lncRNAs from IR29 were up-regulated in 24 h after the onset of salt stress, among which lncRNA.3-IR, lncRNA.7-IR, lncRNA.8-IR and lncRNA.4 were highly consistent with the RNA sequencing results (Fig. 7). Similarly, in FL478, the expression levels of two DE-lncRNAs including lncRNA.2-FL and lncRNA.4 were confirmed by qRT-PCR (Fig. 7).