MicroRNAs (miRNAs) are ∼21 -nucleotide noncoding small RNAs that play an important role in plant biological processes such as development, biotic and abiotic stress response, signal transduction, and protein degradation (Bartel, 2004). In plants, miRNA is transcribed by RNA polymerase II to primary miRNA transcripts called pri-miRNAs (Zhang et al., 2008). The pri-miRNAs are then cleaved to short stem-loop structured precursor miRNAs (pre-miRNAs) by the enzyme Dicer-like 1 (DCL1) and subsequently processed into an miRNA : miRNA* duplex (miRNA* is a small RNA on the opposite arm of the miRNA in the hairpin with partial complementarity to the miRNA). Single-stranded mature miRNA is then assembled into the RNA-induced silencing complex (RISC) and negatively regulates gene expression at the post-transcriptional level by targeting mRNA cleavage or inhibiting mRNA translation (Bartel, 2004; Zhang et al., 2008; Voinnet, 2009).
Currently, two strategies, including experimental and computational approaches, are widely used in miRNA identification. With regard to the experimental approaches, miRNA discovery can be achieved by direct cloning or deep sequencing from a small RNA library. Although both conserved and novel miRNAs can be identified with this approach, it was found to be tedious and expensive (Sunkar et al., 2005; Barrera-Figueroa et al., 2011). By contrast, the computer-based miRNA identification strategy employs a homology search, based on the principle that miRNAs are evolutionarily conserved (Zhang et al., 2006). After searching the public databases against the known miRNAs of the model plants or closely related species, miRNA homologs were obtained and predicted miRNAs were then verified on the basis of experimental validation (Zhang et al., 2008). The computational approaches are currently more popular in miRNA identification because they are fast, inexpensive, and effective. Thus, based on genomic survey sequence (GSS) and expressed sequence tag (EST) sequences in public data sets, conserved miRNAs in plants were identified, including Glycine max (L.) Merr. (Zhang et al., 2008), Nicotiana tabacum L. (Frazier et al., 2010), Solanum tuberosum L. (Xie et al., 2011), Brassica oleracea L. (Wang et al., 2012), cucurbit species (Hu et al., 2014), and other species.
Sacred lotus (Nelumbo nucifera Gaertn.) is a perennial aquatic herbaceous plant of ecological, ornamental, and economic importance (Hu et al., 2012). It is a source of herbal medicine because of strong antipyretic, cooling, astringent, antioxidant, anti-HIV, and demulcent properties. In Asia, N. nucifera has been widely cultivated for more than one thousand years (Shen-Miller et al., 2002). Due to its agricultural and medicinal significance, the whole genome of N. nucifera has been fully sequenced recently (Ming et al., 2013; Wang et al., 2013). However, reports of miRNAs and their roles in N. nucifera are limited. It is necessary to uncover miRNAs and target genes in N. nucifera. In this study, we aimed to identify miRNAs of N. nucifera and their target genes using homology-based computational approaches. To achieve this goal, we first performed a comparison between ESTs of N. nucifera and all known plant miRNA sequences. Then, a number of the putative miRNAs of N. nucifera were confirmed by expression analysis. Their relative expression level difference has been measured by quantitative real-time PCR (qRT-PCR) using miRNA-specific stem-loop RT and qRT-PCR primers (Varkonyi-Gasic et al., 2007).
MATERIALS AND METHODS
Sequence database and reference miRNAs—A total of 8316 known plant miRNAs were downloaded from the miRBase database ( http://www.mirbase.org/; release 20, November 2013) (Kozomara and Griffiths-Jones, 2011) and used as a reference miRNA data set for identifying conserved miRNAs in N. nucifera. After removing the repeats, 2867 unique sequences were selected as the reference set. These miRNAs came from 24 plant species, including Aquilegia caerulea E. James, Arabidopsis lyrata (L.) O'Kane & Al-Shehbaz, A. thaliana (L.) Heynh., Brachypodium distachyon (L.) P. Beauv., Brassica napus L., B. rapa L., Camellia sinensis (L.) Kuntze, Glycine max, Gossypium hirsutum L., G. raimondii Ulbr., Hordeum vulgare L., Manihot esculenta Crantz, Medicago truncatula Gaertn., Nicotiana tabacum, Oryza sativa L., Populus trichocarpa Torr. & A. Gray, Prunus persica (L.) Batsch, Ricinus communis L., Solanum lycopersicum L., S. tuberosum, Sorghum bicolor (L.) Moench, Triticum aestivum L., Vitis vinifera L., and Zea mays L.
The 58,443 assembly genome contig sequences of the N. nucifera variety ‘China Antique lotus’ from both the National Center for Biotechnology Information (NCBI) database (accession: AQOG00000000.1) and RNA-seq data were used for miRNA mining (Ming et al., 2013). RNA-seq contigs_were generated for RNA-seq data from NCBI's Short Read Archive (SRA) raw data (accession: SRX266474, SRX266489, SRX268456, and SRX265003) using the de novo assembly method from Trinity software (v2.0.2; Grabherr et al., 2011). The parameters used for assembly were as described (–CPU 12–kmer_method jellyfish 10G; Grabherr et al., 2011).
Identification of conserved miRNA and target prediction—The genomic contig and RNA-seq contig sequences from the N. nucifera variety ‘China Antique lotus’ were used to mine conserved miRNA and their targets (Fig. 1). Computational identification of the conserved miRNA in sacred lotus was as described (Wang et al., 2012; Hu et al., 2014). Briefly, the N. nucifera genomic and RNA-seq contigs were aligned with known mature plant miRNAs using a BLASTn algorithm with an E threshold value of 10 and alignment length between 18 and 24 (Fig. 1). Results with no more than 3 nucleotide (<4 nucleotide) substitutions, including insertions, deletions, mutations, and gaps between known miRNAs and homolog sequences, were obtained from the BLASTn search (Wang et al., 2015). These alignment sequences to obtain the full-length sequences were extended from RNA-seq contig sequences. Protein-coding sequences were removed using BLASTx against NCBI's nonredundant (nr) protein database. The secondary structures of the remaining sequences were predicted using MFOLD 3.2 software (Zuker, 2003). Finally, candidate miRNAs were identified based on the following criteria: (1) substitutions between contig sequences and known miRNA sequences not less than four; (2) minimum length of pre-miRNA = 45 nucleotides; (3) pre-mi RNA can be folded into the perfect stem-loop hairpin secondary structure; (4) the miRNA : miRNA* duplex should have no loops; (5) mismatches in the miRNA : miRNA* duplex should not exceed 6 nucleotides; and (6) the negative minimal folding free energy (MFE) had the lower value and the minimal folding free energy index (MFEI) values of the predicted secondary structures should be higher than 0.85. In addition, psRNATarget was used to predict the targets of identified miRNAs with strict parameters, and the sequences were further compared against the NCBI-nr database for annotation (Dai and Zhao, 2011).
Validation of miRNA and the targets by qRT-PCR—For validation of miRNA and their targets in N. nucifera, young leaves, stems, and flowers of the N. nucifera accession of ‘Baihuajian lotus’ were selected and frozen immediately in liquid nitrogen and then stored at 80°C. Total RNA from each tissue sample was extracted using the RNAiso Reagent Kit (TaKaRa Bio Inc., Otsu, Shiga, Japan). Approximately 2 µg of each sample was reverse transcribed to stem-loop reverse transcription using specific RT primer and PrimeScript Reverse Transcriptase (TaKaRa Bio Inc.) for miRNAs (Varkonyi-Gasic et al., 2007; Quinn et al., 2015). Single-stranded cDNA for miRNA targets was synthesized using Revert Aid First Strand cDNA Synthesis Kit (Thermo Fisher Scientific, Waltham, Massachusetts, USA) according to the manufacturer's instructions. RT-PCR was then performed on an ABI StepOne Real-Time PCR System (Applied Biosystems, Carlsbad, California, USA) using SYBR Premix Ex Taq Kit (TaKaRa Bio Inc.). The melting curve was used to verify that only one specific product had been amplified. All the primers used are listed in Appendix S1 (apps.1500046_s1-s3.xlsx). The relative miRNA expression level was normalized and quantified using a ΔΔCT method. Three replicates were performed for each miRNA sample. Nelumbo nucifera NnEF1a (GI: 226897264) was used as the internal control for miRNA and their targets. Significant differences of the differential expression among different tissues in N. nucifera were evaluated by Student's t test (*P < 0.05; **P < 0.01).
Identification of conserved miRNA and prediction of their targets—A total of 106 miRNAs, belonging to 40 families ( Appendix S2 (apps.1500046_s1-s3.xlsx)), were identified with their secondary hairpin structure in this study. The length of mature miRNAs ranged from 20 to 24 nucleotides, with 21-nucleotide mature miRNAs being most abundant (83/106) (Fig. 2A). The length of pre-miRNAs ranged from 55 to 184 nucleotides, with an average length of 92 nucleotides (Fig. 2B).
Several miRNA families were revealed in N. nucifera, including miR396 (13 loci), miR393 (7 loci), miR169 (6 loci), and miR172 (6 loci) ( Appendix S2 (apps.1500046_s1-s3.xlsx)). Interestingly, 19 pairs of miRNA-5p/miRNA-3p were found to be located at the same precursors in this study (Fig. 3). Of them, the Nnu-miR396 family had the highest number of miRNA-5p/miRNA-3p pairs (Fig. 3D and Appendix S2 (apps.1500046_s1-s3.xlsx)). In particular, we found three pairs of sense and antisense miRNAs belonging to the same miRNA family (miR393) in N. nucifera (Fig. 4). The average value of MFE was −42.73 kcal/mol (range: −73.80 kcal/mol [Nnu-miR159a] to −13.60 kcal/mol [Nnu-miR5227]). The MFEI value of the predicted pre-miRNAs ranged from 0.62 (Nnu-miR2275b and Nnu-miR319a) to 1.51 (Nnu-miR396e-5p and Nnu-miR396e-3p), with an average of 1.01 ( Appendix S2 (apps.1500046_s1-s3.xlsx)). Putative targets of these miRNAs were also predicted by psRNATarget. A total of 847 potential targets were identified for 40 miRNA families by searching for assembled RNA-seq sequences in N. nucifera. After aligning with the NCBI-nr protein database using BLASTx, 456 targets were annotated ( Appendix S3 (apps.1500046_s1-s3.xlsx)).
Expression analysis of miRNA and their targets using qRT-PCR—To confirm the expression of miRNAs in N. nucifera, we selected 10 miRNAs (miRNAs 156, 159, 160, 164, 168, 171,172, 319, 394, and 396) using qRT-PCR (Table 1 and Appendix S1 (apps.1500046_s1-s3.xlsx)). Of them, five miRNAs (miRNAs 156, 159, 160, 168, 171) and their targets were further analyzed to validate the expression level. The 10 miRNAs exhibited differential expression patterns, thus revealing expression differences among N. nucifera leaf, stem, and flower tissues (Figs. 5, 6).
Moreover, the target gene expression patterns of the five selected miRNA demonstrated a negative association with their miRNAs in N. nucifera stem and flower tissue (Fig. 5). For example, the expression level of miR156 was found to be lowest in N. nucifera flower tissue, while the highest expression level of the miR156 target gene (SPL) was observed in the same tissue. Similar results were also found in the four other miRNAs and their target genes (Fig. 5).
Currently, few studies have been reported to identify miRNAs and their targets with experimental validation in N. nucifera. In this study, we identified a total of 106 miRNAs in N. nucifera, belonging to 40 families. Furthermore, three pairs of antisense miR393 were found in N. nucifera. Antisense miRNA is a class of miRNA that is transcribed from both sense and antisense transcripts derived from the same genomic loci; some of the antisense miRNAs were complementary to their sense miRNAs (Fig. 4). This antisense miRNA has been found in plants, such as Brassica oleracea, Cucumis melo L., and Cucumis sativus L. (Wang et al., 2012; Hu et al., 2014). The antisense miRNA potentially targets the genes that differ from their sense partners, although the evolution and function of the antisense miRNA is still unknown.
In this study, all of the 106 identified miRNAs have been predicted to target 456 genes. After annotation, these genes were likely to play essential roles in growth and development as well as in other biological processes. Furthermore, some conserved miRNAs and their targets were validated in N. nucifera by qRT-PCR (Figs. 5, 6). These include:
Nnu-miR156 targets SPL, which were negatively regulated in post-transcriptional expression. It has been well documented that miR156 in plants plays a critical role for vegetative to reproductive-phase transition (Poethig, 2009). In this study, we also indicated that Nnu-miR156-targeted SPLs are involved in flower development in N. nucifera.
Nnu-miR159 targets GAMYB-like transcription factor. The interaction between miR159 families and phytohormones has been demonstrated in various plant responses. In Arabidopsis, miR159 expression is regulated by gibberellin (GA) and abscisic acid (ABA) during anther development (Achard et al., 2004). In strawberry, miR159 families interact with GAMYB in relation to changes in GA content during receptacle development. In ornamental gloxinia (Sinningia speciosa (Lodd.) Hiern), the expression level of miR159 was negatively correlated with GAMYB, which was effectively involved in flowering time control during flower development (Li et al., 2013). Similarly, we observed that Nnu-miR159 may participate in flower development of N. nucifera.
Nnu-miR160 targets the auxin response factor 18-like (ARF18-like) in N. nucifera. In Arabidopsis, miR160 negatively modulates three ARF genes (ARF10/ARF16/ARF17), controlling root cap development and primary and lateral root growth (Wang et al., 2005). Similarly, miR160/ARFs governs root and nodule organogenesis in Medicago truncatula (Bustos-Sanmamed et al., 2013). We have found that Nnu-miR160 potentially targets the ARFs to regulate stem development in N. nucifera.
Nnu-miR168 targets ARGONAUTEI (AGO1) that encodes the RNA slicer enzyme of miRNAs. AGO1 interacts with miR168 to regulate the small RNA pathway in Arabidopsis. It is also reported that miR168 plays crucial roles in many development processes in tomato, including phase transition, leaf epinasty, and fruit development (Xian et al., 2014). In this study, we showed that Nnu-miR168 regulates flower development in N. nucifera.
Nnu-miR171 targets SCARECROW-LIKE 27 (SCL27). In previous studies, miR171-targeted scarecrow-like proteins (SCL6/SCL22/SCL27) have been revealed to negatively regulate chlorophyll biosynthesis (Wang et al., 2010; Curaba et al., 2013). In Arabidopsis, miR171 interacts with scarecrow-like proteins to combine GT cis-elements, modulating gibberellin-regulated chlorophyll biosynthesis under light conditions (Ma et al., 2014). However, in N. nucifera, Nnu-miR171-targeted genes may play a role during stem development.
On the other hand, five of the 10 miRNAs (Nnu-miR164, Nnu-miR172, Nnu-miR394, Nnu-miR396, and Nnu-miR396) have been measured with relative expression levels by qRT-PCR.
miR164 targets NAC, NAM (no apical meristem), ATAF (Arabidopsis transcription activation factor), and CUC (cup-shaped cotyledon) transcription factors. The NAC genes contain a complementary site with miR164, which were negatively regulated by miR164 through mRNA cleavage (Sieber et al., 2007). In Arabidopsis, miR164 is also involved in age-dependent cell death leaves (Kim et al., 2009). In recent reports, the functions of miR164 have been proven to participate in the responses of plants to biotic as well as abiotic stress (Fang et al., 2014).
miR172 targets APETALA2 transcription factor (Jung et al., 2011). miR172 is critical for the control of flowering time and floral morphology. It is reported that over-expression of miRNA172 leads to early flowering and suppresses the floral organ specification in Arabidopsis (Aukerman and Sakai, 2003).
miR394 targets gene LCR (LEAF CURLING RESPONSIVE-NESS) that encodes an F-box protein (SKP1-Cullin/CDC53-F-box). Besides its control of stem cell competence in the Arabidopsis shoot meristem (Knauer et al., 2013), miR394-regulated LCR abundance plays a role in fine-tuning their responses to ABA and abiotic stress (Song et al., 2013).
miR396 targets GROWTH-REGULATING FACTORS (GRFs), a plant-specific family of transcription factors (Hewezi et al., 2012) . Previous studies have demonstrated that miR396-targeted AtGRFs are required for mediating cell differentiation during leaf morphogenesis in Arabidopsis (Mecchia et al, 2013).
miR319 regulates transcription factors of the TCP family, controlling leaf morphogenesis and several other plant developmental processes in Arabidopsis and tomato (Schommer et al., 2012).
The 10 selected conserved miRNAs for qRT-PCR analysis in Nelumbo nucifera.
Conclusions—In this study, using in silico computer-based approaches for identifying miRNAs and their target genes based on a sequence homology search, 106 conserved miRNAs, belonging to 40 families, were identified in N. nucifera. Pre-miRNAs varied from 55 to 184 nucleotides in length, while mature miRNAs ranged from 20 to 24 nucleotides. Four hundred fifty-six potential miRNA targets were also predicted in this study. Negative correlation of the expression levels between five miRNAs (miRNAs 156, 159, 160, 168, and 171) and their target genes were observed in leaves, stems, and flowers of N. nucifera. Our study results can be used as a workbench for further study of N. nucifera miRNAs and those in other plant species.