Analysis of Some Important Genes from the Trichomes of Boerhaavia diffusa L. Fruits by RNA Isolation

Trichomes play an important role in many physiological and ecological aspects of plants. ESTs from the fruit trichomes are reporting first time, from this plant. Our previous paper reported about the morphology, histochemical and ontology studies of plants trichomes while present paper focused on the ESTs derived from the glandular and nonglandular trichomes on fruits and its analysis. A total 700 ESTs were sequenced from cDNA library of trichomes of an average length of 435 bp. Cluster analysis indicates the presence of 52 counting and 110 singletones transcript redundancy of 54% which means that at least 44% of the total ESTs might yield some useful genes. About 93 sequences annotated to only one GO category and established pathway association to 65 sequences in Keyto encyclopedia of gene and genomics (KEGG). Expression analyses of selected set of genes are known to be involved in the pathway of secondary metabolite synthesis. ESTs determination reports some important proteins, saturated and unsaturated lipids, proteins like flavon-6-phosphate, chalocone synthase, lipoxigenase etc., important metabolites for plant protection. This research is reporting first time from the fruit trichomes and it’s containing lots of useful secondary compounds. So being an important medicinal plant it has capacity for chemical synthesis and secretion for the production of natural products but trichome-specific metabolic pathways and genes involved in various trichome developmental stages have remained unknown. Furthermore, only a very limited amount of plant trichome genomics information is available in scattered databases so further work helpful to increase the secondary metabolic contained that will opens new pathway for the pharmaceutical line.


Introduction
The genus Boerhaavia (Nyctaginacea) has 40 species, which are widely distributed in Australia, China, Egypt, Pakistan, Sudan, Sri Lanka, South Africa, USA, India and several countries of the Middle East. It's commonly known as Punarnava (Sanskrit) in India and an important medicinal herb indigenous to India and is found throughout the warmer parts of the country up to an altitude of 2000 m in the Himalayan region. It grows well on wastelands and in fields after the rainy season. The plant grows in open sun and endures severe abiotic stresses of UV exposure, high temperature, water and nutrient deficiencies. The plant is also cultivated to some extent for its leaves in West Bengal [1].
Ethno-medicinally, B. diffusa has a long history of therapeutic uses in the indigenous tribal communities and in the Ayurvedic system of medicine in India. The root is considered to possess anticonvulsant, antiviral [2], antifibrinolytic [3], antibacterial [4], and hepatoprotective [5] properties. The flowers and seeds are used as contraceptive [6]. The plant as a whole is used for oedema and ascites. Chemical analyses of the aerial parts and roots have revealed the presence of the alkaloid Punarnavine, hypoxanthine-9-Larabinofuranoside, ursolic acid boeravinone A-F, and punarnavoside [6][7][8][9].
Functional genomics approaches are combined computational and expression-based analyses of large amounts of sequence information. Now days it's emerging as a powerful tools to accelerate the comprehensive understanding of cellular metabolism in specialized tissues or as a whole organisms. As part of an ongoing effort to identify genes of essential oil (monoterpene) biosynthesis, randomly selected cDNA clones, or expressed sequence tags (ESTs), from a peppermint (Mentha x piperita) oil gland secretory cell cDNA library. Bioinformatic selection represent that the ESTs involved in essential oil metabolism of about 25%. An additional 7% of the 120 recognized genes code for proteins involved in transport processes, and a subset of these is likely involved in the secretion of essential oil, terpenes from the site of synthesis to the storage cavity of the oil glands. The integrated approaches represent an essential step toward the development of a metabolic map of oil glands and provide a valuable resource for defining molecular targets for the genetic engineering of essential oil formation [10,11]. Fine mapping of pepper trichome locus 1 resulted in identifying genes controlling trichome formation in Capsicum annuum L. CM334 [12,13].
All plants contains high concentrations of the defense compounds of phenylpropene class, that have been recognized as important spices for human consumption (e.g. cloves) and have high economic value. Several lines of basil (Ocimum basilicum) produce volatile oils that contain essentially only one or two specific phenylpropene compounds. Like other members of the Lamiaceae, basil leaves 118 possess on their surface two types of glandular trichomes, termed peltate and capitate glands. An analysis of an expressed sequence tag database from leaf peltate glands revealed that known genes for the phenylpropanoid pathway are expressed at very high levels in these glands, accounting for 13% of the total expressed sequence tags. An additional 14% of cDNAs encoded enzymes for the biosynthesis of Sadenosyl-methionine, an important substrate in the synthesis of many phenylpropenes. Thus, the peltate glands of basil appear to be highly specialized structures for the synthesis and storage of phenylpropenes and serve as an excellent model system to study phenylpropene biosynthesis [14].
Similarly, Humulus lupulus is an economically important crop for the brewing industry, where it is used to impart flavor and aroma to beer, and has also drawn attention in recent years due to its potential pharmaceutical applications. Essential oils (mono-and sesquiterpenes), bitter acids (prenylated polyketides), and prenylflavonoids are the primary phytochemical components that account for these traits, and all accumulate at high concentrations in glandular trichomes of hop cones. The cDNA library of hop identified a Hop MONOTERPENE SYNTHASE2 linear monterpene myrcene from geranyl pyrophosphate, whereas Hop SESQUITERPENE SYNTHASE1 (HlSTS1) formed both caryophyllene and humulene from farnesyl pyrophosphate and together of these enzymes account for the production of the major terpene constituents of the hop trichomes. HlSTS2 formed the minor sesquiterpene constituent germacrene and converted to belemene on chromatography at elevated temperature [15]. Secondly, Humulus lupulus glandular trichomes (lupulin glands) synthesize essential oils and terpenophenolic resins including the bioactive prenylflavonoid xanthohumol. ESTs from cDNA libraries representing enzymes of terpenoid biosynthesis including all of the steps of the methyl 4erythritol phosphate pathway, were abundant in the EST data set, as were ESTs for the known type III polyketide synthases of bitter acid and xanthohumol biosynthesis [16].
B. diffusa is richly covered with trichomes. This plant has so far not been studied for its trichomes-their types and metabolite contents. Trichomes in recent times have attracted the attention because of their secretory products, which range from essential oils to being insect repellents and for biotechnological intervention for modifying their products and understanding molecular basis of their function [17,18]. As trichomes morphological, mechanical features and metabolites produce influence as many as twenty two aspects of plant physiology and ecology. Thus it is important to study trichomes of B. diffusa, as well to analysis the trichomes cDNA library in order to identify putative genes that are involved in secondary metabolism pathways of an important medicinal plant.

Materials and Methods
Fruits of Boerhaavia diffusa were collected from plants growing in the Herbal Garden at the Department of Botany, Dayalbagh Educational Institute, Dayalbagh, Agra (27.2293°N, 78.0026° E) and preserved in the Herbarium of the Department.

Trichome distribution
The presence and density of trichomes on young fruits was determined by ocular measuring grid.

RNA isolation, cDNA library construction and EST sequencing
A total of 70-80 grams of fruits having trichomes were used for mRNA isolated. The fruits were frozen in liquid nitrogen and separated by vortexing the tube for 1 to 2 min while keeping the tube in a horizontal position [19]. The trichomes were used immediately for total RNA extraction by RNeasy plant mini kit (Qiagen). The cDNA library was constructed using the Creator Smart cDNA library construction kit (Clontech, Palo Alto, CA, USA) following the manufacturer's protocol for small amounts of mRNA. Approximately 1 µg of total RNA was used. Plasmid preparations were made using a High Plasmid purification kit (Sigma) with standard protocols. Average insert size was evaluated by agarose gel electrophoresis. 400 cDNA clones were sequenced then Vector, Poly-A, low-quality, and short sequences (100 bp) were subtracted from the EST database. The remaining sequences were inspected manually to further improve the quality of EST trimming by using the pipeline Lasergene (trial version) and Perl software.

Contig assembly and sequence analysis
The high quality sequences were annotated based on their best BLASTX hits in the trichOMC non redundant protein database. ESTs were compared against trichOME specific database TBLSTN (proteins vs. translated DNA) to identify sequence similarity with trichomes of other plant species.
The software programme geecee provided by the Institute Pasteur was used to get information regarding GC content of ESTs. The ORF Predictor was used to predict Open Reading Frames (ORFs). The ORFs are predicted by the presence of start codon and the software tool gives the detailed possible amino acid sequences. The software TargetP was used to identify signal peptides for the subcellular localization of the predicted proteins. This software is based on the presence or absence of secretory pathway, signal peptide, chloroplast peptide, mitochondrial targeting peptide or other signal peptides. 2 sequence may have more than one GO category or a single category.
InterProScan is a special software tool provided by European Bioinformatics Institute (EBI) through Blast2GO. The sequences are mapped according to their domain/motif similarity and the GO results can be merged with the remaining annotations.
The sequences of this study were deposited at GeneBank. The accession numbers are detailed in Table 1.  was much more than on the vegetative parts with the formation and maturation of the apocarps. Variation in the number and type of trichomes is change with the temperature. The study was published in previous paper [20,21].

Generation of ESTs and counting assembly
Sequencing of the cDNA libraries yielded 700 clones from GCTs and GSTs (Table 2). Clipping the vector and poly-A tail and excluding sequences shorter than 100 bp using Perl software resulted in 243(84%) high quality sequences for trichomes, respectively. Average length of high quality ESTs was 435 bp from trichomes, respectively. Cluster analysis with Lasergene software (trial version) indicated the presence of 52 contigs and 110 singletones in the ESTs from trichomes. This had transcript redundancy of 54% which means that at least 44% of the total ESTs might yield some useful genes. Clustering analysis of sequences yielded a total of 162 UniESTs for trichomes and 95% of the sequences had Open Reading Frames (ORFs) when processed through ORF Predictor. An average GC content of 42% for GCTs and GSTs UniESTs was obtained using the geecee software. The ESTs were deposited in NCBI database (Table 1).

Gene ontology annotation
Total ESTs of trichomes were described to Gene Ontology by using Blast2GO software. The ESTs were categorized to three categories: Biological Process (P), Molecular Functions (F), Cellular Components (C). A given EST was assigned to one or more categories about 93 sequences could be annotated to only one GO category; of these 23 could be assigned to Biological Process, 23 to Molecular Functions, 19 to Cellular Components (Figures 1-3).

Pathway analysis
The Biological Process annotations for GCTs sequences were spread among a wide spectrum of processes ranging from photosynthesis, respiration, signal transduction, stress and transcription regulation. The Molecular Function ranged from secondary metabolite production, respiration, to certain transcription factors. The sequences also appeared to be a part of various Cellular Components such as chloroplast, cytoplasm, apoplast, nucleolus, plasma membrane and ribosomal subunit, lipid transportation, various defense responses, metabolic pathways, jasmonic and salicylic acid signalling pathways. The Molecular Functions included lipid binding, SNAP receptor activity and metal-ion binding. The Celluar Components to which these sequences could be annotated were SNARE complex, mitochondrion, plasma membrane and cytoplasm membranebounded vesicle.  To improve annotation the sequences of GCTs and GSTs were subjected to InterProScan. This software uses amino acid sequences generated from Open Reading Frames (ORFs) Predictor. The GCTs sequences resulted in to respiration, transcription, LEA proteins and COX-CUA and some photosynthesis related proteins. The GSTs sequences had a wider range of annotations which included metellotheonin, heat shock protein, sugar transporter, lipid transfer protein, t-SNARE, syntaxin and peptidase. The predicted proteins were identified if they were signal peptides with a possible subcellular localization using the software TargetP. Those signal peptides falling in Reliability Class of 1 and 2 were considered significant enough for mention. In this seven sequences of trichomes could be having signal peptide like function but their specificity could not be ascertained.

Discussion
The morphological, chemical, genomic and metabolomic studies of the trichomes of Boerhaavia diffusa L. in the present work contribute to a better understanding of their functional and ecological significance. The morphological and chemical study of the GCTs and GSTs on the reproductive parts of Boerhaavia diffusa L. contributes to a better understanding of the functional and ecological significance of trichomes in the Nyctagineaceae. The distribution of trichomes with respect to their types and numbers in Boerhaavia is probably genetically controlled as revealed in Arabidopsis where spacing between trichomes is a genetically controlled phenomenon and where present trichomes have a very specific role to play. mRNA isolated in the present study from the GCTs and GSTs of B. diffusa were found to be about 200-435 bp long. After plating and random selection, 700 cDNA clones were sequenced and BLAST in trichOME database. Functional annotation was assigned to 300 randomly chosen ESTs. Out of these ESTs about 25-30% had no predicted functions (no database hit, or a match to a gene with unknown function). Most of the ESTs corresponded to one lipid transfer protein gene that was the most highly expressed gene represented in the trichome library. LTPs are highly expressed in mint, basal and alfalfa trichome [10,14,19]. They may be involved in the formation of the epicuticular waxes that coat the trichome and their high abundance may simply reflect the greater proportion of epidermis to total cellular mass in a trichome preparation from the whole stem and leaf. LTPs may also be involved in the protection against biotic and abiotic stress, either through signaling or direct antimicrobial activity [19,[22][23][24]. Many other genes involved in biotic and abiotic stress like Water stressinduced protein, Putative desiccation protectant protein, Heatshock protein, Senescence-specific cysteine protease were found to be present in the ESTs isolated from B. diffusa in the present study.
WRKY transcription factor, Syntaxin, LEA (late embryogenesisabundant) genes were also found in B. diffusa trichomes (present study). These proteins may reflect the exposed positions of the trichome on the exterior of the plant. The chilling-resistant gene such as in the winter oilseed rape (Brassica napus L. var. oleifera L.) when exposed to the cold (≥ 0°C) brings about adjustments in plant growth and cellular metabolism to cope with low temperature stress, and results in increased resistance of cells to extracellular freezing [25]. The latter effect is further increased by a short exposure of plants to sub-zero temperatures. Such treatment has been shown to affect the properties of plasma membranes and to induce specific signalling pathways [26]. Syntaxin was found in GCTs of B. diffusa (present study) and could be contributing to plant resistance against bacteria and secretion of pathogenesisrelated protein as reported in Arabidopsis thaliana [27].
WRKY gene-encoded transcriptional regulators were found in the GSTs of B. diffusa (present study). WRKY appears to be involved in various physiological and developmental programs, including biotic and abiotic stress responses in the case of Nicotiana tabacum and Retama raetam [28]. WRKY genes play an important role in the signaling cascade of innate immunity in Arabidopsis, as well as in other plant species such as tobacco Oryza sativa and Petroselinum crispum. Induction of WRKY genes has been observed in Retama raetam by drought stress and in Nicotiana tabacum by wounding. Evidence is accumulating that WRKY genes may be involved in development and metabolic processes, such as in the seed coat and trichome development of Arabidopsis, the gibberellins signaling pathway in rice aleurone cells carbohydrate anabolism, and in regulation of sesquiterpene syntheses. WRKY genes encode transcription regulators with diverse functions that have been important for plant development and defense responses [15,29,30].
In the same way Cysteine protease, found in the GCTs of B. diffusa (present study) may play an important role in proteolysis and nitrogen remobilization during the senescence process as reported in plants like Gossypium herbaceum [31], Brassica napus, Hemerocallis [32], Arabidopsis [33]. From the B. diffusa cDNA clones some ESTs were found (present study) that showed resemblance to Plant cathepsin B-like cysteine protease (CBCP) that plays a role in disease resistance and in protein remobilization during germination [34]. Cysteine protease is known to be associated with developmental senescence and pathogen-and stress-induced PCD.
In B. diffusa about 1% of ESTs were found of Sesquiterpene synthase enzyme (present study). This enzyme plays an important role in the biosynthesis of Sesquiterpenoids and they are thought to be prominent in plant fungal interactions [35]. These enzymes catalyze cyclization reactions involving farnesyl diphosphate that yield an estimated 200 different products [36]. Subsequent steps in sesquiterpenoid pathways modify sesquiterpene synthase (in the GCTs, present study) products and generate thousands of compounds that exhibit diverse bioactivities. In plants these modification steps frequently involve various oxygenation reactions, some of which are catalyzed by Cytochrome P450-type monooxygenases as Artemisia annua, Nicotiana tabacum [37].
Small GTP binding proteins found in the trichomes of B. diffusa (present study) play a critical role in cytokinin biosynthesis and/or metabolism; as a defence signal transducer in Arabidopsis, pea, wheat, rice, maize, and tobacco by sensitizing the wound-perception system [38][39][40][41].
Seed maturation proteins found in the ESTs of B. diffusa (present study) are associated with the activation of a variety of genes encoding storage proteins and various hydrophilic, lateembryogenesis-abundant (LEA) proteins that possibly function as desiccation protectants [42]. Developmental program of seed maturation is controlled by at least two factors, the hormone abscisic acid (ABA) and the product of the Viviparous-1 (Vpl) gene in Zea mays [43,44].
Lipoxygenase present in GSTs of B. diffusa (present study) and in some other plant species may be involved in Jasmonic Acid (JA) biosynthesis [45]. The jasmonates (JA and its naturally occurring, structural analogs) are known to influence a wide variety of physiological processes in plants [46,47]. JA has also been proposed to serve as a mediator of plant defence responses to wounding and pathogen attack in Solanum tuberosum [48][49][50]. Thus, lipoxygenase may be involved in a variety of physiological processes due to its role in the synthesis of JA. Other proposed physiological roles of lipoxygenase, unrelated to JA biosynthesis, include membrane degradation during hypersensitive resistance responses [51], production of fatty acid-derived, anti-microbial molecules [52], and the synthesis of ABA [53]. In addition, lipoxygenase has been proposed to serve as a storage protein in both seeds [54] and leaves. Given the presence of multiple isozymes of lipoxygenase in plants, it is possible that individual lipoxygenase isozymes within a plant may have distinct physiological roles.

Conclusions
The present studies reveal that the trichomes of B. diffusa are a rich source of a large number of metabolites. These metabolites are produced within the trichomes, and are suggested by the presence of an array of important enzymes involved in the biosynthesis of flavonoids and terpenes. Isolation of ESTs related to these and many more enzymes suggest transcriptional and translation activities within the trichomes [55,56].