Identification and evaluation of circulating small extracellular vesicle microRNAs as diagnostic biomarkers for patients with indeterminate pulmonary nodules | Journal of Nanobiotechnology


Patient enrollment and study design

This study was approved by the Institutional Review Board in Shanghai Pulmonary Hospital affiliated with Tongji University (K18-199Y) and registered at the Chinese Clinical Trial Registry ( with registration number ChiCTR1800019877. All patients were from Shanghai Pulmonary Hospital affiliated with Tongji University and had signed written consents for their blood samples and clinical information to be used in this study.

Patients with IPNs detected using LDCT scanning, who subsequently underwent surgical resections and diagnosed with LUAD (malignant PNs) and various benign PNs, were enrolled in this study. 199 patients with IPNs were recruited in the training phase from April to May 2019, including 20 patients with benign PNs and 179 with malignant PNs diagnosed using pathological examination. 260 patients with IPNs were recruited in the test phase from September to October 2019, including 35 patients with benign PNs and 225 patients with malignant PNs. Plasma samples had been prospectively collected in a vacutainer with anticoagulant (REF367863; Becton Dickinson, Franklin Lakes, NJ, USA) prior to surgical operation. After the elimination of pathology samples from patients with non-LUAD (n = 65), serious hemolysis (above or equal to grade 5, n = 107), failure to meet the ratio of benign and malignant PNs set at 1:2 (n = 155), and failure in construction of sequencing libraries (n = 23), finally, the training cohort consisted of 47 patients with IPNs (17 benign PNs and 30 malignant PNs), and the test cohort consisted of 62 patients with IPNs (24 benign PNs and 38 malignant PNs). In addition, an external cohort consisting of 20 patients with benign PNs and 79 patients with malignant PNs was used for validation. In this case–control study, a variety of benign PNs without selection served as controls. Additionally, 11 healthy people were enrolled as healthy control.The whole study design and inclusion/exclusion criteria are depicted in Fig. 1 and (Additional file 1: Figure S1).

The LUAD consisted of three pathological subtypes, namely, AIS, MIA, and invasive adenocarcinoma. Nine patients were diagnosed with AIS, which is technically not a malignant disease. Considering the perspective of the pathological progression of lung adenocarcinoma, we still classified AIS as a type of malignant nodule. The pathological subtypes of benign PNs were without selection and comprised more than 10 subtypes. The pathological information of all of the samples was obtained from surgically resected tissue sections in accordance with the 2015 WHO Histological Classification of Lung Cancer [26]. The pathological diagnosis of each patient was confirmed by two pathologists. The tumor–node–metastasis (TNM) stage was determined in accordance with the 8th edition International Association for the Study of Lung Cancer (IASLC) lung cancer staging system [41]. The pathological subtypes of our training and test cohorts, and those of the external validation cohort, are shown in (Additional file 2: Figure S2).

The accuracy of a diagnostic test is usually measured by its sensitivity and specificity [42]. In this study, sensitivity represented the model’s ability to correctly identify individuals with malignant PNs, and specificity represented the model’s ability to correctly identify individuals with benign PNs.

Plasma isolation and sEVs isolation

Blood samples were collected from patients in 10-mL vacutainer tubes containing an anticoagulant of K2EDTA (REF367863; Becton Dickinson, Franklin Lakes, NJ, USA), mixed by gently inverting several times, stored with the tubes placed upright, and then transported on ice within 1 h after collection. To harvest the plasma, the samples were centrifuged at 1600×g for 10 min at 4 °C, after which the hemolysis level was determined and recorded. Samples with hemolysis grade of no more than 4 were used [43]. The collected supernatant was centrifuged again at 16,000×g for 15 min at 4 °C, and then the 1 mL supernatant was transferred into a fresh 1.5 mL tube and stored at − 80 °C prior to use.

For the sEV isolation from plasma, a polyethylene glycol-based 3D Medicine isolation reagent [18] (L3525; 3DMed, Shanghai, China) was used. This isolation reagent has been modified and improved based on the work of Rider [44], and has been registered to the National Medical Products Administration as a Class I medical device (#HMXB20190091), specifically for the isolation of sEVs in the clinical setting. The plasma samples were centrifuged at 12,000×g for 10 min at 4 °C after a static water bath incubation at 37 ℃ for 5 min. The supernatant was transferred to a 0.45 µm tube filter (CLS8163-100EA; Costar, Corning, NY, USA), followed by transfer to a 0.22 µm tube filter (CLS8161-100EA; Costar) and then centrifuged at 12,000×g for 5 min at 4 °C. The filtered supernatant was transferred to a fresh 1.5 mL tube. One-quarter volume of an isolation reagent (L3525) was added to the supernatant; gently inverted and incubated for 30 min at 4 °C and then centrifuged at 4700×g for 30 min at 4 °C. Finally, the supernatant was removed and the pellets containing the total sEVs were re-suspended in 0.2 mL phosphate-buffered saline (PBS).

Western blot analysis

The isolated sEVs were lysed in 200 μL lysis buffer (P0013B, Beyotime, Shanghai, China); next, the proteins were extracted using an isolation reagent (N3525, 3DMed, Shanghai, China). The protein concentration of the sEVs was measured using a Pierce™ BCA Protein Assay Kit (Thermo Fisher Scientific, USA). 20 µg of total protein was resolved on a 12% SDS-PAGE gel, electrotransferred onto a PVDF membrane (Millipore, USA). The membranes were blocked in 5% non-fat milk for 60 min, and incubated with anti- CD9 antibody (diluted 1:500; cat. no. ab92726; Abcam, Cambridge, UK), anti-CD63 antibody (1:2000, ab216130; Abcam, Cambridge, UK), anti-Syntenin antibody (diluted 1:500; cat. no. ab19903; Abcam, Cambridge, UK), anti-TSG101 polyclonal antibody (diluted 1:500; cat. no. abs115706; Absin Bioscience Inc., Shanghai, China), and anti-Calnexin antibody (diluted 1:1000; cat. no. 2679; Cell Signaling Technology, Danvers, MA, USA) primary antibodies overnight at 4 °C. Horseradish peroxidase-conjugated goat anti-rabbit IgG and goat anti-mouse IgG antibodies (Beyotime Biotechnology, China) were used as secondary antibodies. Antibody binding was detected using an enhanced chemilluminescence system according to the manufacturer’s protocol (Tanon-5200 Multi; Tanon Science & Technology Co. Ltd., Shanghai, China).

Nanoparticle tracking analysis (NTA)

Nanosight NS 300 system (NanoSight Technology, Malvern, UK) was used to characterize the number and size of EVs. Isolated sEVs were resuspended in PBS at a concentration of 5 μg/mL and were further diluted 100- to 1000-fold, to achieve between 20 and 100 objects/frame. Samples were manually injected into the sample chamber at ambient temperature. Each sample was configured using a 488 nm laser and a high-sensitivity scientific complementary metal-oxide semiconductor camera, and the measurements were performed in triplicate at a camera setting of 13 with an acquisition time of 30 s and a detection threshold setting of 7. At least 200 completed tracks were analyzed and obtained per video. Finally, the NTA analytical software (version 2.3) was used to analyze the nanoparticle tracking data of the sEV samples in this study.

Transmission electron microscopy (TEM)

For TEM analysis, plasma sEVs were suspended in PBS prior to fixing in 4% paraformaldehyde and transferred to the carbon-coated electron microscopy grids. They were washed with PBS twice, and the third time with PBS containing glycine (50 mM), each for 3 min; then, they were incubated with PBS containing BSA (0.5%) for 10 min. Finally, the grids were stained with 2% uranyl acetate. After the staining, TEM (H-7650, Hitachi High-Technologies, Japan) was used to analyze the morphology of sEVs.

ExoView analysis

Plasma sEVs were detected using ExoView chips (NanoView Biosciences, Brighton, MA) printed with antibodies against CD63, CD81, CD9, and mouse IgG1 as a negative control. 35 μL samples were dropped onto the chip and incubated for 16 h. After washing, chips were incubated with a fluorescence antibody cocktail of anti-CD9 (CF® 488), anti- CD81 (CF® 555), and anti-CD63 (CF® 647) for 1 h at room temperature. Chips were then imaged in the ExoView R100 Scanner (NanoView Biosciences, Brighton, MA). Data were analysed using NanoViewer Software (NanoView Biosciences, Brighton, MA).

RNA isolation from sEVs

RNA was extracted from sEVs using the miRNeasy Serum/Plasma Kit (217184; QIAGEN, Shanghai, China) in accordance with the manufacturer’s protocol. The miRNA quality, yield, and distribution were analyzed using the Agilent 2100 Bioanalyzer with Small RNA Chips (5067-1548; Agilent, Savage, MD, USA).

Small RNA libraries preparation and sequencing

To prepare and construct the small RNA sequencing libraries, a NEB Next Multiplex Small RNA Library Prep Set for Illumina (E7300L; New England Biolabs, Ipswich, MA, USA) was used in accordance with the manufacturer’s protocol. Briefly, the reverse transcription primer was hybridized after 3ʹ adaptor ligation of 100 ng RNA per sample, following 5ʹ adaptor ligation. A total of 18 PCR cycles were performed with Illumina feasible barcode primers after the first strand cDNA synthesis. The prepared libraries were resolved on NucleoSpin Gel and PCR Clean-up (740609.50; MACHEREY–NAGEL, Germany) and recovered in 30 μL DNase- and RNase-free water. The DNA quality, yield, and distribution were analyzed using the LabChip® GX Touch™ HT Nucleic Acid Analyzer with DNA High Sensitivity Reagent Kit (CLS760672; PerkinElmer, Waltham, MA, USA) and the DNA Extended Range LabChip (CLS138948; PerkinElmer). A total of 20–25 libraries were pooled into a single sequencing lane and sequenced using an Illumina HiSeq PE150 analyzer.

Bioinformatics analysis of small RNA sequencing data

The 3′ adaptors of reads were cleaved using a custom program. Subsequently, the reads were aligned to the human genome hg19 assembly ( using BWA 0.7.12 [45]. An individual Small RNA-Seq dataset is required to have a minimum of 5,000,000 reads with minimum mapping rate 80% that mapped with any annotated RNA transcript in the human genome. The annotations were generated from Gencode v25 [46] and miRBase v21 [47] for statistical analysis and to determine expression levels. The annotation includes all small RNAs, such as miRNAs, rRNAs, tRNAs, and piRNAs, as well as long transcripts from GENCODE, which includes both protein coding genes and long non-coding RNAs (lincRNAs). The percentage of reads that mapped to the annotated miRNAs should be greater than 25% (Additional file 1: Table S7). The miRNA expressions were determined by counting the number of reads mapped to the regions annotated by mature miRNAs. The miRNA mapped by at least two reads in each of the samples and with length less than 30 nt was saved for miRNA expression analysis. The miRNA expression analysis was performed using the voom function in the limma package [48], with normalization by Trimmed Mean of M-values (TMM) via the edgeR package, and the miRNA expression level was converted to log2-counts-per-million (logCPM) [49]. The Empirical Bayes algorithm implemented in ComBat was applied to the training and the test cohort data sets adjusted for batch effects [50, 51].

Quantitative reverse-transcription PCR

Total RNA extraction from sEVs were as previously described. miRNA were reverse transcribed using TaqMan™ Advanced miRNA cfDNA Synthesis Kit (A28007, Applied Biosystems™, USA) according to the manufacturer’s protocol. qPCR was performed on Applied Biosystems 7500 Fast Real-Time PCR systems with specific (miR-451a, miR-125b-5p, miR-101-3p, miR-3168, miR-150-5p and let-7b-3p) probes (A25576, Applied Biosystems™, USA). The expression level of miR-451a were used as control as previously reported [52]. Relative expression were calculated with mean Ct values using 2−ΔΔCt method.

Statistical analysis

The samples in the training and test cohorts in this study and the samples in the external validation cohort from another study were analyzed [18]. The diagnostic model was constructed using least absolute shrinkage and selection operator (LASSO) in the training cohort. The test cohort and the external cohort were used to test and validate the diagnostic model. We selected the differentially expressed sEV-miRNAs (DEMs) determined according to the stringent statistical threshold (Student’s t-test p-value ≤ 0.05, 1.5-fold change, and the mean expression CPM ≥ 50) between the benign and malignant PNs. Based on DEMs, the risk scores were generated using LASSO analysis, and the best parameters of the model constructed using LASSO were ultimately selected using tenfold cross-validation.

Statistical analysis was performed using the statistical programming language R (version 3.6). The dendextend package [53] in R was used to perform average linkage hierarchical clustering of genes and cases. The heatmap was constructed using the ComplexHeatmap package [54] in R/Bioconductor. The biological processes of Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enrichment of the experimentally validated targets of miRNAs were examined using mirPath v.3, which provided the Expression Analysis Systematic Explorer (EASE) score and false-discovery rates using the Fisher’s exact tests and unbiased empirical distributions [55]. The Kaplan–Meier plot analysis of the TCGA data was performed using OncoLnc [56].


Source link

Leave a Reply

Your email address will not be published.