Download PDF
Original Article  |  Open Access  |  18 Nov 2025

5-hydroxymethylcytosine signature in plasma extracellular vesicle DNA as a diagnostic molecular biomarker for precancerous lesions of gastric cancer

Views: 34 |  Downloads: 3 |  Cited:  0
Extracell Vesicles Circ Nucleic Acids. 2025;6:822-42.
10.20517/evcna.2025.76 |  © The Author(s) 2025.
Author Information
Article Notes
Cite This Article

Abstract

Aim: Precancerous lesions of gastric cancer (PLGC) represent a critical window for prevention. Developing non-invasive tools that can reliably detect these lesions is therefore a prerequisite for lowering gastric-cancer incidence. Recent work has highlighted the diagnostic promise of plasma extracellular vesicle DNAs (evDNAs) and the 5-hydroxymethylcytosine (5hmC)-Seal epigenomic platform. Here we profiled genome-wide 5hmC patterns in circulating evDNA to discover biomarkers and build a classification model.

Methods: We performed whole-genome 5hmC-Seal on plasma evDNAs from 67 PLGC patients and 67 healthy individuals. By identifying trend-expressed differentially hydroxymethylated regions (DhMRs), we used machine learning algorithms to screen for diagnostic biomarkers of PLGC and established a corresponding diagnostic model.

Results: We ultimately constructed a diagnostic model comprising nine DhMRs. In the test set, the area under the curve (AUC) value was 0.963, with an accuracy of 0.886, sensitivity of 95.45%, and specificity of 81.82%. These results indicate that DhMRs in evDNA can serve as diagnostic biomarkers for PLGC, with good diagnostic capability and reliability. Correlation analysis showed a strong association between the DhMRs in the diagnostic model and clinical pathological indicators of PLGC.

Conclusion: We developed a non-invasive diagnostic model for PLGC by profiling 5hmC in plasma evDNA. In both accuracy and inter-batch robustness, it surpasses all previously reported assays. Our findings establish plasma-evDNA 5hmC profiling as a reliable, minimally invasive strategy for the early detection and precise diagnosis of gastric precancerous lesions, and provide a new translational and clinical framework for future work.

Keywords

Precancerous lesions of gastric cancer, 5-hydroxymethylcytosine, extracellular vesicle DNAs, molecular biomarker

INTRODUCTION

Gastric cancer ranks as the fifth most commonly diagnosed malignancy and is also the fifth leading cause of cancer-related mortality[1,2]. Globally, in 2020, there were over 1.08 million new cases reported, resulting in 768,793 deaths[3]. Moreover, gastric cancer often presents with few early symptoms, and only 35.63% of gastric cancer cases are diagnosed at early stages (I-II)[4]. Studies have shown that in high-risk regions for gastric cancer, such as Japan and South Korea, early screening programs have significantly reduced gastric cancer mortality and improved prognosis, increasing the 5-year patient survival rate to over 54%[5]. Additionally, precancerous lesions of gastric cancer (PLGC), including chronic atrophic gastritis (CAG), intestinal metaplasia (IM), and dysplasia (Dys), are the most common and critical pathological conditions of the gastric mucosa that lead to gastric cancer[6,7]. Therefore, effective screening for PLGC to advance the treatment timeline and inhibit or reverse the pathological progression of the gastric mucosa is of great significance in reducing the incidence of gastric cancer.

Currently, screening for PLGC primarily relies on gastroscopy. Although advanced techniques such as magnifying endoscopy, chromoendoscopy, narrow-band imaging, autofluorescence, and confocal endomicroscopy can enhance the detection rate of precancerous lesions, diagnosing PLGC accurately remains problematic due to variability in endoscopic imaging and lesion morphology[8,9]. Moreover, gastroscopy is expensive and may cause adverse effects such as abdominal discomfort, bleeding, perforation, and infection[10]. To date, there is no reliable non-endoscopic biomarker for screening PLGC in the general population. Commonly used serum markers, such as pepsinogen, the pepsinogen I/II ratio, and gastrin-17, are insufficient for effective PLGC screening[11]. Additionally, molecular biomarkers identified through sequencing technologies such as microRNA sequencing (microRNA-seq), lipidomics, and proteomics have not yet achieved high levels of accuracy[12-15]. Therefore, there is a pressing demand for a low-cost, high-precision, non-invasive or minimally invasive screening tool to detect PLGC and improve diagnostic accuracy.

With the rapid evolution of omics technologies, liquid biopsy - by virtue of its non-invasiveness, minimal discomfort, and logistical convenience - has emerged as a powerful tool for molecular profiling and longitudinal disease monitoring[16,17]. To date, blood-based assays for PLGC have chiefly interrogated circulating proteins, RNAs, and metabolites[18-20]. While these analytes report on gene expression, functional activity, and metabolic flux, they are intrinsically labile and difficult to preserve[21,22]. Cell-free DNA (cfDNA) offers superior stability, and the advent of high-throughput sequencing has uncovered myriad cfDNA- and ctDNA-derived biomarkers that now guide cancer diagnosis and management[23,24]. Yet cfDNA is predominantly released during cell death, providing a snapshot of terminal events; in early malignancies, ctDNA is often present at vanishingly low concentrations[25,26]. Extracellular vesicle DNAs (evDNAs) are rapidly gaining attention. The lipid bilayer of extracellular vesicles (EVs) shields its molecular cargo from nucleases and proteases, while enabling both short- and long-range intercellular communication[21,27-29]. These nanocarriers traverse endothelial barriers, enter the circulation, and deliver a stable, information-rich record of disease status that can be harnessed for diagnosis, therapeutic monitoring, and deep molecular characterization[30-32]. Functioning in paracrine and endocrine signaling networks, evDNAs themselves constitute a high-density message stream. Recent studies have sequenced evDNAs directly from plasma to generate robust classifiers for advanced pancreatic and colorectal cancers[33,34]. Collectively, these findings position evDNAs as a next-generation biomarker with transformative potential for the early detection and precision management of human disease.

Recent studies have revealed a close link between PLGC and epigenetics, with DNA methylation being identified as a key factor associated with PLGC and serving as a potential diagnostic biomarker. The 5-hydroxymethylcytosine (5hmC) is a crucial intermediate in the epigenetic process from DNA methylation to demethylation and is vital for numerous physiological and pathological processes[35,36]. Previous research has shown that 5hmC is highly accurate and sensitive, making it a valuable epigenetic biomarker for various human diseases, including gastric cancer[37], esophageal cancer[38], hepatocellular carcinoma[39], and colorectal cancer[40]. The latest findings also indicate that the 5hmC molecular landscape is associated with the development of CAG in the pathological stages of PLGC[41]. Thus, detecting 5hmC in blood evDNAs could help elucidate the epigenetic changes in PLGC and identify epigenetic molecular markers with diagnostic potential.

This study utilized plasma evDNAs from patients with PLGC and a matched cohort of healthy individuals with similar gender and age. It also employed the whole-genome 5hmC-Seal technology for evDNAs, which our team developed and optimized for clinical plasma samples. The objective was to clarify the distribution of 5hmC in the PLGC genome, investigate the characteristics of 5hmC-enriched regions, identify genes with differential 5hmC modifications between PLGC and healthy individuals as potential biomarkers, and subsequently develop a diagnostic model to facilitate non-invasive detection of PLGC.

METHODS

Design of study and participants

This study adopted a case-control design, recruiting 67 patients with PLGC and matching them with a control group based on age and sex. All participants were from the Hebei Province Hospital of Chinese Medicine between January and December 2024, with the ethics approval number HBZY2023-YS-134-01. The diagnosis of precancerous gastric lesions was based on the Guidelines for Diagnosis and Treatment of Chronic Gastritis in China (2022, Shanghai)[42] and the updated Sydney System for disease classification and grading. Inclusion criteria included: (1) age ranging from 20 to 70 years, regardless of gender; (2) meeting the diagnostic criteria for precancerous gastric lesions, with endoscopy and pathology assessed by at least two relevant experts; (3) informed consent, voluntary participation in the study, and signature of consent form. Exclusion criteria were: (1) patients with severe hepatic and renal impairment, hematologic disorders, autoimmune diseases, endocrine disorders, or other serious primary diseases affecting life expectancy; (2) patients with malignancies, acute infections, or other major illnesses; (3) pregnant, miscarried, or breastfeeding women; (4) patients unable or unwilling to cooperate in the collection of relevant information due to illness or other reasons.

Sample size calculation

In this study, we assessed the performance of existing diagnostic models for PLGC through a literature review and found that their area under the curve (AUC) values generally ranged from 0.6 to 0.9[18-20]. Drawing on our research team’s previous experience in constructing diagnostic models, we selected an AUC of 0.90 as the benchmark for estimating the required sample size. We calculated the sample size using the formula provided by the Power Analysis and Sample Size (PASS) software (2021 edition). The ratio of the positive to negative groups was set at 1:1, and the alternative hypothesis AUC was set at 0.90. Additionally, we designated 0.75 as the null hypothesis value for the test criterion.

$$ n=\frac{\left[Z_{1-\alpha} \sqrt{p_{0}\left(1-p_{0}\right)}+Z_{1-\beta} \sqrt{p_{1}\left(1-p_{1}\right)}\right]^{2}}{\left(p_{1}-p_{0}\right)^{2}} $$

Firstly, we establish the values for α and β at 0.05 and 0.2, respectively. For α, the Z score corresponding to a 95% confidence level, Z1-α, is approximately 1.96. For β, assuming a power of 80%, Z1-β is roughly 0.84.

Next, we calculate the variance for the baseline proportion p0 = 0.6 and the new proportion p1 = 0.9:

$$ Z_{1-\alpha} \sqrt{p_{0}\left(1-p_{0}\right)}=1.96 \times \sqrt{0.75(1-0.75)} \approx 0.849 $$

$$ Z_{1-\beta} \sqrt{p_{1}\left(1-p_{1}\right)}=0.84 \times \sqrt{0.90(1-0.90)}=0.252 $$

We then sum these products and divide by the difference between p1 and p0:

$$ n=\left(\frac{0.849+0.252}{0.90-0.75}\right)^{2}=53.836 \approx 54 $$

Ultimately, based on the formula, we determined that the minimum required sample size per group was 54 individuals. With this in mind, we endeavored to recruit more than this threshold number of participants during the study period. In the end, we enrolled a total of 67 patients to ensure the reliability and validity of the study results.

Plasma separation and evDNAs extraction

In line with our previously established experimental protocol[43], we initiated by collecting whole blood samples from patients via standard venipuncture. These samples were stored in cfDNA collection tubes (Roche) at a temperature range of 15 to 25°C. Each sample was processed within 24 h, with plasma separated by centrifugation. EVs were extracted using an exosome extraction kit (H-Wayen Exosome Extraction Kit, EIQ3-02001, China). Briefly, 1 mL plasma was mixed with 20 µL Reagent C, vortexed, and incubated for 15 min at 37 °C, then centrifuged at 10,000 × g for 10 min; the supernatant was kept, combined with 250 µL Reagent A, incubated for 30 min on ice, and pelleted at 3,000 × g for 1 min. After discarding the supernatant, the pellet was resuspended in 1 mL PBS, 250 µL Reagent B was added, the mixture was incubated for 30 min at 4 °C, and centrifuged again at 3,000 × g for 1 min. The final EV-rich pellet was resuspended in 200 µL PBS and stored at -80 °C until use. The extraction and purification of evDNAs were conducted using the Quick-DNA Miniprep Kit (ZYMO), which involved adding BioFluid&Cell buffer, proteinase K, genomic binding buffer, genomic DNA (gDNA) wash buffer, incubation, and centrifugation at 12,000 × g. Prior to library construction, quality control was performed using nucleic acid electrophoresis.

Characterization of EVs

Exosome size and concentration were determined by nanoparticle tracking analysis (NTA) performed at VivaCell Biosciences on a ZetaView PMX 110 instrument (Particle Metrix, Meerbusch, Germany) running ZetaView 8.06.01 software. Isolated exosomes were diluted in 1 × phosphate-buffered saline (PBS, Biological Industries, Israel) to the optimal concentration for measurement. For each sample, videos were captured at 11 consecutive positions and subsequently analyzed. The system was calibrated with 110 nm polystyrene standards, and measurements were carried out at 23-30 °C.

To confirm the size and morphology of patient-derived EVs, samples were prepared for negative-stain transmission electron microscopy (TEM). A 5-μL droplet of EVs suspended in PBS was applied onto a 200-mesh Formvar/carbon-coated copper grid and allowed to adsorb for 1 min. Excess liquid was gently blotted with filter paper, and the grid was stained with four consecutive drops of 1.5% (w/v) uranyl acetate. After brief washing and air-drying, specimens were imaged with an FEI Tecnai G2 Spirit transmission electron microscope (TEM) operated at 60-120 kV.

Purified EVs were lysed on ice in radio-immunoprecipitation assay (RIPA) buffer (Thermo Fisher Scientific) containing 1 × protease inhibitor and 1 mM phenylmethylsulfonyl fluoride (PMSF). After centrifugation at 12,000 × g for 10 min at 4 °C, the supernatant was transferred to a fresh tube. Protein concentration was determined with the bicinchoninic acid (BCA) Protein Assay Kit (Thermo Fisher Scientific). Fifty micrograms of protein per sample were separated on 12% SDS-PAGE (sodium dodecyl sulfate-polyacrylamide gel electrophoresis) and electro-transferred to polyvinylidene fluoride membrane (PVDF) membranes. Membranes were blocked with 5% non-fat milk at room temperature, then incubated overnight at 4 °C with primary antibodies against CD9 (ab236630, Abcam), CD63 (67605-1-Ig, Proteintech), CD81 (66866-1-Ig, Proteintech), HSP70 (ab5439, Abcam), TSG101 (28283-1-AP, Proteintech) and Albumin (16475-1-AP, Proteintech). After washing, membranes were probed with horseradish peroxidase (HRP)-conjugated secondary antibodies and visualized using an ECL kit (Thermo Fisher Scientific) on a Tanon chemiluminescence imager (China).

Construction of 5hmC library and high-throughput sequencing

In this study, the 5hmC libraries for all samples were constructed using an efficient 5hmC sealing technique, consistent with our previous experiments[44]. According to the kit protocol, evDNAs isolated from plasma were subjected to end repair and 3’-adenylation using the Hyper Prep Kit (KAPA Biosystems). Illumina-compatible adapters were then ligated for labeling. Subsequently, the connected evDNAs were glycosylated in a prepared solution. Then C39H51N5O8S (DBCO-PEG4-biotin, Click Chemistry Tools) was added and incubated. Next, the DNA was purified and cleaned. Subsequently, the purified DNA was incubated with streptavidin-coated beads (Life Technologies) in a buffer containing Tween 20, followed by washing and polymerase chain reaction (PCR) amplification. Next, we further purified the PCR products using AMPure XP beads (Beckman). After quantifying the concentration and performing fragment size quality control (QC), paired-end high-throughput sequencing (150 bp) was conducted on libraries meeting the quantitative standards using the NovaSeq 6000 platform. Post-sequencing library QC was performed, including assessments of Q30 score and duplication rate. The peak of DNA fragments was mainly concentrated around 180 bp, with a size range of 0-600 bp. The average Q30 score was 91.2%, and the duplication rate was approximately 39.1%.

Exploration and alignment of modified regions

The raw sequencing data were first aligned to the human genome using Bowtie2[45], and duplicate reads were removed through filtering with SAMtools[46]. Subsequently, the paired-end reads were normalized and compared to the total read count, formatted in BedGraph for initial analysis. These were then converted to bigwig format to facilitate visualization in genome browsers such as the University of California Santa Cruz (UCSC) Genome Browser[47]. Potential hydroxymethylated regions (hMRs) were identified using Model-based Analysis of ChIP-seq (MACS; version 2.2.7.1), and overlapping peaks were merged using bedtools merge. Only peaks present in more than 10 samples and smaller than 1,000 bp were retained, while genomic regions known to produce artifact signals were excluded. For each patient, hMRs were generated by intersecting individual peak call files with a combined peak file[48,49]. Additionally, intervals overlapping ENCODE blacklist regions, satellite repeats, or sex chromosomes (X/Y) were excluded to minimize copy-number or mapping artefacts.

Identification of differential hydroxymethylation regions and associated bioinformatics analysis

Differential analysis of 5hmC regions between PLGC and healthy samples was performed using the limma package in R. We identified differential hydroxymethylated regions (DhMRs) using a two-sided Wilcoxon rank-sum test (criteria: |log2 fold change| > 0.5 and P value < 0.01 for high-confidence DhMR identification; a relaxed threshold of p < 0.05 was used for exploratory clustering visualization) for further analysis[50]. Following Correa’s cascade, gastric adenocarcinoma is viewed as a continuous progression from chronic gastritis to cancer. Mfuzz was used to select DhMRs whose 5hmC signals show a monotonic upward or downward trend across successive pathological stages[51]. The corresponding DhMGs were matched. The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis and Gene Ontology (GO) enrichment analysis were performed utilizing the clusterProfiler package in the R programming environment[52].

Selection of feature regions and construction of diagnostic model

To identify the most informative diagnostic DhMRs from the initial pool, we compared three feature-ranking algorithms - linear discriminant analysis, logistic regression and random forest - implemented in Python 3.8. A marker was advanced to the modelling stage only if it was selected by at least two of the three methods. The training set was then submitted to a stratified 5-fold cross-validation scheme repeated 10 times (50 folds in total) to build a parameter-free ensemble; model performance was finally assessed on the held-out test set. The ensemble itself combined three base learners: a neural network, a random forest and a stochastic-gradient-descent (SGD) classifier. The probability outputs of these learners were concatenated into a meta-feature matrix that fed a support-vector-machine (SVM) combiner for the ultimate class assignment [Supplementary Figure 1]. Specifically, the convolutional layer structure of the neural network was 256*128*64*32*16*8*4*1, with the rectified linear unit (ReLU) activation function and binary cross-entropy loss function. The training duration was set to 200 epochs. The random forest consisted of 300 decision trees. In our stochastic gradient descent model, we employ log-loss with elastic-net regularization, setting the L1-to-L2 ratio to 0.5. This blend of L1 and L2 penalties fosters sparsity while safeguarding model stability. A meta-classifier was trained using the limited-memory Broyden-Fletcher-Goldfarb-Shanno algorithm (L-BFGS) to optimize the Support Vector Machine (SVM) model. The SVM meta-classifier was used for the final prediction, with a classification threshold of 0.5 (i.e., 1 for probabilities greater than 0.5 and 0 for probabilities less than 0.5). We assessed the model’s discriminative performance using the receiver operating characteristic (ROC) curve and its AUC, from which the corresponding specificity and sensitivity were derived.

Statistical analysis

Continuous variables are presented as mean ± standard deviation and compared among groups using one-way analysis of variance (ANOVA). Categorical variables are summarized as counts or proportions and compared using the chi-square test. For high-dimensional 5hmC-Seal data, differential analysis was performed using the limma-trend framework, with P-values adjusted by the Benjamini-Hochberg false discovery rate (FDR). The performance of the diagnostic model was evaluated using the AUC with 95% confidence intervals estimated by 2,000 bootstrap resamplings. All statistical analyses were conducted using R software.

RESULTS

Demographics and clinical characteristics of the study population

This study collected plasma samples from 67 patients with PLGC and 67 healthy donors. Clinical data were obtained from all samples. Table 1 shows the basic information of patients with PLGC and healthy individuals. There were no significant differences in age and sex. Additionally, we collected operative link on gastritis assessment (OLGA) and operative link on gastric IM assessment (OLGIM) scores for gastric mucosa. We randomly divided all samples into a training set and a test set in a 2:1 ratio for subsequent analysis, utilizing the training set for differential analysis and model construction, and the test set for model validation [Figure 1].

5-hydroxymethylcytosine signature in plasma extracellular vesicle DNA as a diagnostic molecular biomarker for precancerous lesions of gastric cancer

Figure 1. Schematic overview of the study. PLGC: Precancerous lesions of gastric cancer; 5hmC: 5-hydroxymethylcytosine; CAG: chronic atrophic gastritis; IM: intestinal metaplasia; Dys: dysplasia; AUC: area under the curve.

Table 1

Demographics and clinical characteristics of the study participants

Characteristic PLGC Healthy
n 67 67
Age (years) 54.97 ± 12.36 54.58 ± 11.46
Sex
Male 32 (47.76%) 34 (50.75%)
Female 35 (52.24%) 33 (49.25%)
OLGA
0 0 (0%) 0 (0.00%)
1 9 (13.43%) 0 (0.00%)
2 30 (44.78%) 0 (0.00%)
3 28 (41.79%) 0 (0.00%)
OLGIM
0 5 (7.46%) 0 (0.00%)
1 5 (7.46%) 0 (0.00%)
2 22 (32.84%) 0 (0.00%)
3 35 (52.24%) 0 (0.00%)

5hmC modification profiling and genome-wide distribution in PLGC and healthy individuals

Following isolation, EVs were comprehensively characterized. NTA showed a narrow size distribution centered at ≈ 100 nm for both PLGC and Healthy samples [Supplementary Figure 2A], a morphology confirmed by negative-stain TEM [Supplementary Figure 2B]. Western blotting revealed the canonical EV markers CD9, CD81, CD63, TSG101 and HSP70, while albumin - a proxy for plasma contamination - was virtually undetectable [Supplementary Figure 2C]. Together, these data verify successful isolation of high-purity EVs suitable for downstream analyses.

We initially conducted a 5hmC profiling analysis to examine the differences in the 5hmC landscapes of evDNAs between PLGC patients and healthy individuals. We found that the average level of 5hmC within the upstream and downstream 2 kb interval was lower in the PLGC group than in the Healthy group (p < 0.05), with the Dys stage showing the lowest levels [Figure 2A]. A PCA-derived distribution plot clearly distinguished PLGC patients from healthy individuals [Figure 2B]. Consistent with previous findings, the 200 highest-ranking 5hmC sites (P < 0.05, ranked by |log2 fold change|) were predominantly enriched in distal intergenic and intronic regions [Figure 2C]. Overall, 5hmC sites were mainly enriched in the 1st intron region (13.81%), other introns (31.99%), and distal intergenic regions (28.94%) [Figure 2D].

5-hydroxymethylcytosine signature in plasma extracellular vesicle DNA as a diagnostic molecular biomarker for precancerous lesions of gastric cancer

Figure 2. 5-hydroxymethylcytosine modification profiling and genome-wide distribution in PLGC and healthy individuals. (A) Distribution of 5hmC in PLGC and healthy individuals at different pathological stages; (B) Principal component analysis results of the two groups; (C) Mean log2 fold change in different genomic regions, including Distal Intergenic, Promoter, Exon, Intron, and 3’UTR; (D) Overall distribution of 5hmC sites in different genomic regions; (E) 5hmC motif analysis results in healthy samples; (F) 5hmC motif analysis results in PLGC samples. PLGC: Precancerous lesion of gastric cancer; 5hmC: 5-hydroxymethylcytosine; TSS: transcription start site; TTS: transcription termination site; PCA: principal component analysis; PC: principal component; UTR: untranslated region; ERG: ETS-related gene; GATA3: GATA binding protein 3; TRPS1: trichorhinophalangeal syndrome type I; ETV2: ETS variant transcription factor 2; EWS: ewing sarcoma gene.

Additionally, to explore the correlation between 5hmC and potential binding proteins, we performed motif enrichment analysis across the entire genome for 5hmC. In line with prior studies, the E-26 transformation-specific-related gene (ERG) motif (P = 1 × 10-10719, 12.95%) was the most significantly enriched in both PLGC and healthy evDNAs samples[38,53]. In healthy samples, the second and third most enriched motifs were GATA-binding protein 3 (GATA3, P = 1 × 10-10072, 22.31%) and trichorhinophalangeal syndrome-1 (TRPS1, P = 1 × 10-9159, 29.46%), respectively [Figure 2E]. In PLGC samples, the second and third most enriched motifs were Ets Variant Transcription Factor 2 (ETV2, P = 1 × 10-3093, 8.61%) and Ewing sarcoma (EWS, P = 1 × 10-2555, 8.01%) [Figure 2F]. These transcription factors are expressed in various cell types and are involved in processes such as cell proliferation, differentiation, and apoptosis. These results indicate that the 5hmC profiles of plasma evDNAs in PLGC and healthy individuals have distinct features and hold potential as plasma biomarkers for distinguishing PLGC from healthy individuals.

Analysis of 5hmC differences and pathway and function annotations

We next performed unsupervised clustering of the top 500 DhMRs (P < 0.05; 250 with the highest and 250 with the lowest log2 fold change), which revealed a distinct 5hmC signature separating PLGC patients from healthy controls [Figure 3A and Supplementary Table 1]. For the final DhMR selection used in model construction, we applied more stringent criteria (|log2FoldChange| > 0.5 and P-value < 0.01), identifying a total of 2,282 unique DhMRs (572 upregulated and 1,710 downregulated) [Figure 3B]. These DhMRs were then mapped to their corresponding genes to obtain DhMGs, which were subsequently subjected to KEGG pathway and GO enrichment analysis. The results indicated significant enrichment in pathways such as hypoxia-inducible factor 1 (HIF-1) signaling pathway, phosphoinositide 3-kinase (PI3K)-Akt signaling pathway, T cell receptor signaling pathway, Chemokine signaling pathway, Wnt signaling pathway, and Cellular senescence. The GO analysis showed enrichment in biological process (BP) related to positive regulation of epithelial cell migration, regulation of inflammatory response, regulation of Wnt signaling pathway, and B cell activation; CC including focal adhesion, histone deacetylase complex, and cell-substrate junction; and MF (molecular functions) such as transforming growth factor beta receptor activity, cadherin binding, protein serine/threonine kinase activity, and histone H3 acetyltransferase activity [Figure 3C].

5-hydroxymethylcytosine signature in plasma extracellular vesicle DNA as a diagnostic molecular biomarker for precancerous lesions of gastric cancer

Figure 3. Analysis of 5hmC differences and pathway and function annotations. (A) Heatmap of DhMRs distribution; (B) Volcano plot of DhMRs, with the x-axis representing log2(fold change) and the y-axis representing -log10(P-value). Red indicates upregulation, blue indicates downregulation, and gray indicates no significant change; (C) Results of GO and KEGG pathway enrichment analyses. The x-axis shows -log10(P-value), and the y-axis lists significantly enriched BP, CC, MF, and KEGG pathways. BP: Biological processes; CC: cellular components; MF: molecular functions; PLGC: precancerous lesions of gastric cancer; CAG: chronic atrophic gastritis; IM: intestinal metaplasia; Dys: dysplasia.

Previous studies have shown that PLGC, as a chronic inflammatory disease, is associated with HIF-1, Chemokine, Cellular senescence, and B cell activation[54-56]. Additionally, PLGC is closely related to epithelial-mesenchymal transition (EMT) proteins such as beta-catenin and Wnt[57,58]. It forms a premalignant microenvironment closely linked to energy metabolism-related pathways such as PI3K-Akt and mitogen-activated protein kinase (MAPK)[59,60]. Our findings also showed similar results, indicating that 5hmC in plasma-derived evDNAs is closely associated with the disease, suggesting its potential as a biomarker for monitoring disease progression.

Identification of DhMRs with trending expression changes and functional annotations

According to the Correa cascade, gastric adenocarcinoma is clinically defined as a continuous pathological progression from chronic gastritis, CAG, IM, Dys, to gastric cancer. Based on the results of 5hmC distribution levels, we found that the average levels of 5hmC in the combined CAG/IM group and the Dys group were lower than those in healthy individuals, and they decreased progressively with pathological changes [Figure 4A]. Given the limited number of isolated CAG cases and the biological continuity between CAG and IM, these two stages were analyzed jointly as a single “CAG/IM” group, consistent with recent studies[61,62]. The complete DhMR annotations distinguishing Cluster 3 and Cluster 4 are provided in Supplementary Table 2. To further identify DhMRs associated with disease progression, we performed Mfuzz clustering analysis on samples categorized into Healthy, CAG/IM, and Dys groups. This analysis categorized 5hmC sites into four distinct clusters. It is worth noting that Cluster 3 and Cluster 4 exhibited a sustained increasing trend, containing 314 and 985 DhMRs, respectively [Figure 4B]. Subsequent functional enrichment analysis of these continuously upregulated and downregulated DhMRs revealed that upregulated DhMRs were primarily associated with regulation of fatty acid transport, response to transforming growth factor beta, and regulation of lipid catabolic process, while downregulated DhMRs were mainly linked to positive regulation of leukocyte differentiation, myeloid leukocyte differentiation, sensory system development, and stem cell development [Figure 4C].

5-hydroxymethylcytosine signature in plasma extracellular vesicle DNA as a diagnostic molecular biomarker for precancerous lesions of gastric cancer

Figure 4. Identification of DhMRs with Trending Expression Changes and Functional Annotations. (A) Distribution of 5hmC in PLGC and healthy individuals at different pathological stages, including CAG/IM, Dys, and Healthy; (B) Clustering analysis to identify clusters with continuous expression changes across healthy, CAG/IM, and Dys states; (C) Enrichment analysis of biological processes for upregulated and downregulated DhMGs. DhMR: Differentially hydroxymethylated region; 5hmC: 5-hydroxymethylcytosine; PLGC: precancerous lesion of gastric cancer; CAG: chronic atrophic gastritis; IM: intestinal metaplasia; Dys: dysplasia; DhMG: differentially hydroxymethylated gene.

Construction and validation of the PLGC diagnostic model

Next, we conducted feature biomarker screening and diagnostic model construction using 1,664 trend-expressed DhMRs in the training set (90 cases) and validated the model using the test set (44 cases). Initially, we employed linear discriminant analysis, logistic regression, and random forest algorithms for feature selection. We selected the top 30 DhMRs that were chosen by at least two of these algorithms, and ultimately, nine DhMRs [Rho guanine nucleotide exchange factor 16 (ARHGEF16), multiple EGF such as domains 6 (MEGF6), caspase 9 (CASP9), erythrocyte membrane protein band 4.1 (EPB41), transmembrane protein 39B (TMEM39B), SH3GL interacting endocytic adaptor 1 (SGIP1), GNG12, DIRAS3 and WLS antisense RNA 1 (GNG12-AS1), WAS/WASL interacting protein family member 1 (WIPF1), nuclear receptor corepressor 2 (NCOR2)] were incorporated into the diagnostic model [Table 2]. After constructing the diagnostic model, the results showed that the prediction of patients with PLGC in the training set achieved a perfect outcome, with no misdiagnoses among either patients or healthy individuals [Figure 5A]. The AUC value was 1.000 [Figure 5B], with an accuracy of 1.000, sensitivity of 100.00%, and specificity of 100.00% [Figure 5C]. In the test set, the diagnostic model performed well, correctly diagnosing 21 out of 22 PLGC patients and 18 out of 22 healthy individuals [Figure 5D]. The AUC value was 0.963 [Figure 5E], with an accuracy of 0.886, sensitivity of 95.45%, and specificity of 81.82% [Figure 5F]. Unsupervised clustering of the DhMRs in the diagnostic model was further used to generate a heatmap, which confirmed the effectiveness of these DhMRs in distinguishing PLGC patients from healthy individuals [Figure 5G]. These results indicate that DhMRs in evDNAs can serve as diagnostic biomarkers for PLGC, with good diagnostic capability and reliability.

5-hydroxymethylcytosine signature in plasma extracellular vesicle DNA as a diagnostic molecular biomarker for precancerous lesions of gastric cancer

Figure 5. Construction and Validation of the PLGC Diagnostic Model. (A) Stacked bar chart of model prediction results in the training set; (B) ROC curve of the model in the training set; (C) Confusion matrix of the model in the training set; (D) Stacked bar chart of model prediction results in the test set; (E) ROC curve of the model in the test set; (F) Confusion matrix of the model in the test set; (G) Heatmap of DhMGs in the diagnostic model across train and test cohort samples. PLGC: Precancerous lesion of gastric cancer; ROC: receiver operating characteristic; AUC: area under the curve; CAG: chronic atrophic gastritis; IM: intestinal metaplasia; Dys: dysplasia; DhMGs: differentially hydroxymethylated genes.

Table 2

Characterization-related parameters in the diagnostic model

Feature 95%CI Lower 95%CI Upper
ARHGEF16 -3.81 -3.57
MEGF6 -97.49 -97.29
CASP9 -39.93 -39.57
EPB41 -40.73 -40.41
TMEM39B 117.97 118.33
SGIP1 -182.49 -182.13
GNG12-AS1 149.62 149.94
WIPF1 -32.68 -32.32

Correlation analysis of DhMGs with clinical indicators

Next, we employed correlation analysis to assess the relationship between the nine DhMRs in the diagnostic model and the clinical indicators OLGA and OLGIM, in order to evaluate whether these nine DhMRs are associated with the pathological progression of PLGC. We conducted this assessment using Pearson correlation analysis. The results showed that all nine DhMRs were significantly correlated with both clinical indicators, OLGA and OLGIM. Among them, NCOR2 and MEGF6 exhibited the strongest correlations with OLGA and OLGIM, with *P*-values less than 0.01 [Figure 6A]. Specifically, NCOR2 was positively correlated with OLGA and OLGIM, with correlation coefficients of 0.47 and 0.48, respectively. In contrast, MEGF6 was negatively correlated with OLGA and OLGIM, with correlation coefficients of 0.34 and 0.34, respectively [Figure 6B].

5-hydroxymethylcytosine signature in plasma extracellular vesicle DNA as a diagnostic molecular biomarker for precancerous lesions of gastric cancer

Figure 6. Correlation Analysis of DhMGs with Clinical Indicators. (A) Heatmap of the correlation between different DhMGs and clinical indicators (OLGA and OLGIM); (B) Scatter plots showing the relationships between clinical indicators (OLGA, OLGIM) and the gene expression of DhMGs (NCOR2 and MEGF6) with clinical variables. The y-axis shows the normalized counts of DhMGs per sample, and the x-axis shows the OLGA or OLGIM score. *P < 0.05; **P < 0.01; ***P < 0.001. ns: Not significant (Pearson correlation coefficient test). OLGA: Operative link on gastritis assessment; OLGIM: operative link on gastric intestinal metaplasia assessment; DhMGs: differentially hydroxymethylated genes.

DISCUSSION

Gastric cancer remains one of the most common cancers globally, with high mortality and morbidity rates due to its weak symptom-disease correlation and the lack of effective curative treatments[63]. PLGC represents a crucial stage in the prevention and control of gastric cancer. Effective screening of PLGC, advancing the timing of treatment, and inhibiting pathological progression of the gastric mucosa are of significant importance in reducing the incidence of gastric cancer[64]. Currently, the diagnosis of PLGC still relies on pathology and gastroscopy as the gold standard. However, these methods are costly and may lead to complications such as bleeding, perforation, and infection[10,65]. Moreover, traditional serum cancer biomarkers, such as carcinoembryonic antigen (CEA), carcinoembryonic antigen (CA72-4), and carbohydrate antigen 19-9 (CA19-9), have proven to be ineffective in screening for precancerous lesions of PLGC[66,67]. Therefore, there is an urgent need for novel, non-invasive diagnostic methods that are easy to promote and implement.

Currently, numerous studies have attempted to detect PLGC through biomarkers in blood. For instance, a study utilizing serum proteomics identified a panel of five protein biomarkers and constructed a diagnostic model for PLGC. The model achieved AUC values of 0.763 for low-grade intraepithelial neoplasia and 0.867 for high-grade intraepithelial neoplasia in PLGC[18]. Another study combined serum proteomics with mass spectrometry to identify four serum autoantibody biomarkers and incorporated gender into the diagnostic model, resulting in an AUC value of 0.803[19]. Additionally, a diagnostic model for PLGC based on circulating microRNA-130b and red cell distribution width demonstrated an AUC value of 0.896[20]. However, none of these biomarker-based diagnostic models has achieved high diagnostic efficacy to date. Moreover, most studies have not employed the division of patients into train and test sets for model construction and validation, which may compromise the reliability and generalizability of the models. Therefore, the exploration of new non-invasive diagnostic biomarkers for the screening of PLGC holds significant importance.

Recent studies have indicated a link between PLGC and epigenetics. For instance, the hypermethylation of MIR124-3 and NKX6-1 is associated with the risk of PLGC[68]. The m6A modification regulator METTL3 promotes EMT in gastric epithelial cells through the m6A/SNHG7 axis, thereby influencing PLGC[69]. Additionally, histone protein deacetylase 6 (HDAC6) reduces forkhead box P3 (FOXP3) via epigenetic modifications, forming a closed loop of HDAC6/FOXP3/hepatocyte nuclear factor 4α (HNF4α) to facilitate the occurrence of IM[70]. Recent research has also shown that the 5hmC molecular landscape is closely related to the development of PLGC[41]. Therefore, 5hmC holds promise as a novel molecular diagnostic biomarker for the diagnosis of PLGC.

EVs are essential for short- and long-range transport of bioactive molecules, enabling robust intercellular communication[27,28]. These EVs can cross endothelial barriers and enter the bloodstream, offering a stable and informative snapshot of disease[30,31]. Because they carry lipids, nucleic acids, proteins, glycans and metabolites, EVs have attracted attention as liquid-biopsy targets. They contain gDNA and mitochondrial DNA (mtDNA), so EV-DNA has potential as a biomarker for diverse pathologies, including cancer, tuberculosis, kidney injury and Parkinson’s disease[71-74]. The origin of EV-gDNA remains unclear; studies indicate it may derive from chromatid fragments caused by mis-repaired DNA breaks, chromosome mis-segregation, or micronuclei formed by defective nuclear envelopment. Owing to their unstable envelopes, micronuclei eventually rupture, releasing content into the cytosol for engulfment and EV packaging[75]. Additionally, both EV release and DNA loading are reported to increase in cancer[76]. The 5mC in evDNAs has been reported as a biomarker for gastric cancer[77]. Likewise, 5hmC is an important epigenetic mark. Here, using 5hmC-Seal on evDNA, we seek PLGC-specific signatures, build a diagnostic model and assess the utility of EV 5hmC for disease detection.

In this study, we enrolled a total of 67 patients with PLGC and matched them with 67 healthy individuals by age and gender. We performed whole-genome 5hmC sequencing on the evDNAs in their plasma to explore epigenetic biomarkers associated with PLGC. In the process of biomarker selection, we not only identified DhMRs based on traditional differential gene analysis but also employed trend clustering to pinpoint DhMRs associated with the Correa cascade progression, thereby narrowing down the number of candidate biomarkers. Moreover, we adopted a multi-method approach in the feature selection and construction of the diagnostic model. Ultimately, we developed a diagnostic model comprising nine DhMRs (ARHGEF16, MEGF6, CASP9, EPB41, TMEM39B, SGIP1, GNG12-AS1, WIPF1, NCOR2). This model achieved an AUC value of 0.963 and an accuracy of 0.886 in the test set. Although the diagnostic efficiency in the validation set was slightly lower than that in the training set, the performance of our diagnostic model surpassed that of traditional serum markers and previous studies[18-20]. Additionally, correlation analysis revealed that all nine biomarkers in the diagnostic model were highly correlated with the clinical pathological scores of OLGA and OLGIM, thereby confirming the reliability of the biomarkers we selected.

As a chronic inflammatory disease, PLGC is associated with oxidative stress and immune activation. Therefore, during the PLGC stage, the expression of pathways such as HIF-1, chemokines, cellular senescence, and B-cell activation can be detected[54-56]. Additionally, as a precancerous condition, chronic inflammation that is persistent and recurrent can promote EMT. Studies have found that the expression of the Wnt pathway in PLGC can drive the expression of EMT-related proteins such as β-catenin, thereby facilitating the formation of inflammation-cancer transformation[57,58]. Moreover, as inflammation progresses, cellular reprogramming occurs, leading to changes in energy homeostasis. Pathways such as PI3K-Akt and MAPK are also differentially expressed in PLGC[59,60]. Our study, through enrichment analysis of DhMRs, identified the expression of the aforementioned pathways. This suggests that the 5hmC in plasma evDNAs has the potential to reflect disease changes and further confirms the reliability of our results.

The nine DhMGs involved in the diagnostic model are primarily associated with functions such as cell adhesion, apoptosis, differentiation, regulation of gene expression, and signal transduction. Among them, CASP9 is closely related to apoptosis. Infection with Helicobacter pylori can promote the activation of CASP9 in B cells, thereby influencing the immune response[78]. In the gastric mucosa, trefoil factor 1 (TFF1) can regulate apoptosis by targeting the active form of CASP9[79]. WIPF1 is linked to cytoskeleton regulation and cell migration. It can affect the proliferation, invasion, and migration of gastric cancer cells and is associated with the regulation of the PI3K/AKT signaling pathway[80]. NCOR2 is a transcriptional corepressor. Research has shown that its methylation status is associated with the density of tumor-infiltrating lymphocytes in gastric cancer[81]. Additionally, mutations in NCOR2 are linked to the occurrence of multiple gastric cancers[82].

Although the diagnostic model constructed in this study has demonstrated good diagnostic efficacy, there are still some limitations. First, the samples in this study were derived from PLGC patients and healthy individuals in China. To achieve clinical application in the future, it will be necessary to consider differences between geographic regions and ethnic groups and conduct large-scale studies and validations. In addition, this study also adopted a case-control study design. In subsequent studies, historical longitudinal studies and front-looking trials are crucial for validating and confirming the medical applicability of such methods, ultimately achieving non-invasive detection of PLGC. Finally, the modest sample size precluded subtype-specific modelling; future work will expand the cohort to validate model performance and benchmark it against clinical indicators such as miR-130b, CEA, CA72-4 and CA19-9, refining the 5hmC-based model and yielding additional biological insight.

In conclusion, we have established a non-invasive diagnostic model for PLGC based on the 5hmC landscape in plasma evDNAs. This biomarker panel, consisting of nine 5hmC markers, exhibits high sensitivity and specificity. Our research findings indicate that the 5hmC expression profile is a promising tool for the early detection and accurate diagnosis of PLGC.

DECLARATIONS

Acknowledgments

We would like to acknowledge the essential contributions of all staff and students who participated in this work.

Authors’ contributions

Conceived, designed, and supervised the project, revised the manuscript, provided final approval for the manuscript, and provided financial support: Lin J, Wang Y

Performed experiments, collected, interpreted and analyzed data, and wrote the manuscript: Chen H (Haoyu Chen), Gao T, Chen H (Hangyu Chen)

Revised the manuscript and assisted with experiments: Zhang L, Chen X, Duolikun M, Li X (Xiaxuan Li), Li X (Xuehui Li), Chen L, Gao H, Li Q, Hao X, Zhou P, Ren N

All authors reviewed the manuscript.

Availability of data and materials

The data supporting the findings of this study are available from the corresponding author upon reasonable request.

AI and AI-assisted tools statement

Not applicable.

Financial support and sponsorship

This work was supported by the National Natural Science Foundation of China (Grant Nos. 82405293 and 82575015), the Natural Science Foundation of Beijing (No.7232281), the Hebei Province Traditional Chinese Medicine Administration Scientific Research Program (No. 2024012), the Beijing University of Chinese Medicine Basic Scientific Research Business Fund Jie-Bang-Gua-Shuai Project (No. 2025-JYB-JBGS-001) and Hebei Province Graduate Student Innovative Capacity Building Grant Program (No. CXZZBS2025171).

Conflicts of interest

All authors declared that there are no conflicts of interest.

Ethical approval and consent to participate

All participants provided written informed consent prior to enrollment. The study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of the Hebei Province Hospital of Chinese Medicine (approval number: HBZY2023-YS-134-01).

Consent for publication

Not applicable.

Copyright

© The Author(s) 2025.

Supplementary Materials

REFERENCES

1. Ajani JA, D’Amico TA, Bentrem DJ, et al. Gastric cancer, version 2.2022, NCCN clinical practice guidelines in oncology. J Natl Compr Canc Netw. 2022;20:167-92.

2. Bray F, Laversanne M, Sung H, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2024;74:229-63.

3. Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71:209-49.

4. Chen Y, Jia K, Xie Y, et al. The current landscape of gastric cancer and gastroesophageal junction cancer diagnosis and treatment in China: a comprehensive nationwide cohort analysis. J Hematol Oncol. 2025;18:42.

5. eir HK, Carreira H, et al; CONCORD Working Group. Global surveillance of cancer survival 1995-2009: analysis of individual data for 25,676,887 patients from 279 population-based registries in 67 countries (CONCORD-2). Lancet. 2015;385:977-1010.

6. Wei H, Li W, Zeng L, et al. OLFM4 promotes the progression of intestinal metaplasia through activation of the MYH9/GSK3β/β-catenin pathway. Mol Cancer. 2024;23:124.

7. Correa P. A human model of gastric carcinogenesis. Cancer Res. 1988;48:3554-60.

8. Kohoutova D, Banks M, Bures J. Advances in the aetiology & endoscopic detection and management of early gastric cancer. Cancers. 2021;13:6242.

9. Pasechnikov V, Chukov S, Fedorov E, Kikuste I, Leja M. Gastric cancer: prevention, screening and early diagnosis. World J Gastroenterol. 2014;20:13842-62.

10. Levy I, Gralnek IM. Complications of diagnostic colonoscopy, upper endoscopy, and enteroscopy. Best Pract Res Clin Gastroenterol. 2016;30:705-18.

11. Huang RJ, Park S, Shen J, Longacre T, Ji H, Hwang JH. Pepsinogens and gastrin demonstrate low discrimination for gastric precancerous lesions in a multi-ethnic United States cohort. Clin Gastroenterol Hepatol. 2022;20:950-2.e3.

12. So JBY, Kapoor R, Zhu F, et al. Development and validation of a serum microRNA biomarker panel for detecting gastric cancer in a high-risk population. Gut. 2021;70:829-37.

13. Otsu H, Nambara S, Hu Q, et al. Identification of serum microRNAs as potential diagnostic biomarkers for detecting precancerous lesions of gastric cancer. Ann Gastroenterol Surg. 2023;7:63-70.

14. Liu ZC, Wu WH, Huang S, et al. Plasma lipids signify the progression of precancerous gastric lesions to gastric cancer: a prospective targeted lipidomics study. Theranostics. 2022;12:4671-83.

15. Gong Y, Lou Y, Han X, et al. Serum proteomic profiling of precancerous gastric lesions and early gastric cancer reveals signatures associated with systemic inflammatory response and metaplastic differentiation. Front Mol Biosci. 2024;11:1252058.

16. Nikanjam M, Kato S, Kurzrock R. Liquid biopsy: current technology and clinical applications. J Hematol Oncol. 2022;15:131.

17. Alix-Panabières C, Pantel K. Liquid Biopsy: from discovery to clinical application. Cancer Discov. 2021;11:858-73.

18. Li J, Zhao W, Yang J, et al. Proteomic and serological markers for diagnosing cardia gastric cancer and precursor lesions in a Chinese population. Sci Rep. 2024;14:25309.

19. Zhu Q, He P, Zheng C, et al. Identification and evaluation of novel serum autoantibody biomarkers for early diagnosis of gastric cancer and precancerous lesion. J Cancer Res Clin Oncol. 2023;149:8369-78.

20. Chen J, Liu Z, Gao G, et al. Efficacy of circulating microRNA-130b and blood routine parameters in the early diagnosis of gastric cancer. Oncol Lett. 2021;22:725.

21. Jin Y, Chen K, Wang Z, et al. DNA in serum extracellular vesicles is stable under different storage conditions. BMC Cancer. 2016;16:753.

22. Kolesar J, Peh S, Thomas L, et al. Integration of liquid biopsy and pharmacogenomics for precision therapy of EGFR mutant and resistant lung cancers. Mol Cancer. 2022;21:61.

23. Song P, Wu LR, Yan YH, et al. Limitations and opportunities of technologies for the analysis of cell-free DNA in cancer diagnostics. Nat Biomed Eng. 2022;6:232-45.

24. Batool SM, Yekula A, Khanna P, et al. The liquid biopsy consortium: challenges and opportunities for early cancer detection and monitoring. Cell Rep Med. 2023;4:101198.

25. Casanova-Salas I, Aguilar D, Cordoba-Terreros S, et al. Circulating tumor extracellular vesicles to monitor metastatic prostate cancer genomics and transcriptomic evolution. Cancer Cell. 2024;42:1301-12.e7.

26. Ciani Y, Nardella C, Demichelis F. Casting a wider net: the clinical potential of EV transcriptomics in multi-analyte liquid biopsy. Cancer Cell. 2024;42:1160-2.

27. van Niel G, D’Angelo G, Raposo G. Shedding light on the cell biology of extracellular vesicles. Nat Rev Mol Cell Biol. 2018;19:213-28.

28. Maacha S, Bhat AA, Jimenez L, et al. Extracellular vesicles-mediated intercellular communication: roles in the tumor microenvironment and anti-cancer drug resistance. Mol Cancer. 2019;18:55.

29. Niu L, Zhu Y, Wan M, et al. Extracellular DNA as potential contributors to pathological calcification. Interdiscip Med. 2024;2:e20230061.

30. Iannotta D, A A, Kijas AW, Rowan AE, Wolfram J. Entry and exit of extracellular vesicles to and from the blood circulation. Nat Nanotechnol. 2024;19:13-20.

31. Nieuwland R, Siljander PR. A beginner’s guide to study extracellular vesicles in human blood plasma and serum. J Extracell Vesicles. 2024;13:e12400.

32. Zhu Z, Hu E, Shen H, Tan J, Zeng S. The functional and clinical roles of liquid biopsy in patient-derived models. J Hematol Oncol. 2023;16:36.

33. Allenson K, Castillo J, San Lucas FA, et al. High prevalence of mutant KRAS in circulating exosome-derived DNA from early-stage pancreatic cancer patients. Ann Oncol. 2017;28:741-7.

34. Choi J, Cho HY, Jeon J, et al. Detection of circulating KRAS mutant DNA in extracellular vesicles using droplet digital PCR in patients with colon cancer. Front Oncol. 2022;12:1067210.

35. Greenberg MVC, Bourc’his D. The diverse roles of DNA methylation in mammalian development and disease. Nat Rev Mol Cell Biol. 2019;20:590-607.

36. Oliva M, Demanelis K, Lu Y, et al. DNA methylation QTL mapping across diverse human tissues provides molecular links between genetic variation and complex traits. Nat Genet. 2023;55:112-22.

37. Fu Y, Jiang J, Wu Y, et al. Genome-wide 5-hydroxymethylcytosines in circulating cell-free DNA as noninvasive diagnostic markers for gastric cancer. Gastric Cancer. 2024;27:735-46.

38. Tian X, Sun B, Chen C, et al. Circulating tumor DNA 5-hydroxymethylcytosine as a novel diagnostic biomarker for esophageal cancer. Cell Res. 2018;28:597-600.

39. Cai J, Chen L, Zhang Z, et al. Genome-wide mapping of 5-hydroxymethylcytosines in circulating cell-free DNA as a non-invasive approach for early detection of hepatocellular carcinoma. Gut. 2019;68:2195-205.

40. West-Szymanski DC, Zhang Z, Cui XL, et al. 5-hydroxymethylated biomarkers in cell-free DNA predict occult colorectal cancer up to 36 months before diagnosis in the prostate, lung, colorectal, and ovarian cancer screening trial. JCO Precis Oncol. 2024;8:e2400277.

41. Luo Z, Li W, Zheng W, et al. Elucidating epigenetic landscape of gastric premalignant lesions through genome-wide mapping of 5-hydroxymethylcytosines: a 12-year median follow-up study. Clin Transl Med. 2024;14:e70114.

42. Chinese Society of Gastroenterology, Cancer Collaboration Group of Chinese Society of Gastroenterology, Chinese Medical Association. Guidelines for diagnosis and treatment of chronic gastritis in China (2022, Shanghai). J Dig Dis. 2023;24:150-80.

43. Chu JL, Bi SH, He Y, et al. 5-hydroxymethylcytosine profiles in plasma cell-free DNA reflect molecular characteristics of diabetic kidney disease. Front Endocrinol. 2022;13:910907.

44. Song CX, Szulwach KE, Fu Y, et al. Selective chemical labeling reveals the genome-wide distribution of 5-hydroxymethylcytosine. Nat Biotechnol. 2011;29:68-72.

45. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357-9.

46. Etherington GJ, Ramirez-Gonzalez RH, MacLean D. Bio-samtools 2: a package for analysis and visualization of sequence and alignment data with SAMtools in ruby. Bioinformatics. 2015;31:2565-7.

47. Navarro Gonzalez J, Zweig AS, Speir ML, et al. The UCSC genome browser database: 2021 update. Nucleic Acids Res. 2021;49:D1046-57.

48. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841-2.

49. Zhang Y, Liu T, Meyer CA, et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 2008;9:R137.

50. Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43:e47.

51. Kumar L, E Futschik M. Mfuzz: a software package for soft clustering of microarray data. Bioinformation. 2007;2:5-7.

52. Wu T, Hu E, Xu S, et al. clusterProfiler 4.0: a universal enrichment tool for interpreting omics data. Innovation. 2021;2:100141.

53. Lu D, Wu X, Wu W, et al. Plasma cell-free DNA 5-hydroxymethylcytosine and whole-genome sequencing signatures for early detection of esophageal cancer. Cell Death Dis. 2023;14:843.

54. Liu S, Ji H, Zhang T, et al. Modified Zuojin pill alleviates gastric precancerous lesions by inhibiting glycolysis through the HIF-1α pathway. Phytomedicine. 2025;136:156255.

55. Cai Q, Shi P, Yuan Y, et al. Inflammation-associated senescence promotes helicobacter pylori-induced atrophic gastritis. Cell Mol Gastroenterol Hepatol. 2021;11:857-80.

56. Milito C, Pulvirenti F, Garzi G, et al. Decline of gastric cancer mortality in common variable immunodeficiency in the years 2018-2022. Front Immunol. 2023;14:1231242.

57. Murata-Kamiya N, Kurashima Y, Teishikata Y, et al. Helicobacter pylori CagA interacts with E-cadherin and deregulates the beta-catenin signal that promotes intestinal transdifferentiation in gastric epithelial cells. Oncogene. 2007;26:4617-26.

58. Wang YM, Luo ZW, Shu YL, et al. Effects of Helicobacter pylori and Moluodan on the Wnt/β-catenin signaling pathway in mice with precancerous gastric cancer lesions. World J Gastrointest Oncol. 2024;16:979-90.

59. Wang Y, Chu F, Lin J, et al. Erianin, the main active ingredient of Dendrobium chrysotoxum Lindl, inhibits precancerous lesions of gastric cancer (PLGC) through suppression of the HRAS-PI3K-AKT signaling pathway as revealed by network pharmacology and in vitro experimental verification. J Ethnopharmacol. 2021;279:114399.

60. Yi Z, Jia Q, Lin Y, et al. Mechanism of Elian granules in the treatment of precancerous lesions of gastric cancer in rats through the MAPK signalling pathway based on network pharmacology. Pharm Biol. 2022;60:87-95.

61. Luo Z, Li W, Zheng W, et al. Elucidating epigenetic landscape of gastric premalignant lesions through genome-wide mapping of 5-hydroxymethylcytosines: a 12-year median follow-up study. Clin Transl Med. 2024;14:e70114.

62. He S, Zhang Z, Song G, et al. Can patients with mild non-neoplastic lesions diagnosed at baseline screening be safely exempt from surveillance: evidence from multicenter community-based cohorts. Sci China Life Sci. 2025;68:263-71.

63. Smyth EC, Nilsson M, Grabsch HI, van Grieken NC, Lordick F. Gastric cancer. Lancet. 2020;396:635-48.

64. Young E, Philpott H, Singh R. Endoscopic diagnosis and treatment of gastric dysplasia and early cancer: current evidence and what the future may hold. World J Gastroenterol. 2021;27:5126-51.

65. Yang Y, Guan S, Ou Z, Li W, Yan L, Situ B. Advances in AI-based cancer cytopathology. Interd Med. 2023;1:e20230013.

66. Guo X, Peng Y, Song Q, et al. A liquid biopsy signature for the early detection of gastric cancer in patients. Gastroenterology. 2023;165:402-13.e13.

67. Ge X, Zhang X, Ma Y, Chen S, Chen Z, Li M. Diagnostic value of macrophage inhibitory cytokine 1 as a novel prognostic biomarkers for early gastric cancer screening. J Clin Lab Anal. 2021;35:e23568.

68. Lopes C, Almeida TC, Macedo-Silva C, et al. MIR124-3 and NKX6-1 hypermethylation profiles accurately predict metachronous gastric lesions in a Caucasian population. Clin Epigenetics. 2024;16:113.

69. Jian J, Feng Y, Wang R, et al. METTL3-regulated lncRNA SNHG7 drives MNNG-induced epithelial-mesenchymal transition in gastric precancerous lesions. Toxics. 2024;12:573.

70. Zhang L, Wang N, Chen M, et al. HDAC6/FOXP3/HNF4α axis promotes bile acids induced gastric intestinal metaplasia. Am J Cancer Res. 2022;12:1409-22.

71. Zhou X, Kurywchak P, Wolf-Dennen K, et al. Unique somatic variants in DNA from urine exosomes of individuals with bladder cancer. Mol Ther Methods Clin Dev. 2021;22:360-76.

72. Cho SM, Shin S, Kim Y, et al. A novel approach for tuberculosis diagnosis using exosomal DNA and droplet digital PCR. Clin Microbiol Infect. 2020;26:942.e1-5.

73. Sedej I, Štalekar M, Tušek Žnidarič M, et al. Extracellular vesicle-bound DNA in urine is indicative of kidney allograft injury. J Extracell Vesicles. 2022;11:e12268.

74. Picca A, Guerra F, Calvani R, et al. Mitochondrial-derived vesicles as candidate biomarkers in parkinson’s disease: rationale, design and methods of the EXosomes in PArkiNson disease (EXPAND) study. Int J Mol Sci. 2019;20:2373.

75. Elzanowska J, Semira C, Costa-Silva B. DNA in extracellular vesicles: biological and clinical aspects. Mol Oncol. 2021;15:1701-14.

76. Fenech M, Kirsch-Volders M, Natarajan AT, et al. Molecular mechanisms of micronucleus, nucleoplasmic bridge and nuclear bud formation in mammalian and human cells. Mutagenesis. 2011;26:125-32.

77. Lin B, Jiao Z, Dong S, et al. Whole-genome methylation profiling of extracellular vesicle DNA in gastric cancer identifies intercellular communication features. Nat Commun. 2025;16:8084.

78. Bussiere FI, Chaturvedi R, Asim M, et al. Low multiplicity of infection of Helicobacter pylori suppresses apoptosis of B lymphocytes. Cancer Res. 2006;66:6834-42.

79. Bossenmeyer-Pourié C, Kannan R, Ribieras S, et al. The trefoil factor 1 participates in gastrointestinal cell differentiation by delaying G1-S phase transition and reducing apoptosis. J Cell Biol. 2002;157:761-70.

80. Su F, Xiao R, Chen R, et al. WIPF1 promotes gastric cancer progression by regulating PI3K/Akt signaling in a myocardin-dependent manner. iScience. 2023;26:108273.

81. Wen X, Jin HY, Li M, et al. Methylation statuses of NCOR2, PARK2, and ZSCAN12 signify densities of tumor-infiltrating lymphocytes in gastric carcinoma. Sci Rep. 2022;12:862.

82. Wang A, Li Z, Wang M, et al. Molecular characteristics of synchronous multiple gastric cancer. Theranostics. 2020;10:5489-500.

Cite This Article

Original Article
Open Access
5-hydroxymethylcytosine signature in plasma extracellular vesicle DNA as a diagnostic molecular biomarker for precancerous lesions of gastric cancer

How to Cite

Download Citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click on download.

Export Citation File:

Type of Import

Tips on Downloading Citation

This feature enables you to download the bibliographic information (also called citation data, header data, or metadata) for the articles on our site.

Citation Manager File Format

Use the radio buttons to choose how to format the bibliographic data you're harvesting. Several citation manager formats are available, including EndNote and BibTex.

Type of Import

If you have citation management software installed on your computer your Web browser should be able to import metadata directly into your reference database.

Direct Import: When the Direct Import option is selected (the default state), a dialogue box will give you the option to Save or Open the downloaded citation data. Choosing Open will either launch your citation manager or give you a choice of applications with which to use the metadata. The Save option saves the file locally for later use.

Indirect Import: When the Indirect Import option is selected, the metadata is displayed and may be copied and pasted as needed.

About This Article

Special Topic

This article belongs to the Special Topic Extracellular Vesicles in Disease Diagnosis and Treatment
© The Author(s) 2025. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, sharing, adaptation, distribution and reproduction in any medium or format, for any purpose, even commercially, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Data & Comments

Data

Views
34
Downloads
3
Citations
0
Comments
0
0

Comments

Comments must be written in English. Spam, offensive content, impersonation, and private information will not be permitted. If any comment is reported and identified as inappropriate content by OAE staff, the comment will be removed without notice. If you have any queries or need any help, please contact us at [email protected].

0
Download PDF
Share This Article
Scan the QR code for reading!
See Updates
Contents
Figures
Related
Extracellular Vesicles and Circulating Nucleic Acids
ISSN 2767-6641 (Online)
Follow Us

Portico

All published articles are preserved here permanently:

https://www.portico.org/publishers/oae/

Portico

All published articles are preserved here permanently:

https://www.portico.org/publishers/oae/