GRAMEP, AN ALIGNMENT-FREE METHOD BASED ON THE PRINCIPLE OF MAXIMUM ENTROPY FOR GENOME ANALYSIS

Published in 21/11/2024 - ISBN: 978-65-272-0843-3

Paper Title
GRAMEP, AN ALIGNMENT-FREE METHOD BASED ON THE PRINCIPLE OF MAXIMUM ENTROPY FOR GENOME ANALYSIS
Authors
  • Matheus Henrique Pimenta-Zanon
  • André Yoshiaki Kashiwabara
  • André Luís Laforga Vanzela
  • Fabricio Martins Lopes
Modality
Poster
Subject area
Database and Software Development
Publishing Date
21/11/2024
Country of Publishing
Brazil | Brasil
Language of Publishing
Inglês
Paper Page
https://www.even3.com.br/anais/xmeeting-2024/836337-gramep-an-alignment-free-method-based-on-the-principle-of-maximum-entropy-for-genome-analysis
ISBN
978-65-272-0843-3
Keywords
Alignment-Free Methods. SNP Mutation Identification. Biological Sequence Classi- fication. Maximum Entropy Principle. Genomic Data Analysis.
Summary
The recent increase in high-throughput DNA sequencing technologies has led to the generation of huge and complex genomic datasets. Analyzing these vast amounts of data efficiently and robustly presents a significant challenge. Traditionally, variant calling within genomes relies on alignment techniques. However, such methods can be computationally expensive and susceptible to arbitrary placements, particularly when dealing with large-scale analyses. This study presents a novel alignment-free tool for identifying single nucleotide polymorphisms (SNPs) within DNA sequences, called GRAMEP (Genome vaRiation Analysis from the Maximum Entropy). The method leverages k-mer frequencies across all input sequences. By adopting the principle of maximum entropy, the most informative k-mers are chosen for the specific variant being investigated. This approach ensures selection of the most discriminative k-mers relative to the reference sequence. Subsequently, Levenshtein distance is employed with each unique k-mer to pinpoint variations compared to the reference. This enables the extraction of information regarding the type, location, and frequency of SNP mutations. Furthermore, the proposed method extends to the classification of novel sequences. This is achieved by utilizing the exclusive k-mer frequency profiles of known variants as features for a machine learning model, which then predicts the class of new sequences. Preliminary results show the method’s potential for efficient, simultaneous analysis of thousands of SARS-CoV-2 viral genome sequences, yielding SNP information across the entire analyzed region. In terms of classification, comparisons with existing methods show promising results, exhibiting superior performance metrics and lower computational resource consumption. The tool has been implemented in Python and is openly available to the scientific community.
Title of the Event
20º Congresso Brasileiro de Bioinformática: X-Meeting 2024
City of the Event
Salvador
Title of the Proceedings of the event
X-Meeting presentations
Name of the Publisher
Even3
Means of Dissemination
Meio Digital

How to cite

PIMENTA-ZANON, Matheus Henrique et al.. GRAMEP, AN ALIGNMENT-FREE METHOD BASED ON THE PRINCIPLE OF MAXIMUM ENTROPY FOR GENOME ANALYSIS.. In: X-Meeting presentations. Anais...Salvador(BA) Hotel Deville Prime, 2024. Available in: https//www.even3.com.br/anais/xmeeting-2024/836337-GRAMEP-AN-ALIGNMENT-FREE-METHOD-BASED-ON-THE-PRINCIPLE-OF-MAXIMUM-ENTROPY-FOR-GENOME-ANALYSIS. Access in: 27/04/2025

Paper

Even3 Publicacoes