Volume no :19, Issue no: 1, (2025)

RESEARCH ON BREAST CANCER CLASSIFICATION BASED ON PCA AND K-NEAREST NEIGHBOR

Author's: Lujin Lyu, Xinyu Liu, Leilei Bao, Qirui Xiao, Li Li and Jianqiang Gao
Pages: [37] - [52]
Received Date: December 26, 2025
Submitted by:
DOI: http://dx.doi.org/10.18642/ijamml_7100122332

Abstract

Breast cancer has become the most common malignant tumor, making early and precise diagnosis crucial for improving patient survival rates. However, although high-dimensional clinical data can provide rich information, it will introduce the ‘curse of dimensionality’. dimensional disaster". This will increase computational costs and reduce generalization ability of traditional models. To solve this problem, this paper proposes a collaborative framework based on principal component analysis (PCA) and K nearest neighbor classification, denoted as PCA-KNN. The approach employs PCA for dimensionality reduction, compressing the feature space while preserving key variation information. In the low-dimensional subspace, K nearest neighbor algorithm is used to realize the classification task. The proposed method adopts cross-validation strategy, and according to the contribution rate of principal components, the K value in the K nearest neighbor algorithm is obtained. Experimental results show that this proposed method significantly improves computational efficiency while maintaining high classification accuracy, achieving an effective balance between information retention and computational performance. The proposed method has the characteristics of lightweight, strong interpretability and easy operation. This not only provides a practical auxiliary tool for breast cancer recurrence risk prediction, but also establishes a universal paradigm for high-dimensional medical data analysis tasks such as imaging and multi-group fusion. This contributes to advancing the implementation of precision medicine in clinical practice.

Keywords

breast cancer, PCA, KNN, dimensionality reduction, classification.