I'm a dual award PhD student at UAEU
(supervised by Prof. Nazar Zaki) and
KU Leuven
(supervised by Miryam de Lhoneux), where I'm part of
LAGoM NLP. I did my masters in Data Science
at Universiti Malaya .
My research focuses on graph-based data selection: using graphs to find representative
subsets of training data so models can learn from less while losing as little as possible.
In simple terms, I build smarter ways to train AI models using less data.
Outside of research, I enjoy photography.
Publications
Scalable Graph Attention-Based Instance Selection via Mini-Batch Sampling and Hierarchical Hashing
Instance selection (IS) addresses the critical challenge of reducing dataset size while keeping informative characteristics, becoming increasingly important as datasets grow to millions of instances. Current IS methods often struggle with capturing complex relationships in high-dimensional spaces and scale with large datasets. This paper introduces a graph attention-based instance selection (GAIS) method that uses attention mechanisms to identify informative instances through their structural relationships in graph representations. We present two approaches for scalable graph construction: a distance-based mini-batch sampling technique that achieves dataset-size-independent complexity through strategic batch processing, and a hierarchical hashing approach that enables efficient similarity computation through random projections. The mini-batch approach keeps class distributions through stratified sampling, while the hierarchical hashing method captures relationships at multiple granularities through single-level, multi-level, and multi-view variants. Experiments across 39 datasets show that GAIS achieves reduction rates above 96% while maintaining or improving model performance relative to state-of-the-art IS methods. The findings show that the distance-based mini-batch approach offers an optimal efficiency for large-scale datasets, while multi-view variants excel on complex, high-dimensional data, demonstrating that attention-based importance scoring can effectively identify instances important for maintaining decision boundaries while avoiding computationally prohibitive pairwise comparisons. The code is publicly available at https://github.com/zahiriddin-rustamov/gais.
@article{scalable-gais-2025,
title={Scalable Graph Attention-Based Instance Selection via Mini-Batch Sampling and Hierarchical Hashing},
author={Rustamov, Zahiriddin and Zaitouny, Ayham and Zaki, Nazar},
journal={AI Open},
year={2025}
}
Graph Reduction Techniques for Instance Selection: Comparative and Empirical Study
Zahiriddin Rustamov, Nazar Zaki, Jaloliddin Rustamov, Ayham Zaitouny, Rafat Damseh
The surge in data generation has prompted a shift to big data, challenging the notion that "more data equals better performance" due to processing and time constraints. In this evolving artificial intelligence and machine learning landscape, instance selection (IS) has become crucial for data reduction without compromising model quality. Traditional IS methods, though efficient, often struggle with large, complex datasets in data mining. This study evaluates graph reduction techniques, grounded in graph theory, as a novel approach for instance selection. The objective is to leverage the inherent structures of data represented as graphs to enhance the effectiveness of instance selection. We evaluated 35 graph reduction techniques across 29 classification datasets. These techniques were assessed based on various metrics, including accuracy, F1 score, reduction rate, and computational times. Graph reduction methods showed significant potential in maintaining data integrity while achieving substantial reductions. Top techniques achieved up to 99% reduction while maintaining or improving accuracy. For instance, the Multilevel sampling achieved an accuracy effectiveness score of 0.8555 with 99.16% reduction on large datasets, while Leiden sampling showed high effectiveness on smaller datasets (0.8034 accuracy, 97.87% reduction). Computational efficiency varied widely, with reduction times ranging from milliseconds to minutes. This research advances the theory of graph-based instance selection and offers practical application guidelines. Our findings indicate graph reduction methods effectively preserve data quality and boost processing efficiency in large, complex datasets, with some techniques achieving up to 160-fold speedups in model training at high reduction rates.
@article{graph-reduction-2024,
title={Graph Reduction Techniques for Instance Selection: Comparative and Empirical Study},
author={Rustamov, Zahiriddin and Zaki, Nazar and Rustamov, Jaloliddin and Zaitouny, Ayham and Damseh, Rafat},
journal={Artificial Intelligence Review},
year={2024}
}
GAIS: A Novel Approach to Instance Selection with Graph Attention Networks
Zahiriddin Rustamov, Ayham Zaitouny, Rafat Damseh, Nazar Zaki
Instance selection (IS) is a crucial technique in machine learning that aims to reduce dataset size while maintaining model performance. This paper introduces a novel method called Graph Attention-based Instance Selection (GAIS), which leverages Graph Attention Networks (GATs) to identify the most informative instances in a dataset. GAIS represents the data as a graph and uses GATs to learn node representations, enabling it to capture complex relationships between instances. The method processes data in chunks, applies random masking and similarity thresholding during graph construction, and selects instances based on confidence scores from the trained GAT model. Experiments on 13 diverse datasets demonstrate that GAIS consistently outperforms traditional IS methods in terms of effectiveness, achieving high reduction rates (average 96%) while maintaining or improving model performance. Although GAIS exhibits slightly higher computational costs, its superior performance in maintaining accuracy with significantly reduced training data makes it a promising approach for graph-based data selection. Code is available at https://github.com/zahiriddin-rustamov/gais.
@article{gais-2024,
title={GAIS: A Novel Approach to Instance Selection with Graph Attention Networks},
author={Rustamov, Zahiriddin and Zaitouny, Ayham and Damseh, Rafat and Zaki, Nazar},
journal={IEEE ICKG 2024},
year={2024}
}
GAT-RWOS: Graph Attention-Guided Random Walk Oversampling for Imbalanced Data Classification
Zahiriddin Rustamov, Abderrahmane Lakas, Nazar Zaki
Class imbalance poses a significant challenge in machine learning (ML), often leading to biased models favouring the majority class. In this paper, we propose GAT-RWOS, a novel graph-based oversampling method that combines the strengths of Graph Attention Networks (GATs) and random walk-based oversampling. GAT-RWOS leverages the attention mechanism of GATs to guide the random walk process, focusing on the most informative neighbourhoods for each minority node. By performing attention-guided random walks and interpolating features along the traversed paths, GAT-RWOS generates synthetic minority samples that expand class boundaries while preserving the original data distribution. Extensive experiments on a diverse set of imbalanced datasets demonstrate the effectiveness of GAT-RWOS in improving classification performance, outperforming state-of-the-art oversampling techniques. The proposed method has the potential to significantly improve the performance of ML models on imbalanced datasets and contribute to the development of more reliable classification systems. Code is available at https://github.com/zahiriddin-rustamov/gat-rwos.
@article{gat-rwos-2024,
title={GAT-RWOS: Graph Attention-Guided Random Walk Oversampling for Imbalanced Data Classification},
author={Rustamov, Zahiriddin and Lakas, Abderrahmane and Zaki, Nazar},
journal={IEEE ICKG 2024},
year={2024}
}
Improving Microvascular Brain Analysis with Adversarial Learning for OCT-TPM Vascular Domain Translation
Nadia Badawi, Jaloliddin Rustamov, Zahiriddin Rustamov, Frederic Lesage, Nazar Zaki, Rafat Damseh
Modeling microscopic cerebrovascular networks is essential for understanding cerebral blood flow and oxygen transport. High-resolution imaging modalities, such as Optical Coherence Tomography (OCT) and Two-Photon Microscopy (TPM), are widely used to capture microvascular structure and topology. Although TPM angiography generally provides better localization and image quality than OCT, its use is impractical in studies involving fluorescent dye leakage. Here, we exploit generative adversarial learning to produce high-quality TPM angiographies from OCT vascular stacks. We investigate the use of 2D and 3D cycle generative adversarial networks (CycleGANs) trained on unpaired image samples. We evaluate the generated TPM vascular structures based on image similarity and signal-to-noise ratio. Additionally, we evaluated the generated vascular structures after applying vessel segmentation and extracting their 3D topological models. Our results demonstrate that the 2D adversarial learning model outperforms the 3D model in terms of image quality. However, our statistical comparisons of vascular network features show the 3D model's consistent superiority in generating vascular structures. Our work provides a complementary approach to enhance vascular analysis when only OCT imaging is available.
@article{oct-tpm-adversarial-2025,
title={Improving Microvascular Brain Analysis with Adversarial Learning for OCT-TPM Vascular Domain Translation},
author={Badawi, Nadia and Rustamov, Jaloliddin and Rustamov, Zahiriddin and Lesage, Frederic and Zaki, Nazar and Damseh, Rafat},
journal={Scientific Reports},
year={2025}
}