THE IMPACT OF DEMOGRAPHIC COMPOSITION IN UTKFACE AND APPA-REAL DATASETS ON FAIRNESS IN AGE ESTIMATION MODELS

  • Nenad PANIĆ Faculty of Technical Sciences, Singidunum University, Belgrade, Serbia
Keywords: Fairness, Dataset bias, Age estimation, Demographic imbalance, Distributional fairness baseline

Abstract


This paper analyzes the impact of demographic composition on fairness in facial age estimation models trained on the UTKFace and APPA-REAL datasets. Building on previously published empirical results, the study provides a theoretical and analytical interpretation of how dataset imbalance affects model bias. Through comparative evaluation of group-wise performance metrics including Mean Absolute Error, Standard Deviation, Disparate Impact, and Equality of Opportunity, the paper introduces the concept of a Distributional Fairness Baseline (DFB) as a diagnostic framework for separating dataset-driven bias from model-induced bias. The analysis reveals that fairness is primarily a function of the representativeness and internal structure of training data, rather than model architecture. Contrary to common assumptions, full data equalization through oversampling does not necessarily enhance equity and may even amplify disparities due to overfitting and redundancy. Instead, moderate redistribution, particularly controlled undersampling of dominant groups often achieves an optimal balance between accuracy and fairness. These findings emphasize that equitable model performance depends on both quantitative and qualitative diversity within datasets, establishing data design as the central determinant of fairness in automated age estimation systems.

References

Agustsson, E., Timofte, R., Escalera, S., Baro, X., Guyon, I., & Rothe, R. (2017). Apparent and real age estimation in still images with deep residual regressors on appa-real database. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Washington DC, USA, pp. 87–94. https://doi.org/10.1109/FG.2017.20

Albiero, V., & Bowyer, K. W. (2020). Is face recognition sexist? no, gendered hairstyles and biology are. arXiv, arXiv:2008.06989.

Albiero, V., Zhang, K., & Bowyer, K. W. (2020). How does gender balance in training data affect face recognition accuracy? IEEE International Joint Conference on Biometrics (IJCB), Houston, USA, pp. 1-10. https://doi.org/10.1109/IJCB48548.2020.9304924

Angulu, R., Tapamo, J. R., & Adewumi, A. O. (2018). Age estimation via face images: a survey. EURASIP Journal on Image and Video Processing, 2018(1), 1-35. https://doi.org/10.1186/s13640-018-0278-6

Branco, P., Torgo, L., & Ribiero, R. P. (2016). A survey of predictive modeling on imbalanced domains. ACM Computing Surveys (CSUR), 49(2), 1-50. https://doi.org/10.1145/2907070

Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. 1st Conference on fairness, accountability and transparency, New York, USA, pp. 77-91.

Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16(1), 321-357. https://doi.org/10.1613/jair.953

Clapés, A., Bilici, O., Temirova, D., Avots, E., Anbarjafari, G., & Escalera, S. (2018). From apparent to real age: gender, age, ethnic, makeup, and expression bias analysis in real age estimation. IEEE conference on computer vision and pattern recognition workshops, Salt Lake City, USA, pp. 2373-2382.

Dey, P., Mahmud, T., Chowdhury, M. S., Hosssain, M. S., & Andersson, K. (2024). Human Age and Gender Prediction from Facial Images Using Deep Learning Methods. The 15th International Conference on Ambient Systems, Networks and Technologies (ANT), Hasselt, Belgium, pp. 314-321.

Hassanpour, A., Kowsari, Y., Shahreza, H. O., Yang, B., & Marcel, S. (2024). Chatgpt and Biometrics: an Assessment of Face Recognition, Gender Detection, and Age Estimation Capabilities. IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, pp. 3224-3229. https://doi.org/10.1109/ICIP51287.2024.10647924

Hasib, K. M., Iqbal, M. S., Shah, F. M., Mahmud, J. A., Popel, M. H., Showrov, M. I. H., … & Rahman, O. (2020). A survey of methods for managing the classification and solution of data imbalance problem. arXiv, arXiv:2012.11870

Jacques, J.C.S., Ozcinar, C., Marjanovic, M., Baró, X., Anbarjafari, G., & Escalera, S. (2019). On the effect of age perception biases for real age regression. IEEE International Conference on Automatic Face & Gesture Recognition, Lille, France, pp. 1-8. https://doi.org/10.1109/FG.2019.8756595

Kärkkäinen, K., Joo, J. (2019). Fairface: Face attribute dataset for balanced race, gender, and age. arXiv, arXiv:1908.04913

Khan, H., Perperoglou, A., & Majeed, H. (2020). A survey of methods for managing the classification and solution of data imbalance problem. arXiv, arXiv:2012.11870

Kotsiantis, S., Kanellopoulos, D., & Pintelas, P. (2006). Handling imbalanced datasets: A review. GESTS international transactions on computer science and engineering, 30(1), 25-36.

Narayan, K., Vibashan, V. S., Chellappa, R., & Patel, V. M. (2025). FaceXFormer: A Unified Transformer for Facial Analysis. IEEE International Conference on Computer Vision (ICCV), Honolulu, Hawaii, pp. 11369-11382.

Michalski, D., Yiu, S. Y., & Malec, C. (2018). The impact of age and threshold variation on facial recognition algorithm performance using images of children. IEEE International conference on biometrics (ICB), Gold Coast, Australia, pp. 217-224. https://doi.org/10.1109/ICB2018.2018.00041

Oladipo, O., Omidiora, E. O., & Osamor, V. C. (2024). Comparative analysis of features extraction techniques for black face age estimation. AI & Soc, 39(1), 1769-1783.

Paplhám, J., & Franc, V. (2024). A Call to Reflect on Evaluation Practices for Age Estimation: Comparative Analysis of the State-of-the-Art and a Unified Benchmark. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, pp. 1196-1205.

Panić, N., Marjanović, M., & Bezdan, T. (2024). Addressing Demographic Bias in Age Estimation Models through Optimized Dataset Composition. Mathematics, 12(15), 2358. https://doi.org/10.3390/math12152358

Puc, A., Štruc, V., & Grm, K. (2021). Analysis of race and gender bias in deep age estimation models. 28th European Signal Processing Conference (EUSIPCO), Amsterdam, Netherlands, pp. 830-834.

Ramyachitra, D., & Manikandan, P. (2014). Imbalanced dataset classification and solutions: a review. International Journal of Computing and Business Research (IJCBR), 5(4), 1-29.

Shou, Y., Cao, X., Liu, H., & Meng, D. (2025). Masked contrastive graph representation learning for age estimation. Pattern Recognition, 158, https://doi.org/10.1016/j.patcog.2024.110974

Srinivas, N., Ricanek, K., Michalski, D., Bolme, D. S., & King, M. (2019). Face recognition algorithm bias: Performance differences on images of children and adults. IEEE/CVF conference on computer vision and pattern recognition workshops, Long Beach, USA.

Terhörst, P., Kolf, J. N., Huber, M., Kirchbuchner, F., Damer, N., Moreno, A. M., … & Kuijper, A. (2021). A comprehensive study on face recognition biases beyond demographics. IEEE Transactions on Technology and Society, 3(1), 16-30.

Torralba, A., & Efros, A. A. (2011). Unbiased look at dataset bias. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, USA, pp. 1521-1528.

Voigt, P., & Von dem Bussche, A. (2017). The eu general data protection regulation (gdpr). A practical guide, 1st ed., Cham: Springer International Publishing, 10(3152676), 10-5555.

Xing, J., Li, K., Hu, W., Yuan, C., & Ling, H. (2017). Diagnosing deep learning models for high accuracy age estimation from a single image. Pattern Recognition, 66(1), 106-116. https://doi.org/10.1016/j.patcog.2017.01.005

Published
2025/12/07
Section
Original Scientific Paper