Comparison Between the Partial Least Squares Method and Principal Components Using Genetic Algorithm with an Application

Authors

  • Hayder Osman Hussein Department of Statistics, College of Administration and Economics, University of Baghdad, Iraq.
  • Rabab Abdul-Ridha Saleh Department of Statistics, College of Administration and Economics, University of Baghdad, Iraq.

DOI:

https://doi.org/10.33095/s8543c81

Keywords:

Multicollinearity, Partial least squares (PLS), Principal Component Analysis (PCA), Genetic Algorithm (GA), multiple linear regression.

Abstract

The problem of multicollinearity among independent variables in a regression model was
addressed in this study using Partial Least Squares (PLS) and Principal Component Analysis
(PCA). The influence exerted by multicollinearity can deteriorate the results of regression
modeling; that is, it makes the traditional technique OLS less reliable. The research is centered on
applying advanced algorithms for Partial Least Squares (SIMPL and O-PLS) as well as for PCA
(NIPALS and SVD) on real-life data scenarios, as well as integrating genetic algorithm (GAs)
with these algorithms to optimize predictive performance.
The relative efficiency of these methods is evaluated primarily through the amplitude of the Mean
Square Error (MSE) used as a criterion for comparison. The results show the effectiveness of
PLS-OPLS above PCA in terms of the lowest before and after embedding genetic algorithms into
the MSE. All this underpins the effectiveness of PLS in minimizing multicollinearity thereby
allowing for the formulation and prediction of very highly predictive models.
More and more indication of subtle class was revealed with regard to fine tuning advanced GA
technique in favor of enhancing regression modeling for complex data analysis. The ongoing
research will help in opening numerous possibilities to reinforce regression methodology,
especially when one factor in an array of applications representing a relatively significant level of
prediction accuracy.

Downloads

Download data is not yet available.

References

Abdi, H. (2010), Partial least squares regression and projection on latent structure regression (PLS Regression). WIREs Comp Stat, 2: 97-106. https://doi.org/10.1002/wics.51

Abdulhadi, A. T., & Reda, S. M. (2020). Estimation of survival function using genetic algorithm. Journal of Economics and Administrative Sciences, 26(122), 440–454. https://doi.org/10.33095/jeas.v26i122.2018

Abdullah, A. N., & Abbas, B. K. (2020). Comparison between Partial Least Squares Regression and Dendritic Regression Using Simulation. Journal of Economics and Administrative Sciences, 26(120), 411–425. https://doi.org/10.33095/jeas.v26i120.1924

Albadrani, D. R. M., & Al-mawla, T. A. T. (2016). Comparison between Principal Component Regression and Partial Least Squares Methods with Application to Kirkuk Cement Plant. Tikrit Journal of Pure Science, 21(7), 185–203. https://doi.org/10.25130/tjps.v21i7.1126

Albayati, M. (2012). A practical application of analyzing statistical data using the program (SPSS). Al-Jazeera Press and Publishing.

Albayati, M. M. & Shaker, H. H. (2018). Comparison between Partial Least Squares and SVPD for Estimating Logistic Regression Model Parameters in Case of Multicollinearity Problem Using Simulation. Journal of Economic and Administrative Sciences, 24(109), 458–471. https://doi.org/10.33095/jeas.v24i109.1559

Aldouri, Y. K., AlChalabi, H., & Lundberg, J. (2020). Risk-based life cycle cost analysis using a two-level multi-objective genetic algorithm. International Journal of Computer Integrated Manufacturing, 33(10–11), 1076–1088. https://doi.org/10.1080/0951192X.2020.1757157.

Alkhafaji, M. A., & Saleh, R. A. (2021). A Statistical Study on the Parameters of the Skew Normal Distribution Depending on the Use of the Genetic Algorithm Using the Simulation Method. Journal of Physics: Conference Series, 1879(3). https://doi.org/10.1088/1742-6596/1879/3/032017

Alrawi, A. G., & Issa, A. M. (2019). Use Principal Component Analysis Technique to Dimensionality Reduction to Multi Source. Journal of Economics and Administrative Sciences, 25(115), 464–473. https://doi.org/10.33095/jeas.v25i115.1778

Alsabaah, S. A., & AlQuraishi, Z. K. M. (2018). Use Principle Component Regression Method In Addressing Linear Multiplicity Problem. Karbala University Scientific Journal, 16(2), 248–261. https://www.iasj.net/iasj/article/153425

Alsafawi, S. Y., AlDin, S. D., & Shaker, S. M. (2010). Using the Partial Least Squares Method to Eliminate Multicollinearity. Iraqi Journal of Statistical Sciences, 10(1), 115–128.

Andrecut, M. (2009). Parallel GPU implementation of iterative PCA algorithms. Journal of Computational Biology : A Journal of Computational Molecular Cell Biology, 16(11), 1593–1599. https://doi.org/10.1089/cmb.2008.0221

Ding, J., Zhao, L., Liu, C., & Chai, T. (2014). GA-based principal component selection for production performance estimation in mineral processing. Computers and Electrical Engineering, 40(5), 1447–1459. https://doi.org/10.1016/j.compeleceng.2013.12.014

Eiben , A.E & Smaith, J. (2015). Introduction to evolutionary computing. In Natural Computing Series (Vol. 28). https://doi.org/10.1007/978-3-662-43631-8_2

Elvira-Ortiz, D. A., Jaen-Cuellar, A. Y., Morinigo-Sotelo, D., Morales-Velazquez, L., Osornio-Rios, R. A., & Romero-Troncoso, R. de J. (2020). Genetic algorithm methodology for the estimation of generated power and harmonic content in photovoltaic generation. Applied Sciences (Switzerland), 10(2). https://doi.org/10.3390/app10020542

Hassan, M. M., Shaker. H. H., & Mohammed. N. J. (2020). Comparison between the two methods of regression of the letter and regression of the principal components using Monte Carlo simulation through the mean square error (MSE). Journal of Al-Rafidain University College of Science, 46(1), 335–352. https://doi.org/10.55562/jrucs.v46i1.86.

Hubert, M., & Branden, K. V. (2003). Robust methods for partial least squares regression. Journal of Chemometrics, 17(10), 537–549. https://doi.org/10.1002/cem.822

Hussain, J., & Nassir, A. (2015). Cluster Analysis as a Strategy of Grouping to Construct Goodness-of-Fit Tests when the Continuous Covariates Present in the Logistic Regression Model. British Journal of Mathematics & Computer Science, 10(1), 1–16. https://doi.org/10.9734/bjmcs/2015/18616

Hussein, S. M., & Saleh, R. A. (2014). Comparison of some robust methods for estimating partial least squares regression parameters. Journal of Economics and Administrative Sciences, 20(75), 413–431. https://doi.org/10.33095/jeas.v20i75.587

Jiang, S., Tian, H., Wang, Y., Jin, L., Rong, J., Kang, S., ... & Liu, Z. (2023). Optimization of source pencils loading plan with genetic algorithm for gamma irradiation facility. Radiation Physics and Chemistry, 207, 110839.‏ Https://Doi.Org/10.1016/J RADPHYSCHEM.2023.110839.

Kale, I. R., Pachpande, M. A., Naikwadi, S. P., & Narkhede, M. N. (2022). Optimization of advanced manufacturing processes using socio inspired cohort intelligence algorithm. International Journal for Simulation and Multidisciplinary Design Optimization, 13. https://doi.org/10.1051/smdo/2021033

Kaneko, H. (2022). Genetic Algorithm-Based Partial Least-Squares with only the First Component for Model Interpretation. ACS Omega, 7(10), 8968–8979. https://doi.org/10.1021/acsomega.1c07379

Leardi, R. (2003). Genetic algorithm-PLS as a tool for wavelength selection in spectral data sets. Data Handling in Science and Technology, 23(C), 169–196. https://doi.org/10.1016/S0922-3487(03)23006-9

Liu, C., Zhang, X., Nguyen, T. T., Liu, J., Wu, T., Lee, E., & Tu, X. M. (2021). Partial least squares regression and principal component analysis: Similarity and differences between two popular variable reduction approaches. General Psychiatry, 35(1), 1–5. https://doi.org/10.1136/gpsych-2021-100662

Liu, J., & Wong, D. S. H. (2011). Developing soft sensors based on orthogonal projections to latent structures with kernel algorithm. IFAC Proceedings Volumes (IFAC-PapersOnline), 44(1 PART 1), 14342–14347. https://doi.org/10.3182/20110828-6-IT-1002.00100

Mohammed, H. Y. and Mohammed, L. A. (2020). In the analysis of the Principle components of kernel. Journal of Management and Economics, 123, 376–394. https://doi.org/10.4324/9781315755533-9

Ramzan , S., & Zahid, F.M. (2010). prediction method for Time- series regression models with multicollinearity. World Applied Sciences Journal, 11(4), 443–450.

Saleh, R. A. (2016). Comparison between partial least squares and principal component methods using simulation. Journal of Economics and Administrative Sciences, 22(87), 50–71. https://doi.org/10.33095/jeas.v22i87.725

Samosir, R. D., Salaki, D. T., & Langi, Y. (2022). Comparison of Partial Least Squares Regression and Principal Component Regression for Overcoming Multicollinearity in Human Development Index Model. Operations Research: International Conference Series, 3(1), 1–7. https://doi.org/10.47194/orics.v3i1.126

Shang, X., Li, X., Morales-Esteban, A., & Chen, G. (2017). Improving microseismic event and quarry blast classification using Artificial Neural Networks based on Principal Component Analysis. Soil Dynamics and Earthquake Engineering, 99(May), 142–149. https://doi.org/10.1016/j.soildyn.2017.05.008

Van Roon, P., Zakizadeh, J., & Chartier, S. (2014). Partial Least Squares tutorial for analyzing neuroimaging data. The Quantitative Methods for Psychology, 10(2), 200–215. https://doi.org/10.20982/tqmp.10.2.p200

Published

2025-02-01

Issue

Section

Statistical Researches

How to Cite

Hayder Osman Hussein, H.O.H. (2025) “Comparison Between the Partial Least Squares Method and Principal Components Using Genetic Algorithm with an Application”, Journal of Economics and Administrative Sciences. Translated byR.A.-R.S. Rabab Abdul-Ridha Saleh, 31(145), pp. 143–162. doi:10.33095/s8543c81.

Similar Articles

11-20 of 1206

You may also start an advanced similarity search for this article.