Comparison of logistic regression and machine learning techniques in prediction of habitat distribution of plant species
Keywords:
Artificial neural network, Logistic regression, Machine learning, Maximum entropy, True skill statisticAbstract
The study was carried out to compare performance of Logistic regression (LR) and machine learning techniques to predict habitat distribution of plant species in rangelands of Qom Province, Iran. After determination of homogeneous units, vegetation sampling was carried out using random systematic method. The plot size was determined using minimal area method from 2 to 25 m2 . For soil sampling, at each habitat, eight holes were drilled and samples were taken from 0-30 and 30-80 depths. Soil characteristics consisting gravel percent, texture, saturation moisture, available water, lime, gypsum, organic matter, acidity (pH), electrical conductivity (EC) were measured by standard methods. Using geostatistical and kriging interpolation method with the same spatial resolution soil digital layers were prepared and stored in GIS. Digital elevation map of the region was used for mapping slope, aspect and elevation. After implementation of the models, to evaluate and predict the actual maps conformity, Kappa coefficient and true skill statistic (TSS) were measured. The results showed that the highest values of kappa and TSS belong to the ANN (ê= 0.81, TSS= 0.8), MaxEnt (ê= 0.79. TSS= 0.57) and LR models (ê= 0.63, TSS= 0.55), respectively. Based on these results, it can be said that there is a strong relationship between model performance and the kinds of species distributions being modeled. Some methods performed generally better, but no method was superior in all circumstances.