Abstract Visible and near infrared spectroscopy (vis-NIRS) may be useful for an estimation of soil properties in arable fields, but the quality of results are often variable depending on the applied chemometric approach. Partial least squares regression (PLSR) may be replaced by approaches which employ supervised learning methods or variable selection procedures in order to increase the proportion of informative wavelengths used in the estimation procedure, to reduce the noise of the spectra and to find the best fitting solution. Objectives were (1) to compare the usefulness of PLSR with either PLSR combined with a genetic algorithm (GA-PLSR) or support vector machine regression (SVMR) for an estimation of soil organic carbon (SOC), total nitrogen (N), pH, cation exchange capacity (CEC) and soil texture for surface soils (0–5 cm, n = 144) of an arable field in Bangalore (India) and (2) to test and optimize different calibration strategies for GA-PLSR for an improved estimation of soil properties. PLSR was useful for an estimation of SOC, N, sand and clay. In the cross-validation ( n = 96), accuracies of estimated soil properties generally decreased in the order GA-PLSR $>$ SVMR $>$ PLSR. However, the order of estimation accuracies for the random validation sample ( n = 48) changed to SVMR $>$ GA-PLSR $>$ PLSR for SOC, N, pH, and CEC, whereas for clay the order changed to SVMR $>$ PLSR $>$ GA-PLSR. A sequential procedure, which used the most frequently selected wavelengths of the GA-PLSR runs, proved to be useful for an improved estimation of SOC and N. Overall, SVMR especially improved estimations of SOC and clay, whereas GA-PLSR was particularly useful for SOC and N and it was the only approach which successfully estimated CEC in cross-validation and validation.