If I recall correctly, the difference between the SAM models is just a parameter number versus accuracy tradeoff. I have the parameter numbers listed under 'Installation', but the relative quality of the models would be task-dependent and subjective.
I would think that part of the motivation for releasing the smaller models in addition to the larger ones would be use in video image segmentation and mobile filters. The smaller models might actually be more fit for purpose with regard with regard to those applications than the biggest one. However, I'd reccommend the biggest model (vit_h) for desktop or laptop image processing.
It runs from a python script.