The purely algorithmic way to do it is the Wigner-Ville distribution, but it isn't practical for complex sounds due to the quadratic explosion of interactions between all time-frequency components. For a small number of well-separated 'chirp' signals it can give you exact localization.
I think the software you are looking for would have to be based on a machine learning rather than purely theory-based approach if its intended for use with natural sound signals.
I think the software you are looking for would have to be based on a machine learning rather than purely theory-based approach if its intended for use with natural sound signals.