My favorite MLE example: Suppose you walk into a bank and ask them to give you a quarter. You flip the quarter twice and get two heads. Given this experiment, what do you estimate to be the probability p of getting a heads when you flip this coin? Using MLE, you would get p = 1. In other words, this coin will always give you a heads when you flip it! (According to MLE.)
Are you just demonstrating overfitting when estimating using too little data? Or is there something deeper going on in your example? What does the bank have to do with anything?
The bank is context that gives us a prior probability. However, MLE does not consider a prior. So MLE can give results that are not very helpful in the real world. All it does is answer: What parameter value (in case the probability) of a head, makes the observed outcome most likely? But it considers all parameter values equally likely. In reality, we know that it is highly likely that a random coin from a bank is a fair coin. Thus, if we flip two heads, we are almost certain that it's still a fair coin. If, on the other hand, we flipped 10 heads in a row, we might start to wonder if somehow the bank gave you a trick coin. MAP is an alternative to MLE, arguably better in many situations: [https://www.cs.cmu.edu/~aarti/Class/10701_Spring23/Lecs/Lect....
The example only seems ridiculous because you've deliberately excluded relevant knowledge about the world from the model. Add a prior to the model and you'll have a much more reasonable function to maximise.