1) What words persist or become mainstream has little to do with how old they are. Tree shaking is evocative and is probably more appealing/approachable to say than "dead code elimination", so it became the more popular term.
2) I was curious about which term actually came first.
I couldn't find any use of the term "tree shaking" or "tree shaker" in the realm of computing on Google Scholar (it was all citrus tree or other arboreal topics, weird). The earliest discussion I could find with the word is this on comp.lang.lisp: https://groups.google.com/forum/#!topic/comp.lang.lisp/pspFr...
2) I was curious about which term actually came first.
The first use of "dead-code elimination" I could find was this 1973 dissertation: https://research-repository.st-andrews.ac.uk/bitstream/handl...
I couldn't find any use of the term "tree shaking" or "tree shaker" in the realm of computing on Google Scholar (it was all citrus tree or other arboreal topics, weird). The earliest discussion I could find with the word is this on comp.lang.lisp: https://groups.google.com/forum/#!topic/comp.lang.lisp/pspFr...