It's slightly more complex than that. This is in 3D Tiles format which uses GLTF but I don't think you can simply grab a GLTF from an API endpoint. It's been sliced into cubes with hierarchical level of detail and other things I barely understand.
Oh that's interesting. You probably still need to jump through hoops to figure out the right url for the grid square and level of detail you need. And the session parameter - so you still need to make the initial tile request to get the JSON.
Do a little googling and you can probably find code on github to take care of the "get me terrain for these coordinates" part for you. That's how I managed it a few years ago when I was programmatically downloading tiles from google earth without understanding how to convert from coordinates to their url scheme.