Yes, each chain on its own can use sharding. The protocol underlying each chain/sharding method would have to implement the interoperability of chains protocol.
Think of it like TCP/IP today. In your own network, you can use whatever you want, let's say Zigbee. Once I want to interop with other networks though, I'll have to go to TCP/IP or similar.
The value of the network does go up the more people use it, and therefore, there are more transactions per user. However, the amount of transactions that each user produces will probably not scale linearly with the network size.
Think of it like TCP/IP today. In your own network, you can use whatever you want, let's say Zigbee. Once I want to interop with other networks though, I'll have to go to TCP/IP or similar.