In this paper, we consider a cache-aided relay network, where a single server consisting of a library of N files connects with K1 relays through a shared noiseless link, and each relay connects with K2 users through a shared noiseless link. Each relay and user are equipped with a cache memory of M1 and M2 files, respectively. We propose a centralized and a decentralized coded caching scheme that exploit the spared transmission time resource by allowing concurrent transmission between the two layers. It is shown that both caching schemes are approximately optimal, and greatly reduce the transmission delay compared to the previously known caching schemes. Surprisingly, we show that when the relays caching size is equal to a threshold that is strictly smaller than N (e.g. M1=0.382N under the decentralized setup and (K1-1)N/K1 under the centralized setup, when K1=2), our schemes achieve the same delay as if each relay had access to the full library. To our best knowledge, this is the first result showing that even the caching size is strictly smaller than the librarys size, increasing the caching size is wasteful in reducing the transmission latency.