In most Internet of Things (IoT) networks, edge nodes are commonly used as to relays to cache sensing data generated by IoT sensors as well as provide communication services for data consumers. However, a critical issue of IoT sensing is that data are usually transient, which necessitates temporal updates of caching content items while frequent cache updates could lead to considerable energy cost and challenge the lifetime of IoT sensors. To address this issue, we adopt the Age of Information (AoI) to quantify data freshness and propose an online cache update scheme to obtain an effective tradeoff between the average AoI and energy cost. Specifically, we first develop a characterization of transmission energy consumption at IoT sensors by incorporating a successful transmission condition. Then, we model cache updating as a Markov decision process to minimize average weighted cost with judicious definitions of state, action, and reward. Since user preference towards content items is usually unknown and often temporally evolving, we therefore develop a deep reinforcement learning (DRL) algorithm to enable intelligent cache updates. Through trial-and-error explorations, an effective caching policy can be learned without requiring exact knowledge of content popularity. Simulation results demonstrate the superiority of the proposed framework.