Forecasting the future behavior of all traffic agents in the vicinity is a key task to achieve safe and reliable autonomous driving systems. It is a challenging problem as agents adjust their behavior depending on their intentions, the others actions, and the road layout. In this paper, we propose Decoder Fusion RNN (DF-RNN), a recurrent, attention-based approach for motion forecasting. Our network is composed of a recurrent behavior encoder, an inter-agent multi-headed attention module, and a context-aware decoder. We design a map encoder that embeds polyline segments, combines them to create a graph structure, and merges their relevant parts with the agents embeddings. We fuse the encoded map information with further inter-agent interactions only inside the decoder and propose to use explicit training as a method to effectively utilize the information available. We demonstrate the efficacy of our method by testing it on the Argoverse motion forecasting dataset and show its state-of-the-art performance on the public benchmark.