Recently, several studies have explored the use of neural network to solve different routing problems, which is an auspicious direction. These studies usually design an encoder-decoder based framework that uses encoder embeddings of nodes and the problem-specific context to produce node sequence(path), and further optimize the produced result on top by beam search. However, existing models can only support node coordinates as input, ignore the self-referential property of the studied routing problems, and lack the consideration about the low reliability in the initial stage of node selection, thus are hard to be applied in real-world. In this paper, we take the orienteering problem as an example to tackle these limitations. We propose a novel combination of a variant beam search algorithm and a learned heuristic for solving the general orienteering problem. We acquire the heuristic with an attention network that takes the distances among nodes as input, and learn it via a reinforcement learning framework. The empirical studies show that our method can surpass a wide range of baselines and achieve results close to the optimal or highly specialized approach. Also, our proposed framework can be easily applied to other routing problems. Our code is publicly available.