The architecture of a coarse-grained reconfigurable array (CGRA) processing element (PE) has a significant effect on the performance and energy efficiency of an application running on the CGRA. This paper presents an automated approach for generating specialized PE architectures for an application or an application domain. Frequent subgraphs mined from a set of applications are merged to form a PE architecture specialized to that application domain. For the image processing and machine learning domains, we generate specialized PEs that are up to 10.5x more energy efficient and consume 9.1x less area than a baseline PE.