No Arabic abstract
The globalization of the electronics supply chain is requiring effective methods to thwart reverse engineering and IP theft. Logic locking is a promising solution but there are still several open concerns. Even when applied at high level of abstraction, logic locking leads to large overhead without guaranteeing that the obfuscation metric is actually maximized. We propose a framework to optimize the use of behavioral logic locking for a given security metric. We explore how to apply behavioral logic locking techniques during the HLS of IP cores. Operating on the chip behavior, our method is compatible with commercial HLS tools, complementing existing industrial design flows. We offer a framework where the designer can implement different meta-heuristics to explore the design space and select where to apply logic locking. Our method optimizes a given security metric better than complete obfuscation, allows us to 1) obtain better protection, 2) reduce the obfuscation cost.
In this work, we present a new approach to high level synthesis (HLS), where high level functions are first mapped to an architectural template, before hardware synthesis is performed. As FPGA platforms are especially suitable for implementing streaming processing pipelines, we perform transformations on conventional high level programs where they are turned into multi-stage dataflow engines [1]. This target template naturally overlaps slow memory data accesses with computations and therefore has much better tolerance towards memory subsystem latency. Using a state-of-the-art HLS tool for the actual circuit generation, we observe up to 9x improvement in overall performance when the dataflow architectural template is used as an intermediate compilation target.
High-Level Synthesis (HLS) frameworks allow to easily specify a large number of variants of the same hardware design by only acting on optimization directives. Nonetheless, the hardware synthesis of implementations for all possible combinations of directive values is impractical even for simple designs. Addressing this shortcoming, many HLS Design Space Exploration (DSE) strategies have been proposed to devise directive settings leading to high-quality implementations while limiting the number of synthesis runs. All these works require considerable efforts to validate the proposed strategies and/or to build the knowledge base employed to tune abstract models, as both tasks mandate the syntheses of large collections of implementations. Currently, such data gathering is performed ad-hoc, a) leading to a lack of standardization, hampering comparisons between DSE alternatives, and b) posing a very high burden to researchers willing to develop novel DSE strategies. Against this backdrop, we here introduce DB4HLS, a database of exhaustive HLS explorations comprising more than 100000 design points collected over 4 years of synthesis time. The open structure of DB4HLS allows the incremental integration of new DSEs, which can be easily defined with a dedicated domain-specific language. We think that of our database, available at https://www.db4hls.inf.usi.ch/, will be a valuable tool for the research community investigating automated strategies for the optimization of HLS-based hardware designs.
For a system-level design of Networks-on-Chip for 3D heterogeneous System-on-Chip (SoC), the locations of components, routers and vertical links are determined from an application model and technology parameters. In conventional methods, the two inputs are accounted for separately; here, we define an integrated problem that considers both application model and technology parameters. We show that this problem does not allow for exact solution in reasonable time, as common for many design problems. Therefore, we contribute a heuristic by proposing design steps, which are based on separation of intralayer and interlayer communication. The advantage is that this new problem can be solved with well-known methods. We use 3D Vision SoC case studies to quantify the advantages and the practical usability of the proposed optimization approach. We achieve up to 18.8% reduced white space and up to 12.4% better network performance in comparison to conventional approaches.
Hardware accelerators are key to the efficiency and performance of system-on-chip (SoC) architectures. With high-level synthesis (HLS), designers can easily obtain several performance-cost trade-off implementations for each component of a complex hardware accelerator. However, navigating this design space in search of the Pareto-optimal implementations at the system level is a hard optimization task. We present COSMOS, an automatic methodology for the design-space exploration (DSE) of complex accelerators, that coordinates both HLS and memory optimization tools in a compositional way. First, thanks to the co-design of datapath and memory, COSMOS produces a large set of Pareto-optimal implementations for each component of the accelerator. Then, COSMOS leverages compositional design techniques to quickly converge to the desired trade-off point between cost and performance at the system level. When applied to the system-level design (SLD) of an accelerator for wide-area motion imagery (WAMI), COSMOS explores the design space as completely as an exhaustive search, but it reduces the number of invocations to the HLS tool by up to 14.6x.
Triple Modular Redundancy (TMR) is a suitable fault tolerant technique for SRAM-based FPGA. However, one of the main challenges in achieving 100% robustness in designs protected by TMR running on programmable platforms is to prevent upsets in the routing from provoking undesirable connections between signals from distinct redundant logic parts, which can generate an error in the output. This paper investigates the optimal design of the TMR logic (e.g., by cleverly inserting voters) to ensure robustness. Four differe