ترغب بنشر مسار تعليمي؟ اضغط هنا

Sound Transpilation from Binary to Machine-Independent Code

74   0   0.0 ( 0 )
 نشر من قبل Roberto Metere
 تاريخ النشر 2018
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

In order to handle the complexity and heterogeneity of mod- ern instruction set architectures, analysis platforms share a common design, the adoption of hardware-independent intermediate representa- tions. The usage of these platforms to verify systems down to binary-level is appealing due to the high degree of automation they provide. How- ever, it introduces the need for trusting the correctness of the translation from binary code to intermediate language. Achieving a high degree of trust is challenging since this transpilation must handle (i) all the side effects of the instructions, (ii) multiple instruction encoding (e.g. ARM Thumb), and (iii) variable instruction length (e.g. Intel). We overcome these problems by formally modeling one of such intermediate languages in the interactive theorem prover HOL4 and by implementing a proof- producing transpiler. This tool translates ARMv8 programs to the in- termediate language and generates a HOL4 proof that demonstrates the correctness of the translation in the form of a simulation theorem. We also show how the transpiler theorems can be used to transfer properties verified on the intermediate language to the binary code.



قيم البحث

اقرأ أيضاً

Since compiler optimization is the most common source contributing to binary code differences in syntax, testing the resilience against the changes caused by different compiler optimization settings has become a standard evaluation step for most bina ry diffing approaches. For example, 47 top-venue papers in the last 12 years compared different progr
Big Data query systems represent data in a columnar format for fast, selective access, and in some cases (e.g. Apache Drill), perform calculations directly on the columnar data without row materialization, avoiding runtime costs. However, many anal ysis procedures cannot be easily or efficiently expressed as SQL. In High Energy Physics, the majority of data processing requires nested loops with complex dependencies. When faced with tasks like these, the conventional approach is to convert the columnar data back into an object form, usually with a performance price. This paper describes a new technique to transform procedural code so that it operates on hierarchically nested, columnar data natively, without row materialization. It can be viewed as a compiler pass on the typed abstract syntax tree, rewriting references to objects as columnar array lookups. We will also present performance comparisons between transformed code and conventional object-oriented code in a High Energy Physics context.
The aim of the paper is to provide solid foundations for a programming paradigm natively supporting the creation and manipulation of cyclic data structures. To this end, we describe coFJ, a Java-like calculus where objects can be infinite and methods are equipped with a codefinition (an alternative body). We provide an abstract semantics of the calculus based on the framework of inference systems with corules. In coFJ with this semantics, FJ recursive methods on finite objects can be extended to infinite objects as well, and behave as desired by the programmer, by specifying a codefinition. We also describe an operational semantics which can be directly implemented in a programming language, and prove the soundness of such semantics with respect to the abstract one.
Programmers currently enjoy access to a very high number of code repositories and libraries of ever increasing size. The ensuing potential for reuse is however hampered by the fact that searching within all this code becomes an increasingly difficult task. Most code search engines are based on syntactic techniques such as signature matching or keyword extraction. However, these techniques are inaccurate (because they basically rely on documentation) and at the same time do not offer very expressive code query languages. We propose a novel approach that focuses on querying for semantic characteristics of code obtained automatically from the code itself. Program units are pre-processed using static analysis techniques, based on abstract interpretation, obtaining safe semantic approximations. A novel, assertion-based code query language is used to express desired semantic characteristics of the code as partial specifications. Relevant code is found by comparing such partial specifications with the inferred semantics for program elements. Our approach is fully automatic and does not rely on user annotations or documentation. It is more powerful and flexible than signature matching because it is parametric on the abstract domain and properties, and does not require type definitions. Also, it reasons with relations between properties, such as implication and abstraction, rather than just equality. It is also more resilient to syntactic code differences. We describe the approach and report on a prototype implementation within the Ciao system. Under consideration for acceptance in TPLP.
394 - Jooyong Yi 2013
Backtracking (i.e., reverse execution) helps the user of a debugger to naturally think backwards along the execution path of a program, and thinking backwards makes it easy to locate the origin of a bug. So far backtracking has been implemented mostl y by state saving or by checkpointing. These implementations, however, inherently do not scale. Meanwhile, a more recent backtracking method based on reverse-code generation seems promising because executing reverse code can restore the previous states of a program without state saving. In the literature, there can be found two methods that generate reverse code: (a) static reverse-code generation that pre-generates reverse code through static analysis before starting a debugging session, and (b) dynamic reverse-code generation that generates reverse code by applying dynamic analysis on the fly during a debugging session. In particular, we espoused the latter one in our previous work to accommodate non-determinism of a program caused by e.g., multi-threading. To demonstrate the usefulness of our dynamic reverse-code generation, this article presents a case study of various backtracking methods including ours. We compare the memory usage of various backtracking methods in a simple but nontrivial example, a bounded-buffer program. In the case of non-deterministic programs such as this bounded-buffer program, our dynamic reverse-code generation outperforms the existing backtracking methods in terms of memory efficiency.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا