ترغب بنشر مسار تعليمي؟ اضغط هنا

Sparse Matrix Code Dependence Analysis Simplification at Compile Time

75   0   0.0 ( 0 )
 نشر من قبل Mahdi Soltan Mohammadi
 تاريخ النشر 2018
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Analyzing array-based computations to determine data dependences is useful for many applications including automatic parallelization, race detection, computation and communication overlap, verification, and shape analysis. For sparse matrix codes, array data dependence analysis is made more difficult by the use of index arrays that make it possible to store only the nonzero entries of the matrix (e.g., in A[B[i]], B is an index array). Here, dependence analysis is often stymied by such indirect array accesses due to the values of the index array not being available at compile time. Consequently, many dependences cannot be proven unsatisfiable or determined until runtime. Nonetheless, index arrays in sparse matrix codes often have properties such as monotonicity of index array elements that can be exploited to reduce the amount of runtime analysis needed. In this paper, we contribute a formulation of array data dependence analysis that includes encoding index array properties as universally quantified constraints. This makes it possible to leverage existing SMT solvers to determine whether such dependences are unsatisfiable and significantly reduces the number of dependences that require runtime analysis in a set of eight sparse matrix kernels. Another contribution is an algorithm for simplifying the remaining satisfiable data dependences by discovering equalities and/or subset relationships. These simplifications are essential to make a runtime-inspection-based approach feasible.



قيم البحث

اقرأ أيضاً

110 - Yingfu Zeng 2017
Reachability analysis for hybrid systems is an active area of development and has resulted in many promising prototype tools. Most of these tools allow users to express hybrid system as automata with a set of ordinary differential equations (ODEs) as sociated with each state, as well as rules for transitions between states. Significant effort goes into developing and verifying and correctly implementing those tools. As such, it is desirable to expand the scope of applicability tools of such as far as possible. With this goal, we show how compile-time transformations can be used to extend the basic hybrid ODE formalism traditionally supported in hybrid reachability tools such as SpaceEx or Flow*. The extension supports certain types of partial derivatives and equational constraints. These extensions allow users to express, among other things, the Euler-Lagrangian equation, and to capture practically relevant constraints that arise naturally in mechanical systems. Achieving this level of expressiveness requires using a binding time-analysis (BTA), program differentiation, symbolic Gaussian elimination, and abstract interpretation using interval analysis. Except for BTA, the other components are either readily available or can be easily added to most reachability tools. The paper therefore focuses on presenting both the declarative and algorithmic specifications for the BTA phase, and establishes the soundness of the algorithmic specifications with respect to the declarative one.
This paper presents a scalable path- and context-sensitive data-dependence analysis. The key is to address the aliasing-path-explosion problem via a sparse, demand-driven, and fused approach that piggybacks the computation of pointer information with the resolution of data dependence. Specifically, our approach decomposes the computational efforts of disjunctive reasoning into 1) a context- and semi-path-sensitive analysis that concisely summarizes data dependence as the symbolic and storeless value-flow graphs, and 2) a demand-driven phase that resolves transitive data dependence over the graphs. We have applied the approach to two clients, namely thin slicing and value flow analysis. Using a suite of 16 programs ranging from 13 KLoC to 8 MLoC, we compare our techniques against a diverse group of state-of-the-art analyses, illustrating significant precision and scalability advantages of our approach.
Programmers currently enjoy access to a very high number of code repositories and libraries of ever increasing size. The ensuing potential for reuse is however hampered by the fact that searching within all this code becomes an increasingly difficult task. Most code search engines are based on syntactic techniques such as signature matching or keyword extraction. However, these techniques are inaccurate (because they basically rely on documentation) and at the same time do not offer very expressive code query languages. We propose a novel approach that focuses on querying for semantic characteristics of code obtained automatically from the code itself. Program units are pre-processed using static analysis techniques, based on abstract interpretation, obtaining safe semantic approximations. A novel, assertion-based code query language is used to express desired semantic characteristics of the code as partial specifications. Relevant code is found by comparing such partial specifications with the inferred semantics for program elements. Our approach is fully automatic and does not rely on user annotations or documentation. It is more powerful and flexible than signature matching because it is parametric on the abstract domain and properties, and does not require type definitions. Also, it reasons with relations between properties, such as implication and abstraction, rather than just equality. It is also more resilient to syntactic code differences. We describe the approach and report on a prototype implementation within the Ciao system. Under consideration for acceptance in TPLP.
STGF is a community code employed for outer-region R-matrix calculations, describing electron-impact collisional processes. It is widely recognised that the original version of STGF was written by M. J. Seaton in 1983, but through constant refinement over the next decades by worldwide contributors has evolved into its current form that more reflects modern coding practice and current computer architectures. Despite its current wide acceptance, it was never formally published. Therefore, we present an updated high-performance parallel version of PSTGF, that balances the requirements of small university clusters, yet can exploit the computational power of cutting edge supercomputers. There are many improvements over the original STGF, but most noticeably, the full introduction of MQDT options that provide subsequent integration with ICFT (Intermediate Coupling Frame Transformation) codes, and for either Breit-Pauli / DARC (Dirac Atomic R-matrix Codes), better load balancing, high levels of vectorisation and simplified output. Semantically, the program is full fortran 90 in conjunction with MPI (Message Passing Interface) though has CUDA fortran options for the most numerically intensive code sections.
When creating a new domain-specific language (DSL) it is common to embed it as a part of a flexible host language, rather than creating it entirely from scratch. The semantics of an embedded DSL (EDSL) is either given directly as a set of functions ( shallow embedding), or an AST is constructed that is later processed (deep embedding). Typically, the deep embedding is used when the EDSL specifies domain-specific optimizations (DSO) in a form of AST transformations. In this paper we show that deep embedding is not necessary to specify most optimizations. We define language semantics as action functions that are executed during parsing. These actions build incrementally a new, arbitrary complex program function. The EDSL designer is able to specify many aspects of the semantics as a runnable code, such as variable scoping rules, custom type checking, arbitrary control flow structures, or DSO. A sufficiently powerful staging mechanism helps assembling the code from different actions, as well as evaluate the semantics in arbitrarily many stages. In the end, we obtain code that is as efficient as one written by hand. We never create any object representation of the code. No external traversing algorithm is used to process the code. All program fragments are functions with their entire semantics embedded within the function bodies. This approach allows reusing the code between EDSL and the host language, as well as combining actions of many different EDSLs.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا