Static Analyses of Program Flows: Types and Certificates for Complexity

July 6, 2023

StATyCC stands for “Static Analyses of Program Flows: Types and Certificates for Complexity”. This is an international collaborative project funded by the Transatlantic Research Partnership and involving the School of Computer Science of the University of Augusta and the Northern Paris Computer Science Lab of the University of Paris 13.

The project aims at providing new static analysis tools based on theoretical results from Implicit computational complexity, building on previous work of Moyen, Rubiano and Seiller. It revolves around certified compositional analysis of source code, loop optimization through an original dependency model. Application to parallel optimization and intermediate representation are also investigated.

The most up-to-date document describing our hypothesis and results are Neea Rusch’s Research proposal, “*Certifying Implicit Computational Complexity*”. More context can be found in her PhD research proficiency examination, “*On Implicit Computational Complexity with Applications to Real-World Programs*”.

This project is carried out by

- American Partner
- Clément Aubert
- Neea Rusch
- Assya Sellak (until 03/21)

- French Partner

- Clément Aubert, Thomas Rubiano, Neea Rusch, Thomas Seiller. mwp-Analysis Improvement and Implementation: Realizing Implicit Computational Complexity. 2022. FSCD 2022 10.4230/LIPIcs.FSCD.2022.26
- Clément Aubert, Thomas Rubiano, Neea Rusch, Thomas Seiller. Distributing and Parallelizing Non-canonical Loops. 2023. VMCAI 2023 10.1007/978-3-031-24950-1_1 and ⟨hal-03669387⟩. This paper comes with an artifact available on zenodo to reproduce the results of our benchmarkings.
- Clément Aubert, Thomas Rubiano, Neea Rusch, Thomas Seiller. pymwp: A Static Analyzer Determining Polynomial Growth Bounds. 2023. ⟨hal-03269121⟩, accepted to ATVA 2023.

- Clément Aubert, Thomas Rubiano, Neea Rusch, Thomas Seiller. Realizing Implicit Computational Complexity. 2022. ⟨hal-03603510v1⟩, accepted to Types 2022.
- Clément Aubert, Thomas Rubiano, Neea Rusch, Thomas Seiller. Certifying Complexity Analysis. 2023. ⟨hal-04083105v1⟩, accepted to CoqPL 2023.

- Neea Rusch. Formally Verified Resource Bounds through Implicit Computational Complexity. 2022. SPLASH Companion 2022, 10.1145/3563768.3565545.

- A refined implementation of the original optimization is available on github.
- An experimental implementation of the mwp-analysis for C code is available on github.
- The “Loop Fission Benchmarks”, hosted on gihtub, experimentally sustain the claims made in
*Distributing and Parallelizing Non-canonical Loops*.

Neea Rusch gave a talk at the SCOT seminar.

Neea Rusch’s slides are available on-line.

mwp-Analysis Improvement and Implementation: Realizing Implicit Computational Complexity

Implicit Computational Complexity (ICC) drives better understanding of complexity classes, but it also guides the development of resources-aware languages and static source code analyzers. Among the methods developed, the mwp-flow analysis certifies polynomial bounds on the size of the values manipulated by an imperative program. This result is obtained by bounding the transitions between states instead of focusing on states in isolation, as most static analyzers do, and is not concerned with termination or tight bounds on values. Those differences, along with its built-in compositionality, make the mwp-flow analysis a good target for determining how ICC-inspired techniques diverge compared with more traditional static analysis methods. This paper’s contributions are three-fold: we fine-tune the internal machinery of the original analysis to make it tractable in practice; we extend the analysis to function calls and leverage its machinery to compute the result of the analysis efficiently; and we implement the resulting analysis as a lightweight tool to automatically perform data-size analysis of C programs. This documented effort prepares and enables the development of certified complexity analysis, by transforming a costly analysis into a tractable program, that furthermore decorrelates the problem of deciding if a bound exist with the problem of computing it.

Paper related to this talk: https://drops.dagstuhl.de/opus/volltexte/2022/16307/

(joint work with Clément Aubert, Thomas Rubiano and Thomas Seiller)

Neea Rusch gave two talks at POPL 2023:

*Distributing and Parallelizing Non-canonical Loops*during VMCAI 2023, the slides are available on-line, and the presentation was recorded.*Certifying Complexity Analysis*during CoqPL 2023, the slides are available on-line.

Neea Rusch presented *Formally Verified Resource Bounds through Implicit Computational Complexity* at SPLASH 2022’s Doctoral Symposium, the slides are available on-line.

Thomas Seiller presented our work, *mwp-Analysis Improvement and Implementation: Realizing Implicit Computational Complexity* at FSCD 2022. The slides are available on-line.

We had an abstract accepted the 28th International Conference on Types for Proofs and Programs.

Neea Rusch’s slides are available on-line, and the presentation was recorded.

Realizing Implicit Computational ComplexityThis abstract aims at presenting an ongoing effort to apply a novel typing mechanism stemming from Implicit Computational Complexity (ICC), that tracks dependencies between variables in three different ways, at different stages of maturation. The first and third projects bend the original typing discipline to gain finer-grained view on statements independence, to optimize loops by hoisting invariant and by splitting loops “horizontally” to parallelize them more efficiently. The second project refines and implements the original analysis to obtain a fast, modular static analyzer. All three projects aims at pushing the original type system, inspired from ICC, to its limits, to assess how ICC can in practice leads to original, sometimes orthogonal, approaches.

Neea Rusch had her abstract accepted to Augusta University’s Graduate Research Day.

Semantic-preserving optimization algorithm for automatic program parallelizationAdvanced and resource-intensive computation relies on continuous rise in processing power. Since the 1970s, Moore’s law accurately predicted this growth would be achieved through hardware improvements, but this observation is becoming progressively obsolete. Alternative approaches are needed to maintain increase in efficiency. Parallelization is a technique in which larger computational problem is divided into smaller tasks, which are then executed simultaneously, reducing overall time to completion. Specialized software and algorithms are required to enable parallelization.

This research presents a novel algorithm for automatic program parallelization based on loop splitting. In programming, loop statements are used for carrying out repeated computation, but when used extensively or carelessly, will produce performance inefficiencies. Using a graph-based variable dependency analysis, the algorithm detects opportunities for splitting loops into smaller, parallelizable loops; then automatically applies this optimization. Additionally, the algorithm guarantees the preservation of program semantics post-transformation. We hypothesize this algorithm, when combined with OpenMP–an existing state-of-the-art multiprocessing tool–will provide noticeable performance gains for resource-intensive computational tasks. An open-source tool, pyalp, implementing this algorithm on C programs, is currently being developed to demonstrate and measure its efficiency in practice.

Neea Rusch’s poster is available on-line.

We organized a seminar at Dagstuhl November 7 – 12 , 2021, called “Static Analyses of Program Flows: Types and Certificates for Complexity”.

Both Assya Sellak and Neea Rusch had their abstract accepted to Augusta University’s Graduate Research Day.

Neea Rusch’s presentation is available on-line.

Certifying the complexity and correctness of critical software

Software powers our everyday lives: from phones to daily interactions to our homes. At the same time software is fraught with bugs causing systems to behave in undesirable ways. When discussing critical software responsible for sustaining human life—such as airplanes, ventilators, and nuclear reactors—being able to guarantee correct behavior is necessary. Compilers play a vital role in the software development process by transforming programmer’s source code to executable programs. They perform analysis, transformations, and optimizations to improve the performance and reliability of the resulting program. But compilers—since they are themselves pieces of software—may contain bugs. To build reliable software, we must establish the correct behavior of these intermediate tools. There is a colossal push to prove the correctness of such tools using mathematical abstractions such as dependency analysis, formal methods, and proof assistants. Proving the correctness allows eradicating bugs in programs and drives programmers to specify formally the intended behavior of programs while building trust and confidence in the end-result. Using dependency analysis inspired by Implicit Computational Complexity, we apply those techniques to program transformations. Among these techniques is ensuring program’s variables grow within reasonable bounds thus providing a certification in term of memory footprint and possibly run-time, in addition to certifying its behavior. Implementing this analysis is one of the goals of our research.

Assya Sellak made a presentation to the 2020 Georgia Undergraduate Research Conference.

A similar presentation was accepted to the 2021 National Conference on Undergraduate Research.

The slides as well as the abstract are available to download. The program of the conference is available as well.

Optimization Method on Programs Using Dependency Analysis and Loop Peeling Transformations

Computer programs are written in high-level languages and translated into machine-code using compilers. Compilers perform a series of program transformation and optimizations to improve memory usage and reduce the run time of the program execution. Programs consist of commands and statements such as conditionals and loops. Loops are an extremely powerful tool for programmers used to repeatedly run a sequence of commands until a specified condition is met. However, when used carelessly, loops can lead to never-halting or extremely slow programs: for this reason, many compilers and program optimizations focus on these structures. Loops can contain commands that perform unneeded residual operations instead of only being executed when necessary. This excessive performance results in an avoidable increase in run time which may arise intentionally or unintentionally either because of the programmer or other automatic transformations. Detecting which operations could have been performed fewer times than the loop requires is complex, but some optimizations try to detect this and extract portions of code that only needed to run once and successively move commands that need to run more than once, but not as many times as the loop runs. These fall short on some structures, mainly because they limit the scope of the analysis to individual operations, rather than considering sequences of operations as a whole. Thanks to quasi-interpretation coming from Implicit Computational Complexity, new ways of detecting invariant sequences of commands inside loops have been developed. We extend this work along two axes: We allow for more structures, including`for`

,`do...while`

, loops with`break`

, to be peeled. By analyzing the dependencies within the loop, we hope to allow for some parallel optimization. This allows to:

- consider more programs
- possibly significantly speed-up programs that are distributed, i.e., executed in parallel on multiple computers.

- Contact: caubert@augusta.edu
- Created with debian, pandoc and latex.
- All my documents are under Creative Commons Attribution 4.0 International License. Sources are available upon motivated request.
- You will need a
`pdf`

reader to consult some of the documents: I recommend choosing an open-source`pdf`

reader.