LFortran Compiles 60% of SciPy
With successful compilation of dftatom in October 2023, embarking to the new year with a remarkable stride, we are delighted to announce that LFortran can now successfully compile 9 out of the 15 Fortran packages within the SciPy library without requiring any modifications.
This is the fifth third-party production-grade code that LFortran can compile. The progress bar towards beta has thus reached 5/10.
LFortran can now compile and pass tests written for scipy.special.specfun
, scipy.special.cdflib
, scipy.special.amos
, scipy.special.mach
, scipy.optimize.minpack
, scipy.optimize.minpack2
, scipy.interpolate.fitpack
, scipy.integrate.quadpack
, and scipy.integrate.mach
.
The remaining packages are scipy.optimize.cobyla
, scipy.integrate.dop
, scipy.integrate.odepack
, scipy.integrate.quadpack
and, scipy.odrpack
.
LFortran is still alpha software, meaning that users must continue expecting that LFortran will fail compiling or running their codes. Please report all bugs that you find.
SciPy Overview
SciPy provides algorithms for optimization, integration, interpolation, eigenvalue problems, algebraic equations, differential equations, statistics and many other classes of problems. It wraps highly-optimized implementations written in compiled languages like Fortran, C, and C++.
The code is written in Fortran 77, utilizing many legacy features like data, entry statements, external, common block, block data, implied do loops, string formatting, etc. While the dftatom
and fastGPT
codes exercised modern fortran features, the successful compilation of this code demonstrates LFortran’s robust support for legacy Fortran features, marking a significant advancement in its alpha-stage development and enhancing its usability for a diverse range of users.
Development Phase Overview
Achieving a 60% compilation of SciPy posed a formidable challenge, demanding extensive development efforts over a span of a year. A concise breakdown is provided below:
NumFOCUS Grant and QuantStack - 2022
During this phase, we planned to do just quick fixes to our fixed-form parser and focus on semantics and LLVM, but it turned out the existing fixed-form parser would still be too fragile and some changes to the SciPy code would be required; so we decided to write a dedicated fixed-form parser that can parse any F77 code. It took a large part of our effort, however we have delivered it and LFortran can then parse all of SciPy unmodified, and we have a nice foundation to easily improve it to parse any other F77 code. This work was supported by NumFOCUS’s grant and QuantStack who sponsored Konrad Handrick.
Furthermore, we implemented enough semantics and lowering that we can fully compile and correctly run SciPy’s Minpack.
Google Summer of Code - 2023
In the summer of 2023, our project focused on compiling SciPy using LFortran. Substantial progress was made as we addressed missing functionalities, resulting in an impressive 77% of files successfully compiled to LLVM without requiring any modifications. Towards the conclusion of the Google Summer of Code (GSoC) program, we identified the crucial need to integrate LFortran into the SciPy build system. Our subsequent steps involved thorough testing of all compiled packages to ensure alignment with GFortran standards.
NumFOCUS Grant - 2023
Continuing the work, we systematically addressed divergences package by package, resolved all issues, implemented necessary changes, refactored various functionalities, and introduced multiple compiler options to accommodate legacy features and ensured that all tests pass for the package. We modified the SciPy build system to work with LFortran and we ensured that all SciPy tests pass. As a result of these efforts, 60% of all Fortran packages in SciPy now work by compiling unmodified with LFortran and all SciPy tests pass. We also test this at LFortran’s CI for every LFortran commit, to ensure no regressions happen.
What’s Next?
As of this writing, LFortran compiles five third-party codes:
- Legacy Minpack (part of SciPy) and several more SciPy packages
- Modern Minpack
- fastGPT
- dftatom
- SciPy (60%)
Our goal is to compile 10 third-party codes so as to bring LFortran from alpha to beta. This is our main focus. We have been working on compiling several more third-party codes. Our progress roadmap is essentially a “feature importance” method where LFortran supports all language features that show up in the ten third-party candidate codes. Since these codes are taken as representative of existing Fortran codebases, compiling them (and supporting the language features used) is how it is used as a metric for progress towards beta. We will continue to announce each one as soon as we are able to fully compile and run them unchanged from their source code. Some of those codes are the Fortran Package Manager (fpm), Fortran stdlib and remaining parts of SciPy. The requirement and milestone to compile 10 third-party codes is necessary to reach beta, but might not be sufficient. Once we deliver the milestone we will evaluate with the community what else needs to be done to get to beta. Our definition of beta quality compiler is that when you run it on your code, it is expected to work and not fail, but there might still be bugs.
We are always looking for more contributors; if you are interested, please get in touch. It is an exciting time of delivering LFortran, it is becoming easier and easier to compile new codes and it is a lot of fun to work on a compiler and learn how it works. We will teach you all the skills needed.
Acknowledgements
We want to thank:
- NumFOCUS
- QuantStack
- Google Summer of Code
- Sovereign Tech Fund (STF)
- GSI Technology
- LANL
- Our GitHub, OpenCollective and NumFOCUS sponsors
- All our contributors (67 so far!)