tl;dr; Implementing SwiftLint using SwiftSyntax instead of SourceKitten would make it run over 20x slower 😭
Update: Since writing this post, I learnt that SwiftSyntax’s upcoming byte tree deserialization mode will speed this up considerably. I hope to post a follow-up article on this shortly.
I have for some time been looking forward to reimplementing some of SwiftLint’s simpler syntax-only rules with SwiftSyntax. If you’re not familiar with it, the recent NSHipster article gives a great overview. My motivation for integrating it into SwiftLint was that it would be nice to use an officially maintained library directly to obtain the syntax tree rather than the open source but community-maintained SourceKitten library. I was also under the false impression that SwiftSyntax would be significantly faster than SourceKit/SourceKitten.
SourceKitten gets its syntax tree by dynamically loading SourceKit and making cross-process XPC calls to a SourceKit daemon. In a typical uncached lint run, SwiftLint spends a significant amount of time waiting on this syntax tree for each file being linted. Because SwiftSyntax is code-generated from the same syntax definition files as the Swift compiler, I had (incorrectly) assumed that calculating a Swift file’s syntax tree using SwiftSyntax was done entirely in-process by the library, which would have lead to significant performance gains by avoiding the cross-process XPC call made by SourceKitten for equivalent functionality.
In reality, SwiftSyntax delegates all parsing & lexing to the
swiftc binary, launching the process, reading its output from stdout and deserializing the JSON response into its
SourceFileSyntax Swift type. This is repeated for each file being parsed 😱.
Launching a new instance of the Swift compiler for each file parsed is orders of magnitude slower than SourceKitten’s XPC call to a long-lived SourceKit daemon.
I discovered this after reimplementing a very simple SwiftLint rule with a SwiftSyntax-based implementation: Fallthrough. This opt-in rule is a perfect proof-of-concept for integrating SwiftSyntax into SwiftLint because it literally just finds all occurrences of the
fallthrough keyword and reports a violation at that location. I measured the time it took to lint a folder of ~100 Swift files from Lyft’s iOS codebase with only the
fallthrough rule whitelisted.
1 2 3 4 5
I compiled both SwiftLint from
master and again with this
fallthrough-swift-syntax branch with
swift build -c release and named the binaries
swiftlint-swift-syntax. I then benchmarked both binaries using the excellent hyperfine utility.
1 2 3 4 5 6 7 8 9 10 11 12
The SwiftSyntax version was 22x slower than the existing SourceKitten version
Note that I ran SwiftLint with its caching mechanism and logging disabled to accurately measure the time it took just to perform the lint, rather than the overhead from logging or skipping the lint entirely by just returning cached results. Although logging only added 3ms to 10ms in my tests.
Ultimately, this means SwiftLint will be keeping its SourceKitten-based implementation for the foreseeable future, unless SwiftSyntax removes its reliance on costly compiler invocations and drastically improves its performance. I really hope the Swift team can somehow find a way to move parsing and lexing into SwiftSyntax itself, making the library much more appealing to use.