Swift 5.0.x String Performance
Turns out, in Swift 5.0.x,
String(bytes:encoding:) are not “optimised” compared to
Which String initializer are you using, specifically?— David Smith (@Catfish_Man) May 16, 2019
String(utf8String:) is a Foundation extension and looking at the commit history, it wasn’t optimized to avoid bridging until 5.1 (commit a088e1322495).
String(validatingUTF8:) should be a ton faster in 5.0.x.
In other words. In Swift 5.0.x, any
String.append(:) calls can potentially take a performance hit.
It’s a bridged NSString, so in the current model either reserveCapacity or the actual append will have to convert to a native Swift String, which will (in 5.0.x) incur O(n) CFStringGetCharacterAtIndex calls for some specific types of NSString.— David Smith (@Catfish_Man) May 16, 2019
NSTextView.string.append(:) operations will also suffer.
This is what that performance hit looks like in my case, where a
DispatchSourceRead is feeding a
NSTextView every time there are bytes available to read.
NSTextView.textStorage.append(:) and convert your
String to an
This is what the performance looks like after the code change.
Couldn’t be more grateful for David Smith responding to my request as well as Marcin Krzyzanowski and Paul Goracke for nudging me to take a look at
This had the potential to be a huge time sink.
Swift 5.1 will be released sometime after March 18, 2019.
Started noticing some crashes on
String(validatingUTF8:) with a
Fatal error: UnsafeMutablePointer.initialize overlapping range when using it with an
UnsafePointer<CChar>. Instead using the implementation as defined in the
Discussion under the
String documentation. Will report back.
Fatal error: UnsafeMutablePointer.initialize overlapping range. I hate having bad code around, even if it’s bad sample code. It’s been a learning experience for sure.
String(validatingUTF8:) has a requirement that the “cString is A pointer to a null-terminated UTF-8 code sequence.”. This code on the other hand makes no guarantees that this will be the case.
Sure, it initialises the character array with 0 (i.e. null1) but
read may fill that array and not end with a 0.
Back to the
String(validatingUTF8:) initialiser. If you look at the source, it uses
UTF8._nullCodeUnitOffset(in:) which “Is an equivalent of
strlen for C-strings” which gets the length of the string (based on the presence of a null terminating character of course, we are going deep in C now). I was guessing that
strlen takes a trip down memory lane© looking for that null terminating character and ends up way beyond an “acceptable” length. What is acceptable you say?
For that we have to take a look at the source code of
So I decided to take the red pill, go down the rabbit hole and see for myself.
Welcome to the Matrix2.
“Remember…all I’m offering is the truth. Nothing more.”