29 December 2020
Rust normalizes line breaks
I came across an interesting detail when I worked with Rust. The Rust compiler normalizes line breaks in string literals and even in raw strings and byte strings. The Windows-style
\r\n is turned into
\n on all platforms:
In addition to that, the ancient Mac-style bare
\r line break is not even allowed, causing compilation to fail with the error
bare CR not allowed in string, use \r instead for string literals and byte strings and the error
bare CR not allowed in raw string for raw strings.
Interestingly, C++ raw strings also normalize line breaks. Both
\r are converted to
\n without warning.
On one hand, I understand the design decision to normalize line breaks because people clone the code on various platforms. By default, git automatically converts line breaks on Windows. This would lead to programs behaving differently on different operating systems, for example, a two-line string literal on Windows would be longer by a byte, which can have many consequences in the execution of the program.
On the other hand, this behavior is extremely unintuitive. It means that I cannot represent multiline strings with Windows-style line breaks inside raw strings. It also means that the characters inside a raw string, as seen in the user's text editor, aren't exactly the same as the contents of the resulting byte buffer, violating all principles of simple and intuitive design. As someone who almost exclusively writes cross-platform code, this feature cost me more time than it saved. This normalization is an attempt to address the breakage caused by tools that automatically convert line breaks, but two wrongs don't make a right.
(no comments yet)