为什么不使用字符串视图代替传递const std::wstring&?
Use string views instead of passing std:wstring by const&

原始链接: https://giodicanio.com/2024/05/14/why-dont-you-use-string-views-like-std-wstring_view-instead-of-passing-std-wstring-by-const-reference/

作者解释了为什么盲目地从 `const std::wstring&` 切换到 `std::wstring_view` 并不总是“现代 C++”的改进,尤其是在与 Win32 API 交互时。虽然通常建议使用字符串视图以避免复制,但它们**不保证空终止**。 许多 Win32 C 风格的 API 需要空终止字符串 (PCWSTR),而 `std::wstring` 通过其 `data()` 方法*保证*这一点。`std::wstring_view::data()` 不提供这种保证,可能导致错误。 作者提倡在这些情况下继续使用 `const std::wstring&`。此外,他们建议在需要空终止字符串时使用 `c_str()` 而不是 `data()`,因为 `c_str()` 对于 `wstring_view` 来说是不可用的,如果有人错误地尝试转换,这将创建一个编译时错误——这比运行时错误更好。这种方法为防止错误的“现代化”努力提供了安全保障。

一个 Hacker News 的讨论集中在 C 风格字符串(空字符结尾的字符数组)的效率低下问题上,并提倡使用字符串视图,尤其是在使用 `std::wstring` 时。 核心论点是 C 风格字符串是错误的根本来源——影响性能、正确性和安全性——因为需要遍历字符串来查找空终止符。评论者希望 Pascal 字符串(长度前缀字符串)成为标准,并建议长度前缀可以特定于平台,以适应不同的系统架构(16、32 或 64 位)。这将消除确定字符串长度的 O(n) 遍历。 本质上,讨论强调了现代字符串处理技术相对于从 C 继承的传统方法的优势。
相关文章

原文

Thank you for the suggestion. But *in that context* that would cause nasty bugs in my code, and in code that relies on it.

…Because (in the given context) that would be wrong 🙂

This “suggestion” comes up with some frequency…

The context is this: I have some Win32 C++ code that takes input string parameters as const std::wstring&, and someone suggests me to substitute those wstring const reference parameters with string views like std::wstring_view. This is usually because they have learned from someone in some course/video course/YouTube video/whatever that in “modern” C++ code you should use string views instead of passing string objects via const&. [Sarcastic mode on]Are you passing a string via const&? Your code is not modern C++! You are such an ignorant C++98 old-style C++ programmer![Sarcastic mode off] 😉

(There are also other “gurus” who say that in modern C++ you should always use exceptions to communicate error conditions. Yeah… Well, that’s a story for another time…)

So, Thank you for the suggestion, but using std::wstring_view instead of const std::wstring& in that context would introduce nasty bugs in my C++ code (and in other people’s code that relies on my own code)! So, I won’t do that!

In fact, my C++ code in question (like WinReg) talks to some Win32 C-style APIs. These expect PCWSTR as input parameters representing Unicode UTF-16 strings. A PCWSTR is basically a typedef for a _Null_terminated_ const wchar_t*. The key here is the null termination part.

If you have:

// Input string passed via const&.
//
// Someone suggests me to replace 'const wstring &' 
// with wstring_view:
//
//     void DoSomething(std::wstring_view s, ...)
//
void DoSomething(const std::wstring& s, ...)
{
    // This API expects input string as PCWSTR,
    // i.e. _null-terminated_ const wchar_t*.
    SomeWin32Api(s.data(), ...); // <-- See the P.S. later
}

std::wstring guarantees that the pointer returned by the wstring::data() method points to a null-terminated string.

On the other hand, invoking std::wstring_view::data() does not guarantee that the returned pointer points to a null-terminated string. It may, or may not. But there is no guarantee!

So, since [w]string_views are not guaranteed to be null-terminated, using them with Win32 APIs that expect null-terminated strings is totally wrong and a source of nasty bugs.

So, if your target are Win32 API calls that expect null-terminated C-style strings, just keep passing good old std::wstring by const&.

Bonus Reading: The Case of string_view and the Magic String


P.S. Invoking data() vs. c_str() – To make things clearer (and more bug-resistant), when you need a null-terminated C-style string pointer as input parameter, it’s better to invoke the c_str() method on [w]string (instead of the data() method), as there is no corresponding c_str() method available with [w]string_view.

In this way, if someone wants to “modernize” the existing C++ code and tries to change the input string parameter from [w]string const& to [w]string_view, they get a compiler error when the c_str() method is invoked in the modified code (as there is no c_str() method available for string views). It’s much better to get a compile-time error than a subtle run-time bug!

On the other hand, the data() method is available for both strings and string views, but its guarantees about null-termination are different for strings vs. string views.

So, invoking the string’s c_str() method (instead of the data() method) is what I suggest when passing STL strings to Win32 API calls that expect C-style null-terminated string pointers as input (read-only) parameters. I consider this a best practice.

(Of course, if the C-interface API function needs to write to the provided string buffer, the data() method must be invoked, as it’s overloaded for both the const and non-const cases.)

联系我们 contact @ memedata.com