Last active
November 29, 2025 18:13
-
-
Save kuba-orlik/999ca634dba613ba6a1c to your computer and use it in GitHub Desktop.
Convert BSTR to std::string
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #include "stdafx.h" | |
| #include <wtypes.h> | |
| #include <comutil.h> | |
| #pragma comment(lib,"comsuppw.lib") | |
| #include <string> | |
| #include <string.h> | |
| #include <stdio.h> | |
| using namespace std; | |
| string bstr_to_str(BSTR source){ | |
| //source = L"lol2inside"; | |
| _bstr_t wrapped_bstr = _bstr_t(source); | |
| int length = wrapped_bstr.length(); | |
| char* char_array = new char[length]; | |
| strcpy_s(char_array, length+1, wrapped_bstr); | |
| return char_array; | |
| } | |
| int _tmain(int argc, _TCHAR* argv[]){ | |
| BSTR bstr_var = SysAllocString(L"I am bstr"); | |
| string str = bstr_to_str(bstr_var); | |
| printf("result: %s\n", str.c_str());//result: I am bstr | |
| getchar(); | |
| return 0; | |
| } | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
this handling is problematic AFAICS.
While
BSTRis wide-typed (thus: Unicode-compliant; UTF-16 / UTF-32), this post-conversionstd::stringcontent most likely is not Unicode-compliant (...any more!!!), since: ACP (activeCodePage) legacy codepages crap (non-compliance decay site is:_bstr_tquery at strcpy_s() line).However, any byte-typed string representation pretty much MUST be using UTF-8 encoding (
std::string[-means-utf8]), else not PROPERLY CONSISTENTLY PRESERVING *) Unicode compliance (a common restriction/weakness of byte-typed encodings) ===> DATA CORRUPTION bug level type very easily ensuing.(woefully regional-specific-restricted codepages encoding protocol crap MUST NOT be used - unless actually required: to correctly fulfill existing established legacy protocol situations)
https://utf8everywhere.org/
*) firmly consistently end-to-end(!!)
Question would be how transcoding (here: UTF-16 -> UTF-8) would ideally be done then.
Perhaps consume ATL dependency (atlconv.h).
HOWEVER, WARNING:
Micro$oft atlconv.h transcoding is crappy not protocol-consistent handling (c.f. source comment
// Codepage doesn't matter) -IOW several macro variants which are Win32 T protocol affected,
CA2CTetc. - DO NOT do required transcoding (ACP to UTF-8 is a valid (representable!) **) transcoding transition, thus MUST be carried out - but it isn't!!!! ===> DATA CORRUPTION bug level type).**) well, for most ACP codepages (unless there are codepoints [mapping] support issues), I'd think...
So, probably it is a much better idea to
instead be consuming transcoding functionality that is properly cross-platform and simple (plain Unicode-compliant encodings subset support only) and rather more
std::string-typed-based. E.g.codecvtorboost::locale::convor so._bstr_t::length()consumption might be problematic (!CONSISTENCY) -it possibly (yay MSDN docs crap!!) returns the number of wide-typed elements, which definitely often is NOT what the
_bstr_t::operator char*()side (ACP-transcoded other side?) actually has.===> get
char-typed side ***) then determine its [actual] length (strlen()).***) ...but of course that one is still not Unicode-compliant (since ACP-crap-broken - except probably for ACP UTF-8 config setting situation, in some(!) newer Windows 10/11 environment situations)
This
strcpy_s()handling probably has off-by-1 bug type (buffer allocation 1 less than specified).new char[length]is a MEMORY RESOURCE LEAK bug level type===> probably should assign to an actually named
std::stringvariable, then free raw memory resource, then return thatstd::string.Annoying extra and woefully ALWAYS ****) b0rken since unsafe (often non-RAII - so WhyTH not
std::vector??) external raw allocation activity most likely can be avoided anyway, by usingstd::stringstuff directly -see Create a C++ string using printf-style formatting
****) ample colorful personal experience...
This sample possibly should be explicitly consuming
<tchar.h>, for its_tmain(etc.?) consumption - IWYU.using namespacegenerally is rather not recommended (scope pollution - even up to such relatively restricted/controlled scope situations) - risk of symbol conflicts... (potentially silent! Thus NOT Fail-Fast / Shift-Left)Filesystem item names (here: "BSTR to std-string.cpp") better should not contain special characters such as spaces -
POSIX shell
IFSseparator config default is space, thushaving "nice" effects with
"simple" (not specially customized) shell command execution such as
find|xargs grep FooBTW there is a detailed article explaining various direct (i.e., non-transcoding!!) assignments of string types, at
How to convert between different types of counted-string string types.
Peripheral side note: filesystem API (
boost::filesystem,std::filesystem) interface behaviour/usability is extremely problematic, due toWindows-specific ACP[-broken]-hampered behaviour (c.f. required
u8path/u8stringworkaround helpers; alternative workaround: feeding Unicode-compliant wide-typed data). See e.g.:std::filesystem::path.(and several other Internet activities)
HTH and HAND!