Skip to content

Instantly share code, notes, and snippets.

@kuba-orlik
Last active November 29, 2025 18:13
Show Gist options
  • Select an option

  • Save kuba-orlik/999ca634dba613ba6a1c to your computer and use it in GitHub Desktop.

Select an option

Save kuba-orlik/999ca634dba613ba6a1c to your computer and use it in GitHub Desktop.
Convert BSTR to std::string
#include "stdafx.h"
#include <wtypes.h>
#include <comutil.h>
#pragma comment(lib,"comsuppw.lib")
#include <string>
#include <string.h>
#include <stdio.h>
using namespace std;
string bstr_to_str(BSTR source){
//source = L"lol2inside";
_bstr_t wrapped_bstr = _bstr_t(source);
int length = wrapped_bstr.length();
char* char_array = new char[length];
strcpy_s(char_array, length+1, wrapped_bstr);
return char_array;
}
int _tmain(int argc, _TCHAR* argv[]){
BSTR bstr_var = SysAllocString(L"I am bstr");
string str = bstr_to_str(bstr_var);
printf("result: %s\n", str.c_str());//result: I am bstr
getchar();
return 0;
}
@PrithiviRathinam
Copy link

Thank you

@andim2
Copy link

andim2 commented Sep 20, 2025

Hi,

this handling is problematic AFAICS.

While BSTR is wide-typed (thus: Unicode-compliant; UTF-16 / UTF-32), this post-conversion std::string content most likely is not Unicode-compliant (...any more!!!), since: ACP (activeCodePage) legacy codepages crap (non-compliance decay site is: _bstr_t query at strcpy_s() line).
However, any byte-typed string representation pretty much MUST be using UTF-8 encoding (std::string[-means-utf8]), else not PROPERLY CONSISTENTLY PRESERVING *) Unicode compliance (a common restriction/weakness of byte-typed encodings) ===> DATA CORRUPTION bug level type very easily ensuing.
(woefully regional-specific-restricted codepages encoding protocol crap MUST NOT be used - unless actually required: to correctly fulfill existing established legacy protocol situations)
https://utf8everywhere.org/

*) firmly consistently end-to-end(!!)

Question would be how transcoding (here: UTF-16 -> UTF-8) would ideally be done then.
Perhaps consume ATL dependency (atlconv.h).
HOWEVER, WARNING:
Micro$oft atlconv.h transcoding is crappy not protocol-consistent handling (c.f. source comment // Codepage doesn't matter) -
IOW several macro variants which are Win32 T protocol affected, CA2CT etc. - DO NOT do required transcoding (ACP to UTF-8 is a valid (representable!) **) transcoding transition, thus MUST be carried out - but it isn't!!!! ===> DATA CORRUPTION bug level type).
**) well, for most ACP codepages (unless there are codepoints [mapping] support issues), I'd think...
So, probably it is a much better idea to
instead be consuming transcoding functionality that is properly cross-platform and simple (plain Unicode-compliant encodings subset support only) and rather more std::string-typed-based. E.g.
codecvtor boost::locale::conv or so.

_bstr_t::length() consumption might be problematic (!CONSISTENCY) -
it possibly (yay MSDN docs crap!!) returns the number of wide-typed elements, which definitely often is NOT what the _bstr_t::operator char*() side (ACP-transcoded other side?) actually has.
===> get char-typed side ***) then determine its [actual] length (strlen()).
***) ...but of course that one is still not Unicode-compliant (since ACP-crap-broken - except probably for ACP UTF-8 config setting situation, in some(!) newer Windows 10/11 environment situations)

This strcpy_s() handling probably has off-by-1 bug type (buffer allocation 1 less than specified).

new char[length] is a MEMORY RESOURCE LEAK bug level type
===> probably should assign to an actually named std::string variable, then free raw memory resource, then return that std::string.

Annoying extra and woefully ALWAYS ****) b0rken since unsafe (often non-RAII - so WhyTH not std::vector??) external raw allocation activity most likely can be avoided anyway, by using std::string stuff directly -
see Create a C++ string using printf-style formatting
****) ample colorful personal experience...

This sample possibly should be explicitly consuming <tchar.h>, for its _tmain (etc.?) consumption - IWYU.

using namespace generally is rather not recommended (scope pollution - even up to such relatively restricted/controlled scope situations) - risk of symbol conflicts... (potentially silent! Thus NOT Fail-Fast / Shift-Left)

Filesystem item names (here: "BSTR to std-string.cpp") better should not contain special characters such as spaces -
POSIX shell IFS separator config default is space, thus
having "nice" effects with
"simple" (not specially customized) shell command execution such as
find|xargs grep Foo

BTW there is a detailed article explaining various direct (i.e., non-transcoding!!) assignments of string types, at
How to convert between different types of counted-string string types.

Peripheral side note: filesystem API (boost::filesystem, std::filesystem) interface behaviour/usability is extremely problematic, due to
Windows-specific ACP[-broken]-hampered behaviour (c.f. required u8path / u8string workaround helpers; alternative workaround: feeding Unicode-compliant wide-typed data). See e.g.:

HTH and HAND!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment