From: Martin B. on
Hi all.

( Note that I've posted this to microsoft.public.vc.language a few days
ago, but didn't really get an answer there, so I hope someone may shed
some light on this in a more general context. )

The Visual Studio compiler will never inline a funtion that returns an
unwindable object (e.g. std::string, CString, etc.)
This is also true for a simple getter function that only contains one
return statement.

Does anyone know why this is and what other compilers do in such a case?
(I tried to find out for GCC, but didn't find any docs.)

While the details below are for VC, I think the question is really
interesting for all C++ developers.

See details below.

cheers,
Martin

-------- Original Message --------
Subject: inlining of functions returning an unwindable object -- rationale?
Date: Thu, 19 Nov 2009 10:01:11 +0100
From: Martin B. <0xCDCDCDCD(a)gmx.at>
Newsgroups: microsoft.public.vc.language

Greetings.

The Visual Studio compiler will never inline a funtion that returns an
unwindable object (e.g. std::string, CString, etc.)

(See documentation of the inline and __forceinline keyword and the doc
for C4714:
http://msdn.microsoft.com/en-us/library/a98sb923.aspx ,
http://msdn.microsoft.com/en-us/library/z8y1yy88.aspx
)

Can anyone provide a rationale for this? It seems quite weird to me.

Consider this example:
class Simple {
public:
std::string s_;

public:
Simple()
: s_("test")
{ }

std::string get() {
return s_;
}
....

void testsimple()
{
Simple oSimple;
std::string s1( oSimple.get() );
....

void testsimple2()
{
Simple oSimple;
std::string s1( oSimple.s_ );
....

This will always, no matter what, generate a call to get(). (If you
specifiy __forceinline and activate C4714 you'll get that warning)

Find the assembly below, of which I do not claim to understand much, but
it certainly doesn't seem to me as if there's any reason for this.
Especially consider the case where the member is accessed directly. The
calls to the string related functions are exactly the same!
That is, both version will call string functions in this order:
1) string::string (ctor of Simple)
2) string::string (ctor of s1)
3) string::~string
4) string::~string

So what's the deal with not inlining such a simple getter function ??

Find the assembly (VS 2005 / VC8) below.

cheers,
Martin


***************

==>
void testsimple()
{
00401150 push ebp
00401151 mov ebp,esp
00401153 push 0FFFFFFFFh
00401155 push offset __ehhandler$?testsimple@@YAXXZ (402211h)
0040115A mov eax,dword ptr fs:[00000000h]
00401160 push eax
00401161 sub esp,3Ch
00401164 mov eax,dword ptr [___security_cookie (405004h)]
00401169 xor eax,ebp
0040116B mov dword ptr [ebp-10h],eax
0040116E push eax
0040116F lea eax,[ebp-0Ch]
00401172 mov dword ptr fs:[00000000h],eax
Simple oSimple;
00401178 push offset string "test" (403194h)
0040117D lea ecx,[ebp-2Ch]
00401180 call dword ptr
[__imp_std::basic_string<char,std::char_traits<char>,std::allocator<char>
>::basic_string<char,std::char_traits<char>,std::allocator<char> >
(403094h)]
00401186 mov dword ptr [ebp-4],0
std::string s1( oSimple.get() );
0040118D lea eax,[ebp-48h]
00401190 push eax
00401191 lea ecx,[ebp-2Ch]
00401194 call Foo::get (401380h)
}
00401199 lea ecx,[ebp-48h]
0040119C call dword ptr
[__imp_std::basic_string<char,std::char_traits<char>,std::allocator<char>
>::~basic_string<char,std::char_traits<char>,std::allocator<char> >
(40308Ch)]
004011A2 mov dword ptr [ebp-4],0FFFFFFFFh
004011A9 lea ecx,[ebp-2Ch]
004011AC call dword ptr
[__imp_std::basic_string<char,std::char_traits<char>,std::allocator<char>
>::~basic_string<char,std::char_traits<char>,std::allocator<char> >
(40308Ch)]
004011B2 mov ecx,dword ptr [ebp-0Ch]
004011B5 mov dword ptr fs:[0],ecx
004011BC pop ecx
004011BD mov ecx,dword ptr [ebp-10h]
004011C0 xor ecx,ebp
004011C2 call __security_check_cookie (4019D6h)
004011C7 mov esp,ebp
004011C9 pop ebp
004011CA ret

==>
std::string get() {
00401380 push ebp
00401381 mov ebp,esp
00401383 sub esp,8
00401386 mov dword ptr [ebp-8],ecx
00401389 mov dword ptr [ebp-4],0
return s_;
00401390 mov eax,dword ptr [this]
00401393 push eax
00401394 mov ecx,dword ptr [ebp+8]
00401397 call dword ptr
[__imp_std::basic_string<char,std::char_traits<char>,std::allocator<char>
>::basic_string<char,std::char_traits<char>,std::allocator<char> >
(403090h)]
0040139D mov ecx,dword ptr [ebp-4]
004013A0 or ecx,1
004013A3 mov dword ptr [ebp-4],ecx
004013A6 mov eax,dword ptr [ebp+8]
}
004013A9 mov esp,ebp
004013AB pop ebp
004013AC ret 4


**************************

==>
void testsimple2()
{
004011D0 push ebp
004011D1 mov ebp,esp
004011D3 push 0FFFFFFFFh
004011D5 push offset __ehhandler$?testsimple2@@YAXXZ (4022CEh)
004011DA mov eax,dword ptr fs:[00000000h]
004011E0 push eax
004011E1 sub esp,3Ch
004011E4 mov eax,dword ptr [___security_cookie (405004h)]
004011E9 xor eax,ebp
004011EB mov dword ptr [ebp-10h],eax
004011EE push eax
004011EF lea eax,[ebp-0Ch]
004011F2 mov dword ptr fs:[00000000h],eax
Simple oSimple;
004011F8 push offset string "test" (403194h)
004011FD lea ecx,[ebp-2Ch]
00401200 call dword ptr
[__imp_std::basic_string<char,std::char_traits<char>,std::allocator<char>
>::basic_string<char,std::char_traits<char>,std::allocator<char> >
(403094h)]
00401206 mov dword ptr [ebp-4],0
std:string s1( oSimple.s_ );
0040120D lea eax,[ebp-2Ch]
00401210 push eax
00401211 lea ecx,[ebp-48h]
00401214 call dword ptr
[__imp_std::basic_string<char,std::char_traits<char>,std::allocator<char>
>::basic_string<char,std::char_traits<char>,std::allocator<char> >
(403090h)]
}
0040121A lea ecx,[ebp-48h]
0040121D call dword ptr
[__imp_std::basic_string<char,std::char_traits<char>,std::allocator<char>
>::~basic_string<char,std::char_traits<char>,std::allocator<char> >
(40308Ch)]
00401223 mov dword ptr [ebp-4],0FFFFFFFFh
0040122A lea ecx,[ebp-2Ch]
0040122D call dword ptr
[__imp_std::basic_string<char,std::char_traits<char>,std::allocator<char>
>::~basic_string<char,std::char_traits<char>,std::allocator<char> >
(40308Ch)]
00401233 mov ecx,dword ptr [ebp-0Ch]
00401236 mov dword ptr fs:[0],ecx
0040123D pop ecx
0040123E mov ecx,dword ptr [ebp-10h]
00401241 xor ecx,ebp
00401243 call __security_check_cookie (401A66h)
00401248 mov esp,ebp
0040124A pop ebp
0040124B ret

************************

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: George Neuner on
On Tue, 24 Nov 2009 10:13:09 CST, "Martin B." <0xCDCDCDCD(a)gmx.at>
wrote:

>Hi all.
>
>( Note that I've posted this to microsoft.public.vc.language a few days
>ago, but didn't really get an answer there, so I hope someone may shed
>some light on this in a more general context. )
>
>The Visual Studio compiler will never inline a funtion that returns an
>unwindable object (e.g. std::string, CString, etc.)

Documented.

>This is also true for a simple getter function that only contains one
>return statement.

No it isn't. See below.


>Does anyone know why this is ...

The first problem is, your code is not C++ but "managed C++", which
has its own rules - note the references to ___security_cookie() and
__security_check_cookie() in your disassembly, those won't appear in
regular C++ code. VC++ won't inline any function that has a CLR
security descriptor because the security check could throw an
exception. If you want to have the best approximation of C++, make
sure your project is "ATL", "MFC" or "Win32" - "CLR" projects produce
managed C++.


As for non-throwing C++, if you change the type of s_ and add some use
of the results so all the functions aren't optimized completely away
as in the following:

class MYPOINT
{
public:
int x;
int y;
public:
MYPOINT( int i, int j ) { x = i; y = j; };
MYPOINT operator=( MYPOINT src ) { return MYPOINT(src.x,src.y);
};
};

class Simple
{
public:
MYPOINT s_;

public:
Simple()
: s_(3,4)
{ };

MYPOINT get() { return s_; };
};


void testsimple()
{
Simple oSimple;
MYPOINT s1( oSimple.get() );
cout << s1.x;
}


int main()
{
testsimple();
}


.... you'll see from the disassembly below that, in testsimple(), the
calls to the constructor and to get() have been completely elided and
the constant value of the constructor argument has been passed through
directly to the stream write call. This was compiled with VC++08 in
release mode with /O2/Ob1/Oi/Ot. Note that even though "inline
declarations only (/Ob1)" was specified, the compiler ended up
inlining the constructor and get() anyway ... likely as the result of
constant propagation and dead code removal.


***************
PUBLIC ?testsimple@@YAXXZ ; testsimple
; Function compile flags: /Ogtpy
; COMDAT ?testsimple@@YAXXZ
_TEXT SEGMENT
?testsimple@@YAXXZ PROC ; testsimple, COMDAT

; 47 : {

00000 51 push ecx

; 48 : Simple oSimple;
; 49 : MYPOINT s1( oSimple.get() );
; 50 : cout << s1.x;

00001 6a 03 push 3
00003 e8 00 00 00 00 call
??6?$basic_ostream(a)DU?$char_traits@D(a)std@@@std@@QAEAAV01@H@Z ;
std::basic_ostream<char,std::char_traits<char> >::operator<<
00008 59 pop ecx

; 51 : }

00009 c3 ret 0
?testsimple@@YAXXZ ENDP ; testsimple

***************


>... and what other compilers do in such a case?
>(I tried to find out for GCC, but didn't find any docs.)

You'd have to try them.

George

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: George Neuner on
On Tue, 24 Nov 2009 14:40:52 CST, George Neuner <gneuner2(a)comcast.net>
wrote:

> MYPOINT operator=( MYPOINT src ) { return MYPOINT(src.x,src.y);

Ignore this. I had started to do something more complicated and then
realized I didn't need to ... I just quickly dashed off something that
compiled forgot about it.

George

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Martin B. on
George Neuner wrote:
> On Tue, 24 Nov 2009 10:13:09 CST, "Martin B." <0xCDCDCDCD(a)gmx.at>
> wrote:
>> (...)
>> The Visual Studio compiler will never inline a funtion that returns an
>> unwindable object (e.g. std::string, CString, etc.)
>
> Documented.
>
>> This is also true for a simple getter function that only contains one
>> return statement.
>
> No it isn't. See below.
>

Yes it is. See below.

>
>> Does anyone know why this is ...
>
> The first problem is, your code is not C++ but "managed C++", which
> has its own rules - note the references to ___security_cookie() and
> __security_check_cookie() in your disassembly, those won't appear in
> regular C++ code. (...) If you want to have the best approximation of C++, make
> sure your project is "ATL", "MFC" or "Win32" - "CLR" projects produce
> managed C++.
>

Incorrect. The security_cookie is generated by the /GS option (see:
http://msdn.microsoft.com/en-us/library/8dbf701c%28VS.80%29.aspx) adding
/GS- to the commandline will remove these references from the disassembly.

FWIW, here are my settings (unchanged from the OP except for /GS-):
General -
Config. Type: Application (.exe)
Character Set: Unicode
CLR support: *No* CLR support
C++ Optimization - (I have only enabled inlining so I do not have to
care about the compiler optimizing away unused code)
Opt: Custom
Inlining: /Ob2
Whole Program Opt.: No
C++ Language -
Disable Language Extensions: *Yes* (/Za)
Enable RTTI: Yes

The commandline looks like this:
/Ob2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /FD
/EHsc /MD /Za /Yu"stdafx.h" /Fp"Release\inline_opt.pch" /Fo"Release\\"
/Fd"Release\vc80.pdb" /W3 /nologo /c /Zi /TP /errorReport:prompt
plus: /w34714 /GS-

>
> As for non-throwing C++, if you change the type of s_ and add some use
> of the results so all the functions aren't optimized completely away
> as in the following:
>
> class MYPOINT
> {
> public:
> int x;
> int y;
> public:
> MYPOINT( int i, int j ) { x = i; y = j; };
> MYPOINT operator=( MYPOINT src ) { return MYPOINT(src.x,src.y);
> (...)

I have checked and your example does *not* contain an *unwindable*
object. MYPOINT has an empty dtor, because it contains only int members.
Adding a non-empty dtor to mypoint, for example:
MYPOINT::~MYPOINT() {
cout << "This is a non-empty dtor\n";
}
will effectively prevent MYPOINT get() from being inlined.


And so I repeat my question to the general audience: Why?? :-)

cheers,
Martin

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Goran on
On Nov 24, 9:40 pm, George Neuner <gneun...(a)comcast.net> wrote:
> On Tue, 24 Nov 2009 10:13:09 CST, "Martin B." <0xCDCDC...(a)gmx.at>
> wrote:
>
> >Hi all.
>
> >( Note that I've posted this to microsoft.public.vc.language a few days
> >ago, but didn't really get an answer there, so I hope someone may shed
> >some light on this in a more general context. )
>
> >The Visual Studio compiler will never inline a funtion that returns an
> >unwindable object (e.g. std::string, CString, etc.)
>
> Documented.
>
> >This is also true for a simple getter function that only contains one
> >return statement.
>
> No it isn't. See below.
>
> >Does anyone know why this is ...
>
> The first problem is, your code is not C++ but "managed C++", which
> has its own rules - note the references to ___security_cookie() and
> __security_check_cookie() in your disassembly, those won't appear in
> regular C++ code. VC++ won't inline any function that has a CLR
> security descriptor because the security check could throw an
> exception. If you want to have the best approximation of C++, make
> sure your project is "ATL", "MFC" or "Win32" - "CLR" projects produce
> managed C++.

AFAIK, "security cookie" code is the result of /GS option and that is
available for native compilation. Note also that Martin's example only
uses std::string, so no CLR is needed there, and I'd be very surprised
if he had that in his options.

That said, could this have something to do with compiler not knowing
the implementation of ~string, hence forcing a function call (not that
I see a connection)? As you've shown, when destructor is visible to
the compiler, all is indeed inline (but as usual with such examples,
one has to make sure that some unseen optimization doesn't play with
"expected" results).

Goran.


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]