How many of you use C for game programming?

Codeloader · 2011-03-16T22:42:34

I'm very curious. I use it myself because of its simplicity and less haggling with accessor functions, member access, etc. Though many may not agree with me. I'm just curious about what people's take on this is.

GDNet Lounge Community

Started by Codeloader_Dev January 24, 2011 04:33 AM

107 comments, last by Washu 13 years, 6 months ago

agottem

March 16, 2011 04:25 AM

But it just did! It took code that was requesting "i++" to be performed (increment and return previous), but the compiler decided that a simple increment (with no return) would work just as well.

Yes, the compiler performed an i++, not a ++i. This was precisely my point, you seemed to have missed it.

In other words "an optimising compiler can fix this mistake in simple cases".[/quote]

I don't think you've been following the conversation. I already said this much.

What you seemed to have missed is that there are cases where even an optimizing compiler cannot turn i++ into the equivalent of ++i. For example, if the definition of i++ isn't known (e.g., not defined in a header file), it won't be able to optimize out the unused return value.

Alternatively, if the definition of i++ *is* complex, the compiler may be unable to inline appropriately.

Alternatively again, if the compiler is setup to optimize for size, and not for speed, it may turn those inline methods into function calls. In which case, the compiler won't inline and optimize the call to i++, and again, you'll be paying for the overhead of using a post-increment instead of a pre-increment.

Of course in reality you'd just use a raw-pointer instead of some abstract iterator data type, in which case p++ or ++p will optimise to the same assembly[/quote]
Yes, in C, p++ and ++p turn into the same assembly because you are working with a primitive type. You are guaranteed the compiler will always have 'the function definition' for ++p or p++ when working with a primitive type. Iterators and overloaded increment operators are hardly primitive types, and you certainly can't guarantee the compiler will have access to the definition in order to optimize it.

Again, a C++ developer is forced to pay attention to small syntax nonsense that a C developer can ignore.

Daaark

3,556

March 16, 2011 04:59 AM

To provide a totally balanced counter-point, you should also make a video on youtube that starts with the quote 'Trust Me, I can manage your memory!", followed by the Java loading icon animating for 10 minutes

Fair enough. The Java VM is a giant POS.

Memory management ends up being imperfect most of the time. But it's much worse when left in the hands of the random programmer who thinks he's a lot more clever than he actually is. Everyone wants to roll their own solution, and usually to disastrous results.
[font="arial, verdana, tahoma, sans-serif"] [/font][font="arial, verdana, tahoma, sans-serif"]We've had it pretty good in the last few years that .Net, STL, and improved programming practices have come into their own and are becoming common place. Average software stability has come a long way. But I think people forget what it was like before that.[/font][font="arial, verdana, tahoma, sans-serif"] [/font]
[font="arial, verdana, tahoma, sans-serif"][/font]
I grew up in the days when every program was plagued with memory errors. When games full of bad allocations were programmed on top of 32 bit extenders, and memory managers with similar errors. (All these programmers seem to work at Bioware, Activision and especially Abode now...)

[font="arial, verdana, tahoma, sans-serif"] [/font][font="arial, verdana, tahoma, sans-serif"]Basic memory handling is easy. But it gets much harder as program complexity increases, and it turns into a tight-rope walk. One mis-step and you're a goner.

[/font]
[font="arial, verdana, tahoma, sans-serif"] [/font]
[font="arial, verdana, tahoma, sans-serif"](forum keeps eating my new lines)[/font]

Twitter :: DeviantArt

Hodgman

52,718

March 16, 2011 05:30 AM

[quote name='Hodgman' timestamp='1300243460' post='4786326']In other words "an optimising compiler can fix this mistake in simple cases".

I don't think you've been following the conversation. I already said this much.[/quote]Thanks to the magic of the scroll wheel, we can both go back up the conversion and see that I said that much and you disagreed:

[quote name='Hodgman' timestamp='1300168980' post='4785935']6. Again, this is something that separates a junior C++ programmer from an experienced one. Not an STL issue. Plus an optimising compiler can fix this mistake in simple cases.

A compiler can absolutely NOT fix that mistake.[/quote]So... having now come full circle and argued for my original line that spawned this thread of debate, you seem to be arguing for the sake of arguing.
I get it. C++ is too complicated. Operator overloading makes things too hard. Ok.

What you seemed to have missed is that there are cases where even an optimizing compiler cannot turn i++ into the equivalent of ++i. [/quote]Um, no, I only mentioned simple cases. The fact that I specifically mentioned "simple cases" should imply that more complex cases will defeat the optimiser...

I also mentioned that using the correct operator is something that separates a junior C++ programmer from an experienced one - once you've learnt what the operators do, then using the right one is the same as choosing "+" over "^" or "*" in cases where you want to do an addition. Yes, these particular operators (i++/++i) are an area where C programmers (or junior C++ programmers) often use the wrong one, though it's a very easy lesson to correct that behaviour.
For example, if the definition of i++ isn't known (e.g., not defined in a header file), it won't be able to optimize out the unused return value.[/quote]Again, it sounds like you need to upgrade your compiler/linker, because mine can do that (LTCG) and has been for 9 years.

. 22 Racing Series .

agottem

March 16, 2011 02:58 PM

Thanks to the magic of the scroll wheel, we can both go back up the conversion and see that I said that much and you disagreed:[quote name='Hodgman' timestamp='1300168980' post='4785935']6. Again, this is something that separates a junior C++ programmer from an experienced one. Not an STL issue. Plus an optimising compiler can fix this mistake in simple cases.

[/quote]I guess we're both having conflicting definitions of 'fix'. The mistake, in my mind, was using 'i++' when '++i' should have been used instead. In order for the compiler to 'fix' that mistake, it'd need to invoke the pre-increment instead of the post-increment operator. I disagree that the compiler can fix the mistake, however, I do agree that in some cases the compiler can hide the mistake.

Also, a verbatim quote from me, prior to your 'Plus an optimizing compiler can fix this mistake in simple cases' statement:

[quote name='agottem']In the case of a vector, the methods for either implementation are simple enough that the compiler can optimize them both to the point of identical assembly. As the iterators become more complicated for the compiler to analyze, or, if the definition is not available to the compiler...you may see less optimal code due using post/pre increment inappropriately.

You keep repeating something I already stated! It doesn't change the fact that the compiler hasn't fixed your mistake, but has merely hidden it.

Now you're changing the subject. The point isn't that an experienced programmer knows when to use which, the point is that it's stupid to have to.

Again, it sounds like you need to upgrade your compiler/linker, because mine can do that (LTCG) and has been for 9 years.

That's nice. See how the compiler does when the definition is in an external DLL, and all you have is an import library to link against. Also, still doesn't change the fact that when optimizing for size, you now have to pay a performance penalty for using the wrong operator. Clearly the compiler is *not* fixing your mistake.

rip-off

11,000

March 16, 2011 07:38 PM

I guess we're both having conflicting definitions of 'fix'. The mistake, in my mind, was using 'i++' when '++i' should have been used instead. In order for the compiler to 'fix' that mistake, it'd need to invoke the pre-increment instead of the post-increment operator. I disagree that the compiler can fix the mistake, however, I do agree that in some cases the compiler can hide the mistake.[/quote]
I don't see the point in this argument. The point here is the result. If the resulting executable is equally fast, then there is no "mistake". If I know the compiler will manage this for me, why should I waste precious brain cycles worrying about it, when I can put them to better use finding and eliminating bottlenecks. Your whole line of argument appears to be counter-productive and just pedantic really.

Optimising for size doesn't change much. Here is the output assembly:
; 13 : // Print post ; 14 : for(std::vector<int>::iterator it = v.begin(); it != v.end(); it++) { mov esi, ebx cmp ebx, edi je SHORT $LN4@main $LL82@main: ; 15 : printf("%d\n", *it); push DWORD PTR [esi] push OFFSET $SG-31 call _printf add esi, 4 pop ecx pop ecx cmp esi, edi jne SHORT $LL82@main $LN4@main: ; 16 : } ; 17 : ; 18 : // Print pre ; 19 : for(std::vector<int>::iterator it= v.begin(); it != v.end(); ++it) { mov esi, ebx cmp ebx, edi je SHORT $LN1@main $LL112@main: ; 20 : printf("%d\n", *it); push DWORD PTR [esi] push OFFSET $SG-32 call _printf add esi, 4 pop ecx pop ecx cmp esi, edi jne SHORT $LL112@main $LN1@main:
Code speaks louder than words. Next time you make an assertion that can be trivially proven using code, kindly do so. It spares me the time debunking your statements.

If you're putting your iterator implementation in a DLL and optimising for size then maybe you aren't too worried about performance after all.

blog | twitter

rip-off

11,000

March 16, 2011 07:47 PM

Now this is interesting, and not something I expected. I changed the container type to std::map:



#include <map>

#include <vector>

#include <iostream>



int main() 

{

	std::map<int, int> v;

	int i;

	while(std::cin >> i) {

		// v.push_back(i);

		v.insert(std::make_pair(i,i));

	}





	// Print post

	for(std::map<int, int>::iterator it = v.begin(); it != v.end(); it++) {

		printf("%d\n", *it);

	}



	// Print pre

	for(std::map<int, int>::iterator it= v.begin(); it != v.end(); ++it) {

		printf("%d\n", *it);

	}





}

When compiling with optimise for code size I got the following:



; 15   : 	// Print post

; 16   : 	for(std::map<int, int>::iterator it = v.begin(); it != v.end(); it++) {



	mov	ecx, DWORD PTR _v$[esp+68]

	mov	eax, DWORD PTR [ecx]

	mov	DWORD PTR _it$31323[esp+64], eax

	jmp	SHORT $LN158@main

$LL88@main:



; 17   : 		printf("%d\n", *it);



	push	DWORD PTR [eax+16]

	push	DWORD PTR [eax+12]

	push	OFFSET $SG-31

	call	_printf

	add	esp, 12					; 0000000cH

	lea	eax, DWORD PTR _it$31323[esp+64]

	call	??E?$_Tree_const_iterator@V?$_Tree_val@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$$CBHH@std@@@2@$0A@@std@@@std@@@std@@QAEAAV01@XZ ; std::_Tree_const_iterator<std::_Tree_val<std::_Tmap_traits<int,int,std::less<int>,std::allocator<std::pair<int const ,int> >,0> > >::operator++

	mov	eax, DWORD PTR _it$31323[esp+64]

	mov	ecx, DWORD PTR _v$[esp+68]

$LN158@main:



; 12   : 	}

; 13   : 

; 14   : 

; 15   : 	// Print post

; 16   : 	for(std::map<int, int>::iterator it = v.begin(); it != v.end(); it++) {



	cmp	eax, ecx

	jne	SHORT $LL88@main



; 18   : 	}

; 19   : 	

; 20   : 	// Print pre

; 21   : 	for(std::map<int, int>::iterator it= v.begin(); it != v.end(); ++it) {



	mov	eax, DWORD PTR [ecx]

	mov	DWORD PTR _it$31360[esp+64], eax

	cmp	eax, ecx

	je	SHORT $LN1@main

$LL124@main:



; 22   : 		printf("%d\n", *it);



	push	DWORD PTR [eax+16]

	push	DWORD PTR [eax+12]

	push	OFFSET $SG-32

	call	_printf

	add	esp, 12					; 0000000cH

	lea	eax, DWORD PTR _it$31360[esp+64]

	call	??E?$_Tree_const_iterator@V?$_Tree_val@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$$CBHH@std@@@2@$0A@@std@@@std@@@std@@QAEAAV01@XZ ; std::_Tree_const_iterator<std::_Tree_val<std::_Tmap_traits<int,int,std::less<int>,std::allocator<std::pair<int const ,int> >,0> > >::operator++

	mov	eax, DWORD PTR _it$31360[esp+64]

	cmp	eax, DWORD PTR _v$[esp+68]

	jne	SHORT $LL124@main

$LN1@main:

It looks like the compiler is calling the same function implementation for the increment! Clever girl...

blog | twitter

agottem

March 16, 2011 08:22 PM

I don't see the point in this argument. The point here is the result. If the resulting executable is equally fast, then there is no "mistake". If I know the compiler will manage this for me, why should I waste precious brain cycles worrying about it, when I can put them to better use finding and eliminating bottlenecks. Your whole line of argument appears to be counter-productive and just pedantic really.
....

Code speaks louder than words. Next time you make an assertion that can be trivially proven using code, kindly do so. It spares me the time debunking your statements.

If you're putting your iterator implementation in a DLL and optimising for size then maybe you aren't too worried about performance after all.

You have to "waste precious brain cycles" because they aren't the same thing. How many times do I need to explain it to you? The compiler CANNOT change i++ to ++i, as such, it stands to reason there will be differences in certain scenarios.

Why don't you look at something a little more complicated than the vector iterator? How about the following code:





void foo (std::map<int, int>& m)

{

        for(std::map<int, int>::iterator i = m.begin(); i != m.end(); i++ /*or ++i */)

        {

                printf("foo\n");

        }

}

When compiling with "cl /FA /Os /c foo.cpp", here's the assembly you get for the post-increment case:





	push    ebp

	mov	ebp, esp

	sub	esp, 12					; 0000000cH





	lea	eax, DWORD PTR _i$23161[ebp]

	push	eax

	mov	ecx, DWORD PTR _m$[ebp]

	call	?begin@?$_Tree@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$CBHH@std@@@2@$0A@@std@@@std@@QAE?AV?$_Tree_iterator@V?$_Tree_val@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$CBHH@std@@@2@$0A@@std@@@std@@@2@XZ ; std::_Tree<std::_Tmap_traits<int,int,std::less<int>,std::allocator<std::pair<int const ,int> >,0> >::begin

	jmp	SHORT $LN3@foo



$LN2@foo:

	push	0

	lea	eax, DWORD PTR $T23885[ebp]

	push	eax

	lea	ecx, DWORD PTR _i$23161[ebp]

	call	??E?$_Tree_iterator@V?$_Tree_val@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$CBHH@std@@@2@$0A@@std@@@std@@@std@@QAE?AV01@H@Z ; std::_Tree_iterator<std::_Tree_val<std::_Tmap_traits<int,int,std::less<int>,std::allocator<std::pair<int const ,int> >,0> > >::operator++





$LN3@foo:

	lea	eax, DWORD PTR $T23886[ebp]

	push	eax

	mov	ecx, DWORD PTR _m$[ebp]

	call	?end@?$_Tree@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$CBHH@std@@@2@$0A@@std@@@std@@QAE?AV?$_Tree_iterator@V?$_Tree_val@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$CBHH@std@@@2@$0A@@std@@@std@@@2@XZ ; std::_Tree<std::_Tmap_traits<int,int,std::less<int>,std::allocator<std::pair<int const ,int> >,0> >::end

	push	eax

	lea	ecx, DWORD PTR _i$23161[ebp]

	call	??9?$_Tree_const_iterator@V?$_Tree_val@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$CBHH@std@@@2@$0A@@std@@@std@@@std@@QBE_NABV01@@Z ; std::_Tree_const_iterator<std::_Tree_val<std::_Tmap_traits<int,int,std::less<int>,std::allocator<std::pair<int const ,int> >,0> > >::operator!=

	movzx	eax, al

	test	eax, eax

	je	SHORT $LN4@foo





	push	OFFSET $SG23168

	call	_printf

	pop	ecx



	jmp	SHORT $LN2@foo

$LN4@foo:

	leave

	ret	0

Here you can see it call the post-increment operator (in section $LN2@foo). And, for completeness, here's the assembly for the post-increment operator:





	push	ebp

	mov	ebp, esp

	push	ecx

	push	ecx

	mov	DWORD PTR _this$[ebp], ecx



	mov	eax, DWORD PTR _this$[ebp]

	mov	eax, DWORD PTR [eax]

	mov	DWORD PTR __Tmp$[ebp], eax



	mov	ecx, DWORD PTR _this$[ebp]

	call	??E?$_Tree_iterator@V?$_Tree_val@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$CBHH@std@@@2@$0A@@std@@@std@@@std@@QAEAAV01@XZ ; std::_Tree_iterator<std::_Tree_val<std::_Tmap_traits<int,int,std::less<int>,std::allocator<std::pair<int const ,int> >,0> > >::operator++



	mov	eax, DWORD PTR ___$ReturnUdt$[ebp]

	mov	ecx, DWORD PTR __Tmp$[ebp]

	mov	DWORD PTR [eax], ecx

	mov	eax, DWORD PTR ___$ReturnUdt$[ebp]



	leave

	ret	8

Next, pre-increment:





	push	ebp

	mov	ebp, esp

	push	ecx

	push	ecx



	lea	eax, DWORD PTR _i$23161[ebp]

	push	eax

	mov	ecx, DWORD PTR _m$[ebp]

	call	?begin@?$_Tree@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$CBHH@std@@@2@$0A@@std@@@std@@QAE?AV?$_Tree_iterator@V?$_Tree_val@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$CBHH@std@@@2@$0A@@std@@@std@@@2@XZ ; std::_Tree<std::_Tmap_traits<int,int,std::less<int>,std::allocator<std::pair<int const ,int> >,0> >::begin

	jmp	SHORT $LN3@foo



$LN2@foo:

	lea	ecx, DWORD PTR _i$23161[ebp]

	call	??E?$_Tree_iterator@V?$_Tree_val@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$CBHH@std@@@2@$0A@@std@@@std@@@std@@QAEAAV01@XZ ; std::_Tree_iterator<std::_Tree_val<std::_Tmap_traits<int,int,std::less<int>,std::allocator<std::pair<int const ,int> >,0> > >::operator++



$LN3@foo:

	lea	eax, DWORD PTR $T23879[ebp]

	push	eax

	mov	ecx, DWORD PTR _m$[ebp]

	call	?end@?$_Tree@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$CBHH@std@@@2@$0A@@std@@@std@@QAE?AV?$_Tree_iterator@V?$_Tree_val@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$CBHH@std@@@2@$0A@@std@@@std@@@2@XZ ; std::_Tree<std::_Tmap_traits<int,int,std::less<int>,std::allocator<std::pair<int const ,int> >,0> >::end

	push	eax

	lea	ecx, DWORD PTR _i$23161[ebp]

	call	??9?$_Tree_const_iterator@V?$_Tree_val@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$CBHH@std@@@2@$0A@@std@@@std@@@std@@QBE_NABV01@@Z ; std::_Tree_const_iterator<std::_Tree_val<std::_Tmap_traits<int,int,std::less<int>,std::allocator<std::pair<int const ,int> >,0> > >::operator!=

	movzx	eax, al

	test	eax, eax

	je	SHORT $LN4@foo



	push	OFFSET $SG23167

	call	_printf

	pop	ecx



	jmp	SHORT $LN2@foo



$LN4@foo:

	leave

	ret	0

And the assembly for the pre-increment operator:





	push	ebp

	mov	ebp, esp

	push	ecx

	mov	DWORD PTR _this$[ebp], ecx



	mov	ecx, DWORD PTR _this$[ebp]

	call	??E?$_Tree_const_iterator@V?$_Tree_val@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$CBHH@std@@@2@$0A@@std@@@std@@@std@@QAEAAV01@XZ ; std::_Tree_const_iterator<std::_Tree_val<std::_Tmap_traits<int,int,std::less<int>,std::allocator<std::pair<int const ,int> >,0> > >::operator++



	mov	eax, DWORD PTR _this$[ebp]



	leave

	ret	0

Clearly, the difference is there. Additionally, obviously, the compiler (visual studio 2010) could not 'fix' the mistake. You lose, have fun updating updating all your post-increments you thought the compiler would 'fix' into pre-increments.

rip-off

11,000

March 16, 2011 09:49 PM

I actually use pre-increments everywhere, so I'm actually good for updating my code thanks.

I actually tried it earlier for std::map. I am not using the exact same compile options as you, as I am throwing this into a project I used for random internet help. The configuration is near enough the defaults, just some settings such as "disable language extensions" and "iterator debugging" removed, along with increasing the warning level. I generally try to reset any configuration changes I make, but maybe I've changed something important and forgotten about it.

The command line includes:



/I"C:\Program Files (x86)\boost\boost_1_40" /Zi /nologo /W4 /WX- /Ox /Oi /Ot /Oy- /GL /D "_HAS_ITERATOR_DEBUGGING=0" /D "_SECURE_SCL=0" /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /Gm- /EHsc /MT /GS- /Gy /fp:precise /Zc:wchar_t /Zc:forScope /Fp"Release\Help.pch" /FAs /Fa"Release\" /Fo"Release\" /Fd"Release\vc100.pdb" /Gd /analyze- /errorReport:queue

I couldn't really be bothered to go through each setting in detail to see what is causing the differences between what we see.

Here is the code I used.



#include <map>

#include <vector>

#include <iostream>



int main() 

{

	std::map<int, int> v;

	int i;

	while(std::cin >> i) {

		// v.push_back(i);

		v.insert(std::make_pair(i,i));

	}





	// Print post

	for(std::map<int, int>::iterator it = v.begin(); it != v.end(); it++) {

		printf("%d\n", it->first);

	}



	// Print pre

	for(std::map<int, int>::iterator it= v.begin(); it != v.end(); ++it) {

		printf("%d\n", it->first);

	}

}

With favour size, both times the same function was called. It is actually changing a call from i++ to ++i. The instruction sequence was slightly different however, but I'm not sure this is detrimental.



	mov	eax, DWORD PTR _v$[esp+68]



; Loop 1

	mov	ecx, DWORD PTR [eax]

	mov	DWORD PTR _it$31323[esp+64], ecx

	jmp	SHORT $LN162@main

$LL88@main:

	push	DWORD PTR [ecx+12]

	push	OFFSET $SG-31

	call	_printf

	pop	ecx

	pop	ecx

	lea	eax, DWORD PTR _it$31323[esp+64]

	call	??E?$_Tree_const_iterator@V?$_Tree_val@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$$CBHH@std@@@2@$0A@@std@@@std@@@std@@QAEAAV01@XZ ; std::_Tree_const_iterator<std::_Tree_val<std::_Tmap_traits<int,int,std::less<int>,std::allocator<std::pair<int const ,int> >,0> > >::operator++

	mov	ecx, DWORD PTR _it$31323[esp+64]

	mov	eax, DWORD PTR _v$[esp+68]

$LN162@main:

	cmp	ecx, eax

	jne	SHORT $LL88@main

; Loop 2

	mov	ecx, DWORD PTR [eax]

	mov	DWORD PTR _it$31360[esp+64], ecx

	cmp	ecx, eax

	je	SHORT $LN1@main

$LL126@main:

	push	DWORD PTR [ecx+12]

	push	OFFSET $SG-32

	call	_printf

	pop	ecx

	pop	ecx

	lea	eax, DWORD PTR _it$31360[esp+64]

	call	??E?$_Tree_const_iterator@V?$_Tree_val@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$$CBHH@std@@@2@$0A@@std@@@std@@@std@@QAEAAV01@XZ ; std::_Tree_const_iterator<std::_Tree_val<std::_Tmap_traits<int,int,std::less<int>,std::allocator<std::pair<int const ,int> >,0> > >::operator++

	mov	ecx, DWORD PTR _it$31360[esp+64]

	cmp	ecx, DWORD PTR _v$[esp+68]

	jne	SHORT $LL126@main

$LN1@main:

In the first case, the compiler jumps to the loop end first, and then does the loop. In the second, the compiler inserts a test before entering the loop. From what I can see, these differences are minor and whatever speed might be between them is paid once for the very first iteration.

However, I do not see an unnecessary copy of the iterator in the body, and in fact in this case the compiler calls the same function in both loop bodies. The loop body is essentially the same. If anything, the second loop actually contains less instructions in the body, which I find bizarre.

With favour speed, its a bit harder, as the compiler inlines the loops entirely. From examining the assembly back to back, I can only see the label values changing, and the "npad" values. They otherwise appear identical, but I could be wrong as there is quite a bit of code there:



; 15   : 	// Print post

; 16   : 	for(std::map<int, int>::iterator it = v.begin(); it != v.end(); it++) {



	mov	eax, DWORD PTR _v$[esp+84]

	mov	esi, DWORD PTR [eax]

	cmp	esi, eax

	je	SHORT $LN4@main

$LL109@main:



; 17   : 		printf("%d\n", it->first);



	mov	eax, DWORD PTR [esi+12]

	push	eax

	push	OFFSET $SG-31

	call	_printf

	add	esp, 8

	cmp	BYTE PTR [esi+21], 0

	jne	SHORT $LN281@main



; 12   : 	}

; 13   : 

; 14   : 

; 15   : 	// Print post

; 16   : 	for(std::map<int, int>::iterator it = v.begin(); it != v.end(); it++) {



	mov	eax, DWORD PTR [esi+8]

	cmp	BYTE PTR [eax+21], 0

	jne	SHORT $LN274@main

	mov	esi, eax

	mov	eax, DWORD PTR [esi]

	cmp	BYTE PTR [eax+21], 0

	jne	SHORT $LN281@main

	npad	4

$LL124@main:

	mov	esi, eax

	mov	eax, DWORD PTR [esi]

	cmp	BYTE PTR [eax+21], 0

	je	SHORT $LL124@main

	jmp	SHORT $LN281@main

$LN274@main:

	mov	eax, DWORD PTR [esi+4]

	cmp	BYTE PTR [eax+21], 0

	jne	SHORT $LN107@main

$LL108@main:

	cmp	esi, DWORD PTR [eax+8]

	jne	SHORT $LN107@main

	mov	esi, eax

	mov	eax, DWORD PTR [eax+4]

	cmp	BYTE PTR [eax+21], 0

	je	SHORT $LL108@main

$LN107@main:

	mov	esi, eax

$LN281@main:

	mov	eax, DWORD PTR _v$[esp+84]

	cmp	esi, eax

	jne	SHORT $LL109@main

$LN4@main:



; 18   : 	}

; 19   : 	

; 20   : 	// Print pre

; 21   : 	for(std::map<int, int>::iterator it= v.begin(); it != v.end(); ++it) {



	mov	esi, DWORD PTR [eax]

	cmp	esi, eax

	je	SHORT $LN1@main

$LL181@main:



; 22   : 		printf("%d\n", it->first);



	mov	ecx, DWORD PTR [esi+12]

	push	ecx

	push	OFFSET $SG-32

	call	_printf

	add	esp, 8

	cmp	BYTE PTR [esi+21], 0

	jne	SHORT $LN284@main



; 18   : 	}

; 19   : 	

; 20   : 	// Print pre

; 21   : 	for(std::map<int, int>::iterator it= v.begin(); it != v.end(); ++it) {



	mov	eax, DWORD PTR [esi+8]

	cmp	BYTE PTR [eax+21], 0

	jne	SHORT $LN277@main

	mov	esi, eax

	mov	eax, DWORD PTR [esi]

	cmp	BYTE PTR [eax+21], 0

	jne	SHORT $LN284@main

	npad	1

$LL196@main:

	mov	esi, eax

	mov	eax, DWORD PTR [esi]

	cmp	BYTE PTR [eax+21], 0

	je	SHORT $LL196@main

	jmp	SHORT $LN284@main

$LN277@main:

	mov	eax, DWORD PTR [esi+4]

	cmp	BYTE PTR [eax+21], 0

	jne	SHORT $LN179@main

$LL180@main:

	cmp	esi, DWORD PTR [eax+8]

	jne	SHORT $LN179@main

	mov	esi, eax

	mov	eax, DWORD PTR [eax+4]

	cmp	BYTE PTR [eax+21], 0

	je	SHORT $LL180@main

$LN179@main:

	mov	esi, eax

$LN284@main:

	mov	eax, DWORD PTR _v$[esp+84]

	cmp	esi, eax

	jne	SHORT $LL181@main

$LN1@main:



; 23   : 	}

blog | twitter

agottem

March 16, 2011 10:00 PM

/I"C:\Program Files (x86)\boost\boost_1_40" /Zi /nologo /W4 /WX- /Ox /Oi /Ot /Oy- /GL /D "_HAS_ITERATOR_DEBUGGING=0" /D "_SECURE_SCL=0" /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /Gm- /EHsc /MT /GS- /Gy /fp:precise /Zc:wchar_t /Zc:forScope /Fp"Release\Help.pch" /FAs /Fa"Release\" /Fo"Release\" /Fd"Release\vc100.pdb" /Gd /analyze- /errorReport:queue

Your command line is not set for 'Optmize for space'. From the "cl.exe /?" output:





/O1 minimize space



/O2 maximize speed



/Ob<n> inline expansion (default n=0)   



/Od disable optimizations (default)



/Og enable global optimization          



/Oi[-] enable intrinsic functions



/Os favor code space                    



/Ot favor code speed



/Ox maximum optimizations           	



/Oy[-] enable frame pointer omission

Set your compiler options correctly, and I'll bother to take a look. There's no reason we shouldn't see the same assembly output, as we're using the same compiler.

rip-off

11,000

March 16, 2011 10:35 PM

Well, I had already written both tests when I posted my command line, it might have had /Ot rather than /Os at the time. I think its worth looking at what the compiler is doing.

I didn't realise there was two settings for size/speed. I was just changing the obvious looking value in the IDE configuration. I wonder what the expected result is with "/O1 /Ot" or "/O2 /Os".

You are using the "favour" option, rather than "minimise space" (O1), which generates some decent assembly for me (this time a command line build, cl /FAs /O1 /c help.cpp):



; 15   : 	// Print post

; 16   : 	for(std::map<int, int>::iterator it = v.begin(); it != v.end(); it++) {



	mov	ecx, DWORD PTR _v$[ebp+4]

	mov	eax, DWORD PTR [ecx]

	mov	DWORD PTR _it$31314[ebp], eax

	mov	esi, OFFSET ??_C@_03PMGGPEJJ@?$CFd?6?$AA@

	jmp	SHORT $LN139@main

$LL65@main:



; 17   : 		printf("%d\n", it->first);



	push	DWORD PTR [eax+12]

	push	esi

	call	_printf

	pop	ecx

	pop	ecx

	lea	ecx, DWORD PTR _it$31314[ebp]

	call	??E?$_Tree_unchecked_const_iterator@V?$_Tree_val@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$$CBHH@std@@@2@$0A@@std@@@std@@U_Iterator_base0@2@@std@@QAEAAV01@XZ ; std::_Tree_unchecked_const_iterator<std::_Tree_val<std::_Tmap_traits<int,int,std::less<int>,std::allocator<std::pair<int const ,int> >,0> >,std::_Iterator_base0>::operator++

	mov	eax, DWORD PTR _it$31314[ebp]

	mov	ecx, DWORD PTR _v$[ebp+4]

$LN139@main:



; 12   : 	}

; 13   : 

; 14   : 

; 15   : 	// Print post

; 16   : 	for(std::map<int, int>::iterator it = v.begin(); it != v.end(); it++) {



	cmp	eax, ecx

	jne	SHORT $LL65@main



; 18   : 	}

; 19   : 	

; 20   : 	// Print pre

; 21   : 	for(std::map<int, int>::iterator it= v.begin(); it != v.end(); ++it) {



	mov	eax, DWORD PTR [ecx]

	mov	DWORD PTR _it$31351[ebp], eax

	cmp	eax, ecx

	je	SHORT $LN1@main

$LL105@main:



; 22   : 		printf("%d\n", it->first);



	push	DWORD PTR [eax+12]

	push	esi

	call	_printf

	pop	ecx

	pop	ecx

	lea	ecx, DWORD PTR _it$31351[ebp]

	call	??E?$_Tree_unchecked_const_iterator@V?$_Tree_val@V?$_Tmap_traits@HHU?$less@H@std@@V?$allocator@U?$pair@$$CBHH@std@@@2@$0A@@std@@@std@@U_Iterator_base0@2@@std@@QAEAAV01@XZ ; std::_Tree_unchecked_const_iterator<std::_Tree_val<std::_Tmap_traits<int,int,std::less<int>,std::allocator<std::pair<int const ,int> >,0> >,std::_Iterator_base0>::operator++

	mov	eax, DWORD PTR _it$31351[ebp]

	cmp	eax, DWORD PTR _v$[ebp+4]

	jne	SHORT $LL105@main

$LN1@main:



; 23   : 	}

Again the same pattern as before, the compiler choose to jump to the "end" of the loop for the first loop, and chooses to do an extra test for the second, but the loop bodies are mostly identical, with the same caveats above.

In any case I don't particularly care what the compiler does when minimising size - we are talking about the efficiency, which is clearly maximise speed.

And as for "Set your compiler options correctly", the assembly you posted clearly isn't optimised at all. The code is calling end() and a non-inlined operator!= every iteration.

blog | twitter

How many of you use C for game programming?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

How many of you use C for game programming?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines