Jump to content
43oh

Compiler optimization traps for the unaware


Recommended Posts

Here's a very interesting presentation about how modern compiler optimization may lead to unexpected results. This goes way beyond the failure of naive delay loops.
http://www.eng.utah.edu/~cs5785/slides-f10/Dangerous+Optimizations.pdf
 
If you ever relied on buffer indices wrapping around (integer overflow), this is a must read. There are many other scenarios discussed.
 
For example I'm pretty sure I fell for this trap myself:

volatile int buffer_ready;
char buffer[BUF_SIZE];
void buffer_init() {
  for (size_t i = 0; i < BUF_SIZE; i++)
    buffer[i] = 0;
  buffer_ready = 1;
}

It probably works today. But it's a bug waiting to happen when I recompile with different optimization settings or a different compiler. (hint: buffer_ready=1 may be moved before the for loop because the loop does not affect any volatile location).
 

Link to post
Share on other sites

Here's a very interesting presentation about how modern compiler optimization may lead to unexpected results. This goes way beyond the failure of naive delay loops.

http://www.eng.utah.edu/~cs5785/slides-f10/Dangerous+Optimizations.pdf

 

If you ever relied on buffer indices wrapping around (integer overflow), this is a must read. There are many other scenarios discussed.

 

For example I'm pretty sure I fell for this trap myself:

volatile int buffer_ready;
char buffer[BUF_SIZE];
void buffer_init() {
  for (size_t i = 0; i < BUF_SIZE; i++)
    buffer[i] = 0;
  buffer_ready = 1;
}

It probably works today. But it's a bug waiting to happen when I recompile with different optimization settings or a different compiler. (hint: buffer_ready=1 may be moved before the for loop because the loop does not affect any volatile location).

 

Lol reading this presentation is almost nauseating!

As a side note, one thing about the "Go" programming language I found amusing is how they intentionally break a few undefined behaviors, such as:

myMap := make(map[string]int)
myMap["zero"] = 0
myMap["one"] = 1
myMap["two"] = 2
myMap["three"] = 3
myMap["four"] = 4
myMap["five"] = 5

for k, v := range myMap {
    fmt.Printf("Value: %d\n", v);
}

This code will not produce a neat order of 0, 1, 2, 3 .... It will be shuffled/randomized.  The Go designers did this intentionally to break programmers' code when they try to rely on assumptions based on consistent undefined behaviors.  As a result you are forced to convert those sorts of things into a list and sort them or whatever ... something better-defined anyhow.

Link to post
Share on other sites

This code will not produce a neat order of 0, 1, 2, 3 .... It will be shuffled/randomized.  The Go designers did this intentionally to break programmers' code when they try to rely on assumptions based on consistent undefined behaviors.  As a result you are forced to convert those sorts of things into a list and sort them or whatever ... something better-defined anyhow.

Remember Pascal? Originally, no short circuit conditionals, among other things intended to force `good practice'. Most of these things were (incompatibly) rectified by various compiler designers. Even the  UCSD line eventually gave switches for some of these things by the early 1980's. Java still does the same thing, in some ways (no overloading operators, for example, makes writing code to do non-triveal math a bit of a suck) Hmmm... I wonder if I still have a set of UCSD p-code floppys around?

Link to post
Share on other sites

myMap := make(map[string]int)
myMap["zero"] = 0
myMap["one"] = 1
myMap["two"] = 2
myMap["three"] = 3
myMap["four"] = 4
myMap["five"] = 5

for k, v := range myMap {
    fmt.Printf("Value: %d\n", v);
}

This code will not produce a neat order of 0, 1, 2, 3 .... It will be shuffled/randomized.  The Go designers did this intentionally to break programmers' code when they try to rely on assumptions based on consistent undefined behaviors.  As a result you are forced to convert those sorts of things into a list and sort them or whatever ... something better-defined anyhow.

 

Moronic . . . it's like python all over again with our language "parents" enforcing good behavior . . . But good, at least python has company on my

never go to shelf" as a language ;)

Link to post
Share on other sites

Oh and I did not read that fully:

break programmers' code when they try to rely on assumptions based on consistent undefined behaviors

 

 

Morons  to the power of two. Iterating through values from 0 to n ( in order ) is considered "undefined behavior" in go ? Yeah wow Nova ( No va) for me ;)

Link to post
Share on other sites

Oh and I did not read that fully:

break programmers' code when they try to rely on assumptions based on consistent undefined behaviors

Morons  to the power of two. Iterating through values from 0 to n is considered "undefined behavior" in go ? Yeah wow Nova ( No va) for me ;)

I personally find it a good thing, in that this is creating a programming environment where the likelihood of mental blindspot-like errors is low.  Combining Go with an IDE that actively syntax-checks, like Visual Studio Code (the javascript side project of the M$ VS team), I've found that I can write Go code that usually works pretty much perfectly as expected ... sometimes on the first compile.  I'm quite pleased with what Google and the Go team are doing here.  Alas, to each his own ;)

Link to post
Share on other sites

I personally find it a good thing, in that this is creating a programming environment where the likelihood of mental blindspot-like errors is low.  Combining Go with an IDE that actively syntax-checks, like Visual Studio Code (the javascript side project of the M$ VS team), I've found that I can write Go code that usually works pretty much perfectly as expected ... sometimes on the first compile.  I'm quite pleased with what Google and the Go team are doing here.  Alas, to each his own ;)

Yeah, you and I do not always see eye to eye on topics such as this. But what bothers me so much about situations like this is that people start second guessing other people, and then they tend to lose focus on what really matters. Like language features. In this particular case, I expect a loop to iterate in the order that makes sense ( 0, 1, 2 . . . unless step x), and I also expect a number key, to somehow programmatically return it's dictionary value. So if the implementers of GO broke a part of their language on purpose that keeps that from happening. I'm going to start second guessing that they as language implementers SUCK for having too much time on their hands . . . ;)

 

That last little bit was tongue in cheek about the implementers of GO "being SUCK". But I still do think if I have it correct above - that what they did was extremely silly.

 

As far as on topic goes . . . I usually do not optimize all that much, and I always view the more extreme optimization options as  . . . "off limits".

Link to post
Share on other sites

Yeah, you and I do not always see eye to eye on topics such as this. But what bothers me so much about situations like this is that people start second guessing other people, and then they tend to lose focus on what really matters. Like language features. In this particular case, I expect a loop to iterate in the order that makes sense ( 0, 1, 2 . . . unless step x), and I also expect a number key, to somehow programmatically return it's dictionary value. So if the implementers of GO broke a part of their language on purpose that keeps that from happening. I'm going to start second guessing that they are language implementers SUCK for having too much time on their hands . . . ;)

 

That last little bit was tongue in cheek about the implementers of GO "being SUCK". But I still do think if I have it correct above - that what they did was extremely silly.

 

As far as on topic goes . . . I usually do not optimize all that much, and I always view the more extreme optimization options as  . . . "off limits".

 

Well yeah one fun thing about the example I gave is that I specifically assigned the entries in a logical order ... but the point here is, that iteration doesn't necessarily make sense in the case of a map because things can be deleted and re-added arbitrarily and the order can be a bit "undefined" and difficult to ascertain.

 

E.g.:

myMap := make(map[string]int)
myMap["ten"] = 10
myMap["nine"] = 9
myMap["zero"] = 0
myMap["one"] = 1

for k, v := range myMap {
    // One might "assume" to see 10, 9, 0, then 1 here
    fmt.Printf("Value: %d\n", v)
}

myMap["three"] = 3

for k, v := range myMap {
    // But what about this?
    fmt.Printf("Value: %d\n", v)
}

myOtherUnrelatedMap := make(map[string]string)
myOtherUnrelatedMap["foo"] = "bar"
myOtherUnrelatedMap["bleh"] = "this sucks"
myOtherUnrelatedMap["america"] = "fu** YEAH"

for k, v := range myOtherUnrelatedMap {
    // Just what kind of order do you expect here?
    fmt.Printf("Value: %s\n", v)
}

It looked logical in my first example that the for ... range ... fmt.Printf should print stuff in some kind of numerical order, but a map isn't meant to be an ordered list.  It's an unordered hash map.  So Go destroys anyone's temptation to assume it an ordered list by making CERTAIN it's unordered - using randomization.  The idea here is to avoid programmers from "assuming" the map will get iterated in a consistent logical order (and structuring some small but potentially critical part of their code logic around this assumption) and then go off and do something stupid with it that makes their assumed "order" incorrect but they don't realize it.  By ensuring iteration through a map is random, they can't fall into that trap.

 

Alas, I get your position on this one.

Link to post
Share on other sites

Well yeah one fun thing about the example I gave is that I specifically assigned the entries in a logical order ... but the point here is, that iteration doesn't necessarily make sense in the case of a map because things can be deleted and re-added arbitrarily and the order can be a bit "undefined" and difficult to ascertain.

 

E.g.:

myMap := make(map[string]int)
myMap["ten"] = 10
myMap["nine"] = 9
myMap["zero"] = 0
myMap["one"] = 1

for k, v := range myMap {
    // One might "assume" to see 10, 9, 0, then 1 here
    fmt.Printf("Value: %d\n", v)
}

myMap["three"] = 3

for k, v := range myMap {
    // But what about this?
    fmt.Printf("Value: %d\n", v)
}

myOtherUnrelatedMap := make(map[string]string)
myOtherUnrelatedMap["foo"] = "bar"
myOtherUnrelatedMap["bleh"] = "this sucks"
myOtherUnrelatedMap["america"] = "fu** YEAH"

for k, v := range myOtherUnrelatedMap {
    // Just what kind of order do you expect here?
    fmt.Printf("Value: %s\n", v)
}

It looked logical in my first example that the for ... range ... fmt.Printf should print stuff in some kind of numerical order, but a map isn't meant to be an ordered list.  It's an unordered hash map.  So Go destroys anyone's temptation to assume it an ordered list by making CERTAIN it's unordered - using randomization.  The idea here is to avoid programmers from "assuming" the map will get iterated in a consistent logical order (and structuring some small but potentially critical part of their code logic around this assumption) and then go off and do something stupid with it that makes their assumed "order" incorrect but they don't realize it.  By ensuring iteration through a map is random, they can't fall into that trap.

 

Alas, I get your position on this one.

 

Well another consideration, and perhaps you've already thought of this too is this: What kind of performance implications are incurred because of this ? But . . . GO being interpreted . . . I'm not going to expect it to be super fast anyway,

 

But anyhow. This is part of what makes up a good programmer right ? Knowing the limits of the languages we use, *OR* as in my case, I code defensively in many cases. So like in @@chicken's example above. I try my best in many case, unless I do not care about the code ( test code etc ), not to let values auto-overflow. You can see proof of that in the WDT code ive been sharing lately. The last several code posts keep at least one variable ( volatile unsigned int ) from overflowing.

Link to post
Share on other sites

Here's a very interesting presentation about how modern compiler optimization may lead to unexpected results. This goes way beyond the failure of naive delay loops.

http://www.eng.utah.edu/~cs5785/slides-f10/Dangerous+Optimizations.pdf

 

If you ever relied on buffer indices wrapping around (integer overflow), this is a must read. There are many other scenarios discussed.

 

For example I'm pretty sure I fell for this trap myself:

volatile int buffer_ready;
char buffer[BUF_SIZE];
void buffer_init() {
  for (size_t i = 0; i < BUF_SIZE; i++)
    buffer[i] = 0;
  buffer_ready = 1;
}

It probably works today. But it's a bug waiting to happen when I recompile with different optimization settings or a different compiler. (hint: buffer_ready=1 may be moved before the for loop because the loop does not affect any volatile location).

 

Ok so I'll bite. For the scope of the loop you're showing above, with exactly the code you have. I do not think it matters if buffer_ready = 1; is inside, or outside ( above ) the for loop. For the sake of scope . . .right ?

Link to post
Share on other sites

@@yyrkoon if there's an interrupt using that flag to determine whether it's save to use the buffer you may be in trouble.. occasionally.

 

While not advisable on real mutlithreaded OS'es, using a volatile flag for "interprocess" communication is pretty common practice on MCUs.

I'd say it is required if that variable is ever touched, or looked at by an interrupt. But I've had to use volatile in Linux a few times. No interrupts involved. I'd have ot find the code to remember exactly what I was doing at the time . . .

Link to post
Share on other sites

Well another consideration, and perhaps you've already thought of this too is this: What kind of performance implications are incurred because of this ? But . . . GO being interpreted . . . I'm not going to expect it to be super fast anyway,

 

But anyhow. This is part of what makes up a good programmer right ? Knowing the limits of the languages we use, *OR* as in my case, I code defensively in many cases. So like in @@chicken's example above. I try my best in many case, unless I do not care about the code ( test code etc ), not to let values auto-overflow. You can see proof of that in the WDT code ive been sharing lately. The last several code posts keep at least one variable ( volatile unsigned int ) from overflowing.

Fwiw Go is actually compiled.  With a bit of a binary runtime that "tags" along and manages the threading subsystem + garbage collection... but the Go compiler actually has its own assembler & linker and circumvents the use of any "C" code whatsoever.  The compiler used to be built as a C program for quite a while but around 1-2 major releases back, they ditched the C portion and it's now written entirely in Go from what I read

 

Mind you, Go is written by Rob Pike (think Plan9), Ken Thompson (think UNIX) formerly of Bell Labs... and their team of folks at Google along with other input/help/contributed libraries from the open source community at large.  The book that introduced me to the language, "The Go Programming Language", was coauthored by Brian Kernighan (also of Bell Labs and the 'k' in the AWK utility under Unix).  These guys pretty much know what they're doing.  Anyway enough threadjacking :)

Link to post
Share on other sites
myMap := make(map[string]int)
myMap["zero"] = 0
myMap["one"] = 1
myMap["two"] = 2
myMap["three"] = 3
myMap["four"] = 4
myMap["five"] = 5

for k, v := range myMap {
    fmt.Printf("Value: %d\n", v);
}
This code will not produce a neat order of 0, 1, 2, 3 .... It will be shuffled/randomized.  The Go designers did this intentionally to break programmers' code when they try to rely on assumptions based on consistent undefined behaviors.  As a result you are forced to convert those sorts of things into a list and sort them or whatever ... something better-defined anyhow.

 

 

If anything I'd have expected: 5, 4, 1, 3, 2, 0 (sorted by key rather than value or insertion order). That's what you'd get out of a std::map in C++. The C++ standard library designers originally avoided the issue of unspecified map iteration order by only permitting ordered maps, and hash-based maps were not officially included until C++11.

 

In terms of performance, it looks like the randomisation in go is a random start offset rather than a complete shuffling of the iteration order. That's just enough to prevent anyone from relying on the order from one iteration to the next.

 

They might also have some random input into the hash function which would jumble the ordering between program runs. Apparently that can also be implemented to defeat hash collision DoS attacks when user-provided values are used as map keys.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...