Default values instead error handling

Take a look at the following declaration.

Although this isn’t particularly interesting, I have come across this little thing. This declaration is from Unity’s PlayerPref’s class and is an interesting way of solving a problem. The problem is “What should this thing do, if the key doesn’t exist in the player prefs?” While attempting to write something very similar, I thought, I could return an error code, which means the output is a parameter. Bleh. I could throw an exception, which is valid too. Which is also kinda bleh. But here, even though retrieving a float is exceptional, it’s handled elegantly because you don’t need to check an error code or write exception safe boiler plate code. You’re specifying what the value should be in case it doesn’t exist. Which is a case that might be expected. In other words, you’re handling the exceptional cases by configuring it. You’re telling it what you want to do in case it fails. Since this method can only fail in one way to the user (doesn’t exist, can’t get it, etc.), one parameter suffices. If it can fail in multiple ways, this could turn into something terrible as all the parameters tell this thing what to do on failure. Perhaps something better, in that case, might be to pass in a “configure” struct. But that’s kinda bleh too.

This actually has an issue that can cause some pain though. When I was using Unity a long time ago, I remember having a bug where I thought the value was in the player prefs, but really it was just the return value. The GetFloat and SetFloat actually had a slight typo where the keys were not identical. You could imagine the frustration. Although this might be equivalent to not checking the error code return value if the declaration was specified that way.

Anyways, the point is that sometimes an elegant simple solution is also a clever one.

Smart pointer general guidelines

Modern pointer guidelines for C++ are confusing, so I wanted to gather a list of general guidelines for pointer use. I’m sure I’m missing some though that could be added.

Most of these were taken from gotw-91.

  1. Don’t pass a smart pointer as a function parameter unless you want to use or manipulate the smart pointer itself, such as to share or transfer ownership. Only pass a smart pointer if you intend to share ownership.
  2. Prefer passing objects by value, *, or &, not by smart pointer.
  3. Pass by * or & to accept a widget independently of how the caller is managing its lifetime.
  4. Use a * if you need to express null (no widget), otherwise prefer to use a &; and if the object is input-only, write const widget* or const widget&.
  5. Put pointer in class header almost all of the time. Prefer using the pimpl idiom.
  6. You can put a reference in a class header, but it must be set at construction time, passed into the constructor. (dependency injection)
  7. Prefer unique_ptr to shared_ptr
  8. Express a “sink” function using a by-value unique_ptr parameter:
  9. Use a non-const unique_ptr& parameter only to modify the unique_ptr.
  10. Don’t use a const unique_ptr& as a parameter; use widget* instead.
  11. If it can be null, pass widget*, if it can’t be null, use &.
  12. Express that a function will store and share ownership of a heap object using a by-value shared_ptr parameter.
    1. Don’t normally use this because it restricts the lifetime of the caller.
    2. This is used to modify the shared_ptr itself.
    3. void f( shared_ptr& );
  13. Use a non-const shared_ptr& parameter only to modify the shared_ptr.
  14. Use a const shared_ptr& as a parameter only if you’re not sure whether or not you’ll take a copy and share ownership; otherwise use widget* instead (or if not nullable, a widget&).

References:
https://herbsutter.com/2013/06/05/gotw-91-solution-smart-pointer-parameters

Singletons with controlled destruction

I’m not going to debate about why one should or should not use singletons. I’m only going to provide an improvement for the singleton. One of the problems with singletons is that they use a static member. The problem with static members is that they can be destroyed in any order on application exit. If you have more than one singleton, and have dependencies between them, you want to control destruction of the singleton on application exit.

The differences between this and the Super Simple Singleton I previously posted about a couple of years ago are:

  1. The instance is a member of the class, not a function.
  2. There is a destroy() function

The destroy function allows for controlled destruction of the singleton, which is a million times better. When your application exists, and you have more than one singleton and they are dependent upon each other, who knows which one will get destroyed first. This gives you a lot more control over it.

C++ Best Practices

I love best practices. It’s like a jump start or a synopsis of all the things you should do without reading through all the heavy books.

In and out params

For functions, it’s handy to be able to identify what’s an “in” parameter, versus what’s an “out” parameter. I just got done reading an article about the Doom3 engine source code and there was something that I liked.

  1. From this signature you can identify what is an “in” and what is an “out”.
  2. The return type is obviously an “out”.
  3. The “const int val” identifies this as an “in”, even though it’s just an int.
  4. const Data& d1 is an “in”.
  5. For d2, it’s an out, and this is obvious because it takes a double pointer.
  6. For d3, this is also obviously an out. You want the function to fill something out.
  7. All out parameters use pointers, all in parameters use const or const refs.
  8. Never use a parameter as both an “in” and an “out”.

For #6, when you call this function look at what it looks like.

You can tell by looking at how it’s called what is an input and what is an output. If it has an &, you know it’s an out. Except for the return value obviously. You also have a guarantee that counter or data won’t be modified, and it won’t be modified inside the function. The function could copy the value and change it there as a separate variable on the stack, but that requires extra work. This is all about simple guarantees. I really like this approach, and I think I will adopt it for my standards.

Comments – 01

I think the APIs should be most self documenting. So I prefer to not add any comments unless it somehow helps. Obviously, this doesn’t help if the user wants doxygen documentation. If you do add comments, they should be in the .cpp file only. The reason for this is because reading the .h file is nice, neat, and compact. Otherwise, your header files are massively huge and difficult to digest. But if you do need comments, make sure they’re doxygen style.

So in the cpp file, prefer to have something like this:

There’s no point in adding comments to this function because it should be obvious. But I prefer to have the divider for readability. If you need comments, add them doxygen style, like this:

This of course causes problems. If all of your function comments are in the cpp, you need the cpp to read the documentation. This is okay for me, but not okay if someone else uses the library precompiled. But then again, you’re probably going to have to compile the sources yourself anyways since there’s no guarantee the ABI won’t change between versions of the tool chain.

This is a catch 22. You must build doxygen comments in order to see the documentation. Or ship the source code. I really don’t care because I don’t intend on publishing any of my code… Right now anyways. But I’m not sure if it would make my life easier just putting them all in the .h file.

Coding Standards

One thing that you can do to help you code better is to see how other people code. In particular, companies. Just reading their coding standards gives you an idea about why they code that way. It may be a little boring, but sometimes it helps and it keeps the code clean. Other times, it’s just style consistency.

For example, I like to write my function methods like this:

The .h file:

The .cpp file:

The reason I like to have my parameters, one on each line is because, if you need to remove or add a parameter, it makes it a little bit easier to identify where in the list it needs to go, or be removed.

Anyways, here are some coding standards you may want to browse:

Insomniac Games coding standards
Google style guide
CoreLinux++ Coding Standards
General Linux C++ Coding Standards

Super Simple Singleton

Here’s a singleton that is pretty clean. No references lying around in the class. There’s nothing special about it, but it’s simple and clutter free, but it’s not templated or anything like that.

Only check parameters on public interface methods

This was not obvious but should make some sense. Of course, it can break if you have “friends” or some other weirdness going on in your code. Let’s say that you have a class like this:

If I create an instance of this class, I have a little nice and neat package. The only way to access the data or modify the data in the class is through the public interface, like such:

As long as you verify the data coming into these public function methods that are exposed, the state of the object should be valid. You don’t need to do parameter checking on the internal function methods.

Of course, you may want to do checking on the internal methods simply because you don’t trust your calculations, but in general, the state of the object should always be sane. But you don’t know what’s going to enter the public interface, so you should always check that. Alternatively, you can use “if” statements and “throw” while validating the contract on the public interfaces, but maybe just use “assert”s on the private methods to make sure your code is internally stable while developing. When unit testing, of course, you probably only test the interface.

Header file neatness

One thing that I like to do in a class, is to break up the function methods and the data. Like this:

I separate the private data section from the private function method section. This groups all of the data together. I think it looks much cleaner this way personally.

Using namespace

When learning C++, frequently people will tell you to use “using namespace std” or some other namespace like “boost”. What they don’t tell you is that you’re never supposed to use it in a header file. You’re only supposed to use it in cpp files. If you use it in a header file, every included after that directive will be using that directive. It could lead to name clashes and other issues. Also, if you put it in a header, you may unintentionally be using that directive without even knowing it, which can cause serious headaches.

Where to put doxygen comments

I’m awesome
This is an “ah ha!” moment for me. For some reason, I felt that it was better to place all of my function method comments inside the cpp files instead of the header file. I felt it was cleaner, and it looked nicer because the header file was nice and compact and formatted neatly. You could glance at the class and immediately determine what public function methods you were supposed to use.

I am no longer awesome like I thought
Well, until I realized that, after you compile the library and distribute it with the header file, there are zero comments for the end user on how to use the API, unless you’ve built the doxygen comments already and also distributed it. This makes it a hassle for the developer who now has to go onto a website or open a .chm file for documentation. It’s rather clumsy.

There is a nasty trade off for this though. If you put all the comments in the header file, the user may now have to scroll around just to find the public API. It may not be nice and neat anymore and it feels cluttered to me. Maybe you can use visual studio’s “#region” stuff to help though.

Summary
Put doxy comments inside the header file for the end user, not the cpp file, where no one will be looking or possibly even have access to. Alternatively, put the comments in the cpp file, but make sure you’ve built the doxygen documentation for the API. I say the header is better because the developer won’t have to jump out to a website to get documentation. Another rason is because you can mouse over a function in visual studio and it gives you a synopsis of it with the tool tip. Yay.

Keeping track of constants

This is a simple little trick that I learned while I was at Game Circus. Basically, the goal is to never have any string constants or number literals littered throughout your code. Instead, you want to make one constant and place it at the top of the file. This is usually an obvious idiom to most programmers, but you may not realize how important it is.

The reason you want to do this is because if you used, for example, the value 0.004f in a few places within your cpp file, but you want to change that value, you’ll have to change all the places where you used it. It’s easy to forget one or two though, which is the problem.

So the idea is to group them all together at the top of the cpp file and make them static and const. It’s much easier to do this if you have primitive data types versus classes. It’s also better to have the value stored in the cpp file. If you store it in the header file, every time you need to change it, it has to recompile all the cpp files that include that header. So the compile time is actually faster if you keep them defined in the .cpp file like this.

The reason it’s good to make the value static is because it will only make one copy of it for that class. You obviously want it to be const because that will guarantee that it won’t change.

Header file.

Cpp file.

Of course, it’s up to you to change things around. For example, you may want to make them private instead of public. You can move them outside of the class too, or just within the namespace, but this will pollute the namespace, unless that’s your intent.

As a side note with regards to localization: Ideally, you would put all of your strings in something like a spreadsheet. One column for “English”, one column for “Spanish”, etc. Then you simply implement a “fetch” function that will automatically grab the correct string based on a language setting.

Easy initializer list editing

I’m usually into keeping my code nice and neat, but sometimes its also difficult to edit too. But this tip is not like that actually.

Setting up your initializer lists this way not only makes them nice and tidy, they’re easy to edit too. When you add or remove a member variable,  you add or remove the entire line. No shuffling commas about or re-straightening up. The other reason I like this is because if you have the following:

It’s difficult to tell where the parameters end and the initializer starts. In my opinion, this isn’t any better:

It still looks like a function with 6 parameters.. It’s just more difficult to read.

But I think this is substantially easier to read:

You can immediately identify what is the initializer list by the ‘:’ and the ‘,’ at the beginning of the lines.

assert( yourCodeCorrectly );

Everyone thinks asserts will somehow help you or somehow make your life easier or better. Sometimes they do, but sometimes n00bs don’t know how to use them properly. Here is an example of just such a case:

We stopped it from crashing, but it still asserts, and so it exits the app, so it is equivalent to a crash because the app stopped working. Oh yea, this code only works in debug mode, so if we’re running release, this statement gets stripped out too. So whether we are in release mode or debug mode, the app will terminate in some way, so we have done nothing to solve the problem, which is, the app exits unexpectedly and the user will no longer run it anymore, even if you fix it in the future. You have just proven to the end user that you are indeed incompetent so anything you produce in the future will probably go unnoticed.

This isn’t the real problem though. The real problem is when you use asserts all over the place without thinking, assuming they’ll protect your product somehow. Then, when you go to release, crash here, crash there, you get the idea. What asserts should be used for, aren’t to check right before a possible crash, they’re to check that some of your logic is indeed correct.

For example, you may assert that a return value is correct (the function works correctly), or a loop invariant does not change, but do not use asserts to check indices into an array or for checking null pointers right before dereferencing them. Those cause crashes in debug and release mode and we know that asserts will not save you in release mode.

We have a few options here:

  1. Use if statements to protect indexing into the array and also to spit out an error, probably returning in the process. This clutters code up with a bunch of error checking though.
  2. We can throw an exception if possible. If you’re using regular C arrays, in C++, then this isn’t going to happen, it will crash when you index. So you need to wrap the array with your own custom array interface probably and throw accordingly.
  3. Use goto and labels if exception handling is not possible. If you don’t suck at coding, goto statements have their use and can act as a crappy but cheap exception handlers when you don’t have the compiler flag to enable exception handling turned on. But, of course, be careful with it n00b.

Ok, so we’re talking about being robust here. We have traded speed for robustness and this is perfectly fine for some enterprise level applications where quality is more important (such as handling your credit card info). But if speed is more important, then asserts may very well be what you’re looking for.

Early-outs for n00bs

Sometimes I see people code something up like the following:

This situation has numerous problems.

  1. It’s difficult to read.
  2. The cohesion is terrible. If you trace a debug log to one of the debug statements, it is no where near the failure. So you have to sloppily figure out which if failed.
  3. It can get complicated.

At first it may seem that you can just pull all of the if statements together like this:

It’s pretty clear what the problem is. We simplified the statement, but we lost all our debugging information.
If one of the ANDed statements fails, we still have to check which one in the “else” clause.

Enter “Early-out”. The concept of an early-out is simple. Just check for negative conditions instead of positive ones.

  1. It keeps the errors together with the failure check.
  2. It makes the code easier to read
  3. Its easy to tell if you forgot something
  4. Use this method to check input arguments passed to the function

It’s simply a matter of reorganizing the code.

So all we did was reorganize the code, got rid of the “elses,” and changed the boolean logic to != instead of == on all if statements. This is a good way to make sure you’re testing all of your parameters for valid input before you actually execute any code… They’re called early-outs because they attempt to exit out of the function as early as possible. This avoids any unnecessary processing (that’s the theory any ways.)

Testing at Microsoft

I was a contract employee at Microsoft, an SDET I. While I was there I learned quite a lot about testing, and I really believe it made me a much better programmer. Although my time there was short lived, I managed to retain some of my testing knowledge, and I hope to share that with you now. So I’ll quickly and briefly cover the basics of testing. I’m not talking about unit testing either. I’ll break down the types of tests you have into different categories. There may be more.

  • BVT (Base verification tests)
  • –Positive test cases
  • FVT (Functional verification tests)
  • –Positive Test Cases (Tester tests for expected failures )
  • –Negative Test Cases (We catch unexpected failures during testing (hopefully) )

BVT (Base Verification Tests)
These are the basic tests performed to show that the operation is basically working. Sometimes you just want to run the BVT tests because they’re faster and shorter. Example:

It is short and to the point. TEST is a simple macro that checks if the return value of the function is equal to the second parameter. It checks to see if 2*2/1 == 5. Nothing more. It is fast, but it doesn’t check everything. This test is also called a positive test case. A positive test case consists of passing in valid data, and expecting to get a valid result.

FVT (Functional Verification Tests)
The FVTs are a much more in depth testing stage. It consists of a much more thorough testing of the functions, in both quantity, and quality. FVTs consists of the BVT tests as well. So they may be duplicated, but this isn’t necessary. In some cases, the FVTs consists of modifications to the BVTs.
From the example above:

As you can see, we now have a lot more test cases and they’re broken up into positive and negative test cases. We may want then to be in separate functions too, but I didn’t do that here. Our positive test cases have a lot more cases and are simple modifications to the BVT.

The FVT also consists of our negative cases, where, if the function fails, (gives an invalid result the user can’t use) it needs to fail in an expected and deterministic way. Under no circumstances should it crash. When it crashes, this is an “unexpected failure,” and is usually caught very soon, since your tests will also crash. This is the most common unexpected failure. I can’t really think of any other right now.

Also, all of the negative test cases will fail in this scenario because nothing in the function is being thrown. And our tests are expecting it to throw the correct exceptions. When testing, it’s important to test common boundary values. In the example above, good things to test are 0, -1, 1, INF, -INF, NaNs, and other such things. When writing code, it’s important to check for the same boundary conditions immediately at the top of the function.

One final note before I leave. These are NOT unit tests. Unit tests sometimes overlap the BVTs, but they are written by the developer who wrote the code (in this case the Calc function). So they are not written by the tester (or SDET).

Const qualifier reminder

Just a reminder how const-ness works. If I remember correctly.

Math trickery for 1D & 2D arrays

Once upon a time, there was a language that didn’t have 2 dimensional arrays. This greatly upset the programmer. Fortunately, the programmer was swift in the art of arithmetic and overcame this obstacle easily.

Lets say for example, we only have access to a 1 dimensional array data structure and it has 50 elements in it.

Lets break this array up into 10 rows and 5 columns to make a grid. So how do we get access to a position say data[4][3]?
There are two things I’m going to teach you now:
1) Retrieve the column & row numbers from a linear index counter.
2) Retrieve the index from a row & column numbers.

Before I continue, i want you to notice that the row and column numbers are zero based. So they start at zero, not one.

Lets do #1 first:

Read the rest of this entry »

What is the difference between aggregation and composition?

This question has bothered me for some time and I always forget. So I wrote it down here and added an answer.

Composition : An object contains another object. When the container object dies, so does the composited objects.

Aggregation : An object pseudo-contains another object (contains a pointer to it). When the container object dies, the containees do not.

Sometimes aggregation is called composition when the difference doesn’t matter.
Note: In UML, aggregation is an unfilled diamond, whereas composition is a filled diamond.

References:
http://en.wikipedia.org/wiki/Object_composition#Aggregation