Patrick's Blog

Moves and copies in C++

You can find the full complete gist for this post here.

This article is an attempt to compile various knowledge bits and Stack Overflow articles I found while learning about this topic.

Many of the ideas and examples presented here are not my own, and linked articles and posts will be at the bottom.

Rather, this is an attempt to piece together a logical thread of investigation one might go about when learning these language features, instead of having to jump around various articles.

So treat these as notes rather than lecture. At the very least, I hope it can provide a base to start from. It is largely for my own benefit, but I hope someone else may find it useful too.

As a prerequisite, it is helpful to know about value categories in C++.

Introduction

When defining a class or struct in C++, we have the option of defining several common constructors and operator overloads. Those being:

You might have heard of the C++ rule of 3 or the rule of 5, these concepts are what those "rules" refer to.

Distinguishing the ideas of copies and moves is often talked about in the context of being performance-sensitive and it's a useful tool in our pocket to have when writing applications.

Copies

Copies are straight forward, and I think everyone who understands the basics of memory implicitly knows what a copy is.

Copy Constructor

Specifically, our copy constructor produces a deep copy. Here's how we might write that:

class FunArray {
private:
    std::size_t m_size;
    int* m_data;

public:
    FunArray() : m_size(0), m_data(nullptr) { }
    FunArray(std::size_t size) : m_size(size), m_data(new int[m_size]) {
        for (std::size_t i = 0; i < m_size; i++) {
            m_data[i] = i;
        }
    }

    ~FunArray() {
        if (m_data) {
            delete[] m_data;
        }
    }

    // Copy constructor
    FunArray(const FunArray& other) : m_size(other.m_size), m_data(new int[m_size]) {
        std::cout << "Copy constructor\n";
        std::copy(other.m_data, other.m_data + m_size, this->m_data);
    }

    void print() {
        for (size_t i = 0; i < m_Size; i++) {
            std::cout << data[i] << " ";
        }
        std::cout << "\n";
    }
};

In this class, our copy constructor is invoked via:

FunArray C_c(15);
FunArray D_c(C_c);

If we print both FunArrays we get the expected:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 

Copy Assignment

The copy assignment operator is a little more involved, and would look something like this:

FunArray& operator=(const FunArray& other) {
    std::cout << "Copy assignment\n";
    if (this != &other) {
        this->m_size= other.m_size;
        int* newData = new int[other.m_size];
        std::copy(other.m_data, other.m_data + other.m_size, newData);
        delete[] this->m_data;
        this->m_data = newData;
    }
    return *this;
}

The copy assignment operator is invoked when:

FunArray A_c(12);
FunArray B_c(10);
B_c = A_c;
FunArray C_c = A_c; // Careful, this will _not_ invoke the assignment operator, it actually invokes the copy constructor

For the most part it is the same, but a small important detail is the check at the beginning:

if (this != &other) {
    // Do the stuff...
}

This covers the case where we try to assign an instance of an object to itself. Without this, we'd unnecessarily perform an alloc/dealloc + copy. While this situation is unlikely, it's nice to have.

Depending on how the operator overload is written, however, it could even be dangerous. For example, another way of writing the operator would be:

// Perfectly valid implementation if: this != &other
FunArray& operator=(const FunArray& other) {
    // Remove our current m_data, we're overwriting it anyways right?
    delete[] m_data;

    m_data = new int[other.m_size];
    std::copy(other.m_data, other.m_data + other.m_size, m_data);
    
    m_size = other.m_size;

    return *this;
}

In this case, if &other is the same as this, we end up deleting the very same data pointer we intend to copy from, before we've even copied from it! The resulting output will be garbage.

Those more advanced in C++ may wonder where the "copy and swap" idiom is. I'll leave the explanation for what it is to this Stack Overflow post, but for this class example it'd look something like this:

// Copy assignment using the "copy and swap" idiom

// Notice how we are now passing in 'other' by value 
// instead of const reference, this has the effect 
// of invoking the a constructor, specifically the
// copy constructor. When the function exits, 'other' 
// takes the old data with it
FunArray& operator=(FunArray other) {
    std::cout << "Swap assignment\n";
    swap(*this, other);
    return *this;
}

friend void swap(FunArray& first, FunArray& second) {
    std::swap(first.m_data, second.m_data);
    std::swap(first.m_size, second.m_size);
}

For those unaware: in short, the "copy and swap" idiom is an alternative way of writing this operator overload that results in arguably cleaner and safer code. Writing this way, however, has a small impact on how we treat moves, which we'll see in the next section.

Moves

Moves or more generally, "move semantics" in C++ can be a pretty confusing topic. Specifically, you'll often hear that std::move "doesn't actually move anything!", which only exacerbates the confusion for the beginner.

The key idea is that std::move should be taken as an indication. An indication that its argument may be moved from. More precisely, std::move is essentially a static_cast to an rvalue reference type. Value categories in C++ are a handful, but to put it simply:

T&  // Reference, or "L-value reference"
T&& // R-value reference

Move Constructor

Our move constructor is similar to the copy constructor, but now we have 2 ampersands and a lack of const. The move constructor can be triggered as follows:

FunArray C(15);
FunArray D(std::move(C));

And written like:

FunArray(FunArray&& other) noexcept : m_size(other.m_size), m_data(other.m_data)  {
    std::cout << "Move constructor\n";
    other.m_size = 0;
    other.m_data = nullptr;
}

All this does is "steal" other's data by setting a pointer and size value. We leave the original object in a "valid but unspecified state". Now you can imagine that for large objects that this is much more efficient than copying and can be a real performance win when applicable.

There's actually another way of writing our move constructor by re-using the swap() function:

FunArray(FunArray&& other) noexcept : FunArray() {
    std::cout << "Move constructor\n";
    swap(*this, other);
}

You'll notice a few things:

  1. Calling of the default constructor in the initializer list
  2. Reuse of the previously defined swap() function
  3. noexcept

Going through these things one-by-one:

Constructor Delegation

The goal of a move is to potentially take the contents of an object and place them in another with the added requirement that the remaining object left behind (moved from) is still a valid object of that type, even if it contains unknown values. To that end, the move constructor needs to provide a valid object of its type in exchange for what it is moving.

An easy way to guarantee this is to use the default constructor to initialize *this, and give that to other in the place of other. Putting a constructor in the initializer list is a C++11 feature called "Constructor Delegation" and is a nice convenient alternative to manually default constructing the class.

Reusing swap()

The context from above should be enough to indicate why we reuse swap() here

noexcept

This one's a little bit more involved. I honestly haven't written a ton of code that makes heavy use of exceptions. Aside from "no exceptions" being a common coding standard/guideline in game development, the main reasons for appending this specifically to the move constructor seem to be tied to performance.

For example, STL containers have a resize() function that allows for changing the size of the container. In later C++ standards, these library containers have the option of potentially using moves instead of copies to perform these resize operations. However, they will only do so if the move constructor does not break its "strong exception guarantee". Specifically, since a throwing move may fail, the original container would be left an unknown potentially unrecoverable state. On the other hand, copies by their nature keep the original data intact, but with a potential cost to performance.

Therefore, implementing your moves in such a way that it will never throw exceptions, and letting the compiler know that via marking noexcept, allows for use of moves in a function like resize().

Move Assignment

Finally, we arrive at move assignment. Following the pattern with the constructors, the move assignment is invoked as follows:

FunArray A(12);
FunArray B(10);
B = std::move(A);

With a typical implementation looking like this:

FunArray& operator=(FunArray&& other) noexcept {
    std::cout << "Move assignment\n";
    if (this != &other) {
        delete[] this->m_data;
        this->m_data = other.m_data;
        this->m_size = other.m_size;
        other.m_data = nullptr;
        other.m_size = 0;
    }   
    return *this;
}

Right off the bat, if you used the "copy and swap" idiom version of the copy assignment operator, you'll be greeted with an error message like the following when trying to define and compile this operator overload:

more than one operator "=" matches these operands:C/C++(350)
main.cpp(131, 7): function "FunArray::operator=(FunArray other)" (declared at line 63)
main.cpp(131, 7): function "FunArray::operator=(FunArray &&other) noexcept" (declared at line 86)

This is because both our copy assignment (pass by value) and this new move assignment (pass by rvalue reference) overload can take rvalues. This results in an ambiguous overload. If we want to define this move assignment explicitly, we'd have to change the argument of the copy assignment back to const FunArray& other.

At this point, we're done! Because our copy-and-swap idiomatic assignment overload invokes a construction via the pass-by-value, you'll notice that the move constructor actually gets invoked this time around because other is being constructed via an rvalue reference.

FunArray B(10);
B = std::move(FunArray());
B.print() // Will print empty