Simple types and objects

This chapter introduces you to the very basics of the type system and object model, as well as useful expressions. Much of this chapter will feel like revision, but you should still pay attention since there are going to be subtle differences from what you’re used to.

Be sure to open a new session of Compiler Explorer.

Chapter table of contents

  1. Simple types and objects
    1. Objectives and outcomes
    2. Acknowledgements
    3. Vocabulary calibration
      1. Types
      2. Objects
      3. Values
      4. Expressions
    4. Manipulating numbers, logic, and text
      1. Introducing Catch2
      2. Constants
        1. Integers
        2. Fractions
        3. Logic
        4. Text
          1. Single characters
          2. Strings
      3. Object representation in memory
      4. Variables and modifying expressions
      5. const-correctness
    5. Feedback
    6. Summary

Objectives and outcomes

Objective (you will develop) Outcome (you will be able to)
an understanding of the C++ type and object systems
  • define the terms ‘type’, ‘object’, and ‘value’.
  • describe the relationship between objects and memory.
  • distinguish between C++ objects and other languages’ definitions.
  • identify the go-to types for arithmetic, logic, and text.
  • describe how the go-to types are represented as objects in an abstract machine.
an understanding of const-correctness
  • distinguish between constants and variables.
  • describe the benefits of const-correctness.
  • evaluate when variables are more appropriate than constants.
skills in defining objects
  • synthesise programs that define constants and variables.
skills in using the Catch2 test framework
  • synthesise test cases using Catch2.
skills in arithmetic, logic, and string operations
  • synthesise arithmetic expressions for integers and floating-point numbers, including
    • addition, subtraction, multiplication, division, and modular arithmetic
    • compound assignment involving the above operations
    • increment and decrement operations
  • synthesise logic expressions
    • comparing objects for equality and inequality
    • conjunctions, disjunctions, and negation
  • synthesise string expressions, including
    • indexing
    • appending other characters and strings
    • clearing a string’s contents
    • checking a string’s length

Acknowledgements

Thank you to Vagrant Gautam, killerbee13, and Janet Cobb for providing feedback.

Thank you to Maren Pan for introducing me to Figma (the tool used for making graphics), and Luke D’Alessandro for providing alternative wording for the definition of objects.

Vocabulary calibration

Before we dive into code, we’re gonna quickly revise our vocabulary, since these terms often differ between languages, and they all have precise meanings in C++.

Types

We use types to specify sets of valid operations and values. Examples include int, double, bool, char, and std::string. Types are used to describe objects.

Objects

An 8-by-4 grid of cells with only the first cell filled in. The first cell has the type `char` and the value `'A'`.

An object is an instance of a type that occupies a region of memory. This includes built-in types, which you might find surprising if you’re coming from an object-oriented language like Java, where only class instances are considered objects.

Values

Values are the interpretation of an object according to a type, or a bit more plainly, what its meaning is. Values give human meaning to types and objects. Since objects are in memory, values are encoded in binary, and types transform that bit sequence into something reasonable to humans. For example, the bit sequence 01000001 will be interpreted as 65 when we’re working with integers, but as the letter A when doing text processing; it’s up to the type to tell us what it means.

Expressions

Finally, an expression describes a specific computation. Some expressions, like addition, result in values; others cause side effects, like writing to the standard output.

1 + 2; // returns a value
fmt::print("Hello, world!\n");

Manipulating numbers, logic, and text

Since the most fundamental things that a program can do are to manipulate numbers, logic, and text, we’ll keep this chapter simple, and dedicate it to them. Other interesting stuff like functions and collections appear in the two chapters following. Let’s jump into Compiler Explorer now, and start looking at some code.

#include <catch2/catch_test_macros.hpp>

TEST_CASE("Integers")
{
}

Introducing Catch2

You should notice that instead of including <fmt/format.h> at the top, we now have <catch2/catch_test_macros.hpp>. Catch2 is a popular unit-test framework for C++, and we’ll often use it instead of writing trivial programs that are written to standard output. This will help you get to a functional level faster, since you’ll be able to test your code from the get-go. The framework handles the main for us in its implementation details, so we don’t need to worry about writing it out: instead, we’ll be writing test cases. All test cases are spelt with TEST_CASE, followed by a unique string inside the parentheses.

Constants

Integers

Let’s start by defining a constant integer.

TEST_CASE("Integers")
{
  int const width = 2;
}

We use the type int to represent integers. Here, we’ve just defined an integer constant named width and initialised it with the value 2. A constant is an object that has an identifier, and its value can’t be changed (i.e. its value remains constant).

To define a constant, we specify the type (e.g. int), then add the const-qualifier, followed by the constant’s name (e.g. width), and then we initialise it with a value (e.g. = 2). If we forget to initialise our constant, then the compiler will tell us that.

TEST_CASE("Integers")
{
  int const width;
}
<source>:5:13: error: default initialization of an object of const type 'const int'
  int const width;
            ^
                  = 0

To check if two ints are equal, we use the equality operator (==). To check if they’re not equal, then we use the not-equal-to operator (!=). We also have all the inequality operators (<, <=, >=, >) at our disposal.

TEST_CASE("Integers")
{
  int const width = 2;
  CHECK(width == 2);
  CHECK(width != 3);
  CHECK(width <  4);
  CHECK(width >  1);
  CHECK(width <= 3);
  CHECK(width >= 2);
}

When you run this program, you should see the following output.

===============================================================================
All tests passed (6 assertions in 1 test case)

The CHECK operation is a part of the test framework, and it reports if the expression inside the parentheses is false. Let’s check that out by changing our first test from CHECK(width == 2) to CHECK(width == 3).

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
output.s is a Catch v3.0.0-preview.3 host application.
Run with -? for options

-------------------------------------------------------------------------------
Integers
-------------------------------------------------------------------------------
example.cpp:3
...............................................................................

example.cpp:6: FAILED:
  CHECK( width == 3 )
with expansion:
  2 == 3

===============================================================================
test cases: 1 | 1 failed
assertions: 6 | 5 passed | 1 failed

Beginners sometimes get confused between using assertions and CHECK. The neat thing about using CHECK over an assertion is that they don’t disrupt a program’s flow: when a CHECK fails, the failure is reported, and the program moves on to the next CHECK instead of abruptly halting the program on the spot. We can verify this by changing CHECK(width > 1) to CHECK(width > 2).

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
output.s is a Catch v3.0.0-preview.3 host application.
Run with -? for options

-------------------------------------------------------------------------------
Integers
-------------------------------------------------------------------------------
example.cpp:3
...............................................................................

example.cpp:6: FAILED:
  CHECK( width == 3 )
with expansion:
  2 == 3

example.cpp:9: FAILED:
  CHECK( width > 2 )
with expansion:
  2 > 2

===============================================================================
test cases: 1 | 1 failed
assertions: 6 | 4 passed | 2 failed

int supports the arithmetic operations you learnt about in school: we can add, negate, subtract, multiply, and divide.

CHECK(width + 1 ==  3);  // addition
CHECK(-width == -2);     // negation
CHECK(width - 2 ==  0);  // subtraction
CHECK(width * -4 == -8); // multiplication
CHECK(width / 2 == 1);   // division

Precedence for these operators follows the normal rules in mathematics, and you can always wrap a subexpression in parentheses to make sure it’s evaluated before its parent expression.

CHECK((width + 1) / 3 == 1);

When we divide an int, we get an int back, which might be surprising if you’re coming from JavaScript, which would give you back a fractional component too.

CHECK(5 / 2 == 2);

Integer division in C++ is broken into two parts: we get the whole number component, which is what we have here, and then we also have a remainder component, which we get by using the modulus operation (%).

CHECK(5 % 2 == 1);

“Five divides into two: twice, with one left over,”

This is how I was taught to compute in primary school. In my uni discrete maths class, it would be 5 ≡ 1 ( mod 2).

Fractions

If you’re still confused, that’s possibly because you’re used to there being only one number type. In C++, to represent rational numbers with fractional components, we use the type double. Perhaps the double type will help to clarify things.

TEST_CASE("Integers")
{
  // Code from previous section...
}

TEST_CASE("Fractions")
{
  double const width = 2.0;
}

For the initiated, doubles are floating-point numbers; for the uninitiated, this means support a much larger set of numbers, but the result may not always be exact. If you only need to support whole numbers, use int. We dissect the differences between int and double much later on.


To distinguish a double literal from an int literal, we use the decimal point. See here that we’re using 2 for the int literal, and 2.0 for the double literal.

Doubles support all of the arithmetic and logical operations that integers support, with the exception of the modulus operator. Division also gives us a number that has a fractional component.

TEST_CASE("Fractions")
{
  double const width = 2.0;
  CHECK(width == 2.0);
  CHECK(width != 3.0);
  CHECK(width < 4.0);
  CHECK(width <= 3.0);
  CHECK(width >= 2.0);
  CHECK(width > 1.0);

  CHECK(width + 1.0 == 3.0);   // addition
  CHECK(-width == -2.0);       // negation
  CHECK(width - 2.0 == 0.0);   // subtraction
  CHECK(width * -4.0 == -8.0); // multiplication
  CHECK(width / 2.0 == 1.0);   // division

  // precedence
  CHECK((width + 1.0) / 3.0 == 1.0);

  // fractional component
  CHECK(5.0 / 2.0 == 2.5);
}

If we try to add a double const remainder = width % 2.0;, then we get the following output.

error: invalid operands to binary expression ('const double' and 'double')
  double const remainder = width % 2.0;
                           ~~~~~ ^ ~~~

Some programming languages have a power operator, but C++ isn’t one of them. In order to compute interesting expressions like powers and trigonometric functions, we first need to use the <cmath> library. Then, we can use functions like std::pow and std::sin to compute stuff.

#include <cmath>
#include <catch2/catch_test_macros.hpp>

TEST_CASE("Integers")
{
  // ...
}

TEST_CASE("Fractions")
{
  // ...
}

TEST_CASE("Interesting maths")
{
  CHECK(std::pow(2, 3) == 8);
  CHECK(std::pow(2.0, 3.0) == 8.0);
  CHECK(std::sin(0.0) == 0.0);
}

Logic

Our Boolean type is bool. There are only two values: true and false, and three operations: conjunction (and), disjunction (or), and logical negation (not).

TEST_CASE("Booleans and logic")
{
  bool const learning_cxx = true;
  CHECK(learning_cxx != false);

  bool const is_chapter2 = true;
  CHECK((learning_cxx and is_chapter2));

  bool const bored = false; // hopefully!
  CHECK((bored or learning_cxx));

  CHECK(not bored);
}

For complex technical reasons, CHECK is unable to support learning_cxx and is_chapter2 and bored or learning_cxx without first wrapping them inside parentheses. You don’t need them in other contexts.

Text

Single characters

Representing single characters is done using the char type, and char literals are delimited by single quotes ('). Below are some of the operations that char supports.

TEST_CASE("Single characters")
{
  char const first_latin_letter = 'A';
  CHECK(first_latin_letter == 'A');
  CHECK(first_latin_letter != 'a');
  CHECK(first_latin_letter <  'B');
  CHECK(first_latin_letter >  '0');

  char const third_latin_letter = first_latin_letter + 2;
  CHECK(third_latin_letter == 'C');
  CHECK(third_latin_letter - 1 == 'B');
}
Strings

C++ doesn’t have a built-in string type: it’s instead a library type that gets imported when we write #include <string>. We use the std::string type to define a string.

#include <string>
// ...

TEST_CASE("Strings")
{
  std::string const greeting = "Hello, world!";
}

At this point, you might be wondering why the string type isn’t built into the language if the string literal is. This is a great question, but its answer is fairly involved, so let’s defer this to a later chapter, when you’ll have a bit more C++ maturity.

Strings support all of the comparison operations, and are case-sensitive. Equality and non-equality are fairly straightforward.

CHECK(greeting == "Hello, world!");
CHECK(greeting != "hello, world!"); // 'H' != 'h'

Inequalities perform lexicographical comparisons, so "Good" with a capital 'G' is considered to be less than "good" with a lower-case 'g', because 'G' comes before 'g' in our system’s encoding.

std::string const good = "good";
CHECK("Good" < good);

"goods" is considered greater than "good", because it contains all of "good" and is longer. If we change "goods" to "foods", we see the answer change, the encoding places 'f' before 'g'.

CHECK("goods" > good);
CHECK("foods" < good);

If we want to know how long a string is, we use the size member function, and to learn if the string has no characters, then we use the empty member function.

CHECK(good.size() == 4);
CHECK(not good.empty());

To concatenate strings, we use the plus (+) operator on characters, string literals, and other std::string objects.

CHECK(good + 's' == "goods");
CHECK(good + " day" == "good day");

std::string const news = " news";
CHECK(good + news == "good news");

Object representation in memory

C++ is a language that has a close relationship with hardware, so it’s important to understand how our built-in types are represented as objects in memory. To do this, we’ll create an “abstract machine”, which only exists on paper, and represents how a physical computer behaves1. For the most part, we’re going to be focussing on computer memory. We can imagine the memory we’re going to use as a table of cells, where each cell is one byte. For this chapter, we’ll use an abstract machine that has 32 bytes of memory, arranged into four rows, each with eight columns2. Right now, all the cells are empty.

An 8-by-4 grid of empty cells.
// Don't copy into Compiler Explorer.
int main()
{
  char const first = 'A';
}

When we define a char object, it occupies one cell.

An 8-by-4 grid of cells with only the first cell filled in. The first cell has the type `char` and the value `'A'`.
// Don't copy into Compiler Explorer.
int main()
{
  char const first = 'A';
  char const second = 'B';
}

If we define a second char, we see a second cell gets occupied.

An 8-by-4 grid of cells with the first two cells filled in. The first cell has the type `char` and the value `'A'`. The second cell has type `char` and value `B`.
// Don't copy into Compiler Explorer.
int main()
{
  char const first = 'A';
  char const second = 'B';
  bool const third = false;
  bool const fourth = true;
}

The same thing happens when we define two bool objects: they each take up one cell.

An 8-by-4 grid of cells with the first four cells filled in. The first two cells are as before. The next two cells both have type `bool`, and the values `false` and `true`, respectively.
// Don't copy into Compiler Explorer.
int main()
{
  char const first = 'A';
  char const second = 'B';
  bool const third = false;
  bool const fourth = true;
  int const fifth = 123;
}

When we define an int, it doesn’t occupy a single cell. In this case, the region of memory that our int object occupies totals to four bytes.

An 8-by-4 grid of cells with the entire first row filled. The first four cells are as before. The next four cells have been merged into one big cell, with the type `int`, and the value 123.
// Don't copy into Compiler Explorer.
int main()
{
  char const first = 'A';
  char const second = 'B';
  bool const third = false;
  bool const fourth = true;
  int const fifth = 123;
  double const pi = 3.14159265;
}

Notice that an object of type double is held by eight cells. This will be relevant in just a bit.

An 8-by-4 grid of cells with the entire first row filled. The first row is as before. The second row has merged all eight of its cells into one big cell. Its type is `double`, and its value is π to eight decimal places.

Finally, strings.

// Don't copy into Compiler Explorer.
#include <string>

int main()
{
  char const first = 'A';
  char const second = 'B';
  bool const third = false;
  bool const fourth = true;
  int const fifth = 123;
  double const pi = 3.14159265;
  std::string message = "This is 16B long";
}

We’ll go more into the details of how strings are represented in memory later on (around the time we explore why the type isn’t built in, but the literal is), but for now, we can treat its representation as occupying one cell for every character in the string.

An 8-by-4 grid of cells entirely filled. The first two rows are as before. The next two rows consist of `char`s spelling the sentence 'This is 16B long'.

At its core, std::string is a collection of chars organised linearly in memory, so we’re representing that in our diagram as sixteen char objects in place of showing you a string as a big blob. We’ll be showing strings as a big blob in the future for improved readability, but it is important to have an understanding of what it abstracts away.

Variables and modifying expressions

Like a constant, a variable is an object that we’ve named, but we’re also allowed to modify them. Defining variables is extremely similar to defining constants.

TEST_CASE("Variables")
{
  int meaning_of_life = 42;
}

Type, name, initialisation. That’s it: our variable’s defined. At the object level, it matters not whether we have a constant or variable. This info is only relevant to programmers.

TEST_CASE("variables")
{
  int meaning_of_life = 42;

  meaning_of_life = 0;
  CHECK(meaning_of_life == 0);
}

meaning_of_life = 0 is called assignment, which writes the value 0 into the object meaning_of_life. The syntax looks very similar to our definition, so we’re going to tread carefully here. In int meaning_of_life = 42;, the = represents initialisation, which is different to assignment. It might seem like we’re splitting hairs here, but as we progress to later chapters, this distinction will be of critical importance: so it’s best to get into the habit of using the correct terminology early on.


It’s possible to assign to variables of any type we’ve looked at so far: that’s left as an exercise for you to play with. Instead, let’s replace the contents of our test case with the following.

TEST_CASE("variables")
{
  int width = 2;

  width += 1; // same as `width = width + 1`
  CHECK(width == 3);

  width -= 2; // same as `width = width - 2`
  CHECK(width == 1);

  width *= 8; // same as `width = width * 8`
  CHECK(width == 8);

  width /= 2; // same as `width = width / 2`
  CHECK(width == 4);

  width %= 3; // same as `width = width % 3`
  CHECK(width == 1);
}

Each of these expressions is called compound assignment. Compound assignment can be done for all of the arithmetic operations and works for integers, characters, and doubles. If you’re unfamiliar with compound assignment, it’s a shorthand for assigning the result of a binary operation to the same object as its left-hand side. Essentially, width += 1 is the same as width = width + 1.

++width; // same as `width += 1`
CHECK(width == 2);

--width; // same as `width -= 1`
CHECK(width == 1);

++width is the increment operator, which increases the value of the object by one. --width is the decrement operator, which decreases the object’s value by one. Just as with compound assignment, you can use these on ints, doubles, and chars. Unlike other programming languages, in C++, we put the ++ and -- operators before the variable name.

We can also append to a string using +=.

std::string greeting = "Hello";
CHECK(greeting != "Hello, world!");

greeting += ", world!";
CHECK(greeting == "Hello, world!");

If we want to clear a string so that it’s empty, we use the clear member function, like so:

CHECK(not greeting.empty());
greeting.clear();
CHECK(greeting.empty());

const-correctness

Since it’s easier to define a variable than a constant, you might be wondering why we learnt about variables after constants. The answer comes down to expressing intention: we read code far more often than we write it, so we should be giving the reader as much useful information as possible.

In C++, const is a contract between the programmer and the compiler, where the programmer promises that they are never going to modify the object, and the compiler agrees to reject all attempts to modify said object. Someone reading the code at a later point will have confidence that the object’s value won’t change at any point because of this contract.

If we now change meaning_of_life into a constant, we’ll get the following output.

TEST_CASE("variables")
{
  int const meaning_of_life = 42;

  meaning_of_life = 0;
  CHECK(meaning_of_life == 0);
}
<source>:7:19: error: cannot assign to variable 'meaning_of_life' with const-qualified type 'const int'
  meaning_of_life = 0;
  ~~~~~~~~~~~~~~~ ^
<source>:5:13: note: variable 'meaning_of_life' declared const here
  int const meaning_of_life = 42;
  ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~

This is telling us that we can’t write a new value into a constant. Because of these two reasons, I adhere to a simple rule when writing code: if you’re not absolutely certain the object is going to be modified, then make it a constant.

As you learn more about C++, we’ll revisit this notion of const-correctness.

Feedback

If you’d like to provide feedback regarding this series, please file an issue on GitHub.

If you’re interested in reading future chapters, subscribe to my RSS feed to receive a notification at the time of publication. If you’d previously subscribed to my feed on my old website (www.cjdb.com.au), please be sure to note the new domain!

Summary

This chapter’s main focus has been to acquaint you with the C++ type system and object model. We’ve done that by making simple programs and test cases using:

  • five types fundamental to all useful programs
  • exploring fundamental operations for each type

This chapter was a fairly lengthy one, but it covers the most fundamental things you’ll need so you can press forward.


  1. This is a heavily simplified view of what actually happens. We will be adding layers of complexity to the abstract machine as we learn more.↩︎

  2. This amount will grow and contract as chapters need more or less memory.↩︎