Idioms for Typesafe Enums in C++

We "discover" two (perhaps) novel C++ programming idioms,

by starting with Joshua Bloch's typesafe enum pattern for Java, then exploring ways to translate that into C++.

C++ does have builtin enums, but they are not sufficiently typesafe for many peoples' taste, including mine.

    enum Suit { CLUBS, ... };
    enum Suit x = CLUBS;  // OK
    enum Suit y = 1;      // COMPILE ERROR - this is good, but
    int a[CLUBS];         // OK - this is bad
    enum Suit z = enum Suit (99); // OK - very bad - invalid object state
The key problem is the implicit conversion from an enum type to the int type. Such conversions are part of the untyped heritage of C and were seen as a great convenience in the early days of C++, but increasingly automatic conversions are seen as a burden, rather than a help.

Can we improve on this?

Drawing inspiration from Joshua Bloch's typesafe enum for Java

We could simply try to translate one of Joshua Bloch's typesafe enum programs for Java directly into C++:

    // ----------------------------------------------------------------
    // In Suit.hpp
    // ----------------------------------------------------------------
    class Suit
    {
    private:
      static inline int nextOrdinal ()
      {
	static int firstOrdinal = 0;
	return firstOrdinal++;
      }

      struct SuitBody
      {
	SuitBody (std::string name, int ordinal)
	  : name_ (name), ordinal_ (ordinal) {}
	const std::string name_;
	const int ordinal_;
      };

      const SuitBody *p;

      Suit (std::string name) : p (new SuitBody (name, nextOrdinal())) {}

    public:

      std::string toString () const { return p->name_; }

      friend bool operator< (Suit x, Suit y)
      {
	return x.p->ordinal_ < y.p->ordinal_;
      }

      friend bool operator== (Suit x, Suit y) { return x.p == y.p; }
      friend bool operator!= (Suit x, Suit y) { return ! (x == y); }

      static const Suit clubs;
      static const Suit diamonds;
      static const Suit hearts;
      static const Suit spades;
    };

    inline std::ostream& operator<< (std::ostream& o, Suit s)
    {
      return o << s.toString ();
    }

    // ----------------------------------------------------------------
    // In Suit.cc
    // ----------------------------------------------------------------
    const Suit Suit::clubs    ("Clubs");
    const Suit Suit::diamonds ("Diamonds");
    const Suit Suit::hearts   ("Hearts");
    const Suit Suit::spades   ("Spades");
That was not too hard. But C++ folks will tend to reject this solution because:

Implementing switch without integral conversions

It's possible to provide a switch-like construct that is typesafe and never involves conversion from Suit to an integral type. We use the visitor pattern (maybe this variant should be called the typesafe switch pattern) thusly:

    class Suit
    {
    public:
      struct Switch
      {
	virtual void Clubs    () = 0;
	virtual void Diamonds () = 0;
	virtual void Hearts   () = 0;
	virtual void Spades   () = 0;
	void doit (Suit s) { s.visit (*this); }
	virtual ~Switch () {}
      };

    private:
      struct SuitBase
      { virtual void accept (Switch& s) const = 0; };

      struct Clubs: public SuitBase
      { virtual void accept (Switch& s) const { s.Clubs (); } };

      struct Diamonds: public SuitBase
      { virtual void accept (Switch& s) const { s.Diamonds (); } };

      struct Hearts: public SuitBase
      { virtual void accept (Switch& s) const { s.Hearts (); } };

      struct Spades: public SuitBase
      { virtual void accept (Switch& s) const { s.Spades (); } };

    public:
      static const Suit clubs;
      static const Suit diamonds;
      static const Suit hearts;
      static const Suit spades;

      friend bool operator== (Suit x, Suit y) { return x.p == y.p; }
      friend bool operator!= (Suit x, Suit y) { return ! (x == y); }

      inline Suit (const SuitBase *q) : p (q) {}
      inline void visit (Switch& x) const { p->accept(x); }

    private:
      const SuitBase *p;
    };

Client code can then do, e.g.:

    inline string toString (const Suit& s)
    {
      string str;

      struct Switch : public Suit::Switch
      {
	string& str;
	Switch (string& alias) : str (alias) {}
	void Clubs ()    { str = "Clubs"; }
	void Diamonds () { str = "Diamonds"; }
	void Hearts ()   { str = "Hearts"; }
	void Spades ()   { str = "Spades"; }
      } Switch (str);

      Switch.doit (s);
      return str;
    }
This works like a switch statement and has the very significant advantage that a "missing case" will be flagged by the compiler.

But it's both less convenient and less efficient than builtin switch - the C++ programmer will not likely embrace this idiom.

An efficient solution

Let's see if we can design a compromise that will give the client:

We would like to provide a client with an API like this:
   Suit s = Suit::clubs;    // OK
   int y = Suit::clubs;     // COMPILE ERROR
   Suit::clubs + 1;         // COMPILE ERROR

   switch (s)               // OK
    {
      case Suit::clubs: ... // OK
    }
but this is clearly not achievable, since case labels must be integral constants. So we need two kinds of "constants":
   Suit::clubs   // of type Suit or convertible to Suit
   Suit::CLUBS   // of enum or int type.
and Suit is only convertible to enum type through explicit conversion.

A more achievable design for client code looks like this:

    Suit s = Suit::clubs;     // OK
    int y = Suit::clubs;      // COMPILE ERROR
    Suit::clubs + 1;          // COMPILE ERROR
    Suit::CLUBS + 1;          // OK
    Suit b (Suit::CLUBS + 1); // COMPILE ERROR
    Suit a = Suit::CLUBS;     // COMPILE ERROR

    switch (s.toEnum())       // OK
     {
       case Suit::CLUBS: ...  // OK
     }
If we represent each enumerator by a distinct singleton type, obvious implementation steps leads us to:
    class Suit
    {
    public:
      enum SUIT_ { CLUBS, DIAMONDS, HEARTS, SPADES };

    private:
      SUIT_ S_;

      struct Clubs    {};
      struct Diamonds {};
      struct Hearts   {};
      struct Spades   {};

    public:
      Suit (const Clubs&)    : S_ (CLUBS)    {}
      Suit (const Diamonds&) : S_ (DIAMONDS) {}
      Suit (const Hearts&)   : S_ (HEARTS)   {}
      Suit (const Spades&)   : S_ (SPADES)   {}

      SUIT_ toEnum () const { return S_; }

      inline friend bool operator== (Suit s, Suit t) { return s.S_ == t.S_; }
      inline friend bool operator!= (Suit s, Suit t) { return ! (s == t); }

    public:
      static Clubs    clubs;
      static Diamonds diamonds;
      static Hearts   hearts;
      static Spades   spades;
    };
This implementation is maximally efficient.

The Enum Literal idiom

There is also another subtle issue with some of the prior Suit implementations that should be pointed out. Because C++ is an unsafe language, it is possible for either Murphy or Machiavelli to modify the memory of one of the "Suit literals" like Suit::clubs. This is reminiscent of the bugs in ancient FORTRAN compilers where the value of a literal such as 7 could be changed by user code. But a literal like Suit::clubs doesn't actually need any state since its type completely determines its behavior. In fact, an optimizing C++ compiler can remove any references to the literal objects like Suit::clubs above, since the objects themselves are actually never used.

Naturally there are flaws...

    class Converter { public: Converter (Suit) {} };
    void f (Converter);

    Suit s = Suit::clubs;
    f (s);           // OK
    f (Suit::clubs); // COMPILE ERROR - bad

Unfortunately, the object Suit::clubs is not of type Suit (although it is convertible to Suit), and C++ will only apply one user-defined conversion at a time.

There is a solution to this as well. In C++, the inheritance relationship is special (not an ordinary user defined conversion), so if Suit::clubs is of a type derived from Suit, the above example will work. C++ will make us jump through some hoops to have a nested class inherit from its enclosing class, but nothing insurmountable.

    class Suit
    {
    public:
      enum SUIT_ { CLUBS, DIAMONDS, HEARTS, SPADES };

    private:
      SUIT_ S_;

      class Clubs;
      class Diamonds;
      class Hearts;
      class Spades;

    public:
      Suit (const Clubs&)    : S_ (CLUBS)    {}
      Suit (const Diamonds&) : S_ (DIAMONDS) {}
      Suit (const Hearts&)   : S_ (HEARTS)   {}
      Suit (const Spades&)   : S_ (SPADES)   {}

      SUIT_ toEnum () const { return S_; }

      inline friend bool operator== (Suit s, Suit t) { return s.S_ == t.S_; }
      inline friend bool operator!= (Suit s, Suit t) { return ! (s == t); }

    public:
      static Clubs    clubs;
      static Diamonds diamonds;
      static Hearts   hearts;
      static Spades   spades;
    };

    class Suit::Clubs : private Suit
    {
      friend class Suit;
      Clubs () : Suit (*this) {};
      void operator& () const;
    };

    class Suit::Diamonds : private Suit
    {
      friend class Suit;
      Diamonds () : Suit (*this) {};
      void operator& () const;
    };

    ...

This fixes the above bug. Is this a new use for private inheritance? Strangely enough, the "Suit Literals" like Suit::clubs contain data (inherited from Suit), but they are never referenced, and as before they can be optimized away. Because everything there is to know about Suit::clubs is implicit in its static type, the compiler can optimize uses of Suit::clubs more than it can optimize uses of Suit. (Admittedly, this is a very small effect).

Furthermore, we prevent:

i.e. the following are both compile errors:
   void *p = (void *) &Suit::clubs; // ERROR: operator& is private
   const Suit& s = Suit::clubs;     // ERROR: `Suit' is an inaccessible base

Thus these enum literals effectively have no location in memory and effectively no data. They are "pure values", like builtin data literals such as `42'.

We dub this the "Enum Literal" idiom.

The Private Inherited Enum idiom

Another "buglet" is that the enumeration type Suit::SUIT_ is public. Clients should never create a variable of the enumeration type Suit::SUIT_. They should always use Suit instead.

We want clients to refer to the enum constants like Suit::CLUBS (and only "at the last moment"), but not to their type.

One might think this would be impossible, since the enumeration type and corresponding constants share a single declaration. But it can be done, thus:

    class SuitEnumInheritanceTrick_
    {
    private:
      friend class Suit;
      enum SUIT_ { CLUBS, DIAMONDS, HEARTS, SPADES };
    };

    class Suit : private SuitEnumInheritanceTrick_
    {
      // Expose the names of the enumeration constants, while keeping the
      // name of the enumeration type private!
    public:
      using SuitEnumInheritanceTrick_::CLUBS;
      using SuitEnumInheritanceTrick_::DIAMONDS;
      using SuitEnumInheritanceTrick_::HEARTS;
      using SuitEnumInheritanceTrick_::SPADES;
    private:
      using SuitEnumInheritanceTrick_::SUIT_;
    ...
    }
This may be another novel use of private inheritance in C++. We dub this the Private Inherited Enum idiom.

It may seem bizarre that clients can manipulate objects whose type is private. Privacy in C++ protects names, not types. Privacy is a low-level mechanism.

Type privacy does not prevent object piracy.

There's an alternative programming language concept called Confined Types that works at the object level as intuition would suggest. We are not aware of any implementations.

In the real world, unfortunately, the software engineering benefits are not worth the costs of the obscurity of the technique. In C++, since users can always cheat, most obviously by

    int x = Suit::CLUBS;
protection is never absolute. Although the Private Inherited Enum idiom is a cool hack, we don't use it.

A more practical way to discourage saving Suit values in the less safe enum form is:
    class Suit
    {
    public:
      enum private_SUIT_ { CLUBS, DIAMONDS, HEARTS, SPADES };
    private:
      typedef private_SUIT_ SUIT_;

Eliminating redundant code

Our previous implementation fulfills all the design criteria for library client API.

For the library implementer, however, it would be nice if we could lighten the implementation burden by reducing the number of times each enumerated name like clubs must be repeated in the code. The obvious technique for removing code duplication in C++ is templates.

Here's the "final" version:

    class Suit
    {
    public:
      enum private_SUIT_ { CLUBS, DIAMONDS, HEARTS, SPADES };

    private:
      typedef private_SUIT_ SUIT_;
      SUIT_ S_;

      template <SUIT_ S> class Literal;

    public:
      template <SUIT_ S>
      Suit (const Literal<S>&) : S_ (S) {}

      SUIT_ toEnum () const { return S_; }

      inline friend bool operator== (Suit s, Suit t) { return s.S_ == t.S_; }
      inline friend bool operator!= (Suit s, Suit t) { return ! (s == t); }

      static const Literal<CLUBS>    clubs;
      static const Literal<DIAMONDS> diamonds;
      static const Literal<HEARTS>   hearts;
      static const Literal<SPADES>   spades;
    };

    template <Suit::private_SUIT_ S>
    class Suit::Literal : private Suit
    {
    public:
      Suit::private_SUIT_ toEnum () const { return S; }
    private:
      friend class Suit;
      Literal () : Suit (*this) {}
      // Prevent treating literals as objects
      void* operator new (size_t);      // outlawed
      void  operator delete (void*);    // outlawed
      void  operator= (const Literal&); // outlawed
      void* operator& () const;         // outlawed
    };

Although the previous implementations can be seen as pure experiments, I recommend this implementation for serious use.

Complete Source Code

Complete sample programs for all the above variants, together with a test suite, is available here.


Back to Martin's home page
Last modified: Sat Jan 25 22:50:21 PST 2003
Copyright © 2003 Martin Buchholz