Monday, February 13, 2012

Introducing polymorphic_collections: enumerator

Polymorphic_collections is a C++ library which provides a set of wrapper objects which expose an interface for performing operations on a collection without specifying the underlying type of the collection. For example, enumerator<> is an adapter which allows a collection to be enumerated through:

int sum(enumerator<int> e)
{
     int result = 0;
     while (auto i = e.next())
     {
      // Note: i is a boost::optional<int&>, so we need to dereference it to get
      // at the value.
      result += *i;
     }
}

We can then call this function using enumerator<> to wrap a variety of collection types:

std::vector<int> v;
sum(v);

std::list<int> l;
sum(l);

int a[10];
sum(a);

Using enumerator<> allows some of the same flexibility in design that templates provide, but in locations where it would be impossible or unfeasible to use templates. For example, virtual functions:

struct Foo
{
     virtual void Baz(enumerator<int> e) = 0;
};

Enumerator is an extremely simple interface. It exposes only a single method, next(). The return type of next() is boost::optional<T&>, where T is enumerator's template parameter. When the enumerator has exhausted the collection, it will return an empty optional object. There is no mechanism for resetting an enumerator, retrieving an object without advancing the enumeration, or traversing the collection in any other way. Enumerators cannot be copied, but they can be moved.

Enumerators can wrap collections of a different type, so long as a reference of the enumerator's type can be taken of the collection's type. For example:

std::vector<int> v;
enumerator<const int> e1 = v; // Okay, const int& = int is allowed
enumerator<float> e2 = v; // Error, float& = int is not allowed

struct Base { };
struct Derived : public Base { };
std::vector<Derived> v;
enumerator<Base> e = v; // Okay, Base& = Derived is allowed

Enumerators normally allow modification of the objects within the collection. You can prevent this by specifying a const type as the enumerator's template parameter.

std::vector<int> v;
enumerator<int> e = v;
*e.next() = 1;        // Equivalent to *v.begin() = 1

enumerator<const int> e = v;
*e.next() = 1;        // Error, can't assign to const int&

Normally, an enumerator contains only a reference to the underlying collection. If assigned to an rvalue, the enumerator can embed the collection in itself. This is handy for returning enumerators from functions:

enumerator<int> foo()
{
     std::vector<int> v;
     v.push_back(0);
     v.push_back(1);
     v.push_back(2);
     return std::move(v);      // Note: without the std::move, it will return an enumerator
            // referencing a temporary, which will crash.
}

Any STL-compatible collection exposing an iterator-based interface can be embedded in an enumerator.

Enumerators can encapsulate pointers with the size-specified at runtime, via the make_enumerator function:

int ar[5];
int* ptr = ar;
enumerator<int> e = make_enumerator(ptr, sizeof(ar) / sizeof(ar[0]));

Enumerators can wrap functions and function-like objects:

int a = 0, b = 1;
enumerator<int> fibo = [&] () -> boost::optional<int>
{
     int r = b + a;
     a = b;
     b = r;
     return r;
}

In order to be wrapped by enumerator, the callable object must return either a boost::optional<T> or a boost::optional<T&>. If it returns a value, then next() will return a reference to a temporary which will be destroyed on the following call to next().

Enumerators can be made thread-safe by specifying a locking policy via a template parameter:

void worker(enumerator<Task, atomic>& tasks)
{
     while (auto task = tasks.next())
     {
      // Perform task...
     }
}

void manager(enumerator<Task> tasks)
{
     // Create a locking enumerator from a non-locking one.
     // As enumerators cannot be copied, this requires consuming the
     // original enumerator.
     enumerator<Task, atomic> atomicTasks = std::move(tasks);
     
     for (size_t i = 0; i < THREAD_LIMIT; i++)
     {
      spawn([] () { worker(atomicTasks); });
     }
}

The atomic policy inserts a mutex lock into every call to next(). Because of its simple interface design, it is extremely easy to write thread-safe code using enumerator.

No comments:

Post a Comment