Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world. Herb Sutter and Andrei Alexandrescu, C++ Coding Standards

Boost.Flyweight Tutorial: Configuring Boost.Flyweight



Contents

Configurable aspects of Boost.Flyweight

Most of the time, flyweight default configuration is just good enough and the user need not care about further tuning of her flyweight instantiations; however, when the necessity for more control over Boost.Flyweight behavior arises, comprehensive mechanisms are provided to select, configure and even extend the following implementation aspects:

Free-order template parameter interface

The flyweight class template features a "smart" specification interface by which the configuration aspects can be provided as optional template arguments in whatever order the user pleases. For instance, a tagged flyweight of std::strings with a set-based factory and no tracking can be specified like this:

flyweight<std::string, tag<label_t>,  set_factory<>, no_tracking   >

or like this:

flyweight<std::string, no_tracking,   tag<label_t>,  set_factory<> >

or in any other order; only std::string is required to occupy the first place in the specification.

Header inclusion

The example code shown at the introductory section uses the "boost/flyweight.hpp" convenience header, which simply includes the headers for the class template flyweight and its default configuration components:

#include <boost/flyweight/flyweight.hpp>      // class template flyweight
#include <boost/flyweight/hashed_factory.hpp> // hashed flyweight factory
#include <boost/flyweight/static_holder.hpp>  // regular factory instantiation
#include <boost/flyweight/simple_locking.hpp> // simple locking policy
#include <boost/flyweight/refcounted.hpp>     // refcounting tracking policy

When using components other than these, their specific headers must be explicitly included.

Tagging

Consider the following two types:

typedef flyweight<std::string> name_t;
typedef flyweight<std::string> ip_address_t;

Although technically both types are identical, this is so by virtue of coincidence, as there is no sensible relation between names and IP addresses. Internally, the fact that name_t and ip_address_t are the same flyweight type causes values of both classes to be stored together in the same flyweight factory, although their respective ranges are not expected to overlap. Tagging can be used to turn these into really different types:

struct name_tag{};
typedef flyweight<std::string,tag<name_tag> > name_t;

struct ip_address_tag{};
typedef flyweight<std::string,tag<ip_address_tag> > ip_address_t;

Now, name_t and ip_address_t are different flyweight classes having separate factories each. Tags are a purely syntactic device: any type can be used for tagging inside the tag construct, though good style recommends using tag classes with descriptive names which are local to the context where the flyweight type is being defined.

Factory specification

flyweight uses a type of internal component called factory whose purpose is to store and retrieve the different values flyweight objects refer to at a given time. By default, a factory based on a hashed container is used, so that flyweight<T> is actually equivalent to

flyweight<T,hashed_factory<> >

where hashed_factory is a so-called factory specifier. Boost.Flyweight provides several predefined factory specifiers, which not only let the user select the specific type of factory used, but also accept their own template arguments to customize each factory.

Types involved in the configuration of factories

A given flyweight instantiation has associated flyweight::key_type and flyweight::value_type types (which are equal in the case of regular flyweights or different if key-value flyweights are used). Also, there is an internal Entry type which corresponds to the type of the objects actually stored in the factory: Entry contains the shared value_type objects of flyweight as well a some internal bookkeeping information; also, Entry is implicitly convertible to const key_type&, so that factories can rely on key_type to look up Entries. Since Entry is internal to the implementation of flyweight, it cannot be directly referred to by the user in the configuration of factories. Instead, the proxy placeholder type boost::mpl::_1 can be used.

hashed_factory

Header: "boost/flyweight/hashed_factory.hpp"
Syntax: hashed_factory<[Hash[,Pred[,Allocator]]]>

This specifier, which Boost.Flyweight takes by default, controls the usage of a factory internally based on a hash container. Values are determined to be equivalent by means of the BinaryPredicate Pred, and indexed into the factory container using Hash, which is assumed to be a Hash function object for arguments of type Key. The Allocator parameter is used by the factory container for its memory allocation needs. The default types for these parameters are such that the expression

flyweight<T,hashed_factory<> >

is equivalent to

flyweight<
  T,
  hashed_factory<
    boost::hash<key_value>,
    std::equal_to<key_value>,
    std::allocator<boost::mpl::_1>
  >
>

where key_type is the key type of the flyweight and boost::mpl::_1, as explained above, stands for the internal Entry type of the elements stored in the factory. Suppose we would like to configure hashed_factory for a std::string flyweight with a special hash predicate special_hash and a custom allocator custom_allocator; this would be specified as follows:

flyweight<
  std::string,
  hashed_factory<
    special_hash<std::string>,
    std::equal_to<key_value>,
    custom_allocator<boost::mpl::_1>
  >
>

concurrent_factory

Header: "boost/flyweight/concurrent_factory.hpp"
Syntax: concurrent_factory<[Hash[,Pred[,Allocator]]]>

This specifier provides a factory based on a concurrent container from Boost.Unordered. The factory is particularly suitable for flyweight creation in multithreaded scenarios as it does not require external synchronization or tracking; so, it should be generally used in conjunction with no_locking and no_tracking:

typedef flyweight<
  std::string,
  concurrent_factory<>,
  no_locking,
  no_tracking
> concurrent_string_flyweight;

Unused values (those no longer referred to by any flyweight), are periodically erased from the factory by a built-in garbage collector.

The default types for concurrent_factory parameters are such that the expression

flyweight<T,concurrent_factory<> >

is equivalent to

flyweight<
  T,
  concurrent_factory<
    boost::hash<key_value>,
    std::equal_to<key_value>,
    std::allocator<boost::mpl::_1>
  >
>

with the same meaning as in hashed_factory.

set_factory

Header: "boost/flyweight/set_factory.hpp"
Syntax: set_factory<[Compare[,Allocator]]>

set_factory resorts to an std::set-like ordered container for the implementation of the flyweight factory. Compare must be a BinaryPredicate inducing a strict weak ordering on the value type flyweight is acting upon; as is customary with STL ordered containers, two values are considered equivalent if none is less than the other according to Pred. Allocator is an allocator type passed along to the factory internal container for its memory-related tasks. When default parameters are used, the expression

flyweight<T,set_factory<> >

is equivalent to

flyweight<
  T,
  set_factory<std::less<key_type>,std::allocator<boost::mpl::_1> >
>

Usual tradeoffs arising in the comparison of ordered and hashed containers also apply when choosing between set_factory and hashed_factory: so, set-based lookup and insertion of values are generally slower than those based on hashing, but the latter can be affected by pathological worst-case scenarios with very poor performance.

assoc_container_factory

Header: "boost/flyweight/assoc_container_factory.hpp"
Syntax: assoc_container_factory<ContainerSpecifier>

This specifier can be seen as a generalization of hashed_factory and set_factory where the user supplies the exact type of container on which the factory is based. The way in which the container is specified might seem at first a little daunting to those unfamiliar with the Boost MPL Library: ContainerSpecifier must be an MPL Lambda Expression such that, when invoked with the types Entry and key_type explained above, it produces the type of a container of Entry elements satisfying the following requirements:

  1. The container type must be a model of AssociativeContainer or UnorderedAssociativeContainer with unique keys, where equivalence of Entrys is determined by the key_type values the entries are convertible to.
  2. The container must be stable, i.e. its iterators must remain valid after insert and erase operations. Note that this condition is not met by many existing implementations of hashed containers that invalidate iterators upon a rehashing operation.

Let us see what a container specifier looks like with an example. Suppose we have our own ordered container like the following:

template<
  typename Elem,
  typename Compare=std::less<Elem>,
  typename Allocator=std::allocator<Elem>
>
class ultrafast_set
{
  ...
};

Then ultrafast_set can be plugged into assoc_container_factory like this:

typedef flyweight<
  std::string,
  assoc_container_factory<
    // MPL lambda expression follows
    ultrafast_set<mpl::_1,std::less<std::string> >
  >
> flyweight_string;

As has been explained, mpl::_1 is a so-called MPL placeholder standing as a "slot" to be replaced with Entry by the internal machinery of Boost.Flyweight. Note that we have not relied on the default argument of ultrafast_set for Compare and instead we have provided a fixed instantiation for std::string: this is so because requirements state that the type with which ContainerSpecifier will be filled in internally is convertible to const key_type& (here const std::string&), and it is based on key_type that lookup and equivalence of entries should be determined. On the other hand, the default argument for the Allocator parameter works just fine, as is more apparent if we write it down explicitly:

typedef flyweight<
  std::string,
  assoc_container_factory<
    ultrafast_set<
      mpl::_1,
      std::less<std::string>,
      std::allocator<mpl::_1>
    >
  >
> flyweight_string;

Holder specification

Each flyweight type, that is, each distinct instantiation of the class template flyweight, is associated with exactly one factory object. In most cases, how this factory object is created is of little importance to the user of Boost.Flyweight, but there are special circumstances where control of this aspect is necessary. An internal component called holder is in charge of instantiating the factory class and some other internal information; this component is stipulated by means of a holder specifier, static_holder being the default one.

static_holder

Header: "boost/flyweight/static_holder.hpp"
Syntax: static_holder

This the default holder specifier of Boost.Flyweight, and produces holders where the unique factory lives as a local static variable of the program.

intermodule_holder

Header: "boost/flyweight/intermodule_holder.hpp"
Syntax: intermodule_holder

In most C++ environments, static variables do not mix well with dynamically loaded modules in the sense that instances of the same static variable can be duplicated across different modules, even though by definition the variable should be unique. In many cases, this duplication goes unnoticed if the modules do not communicate between each other using the affected types, but consider this case where such communication does happen:

// module 1

typedef flyweight<std::string> flyweight_string;

// produce_string is exported so that it can be dynamically
// linked

flyweight_string produce_string()
{
  return flyweight_string("boost");
}
// main program

typedef flyweight<std::string> flyweight_string;

int main()
{
  ... // import module 1

  flyweight_string str1=produce_string();
  flyweight_string str2("boost");
  assert(str1==str2);
}

In many environments, this program results in an assertion failure because the flyweight factory object used by flyweight_string as seen within module 1 is not the same factory object as seen within the main program: hence the value representations internally pointed to by str1 and str2 will differ and will be mistakenly considered as not equal. Many other problems might arise due to factory duplication, including undefined behavior.

intermodule_holder specifies a factory holder which is capable of avoiding the duplication problem and ensuring that all modules of a program are using the same factory instance. To fix the example above, it suffices to redefine flyweight_string in both modules as:

typedef flyweight<std::string,intermodule_holder> flyweight_string;

intermodule_holder is considerably more onerous than static_holder in terms of compilation times and introduces a non-negligible overhead at program start-up, so its use should be reserved to the situations where it is really necessary.

Locking policies

The internal factory associated to each flyweight type is a shared resource and as such access to it must be properly synchronized in multithreaded environments. A locking policy specifies the synchronization mechanisms to be used for this purpose.

simple_locking

Header: "boost/flyweight/simple_locking.hpp"
Syntax: simple_locking

This is the default locking policy. It specifies the simplest native synchronization primitives provided by the operating system, whenever available.

no_locking

Header: "boost/flyweight/no_locking.hpp"
Syntax: no_locking

No synchronization is enforced so that irrestricted internal access to the implementation shared resources is allowed. Selecting no_locking results in somewhat faster execution than the default simple_locking, but it renders the type thread-unsafe, which can have catastrophic consequences. This policy should not be used except in single-threaded environments or when there is an absolute guarantee that the particular flyweight type will not be used in a concurrent scenario.

Tracking policies

A tracking policy controls the lifetimes of the flyweight objects and can act based on this information. For instance, a suitable tracking mechanism can determine when a given value stored in the factory can be safely erased because it is no longer referenced by any flyweight; this is precisely what the default tracking policy, refcounted, does.

refcounted

Header: "boost/flyweight/refcounted.hpp"
Syntax: refcounted

This tracking policy determines that values stored in the factory be equipped with reference counting mechanisms so that a factory entry is erased when the last flyweight object associated to it is destroyed.

no_tracking

Header: "boost/flyweight/no_tracking.hpp"
Syntax: no_tracking

No flyweight tracking is done when this policy is selected, which implies that the values stored in the factory remain in it until program termination. As compared with refcounted, no_tracking presents advantages and drawbacks. The benefits are:

whereas potential drawbacks of using no_tracking include:




Revised September 27th 2024

© Copyright 2006-2024 Joaquín M López Muñoz. Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)