C++ Multiple Thread Concurrency with data in Classes 2

Following on from my previous article, here is the second approach which uses a mutex to protect the data within the class from multiple threads. One of the main things you want to do when dealing with threading and concurrency with data is to encapsulate and localize the locking in one place so the details of locking are completely hidden from the user of the class.

The rules to follow for this pattern are very simple…

The data variables must be private within the class
The mutex that protects the data must also be private to the class
You should never return const references to your data in member functions, which means you must return copies of the data you need to access. This is ok for basic types but for strings and more complex types will mean copying data, or using shared pointers. I will cover both examples.

So, let’s start with making a basic class like in the previous article, the goal of the class is to contain a list of config params (key/value pairs) and to provide a 100% thread-safe interface to read/set values in this data set.

class config_data_t
{
   std::map<std::string, std::string> _config_vals;
   std::mutex _config_vals_mtx;

public:
   std::string get_val(const std::string& name) const
   { 
      // Lock our data, prevent any other thread while we are here
      std::lock_gaurd<std::mutex> locker(_config_vals_mtx);

      auto v = _config_vals.find(name);
      if(v !=  _config_vals.end())
         return v->second;
      return ""; 
   }

   void set_val(const std::string& name, const std::string& val);
   {
      // Lock our data, prevent any other thread while we are here
      std::lock_gaurd<std::mutex> locker(_config_vals_mtx);

      _config_vals[name] = val; 
   }
}

This should be self-explanatory, the getter and setter function acquire an exclusive lock before touching the _config_vals private member. The implementation is 100% thread-safe because only one thread can operate on the _config_vals data at any one time.

You will note that the get_val() function returns a std::string by value, the string copy happens on line 14.

Now let’s take a look at what happens if the above class was working with larger data value sizes, for this example I have replaced the std::string with a blob_t which could, for example, be any data size from 60k to 10Mb of data. The following modified class uses this new type

class config_data_t
{
   using blob_t = std::vector<char>;
   std::map<std::string, blob_t> _config_vals;
   std::mutex _config_vals_mtx;

public:
   blob_t get_val(const std::string& name) const
   { 
      // Lock our data, prevent any other thread while we are here
      std::lock_gaurd<std::mutex> locker(_config_vals_mtx);

      auto v = _config_vals.find(name);
      if(v !=  _config_vals.end())
         return v->second;
      return ""; 
   }

   void set_val(const std::string& name, const blob_t& val);
   {
      // Lock our data, prevent any other thread while we are here
      std::lock_gaurd<std::mutex> locker(_config_vals_mtx);

      _config_vals[name] = val; 
   }
}

Both the setter and getter are going to make copies of this data, and that would be a horrible implementation if the data sizes are large. In this case, I would want to implement this differently, and this is where the magic of std::shared_ptr comes into play, so let us re-work the above and discuss it below.

class config_data_t
{
   using blob_t = std::vector<char>;
   using blob_ptr_t = std::shared_ptr<const blob_t>;
   std::map<std::string, blob_ptr_t> _config_vals;
   std::mutex _config_vals_mtx;

public:
   const blob_ptr_t get_val(const std::string& name) const
   { 
      // Lock our data, prevent any other thread while we are here
      std::lock_gaurd<std::mutex> locker(_config_vals_mtx);

      auto v = _config_vals.find(name);
      if(v !=  _config_vals.end())
         return v->second;
      return nullptr; // Or throw an exception if you prefer
   }

   void set_val(const std::string& name, blob_ptr_t&& val);
   {
      // Lock our data, prevent any other thread while we are here
      std::lock_gaurd<std::mutex> locker(_config_vals_mtx);

      _config_vals.emplace(name, std::move(val)); 
   }
}

Now the change is quite subtle, but the overall performance of this will be much better, and still 100% thread-safe, essentially through the interface we are aiming to eliminate data copies. This is also using features that were introduced in C++11 so please be aware of this.

Ok, let’s start with the data storage, we now have a map of names to shared pointers, where the shared pointer is pointing to (and owning) a std::vector containing some arbitrarily large size of data. So first, let’s look at the get_val() which is now returning a shared pointer to one of the data items in the map. We are still doing a copy (line 14), but all we are copying is the pointer, not the data it points to. So once you call this get_val() function, the same block of data is now pointed to by both the shared pointer that was returned to you AND the pointer held in the _config_vals map. The data that is returned to you should be treated as a const value, in other words you must not modify or change the data in the vector that your pointer points to – this is critical. If you need to change a value you must call set_val() instead

When you call set_value() you are passing in a pointer to a data block that you created, and what we do here is pass the shared_ptr in by R-Value, which allows us to take ownership of the data block you have already created, removing the need for a data copy

Now let us suppose, the value we are setting, already has another value set, and – that other value was previously obtained by you, so you are holding a shared_ptr to it. Now, when you set a new value, the pointer in the map to the old data is destroyed, but because there is another pointer to that same data, the data its self remains until the last instance of a shared_ptr is destroyed.

So the principle here, like the class in the first article, you essentially never change data you can have a reference to, you simply replace it each time you do a change. This is what the shared_ptr is for and this pattern makes for a very thread-safe implementation.

This is very basic stuff, but you will be surprised how many times I have seen this done badly, even by me… it’s simple if you keep the rules simple and it is unlikely to go wrong if your user can use the class without having to think about threading/concurrency issues at all.

This content is published under the Attribution-Noncommercial-Share Alike 3.0 Unported license.

Leave a Reply Cancel reply