In this post I share a useful programming technique I first saw used twenty-some years ago while reading through some C code. The technique combined aspects of an array and a linked-list into a single data construct. I've here abstracted the technique into a reusable container I call a partition. (If someone knows of some library in some language that offers a similar container I'd appreciate a reference.)
A partition is a high performance container that organizes a set of N sequential integer IDs (which together are called the domain) into an arbitrary number of non-overlapping groups. (I.e., each ID of the domain can be in at most one group.) The functionality of a partition includes these constant time operations:
- Given an ID, determine the group to which the ID belongs (if any).
- Given a group, find an ID in that group (if any).
- Given an ID, move that ID to a particular group (simultaneously removing it from the group with which it was previously a member if any).
None of the above operations use the heap and are each considerably faster than even a single push to standard implementations of a doubly linked list.
A partition has not one, but two fundamental classes: Domain and Group. The set of sequential IDs in a partition are defined by a Domain object; and a Group is a list of IDs from a Domain. Domain and Group each have two templatization parameters M and V. IDs can be bound to a user-defined member object of type M and Groups can be bound to a user defined value object of type V. Together these associations enable mappings from objects of type M (with known IDs) to objects of type V, and conversely from objects of type V (with known Groups) to a list of objects of type M.
The partition is potentially useful any time objects need to be organized into groups; and that happens all the time! In this post, I show how you can use a partition to track the states of a set of like objects. This is just one possible usage of a partition and is intended only as a tutorial example of how you can use this versatile storage class.
Contents
Motivation
Suppose you have a set of N objects of some common type each having an independent state variable. Perhaps you have hardware monitoring the bit-error-rates on N communications ports and it is your job to categorize them into signal quality states of CLEAR, DEGRADED or FAILED. Or perhaps you are tracking the BUSY-IDLE states of N processors in a massively parallel computer processing environment. Or you might be tracking the activity of N threads in a thread pool. Obviously, such problems abound.
A common use-case for such systems is to find one or more objects in a particular state. For example, you may need to find one port in the CLEAR state for allocation to some new network communications service; or you may want to find and list all processors in the BUSY state to perform an audit. If the number of objects is small, you could simply iterate over them all until you find the objects with the desired state; but if you have many objects, this approach can be highly inefficient. To solve such problems for large sets, you will naturally use containers to hold objects that share a common state; but what type of container should you use?
For C++ you might group your objects into standard STL maps. Similarly, for Java you could use HashMaps from the java.util package. These classes provide convenient solutions, but such heavy-weight containers may be prohibitively slow for some applications.
A lighter-weight alternative is to use linked-lists, but then a state change of, for example, a circuit from CLEAR to DEGRADED now requires you first find the circuit in the CLEAR list so that it can be removed from that list before adding it to the DEGRADED list; this search can again be slow for large sets. In the C++ domain you can remedy this by giving each circuit object an iterator that tracks its own location in the linked-list of which it is currently a member. This is actually an effective (if slightly cumbersome) solution. In the Java domain, such an approach is not possible if you restrict yourself to standard library containers which suffer from the limitation that iterators point between two members and are in any case necessarily invalidated whenever the container they reference is updated. (Oh, the horror! I need a Prozac!)
The partition container is specialized to handle exactly this type of problem and is both easier to use and faster than any of the above alternatives.
An Example
The problem
A partition is often useful for solving resource management problems. For our working example, we'll track the state of a box of eight crayons being shared by my two girls Becky and Cate to draw pictures. (This is not the most compelling resource management problem, but it is at least easy to understand and certainly doesn't look like any proprietary software I've ever written.) I want to show how you might model a resource that has multiple states, so we'll give each crayon both a User
state (which can be one of BECKY
, CATE
, or BOX
), and a Quality
state (which can be one of SHARP
or DULL
). The programming problem is merely to efficiently keep track of this state information for all eight crayons and support a variety of use cases, like "find a sharp crayon in the box", "get the state of the red crayon", or "return all of Cate's crayons to the box".
A solution
#include "partition.hpp" #include <string> #include <iostream> #include <iomanip> using namespace std; using namespace partition; enum Color { NONE_AVAILABLE = -1, RED, ORANGE, YELLOW, GREEN, BLUE, VIOLET, BROWN, BLACK, NUM_CRAYONS }; string crayonName[] = { "red", "orange", "yellow", "green", "blue", "violet", "brown", "black" }; enum User { BOX, CATE, BECKY, NUM_USER }; string userName[] = { "Box", "Cate", "Becky" }; enum Quality { SHARP, DULL, NUM_QUALITY }; string qualityName[] = { "Sharp", "Dull" }; class CrayonState { private: User user; Quality quality; public: CrayonState() {} CrayonState(User user, Quality quality) : user(user), quality(quality) { } User getUser() const { return user; } Quality getQuality() const { return quality; } friend ostream& operator << (ostream& os, const CrayonState& cs); }; inline ostream& operator << (ostream& os, const CrayonState& cs) { return os << "(" << setw(5) << userName[cs.user] << ", " << setw(5) << qualityName[cs.quality] << ")"; } typedef Group<string,CrayonState> CrayonGroup; inline ostream& operator << (ostream& os, const CrayonGroup& g) { os << g.getValue() << ":"; for(CrayonGroup::ConstIterator i = g.front(); !i.isAfterBack(); ++i) os << " " << (*i).member; return os; } class CrayonManager { private: Domain<string,CrayonState> crayons; CrayonGroup state[NUM_USER][NUM_QUALITY]; public: CrayonManager() { for(int i = 0; i < NUM_CRAYONS; ++i) crayons.addEntry(i, crayonName[i]); for(int u = 0; u < NUM_USER; ++u) { for(int q = 0; q < NUM_QUALITY; ++q) { state[u][q].setDomain(crayons); state[u][q].setValue(CrayonState(User(u),Quality(q))); } } state[BOX][SHARP].addAll(); } const CrayonState& getState(Color c) const { return crayons.getValue(c); } User getUser(Color c) const { return crayons.getValue(c).getUser(); } Quality getQuality(Color c) const { return crayons.getValue(c).getQuality(); } void setState(Color c, User u, Quality q) { if(c != NONE_AVAILABLE) { cout << CrayonState(u,q) << " <- " << getState(c) << ": " << crayonName[c] << endl; state[u][q].addBack(c); } } void setUser(Color c, User u) { if(c != NONE_AVAILABLE) setState(c, u, getQuality(c)); } void setQuality(Color c, Quality q) { if(c != NONE_AVAILABLE) setState(c, getUser(c), q); } Color find(Quality q, User u = BOX) const { if(state[u][q].size() > 0) return (Color) state[u][q].peekFront().id; else return NONE_AVAILABLE; } Color findPreferred(Quality q = SHARP, User u = BOX) const { if(state[u][q].size() > 0) return (Color) state[u][q].peekFront().id; else return find(q == SHARP ? DULL : SHARP, u); } Color findPreferred(Color c, Quality q=SHARP, User u=BOX) const { if(getUser(c) == u) return c; else return findPreferred(q, u); } void moveAll(User from, User to, Quality q) { cout << CrayonState(to, q) << " <- " << state[from][q] << endl; state[to][q].addBack(state[from][q]); } void moveAll(User from, User to) { for(int q = 0; q < NUM_QUALITY; ++q) moveAll(from, to, Quality(q)); } friend ostream& operator << (ostream& os, const CrayonManager& cm); }; inline ostream& operator << (ostream& os, const CrayonManager& cm) { for(int u = 0; u < NUM_USER; ++u) for(int q = 0; q < NUM_QUALITY; ++q) os << cm.state[u][q] << endl; return os; } int main(int argc, char** argv) { CrayonManager manager; Color c; // Becky grabs three crayons, preferrably orange, blue, and a sharp. // She dulls the first two. // c = manager.findPreferred(ORANGE); manager.setState(c, BECKY, DULL); c = manager.findPreferred(BLUE); manager.setState(c, BECKY, DULL); c = manager.findPreferred(); manager.setUser(c, BECKY); // Cate grabs two crayons, preferrably blue and green. // She dulls the first one. // c = manager.findPreferred(BLUE); manager.setState(c, CATE, DULL); c = manager.findPreferred(GREEN); manager.setUser(c, CATE); // Becky returns all her crayons to the box... // manager.moveAll(BECKY, BOX); // ...and then grabs a sharp one... // c = manager.find(SHARP); manager.setUser(c, BECKY); // ...and makes it dull... // manager.setQuality(c, DULL); // Cate returns her dullest crayon to the box. // c = manager.findPreferred(DULL, CATE); manager.setUser(c, BOX); // Cate notices the sharp ones are disappearing fast, // so she grabs all the remaining sharp ones... // manager.moveAll(BOX, CATE, SHARP); // Becky gets mad and shows dad. // cout << "\\nFinal States:\\n" << manager << endl; return 0; }
Note: bypass animation by double-clicking the step buttons.
Define an enum for the 8 crayon colors. The enum values of these crayons (from 0 to 7) will form the ids of the Domain. Also define a string array that maps each crayon to its string name.
Likewise, define enums and string arrays for the User and Quality state variables.
Define a simple class to bind the two state variables together into a single compound state. One of these will be bound to each Group. They won't change value, so we won't need any setters.
Now create a class CrayonManager to track the crayon states. Start by giving CrayonManager a Domain to hold the crayons. We'll use the partition to map from a crayon color to a CrayonState; accordingly, the Domain has templatization parameters string and CrayonState. Alternatively, we could have defined two Domains (one to map crayons to user state, and another to map crayons to quality state), but it turns out to be more useful to have a single Domain that maps to a multi-valued state.
Next we'll need some Groups. Each Group will hold all the crayons in a particular compound state, so we define a double-array of Groups indexed by user state and quality state.
Initialize the domain by loading up all the crayon ids (zero through seven) and mapping each to its color string.
Initialize each Group by binding it to the Domain object and setting its value to the CrayonState it represents.
The last initialization step is to give all the crayons an initial state. We'll start with them all sharp and in the box.
Add a method to get the current compound state of a crayon. Remember that the state of a crayon is defined by its Group membership. The Domain's getValue() method finds the Group of the given crayon, and returns the user-assigned value of that Group (which is just the CrayonState object we set above). I've also added a couple of convenience methods to return just the crayon's User state and Quality state.
The setState method sets a crayon's full compound state by moving that crayon to the Group representing the given state. Note that we only have to add the crayon to its new Group — the crayon is removed from its previous Group automatically. I've also provided convenience methods for setting each component of a crayon's state independently. For simplicity, I'll largely be ignoring exceptional conditions, but here I at least go to the effort to ignore attempts to set the state of the NONE_AVAILABLE crayon.
Add a method to find a crayon with a particular quality and user state; if no crayon with that compound state exists, return NONE_AVAILABLE.
findPreferred will try to find a crayon with the requested quality and user state. If none exists, it will look for a crayon with the opposite quality but that same user state. If the user has no crayons at all, it returns NONE_AVAILABLE.
Add an overloaded version of findPreferred that first looks for a particular color in the given user state, but if that fails, it works just like the first version of findPreferred above.
Add a method to transfer all crayons of a particular quality from one user to another, and another method to move all crayons (independent of quality) from one user to another.
So we can see what's going on, update the state setters to print the state changes that are taking effect, and define a streaming operator for the crayon manager itself that prints the current state of all crayons.
Finally, add a main routine to exercise things. To do it right, I'd have to define what to do when any of the find methods return NONE_AVAILABLE, but for this tutorial I just pass these failures onto the state setters which ignore them.
Compile the program and run it.
> g++ -o crayons -I ../src crayons.cpp
> ./crayons
(Becky, Dull) <- ( Box, Sharp): orange
(Becky, Dull) <- ( Box, Sharp): blue
(Becky, Sharp) <- ( Box, Sharp): red
( Cate, Dull) <- ( Box, Sharp): yellow
( Cate, Sharp) <- ( Box, Sharp): green
( Box, Sharp) <- (Becky, Sharp): red
( Box, Dull) <- (Becky, Dull): orange blue
(Becky, Sharp) <- ( Box, Sharp): violet
(Becky, Dull) <- (Becky, Sharp): violet
( Box, Dull) <- ( Cate, Dull): yellow
( Cate, Sharp) <- ( Box, Sharp): brown black red
Final States:
( Box, Sharp):
( Box, Dull): orange blue yellow
( Cate, Sharp): green brown black red
( Cate, Dull):
(Becky, Sharp):
(Becky, Dull): violet
>
Conclusions
A software partition is an exceptionally fast container useful for organizing a set of like objects into groups. Applicable programming problems are common: I personally have used a partition-based approach to achieve elegant solutions to circuit state modeling problems, problems in graph theory, and others. If you too are a software developer, I suspect you'll find uses for Partition as well if you only look for them.
Software Downloads and Documentation
This software is protected by the GNU general public license version 3. This is free software (as defined by the License), but I'd very much appreciate it if you leave a comment to let me know if and/or how you've found the software useful.
I have not gone to the effort to put together a commercial license, but if someone is interested, I'll make one available.
Although I've used partition programming techniques at multiple companies during my software career, this design and implementation is new and should be considered experimental. I'm not sure I'm satisfied with the public interface (particularly due to arguably cumbersome method names that resulted from the decision to not include a reverse iterator). Future versions may not be backwards compatible. This software has been extensively unit-tested, but only with gcc version 4.4.3 on Ubuntu 10.04.4. Please, please contact me if you discover platform compatibility problems; bugs; design deficiencies; and/or documentation errors (including misspellings or grammatical errors). Thanks!
Downloads | Documentation | |
C++ | partition_c++_0.1.1.tgz | 0.1.1 |
partition_c++_0.1.1.zip | ||
Java | partition_java_0.1.tgz | 0.1 |
partition_java_0.1.zip |