Boost.MultiIndex - Eternity

很久沒發程式文了。
最近常到處在網路上亂逛，
發現 boost 在台灣真的是沒什麼人在用。
應該說 C++ 普及的程度實在太低了，
看來還是放一點東西比較好...
不過用 1280x1024 解析度 + IE7 來讀這篇大概 code 右半邊會無法顯示，
懶得玩複製貼上的話就乖乖切到 1920x1200 或用 firefox 吧 (字會小一點但是很難看)；
不然就等我有空去改改 css。

下面這張圖可以算是 Boost.MultiIndex 的教學文件裡最經典的圖了：

簡單來說就是 programmer 有時需要用不同的方式來 traverse 一個 container。
以上圖左上的框來說，
indexed by shape 很明顯是以圖形的邊數來排序，
從 begin 走到 end 的語意就是「由邊數少的走到邊數多的」。
上圖中上的框則是以圖形內的編號來決定順序，
從 begin 走到 end 的語意就是「由編號小的走到編號大的」。
除了這種 indexed by xxx 的用法外，
還有像右上角這種很單純就是一個 sequence 的擺法。
有些不熟 STL 的人可能對 sequence 這名詞感到陌生，
總之可以把它想像成一個 list 就對了。
裡面並沒有預設的排序方式，
從 begin 走到 end 的語意單純就是根據原本插入元素的順序而定。

一般 programmer 在應付這種需求時，
大都會製作三個 containers，
分別以不同的方式 sort 一遍，
然後在需要不同的排序需求時存取不同的 container。
而在變動其中一個 container 的內容時 (如插入、刪除元素)，
又需要設法同步更新另外兩個 container。
Boost.MultiIndex 所提供的是用來應付這類問題的 container。
讓同一個 container 可以吐出不同種類的 iterator，
若是利用其中一種 iterator 來新增或移除該 container 的元素，
也不用去煩惱前面所提到的同步更新問題。

要造出類似上圖的程式碼範例，
首先當然是需要一個代表 Shape 的 data structure：

#include <tr1/cstdint> 

using namespace std;

struct Shape {
	// 定義式自己想
	Shape(int32_t number, int32_t edges);

	int32_t number_;
	int32_t edges_;
};

#include <tr1/cstdint>

using namespace std;

struct Shape {

// 定義式自己想

Shape(int32_t number, int32_t edges);

int32_t number_;

int32_t edges_;

};

如果你很閒的話可以多擺上一堆其它有的沒的 data member 進去。

再來就是利用 multi_index_container<> 來定義一個 container type：

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/member.hpp>
#include <boost/multi_index/sequenced_index.hpp>

using namespace boost;
using namespace boost::multi_index;

typedef multi_index_container<
	Shape,
	indexed_by<
		ordered_unique<member<Shape, int32_t, &Shape::number_> >,
		ordered_non_unique<member<Shape, int32_t, &Shape::edges_> >,
		sequenced<>
	>
> ShapeContainer;

#include <boost/multi_index_container.hpp>

#include <boost/multi_index/ordered_index.hpp>

#include <boost/multi_index/member.hpp>

#include <boost/multi_index/sequenced_index.hpp>

using namespace boost;

using namespace boost::multi_index;

typedef multi_index_container<

Shape,

indexed_by<

ordered_unique<member<Shape, int32_t, &Shape::number_> >,

ordered_non_unique<member<Shape, int32_t, &Shape::edges_> >,

sequenced<>

> ShapeContainer;

只要你是人，
你都能從 code 和本文最開頭的那張圖裡推測出這個 typedef 是在做什麼。
總之在 member 的三個引數裡面，
第一個引數基本上一定會跟 multi_index_container<> 的第一個引數一致，
第三個引數丟的就是要拿來當排序依據的 data member address，
而第二個引數就是該 data member 的 type，
就這麼簡單。
至於那個 ordered 和 unique 的意思應該不需要浪費時間講解了吧？
其實我做那個 Shape 的 struct 還有偷懶，
圖上是 indexed by shape (我直接用 edges_ 代替)。
所以照理說應該放個 identity<> 搭上一個 operator< 用來比較 Shape 本身，
但是這個官網有其它範例教 identity<> 所以就不搞了。

接下來就是做出跟圖裡一樣的效果，
也就是把元素插進去，
這個小學生就會了：

	ShapeContainer container;

	container.get<2>().push_back(Shape(2, 3));
	container.get<2>().push_back(Shape(3, 6));
	container.get<2>().push_back(Shape(1, 5));
	container.get<2>().push_back(Shape(4, 4));

ShapeContainer container;

container.get<2>().push_back(Shape(2, 3));

container.get<2>().push_back(Shape(3, 6));

container.get<2>().push_back(Shape(1, 5));

container.get<2>().push_back(Shape(4, 4));

get<> 裡面的參數是 indexing 的方式。
0 1 2 對應於前面 typedef 中 indexed_by<> 內的順序。
ShapeContainer 只有 sequence 這種 indexing 方式提供 push_back() 介面，
其它兩個則要使用 insert()，
用哪種 indexing 方式插入元素其實都沒差。
另外不指定用哪種 indexing 方式的話則是使用 indexed_by<> 裡的第一項，
譬如下面兩行的語意是相同的：

	container.isnert(Shape(2, 3));
	container.get<0>().insert(Shape(2, 3));

1 2	container.isnert(Shape(2, 3)); container.get<0>().insert(Shape(2, 3));

再來就是 traverse 的方法了。
其實需要學的地方就是如何把 iterator 生出來。
如果是希望以邊數做排序來逐一印出元素的話就是這樣：

	ShapeContainer::nth_index<1>::type::iterator begin = container.get<1>().begin();
	ShapeContainer::nth_index<1>::type::iterator end = container.get<1>().end();

	// 自己定義一下 operator<<(ostream &, const Shape &) 吧
	copy(begin, end, ostream_iterator<Shape>(cout));

ShapeContainer::nth_index<1>::type::iterator begin = container.get<1>().begin();

ShapeContainer::nth_index<1>::type::iterator end = container.get<1>().end();

// 自己定義一下 operator<<(ostream &, const Shape &) 吧

copy(begin, end, ostream_iterator<Shape>(cout));

其實你高興的話可以直接把 iterator 的 type 另外 typedef 一下，
這樣可以少打很多字。
不過 typedef 出來的名稱可能要設法讓它容易被理解，
不然還是乾脆直接這樣還比較好懂一點。
如果是要 search，
基本上這跟用 STL 的方式差不多：

	ShapeContainer::nth_index<0>::type::iterator iter;

	// 找編號是 2 的元素
	iter = container.get<0>().find(2);

ShapeContainer::nth_index<0>::type::iterator iter;

// 找編號是 2 的元素

iter = container.get<0>().find(2);

基本上 ordered 和 unordered 這些都像 map 和 unordered_map 一樣是 key based，
而 unique 和 non_unique 的差別就像 map 和 multi_map 一樣。

既然說跟 map 一樣，
那我想應該很多人就知道 map 的 key 一般是不可更改的；
而 map 跟 multi_index_container 有個最大差異，
那就是 map 把 key 跟 value 分開來塞在 pair<> 裡；
至於 multi_index_container 則是整塊 class/struct 塞進去，
所以它吐出來的 iterator 其實是 immutable 的。
那怎麼辦呢？
Boost.MultiIndex 有提供 replace()、modify() 和 modify_key() 三種改法。
其實看名字就能猜出它們是在幹嘛，
所以就不多做介紹了；
但是通常都免不了需要寫一個 functor 來改。
這時熟悉 Boost.Lambda 的人會有一些小技巧省去寫 functor 的時間，
這些官方文件都有範例就不介紹了。
這樣講好像太機車，
還是沿用上面的 search 例子來個修改 key 的範例好了，
下面的 code 是把第一個找到的五邊形改成八邊形：

#include <boost/lambda/lambda.hpp>
...
	ShapeContainer::nth_index<1>::type::iterator iter;

	iter = container.get<1>().find(5);
	container.get<1>().modify_key(iter, boost::lambda::_1 = 8);

#include <boost/lambda/lambda.hpp>

...

ShapeContainer::nth_index<1>::type::iterator iter;

iter = container.get<1>().find(5);

container.get<1>().modify_key(iter, boost::lambda::_1 = 8);

_1 前面要冠 boost::lambda 是避免 ambiguous 的緣故，
一般單獨使用 Boost.Lambda 的話其實可以不需要這樣寫。

如果是對 code 品質有所要求的 programmer，
看過這裡提供的範例一定會很不爽一件事。
那就是所謂的 magic number，
擺一堆 0 1 2 在那邊過了幾個月誰會記得那是什麼東西？
這個部分 Boost.MultiIndex 的作者也有考慮到，
所以提供了 tagging 的功能：

struct IndexedByNumber {};
struct IndexedByEdges {};
struct IndexedBySequence {};

typedef multi_index_container<
	Shape,
	indexed_by<
		ordered_unique<
			tag<IndexedByNumber>,
			member<Shape, int32_t, &Shape::number_>
		>,
		ordered_non_unique<
			tag<IndexedByEdges>,
			member<Shape, int32_t, &Shape::edges_>
		>,
		sequenced<tag<IndexedBySequence> >
	>
> ShapeContainer;

struct IndexedByNumber {};

struct IndexedByEdges {};

struct IndexedBySequence {};

typedef multi_index_container<

Shape,

indexed_by<

ordered_unique<

tag<IndexedByNumber>,

member<Shape, int32_t, &Shape::number_>

ordered_non_unique<

tag<IndexedByEdges>,

member<Shape, int32_t, &Shape::edges_>

sequenced<tag<IndexedBySequence> >

> ShapeContainer;

這樣一來使用上就會變成：

	container.get<IndexedBySequence>().push_back(Shape(2, 3));
	...
	// 注意不再是使用 nth_index<>
	ShapeContainer::index<IndexedByNumber>::type::iterator iter;
	...

container.get<IndexedBySequence>().push_back(Shape(2, 3));

...

// 注意不再是使用 nth_index<>

ShapeContainer::index<IndexedByNumber>::type::iterator iter;

...

然後一定又會有那種不做到完善封裝就會窒息身亡的 programmer 問那封裝勒？
少了封裝都用 struct 來搞那還能叫做程式嗎？
這個問題其實也很好解決：

#include <boost/multi_index/mem_fun.hpp>

class Shape {
public:
	Shape(int32_t number, int32_t edges);

	// 還是那句話，定義式自己寫。
	int32_t number() const;
	int32_t edges() const;
	void print(ostream &os) const;

private:
	int32_t number_;
	int32_t edges_;
};

typedef multi_index_container<
	Shape,
	indexed_by<
		ordered_unique<const_mem_fun<Shape, int32_t, &Shape::number> >,
		ordered_non_unique<const_mem_fun<Shape, int32_t, &Shape::edges> >,
		sequenced<>
	>
> ShapeContainer;

#include <boost/multi_index/mem_fun.hpp>

class Shape {

public:

Shape(int32_t number, int32_t edges);

// 還是那句話，定義式自己寫。

int32_t number() const;

int32_t edges() const;

void print(ostream &os) const;

private:

int32_t number_;

int32_t edges_;

};

typedef multi_index_container<

Shape,

indexed_by<

ordered_unique<const_mem_fun<Shape, int32_t, &Shape::number> >,

ordered_non_unique<const_mem_fun<Shape, int32_t, &Shape::edges> >,

sequenced<>

> ShapeContainer;

這樣就沒話說了吧，
當然要是討厭 magic number 的還是可以把 tag 加上去。

這篇講的並不是 Boost.MultiIndex 的全部，
比方說除了 sequenced<> 外還有 random_access<> 可以用，
而且它的排列順序還能 call rearrange() 指定由哪種 indexing 方式來排，
不過每次插入新元素後都要手動 call 一次就是了。
總之想徹底活用的話還是乖乖看官方文件吧。

一	二	三	四	五	六	日
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30