[Groonga-commit] droonga/droonga.org at 3e0e519 [gh-pages] Update translation of benchmark tutorial

アーカイブの一覧に戻る

SHIMODA Piro Hiroshi null+****@clear*****
Sat Oct 4 02:59:00 JST 2014


SHIMODA "Piro" Hiroshi	2014-10-04 02:59:00 +0900 (Sat, 04 Oct 2014)

  New Revision: 3e0e519a38d9913074577bd15f00d159c897c268
  https://github.com/droonga/droonga.org/commit/3e0e519a38d9913074577bd15f00d159c897c268

  Message:
    Update translation of benchmark tutorial

  Added files:
    _po/ja/tutorial/benchmark/index.po
  Modified files:
    _po/ja/tutorial/1.0.7/benchmark/index.po
    ja/tutorial/1.0.5/benchmark/index.md
    ja/tutorial/1.0.6/benchmark/index.md
    ja/tutorial/1.0.7/benchmark/index.md

  Modified: _po/ja/tutorial/1.0.7/benchmark/index.po (+788 -104)
===================================================================
--- _po/ja/tutorial/1.0.7/benchmark/index.po    2014-10-04 02:45:29 +0900 (f23c900)
+++ _po/ja/tutorial/1.0.7/benchmark/index.po    2014-10-04 02:59:00 +0900 (b97293d)
@@ -14,19 +14,30 @@ msgid ""
 "layout: en\n"
 "---"
 msgstr ""
+"---\n"
+"title: \"DroongaとGroongaのベンチマークの取り方\"\n"
+"layout: ja\n"
+"---"
 
 msgid ""
 "* TOC\n"
 "{:toc}"
 msgstr ""
 
+msgid ""
+"<!--\n"
+"this is based on https://github.com/droonga/presentation-droonga-meetup-1-intr"
+"oduction/blob/master/benchmark/README.md\n"
+"-->"
+msgstr ""
+
 msgid "## The goal of this tutorial"
 msgstr "## チュートリアルのゴール"
 
 msgid ""
 "Learning steps to benchmark a [Droonga][] cluster and compare it to a [Groonga"
-"][groonga]."
-msgstr ""
+"][groonga] server."
+msgstr "[Droonga][]クラスタのベンチマークの測定し、[Groonga][groonga]での結果と比較するまでの、一連の手順を学ぶこと。"
 
 msgid "## Precondition"
 msgstr "## 前提条件"
@@ -36,207 +47,878 @@ msgid ""
 "tu][] or [CentOS][] Server.\n"
 "* You must have basic knowledge and experiences to use the [Groonga][groonga] "
 "via HTTP.\n"
-"* You must have basic knowledge to construct a [Droonga][] cluster by your han"
-"d.\n"
+"* You must have basic knowledge to construct a [Droonga][] cluster.\n"
 "  Please complete the [\"getting started\" tutorial](../groonga/) before this."
 msgstr ""
+"* [Ubuntu][]または[CentOS][]のサーバの操作に関する基本的な知識と経験があること。\n"
+"* [Groonga][groonga]をHTTP経由で操作する際の基本的な知識と経験があること。\n"
+"* [Droonga][]クラスタの構築手順について基本的な知識があること。\n"
+"  このチュートリアルの前に、[「始めてみる」のチュートリアル](../groonga/)を完了しておいて下さい。"
 
-msgid "## Why benchmarking?"
+msgid ""
+"And, assume that there are four [Ubuntu][] 14.04LTS servers for the new Droogn"
+"a cluster:"
+msgstr "また、新しいDroongaクラスタのために以下の4つの[Ubuntu][] 14.04LTSのサーバがあると仮定します:"
+
+msgid ""
+" * `192.168.100.50`\n"
+" * `192.168.100.51`\n"
+" * `192.168.100.52`\n"
+" * `192.168.100.53`"
 msgstr ""
 
+msgid "One is client, others are Droonga nodes."
+msgstr "1つはクライアント用で、残りの3つはDroongaノード用です。"
+
+msgid "## Why benchmarking?"
+msgstr "## ベンチマークの必要性について"
+
 msgid ""
 "Because Droonga has compatibility to Groonga, you'll plan to migrate your appl"
 "ication based on Groonga to Droonga.\n"
 "Before that, you should benchmark Droonga and confirm that it is better altern"
 "ative for your application."
 msgstr ""
+"DroongaはGroongaと互換性があるため、GroongaベースのアプリケーションをDroongaに移行することを検討することもあるでしょう。\n"
+"そんな時は、実際に移行する前に、Droongaの性能を測定して、より良い移行先であるかどうかを確認しておくべきです。"
 
-msgid "For example, assume that your application has following spec:"
+msgid ""
+"Of course you may simply hope to know the difference in performance between Gr"
+"oonga and Droonga.\n"
+"Benchmarking will make it clear."
 msgstr ""
+"もちろん、単にGroongaとDroongaの性能差を知りたいと思うこともあるでしょう。\n"
+"ベンチマークによって、差を可視化することができます。"
+
+msgid "### How the benchmark tool measures the performance?"
+msgstr "### ベンチマークツールはどのように性能を測定するのか"
 
 msgid ""
-" * The database contains all pages of [Japanese Wikipedia](http://ja.wikipedia"
-".org/).\n"
-" * 50% accesses are a fixed query for the front page. Others have different se"
-"arch queries.\n"
-" * There are three [Ubuntu][] 14.04LTS servers for the new Droogna cluster: `1"
-"92.168.0.10`, `192.168.0.11`, and `192.168.0.12`."
+"You can run benchmark with the command `drnbench-request-response`, introduced"
+" by the Gem package [drnbench]().\n"
+"It measures the throughput performance of the target service - how many reques"
+"t can be processed in a time.\n"
+"The performance index is described as \"*queries per second* (*QPS*)\"."
 msgstr ""
+"ベンチマークは、[drnbench]()というGemパッケージによって導入される`drnbench-request-response`コマンドで行うことがで"
+"きます。\n"
+"このツールは、対象サービスのスループット性能、つまり、一度にどれだけの数のリクエストを捌けるかを計測します。\n"
+"性能の指標は「*クエリ毎秒*(Queries Per Second, *QPS*)」という単位で表されます。"
 
-msgid "## Prepare the data source"
+msgid ""
+"For example, if a Groonga server processed 10 requests in one second, that is "
+"described as \"10 QPS\".\n"
+"Possibly there are 10 users (clients), or, there are 2 users and each user ope"
+"ns 5 tabs in his web browser.\n"
+"Anyway, \"10 QPS\" means that the Groonga actually accepted and responded for 10"
+" requests while one second is passing."
 msgstr ""
 
 msgid ""
-"First, download the archive of Wikipedia pages and convert it to a dump file f"
-"or Groonga, on the node `192.168.0.10`.\n"
-"Because the archive is very large, downloading and data conversion may take so"
-"me a few hours."
+"`drnbench-request-response` benchmarks the target service, by steps like follo"
+"wing:"
 msgstr ""
 
 msgid ""
-"    (on 192.168.0.10)\n"
-"    % cd ~/\n"
-"    % git clone https://github.com/droonga/wikipedia-search.git\n"
-"    % cd wikipedia-search\n"
-"    % bundle install\n"
-"    % time rake data:convert:groonga:ja data/groonga/ja-all-pages.grn"
+" 1. The master process generates one virtual client.\n"
+"    The client starts to send many requests to the target sequentially and fre"
+"quently.\n"
+" 2. After a while, the master process kills the client.\n"
+"    Then he counts up the number of requests actually processed by the target,"
+" and reports it as QPS of the single client case.\n"
+" 3. The master process generates two virtual clients.\n"
+"    They starts to send requests.\n"
+" 4. After a while, the master process kills all clients.\n"
+"    Then total number of processed requests sent by all clients is reported as"
+" QPS of the two clients case.\n"
+" 5. Repeated with three clients, four clients ... and more progressively.\n"
+" 6. Finally, the master process reports QPS and other extra information for ea"
+"ch case, as a CSV file like:"
+msgstr ""
+
+msgid ""
+"    ~~~\n"
+"    n_clients,total_n_requests,queries_per_second,min_elapsed_time,max_elapsed"
+"_time,average_elapsed_time,0,200\n"
+"    1,164,5.466666666666667,0.002184631,1.951960432,0.1727086823963415,0,100.0"
+"\n"
+"    2,1618,53.93333333333333,0.001466091,1.587372312,0.026789948272558754,0.12"
+"360939431396785,99.87639060568603\n"
+"    4,4690,156.33333333333334,0.001065161,0.26070575,0.015224578191897657,0.04"
+"2643923240938165,99.95735607675907\n"
+"    6,6287,209.56666666666666,0.000923332,0.25709169,0.018191428254970568,0.09"
+"543502465404805,99.90456497534595\n"
+"    8,6628,220.93333333333334,0.000979707,0.288406006,0.02557014875603507,0.03"
+"0175015087507546,99.96982498491249\n"
+"    10,7117,237.23333333333332,0.001235846,0.303093461,0.03160425060474918,0.1"
+"405086412814388,99.85949135871857\n"
+"    12,7403,246.76666666666668,0.001111115,0.33163911,0.03792291040199917,0.09"
+"455626097528029,99.90544373902472\n"
+"    14,7454,248.46666666666667,0.00151987,0.335161281,0.04522922885028168,0.17"
+"4403005097934,99.82559699490207\n"
+"    16,7357,245.23333333333332,0.000763487,0.356862003,0.05435767224085904,0.0"
+"8155498165012913,99.91844501834987\n"
+"    18,7494,249.8,0.001017168,0.378661333,0.061178927504003194,0.2001601281024"
+"8196,99.79983987189752\n"
+"    20,7506,250.2,0.001759464,0.404634447,0.06887332192845741,0.21316280309086"
+"064,99.78683719690913\n"
+"    ~~~"
+msgstr ""
+
+msgid "    You can analyze it, draw a graph from it, and so on."
 msgstr ""
 
 msgid ""
-"After that, a dump file `~/wikipedia-search/data/groonga/ja-all-pages.grn` bec"
-"omes available."
+"    (Note: Performance results fluctuate from various factors.\n"
+"    This is just an example on a specific version, specific environment.)"
+msgstr ""
+
+msgid "### How read and analyze the result? {#how-to-analyze}"
+msgstr ""
+
+msgid "![A graph of throughput](/images/tutorial/benchmark/throughput-groonga.png)"
 msgstr ""
 
-msgid "## Set up a Groonga server"
+msgid ""
+"Look at the result above, and this graph.\n"
+"You'll see that the QPS stagnated around 250, for 12 or more clients.\n"
+"This means that the target service can process 250 requests in one second, at "
+"a maximum."
 msgstr ""
 
-msgid "As a criterion, let's setup the Groonga on the node `192.168.0.10`."
+msgid ""
+"In other words, we can describe the result as: 250 QPS is the maximum throughp"
+"ut performance of this system - generic performance of hardware, software, net"
+"work, size of the database, queries, and more.\n"
+"If the number of requests for your service is growing up and it is going to re"
+"ach the limit, you have to do something about it - optimize queries, replace t"
+"he computer with more powerful one, and so on."
 msgstr ""
 
 msgid ""
-"    (on 192.168.0.10)\n"
+"And, sending same request patterns to Groonga and Droonga, you can compare max"
+"imum QPS for each system.\n"
+"If Droonga's QPS is larger than Groonga's one (=Droonga has better performance"
+" about throughput), it will become good reason to migrate your service from Gr"
+"oogna to Droonga.\n"
+"Moreover, comparing multiple results from different number of Droogna nodes, y"
+"ou can analyze the cost-benefit performance for newly introduced nodes."
+msgstr ""
+
+msgid "### Ensure an existing reference database (and the data source)"
+msgstr ""
+
+msgid ""
+"If you have any existing service based on Groonga, it becomes the reference.\n"
+"Then you just have to dump all data in your Groonga database and load them to "
+"a new Droonga cluster."
+msgstr ""
+
+msgid ""
+"Otherwise - if you have no existing service, prepare a new reference database "
+"with much data for effective benchmark.\n"
+"The repository [wikipedia-search][] includes some helper scripts to construct "
+"your Groonga server (and Droonga cluster), with [Japanese Wikipedia](http://ja"
+".wikipedia.org/) pages."
+msgstr ""
+
+msgid ""
+"So let's prepare a new Groonga database including Wikipedia pages, on a node `"
+"192.168.100.50`."
+msgstr ""
+
+msgid ""
+" 1. Determine the size of the database.\n"
+"    You have to use good enough size database for benchmarking."
+msgstr ""
+
+msgid ""
+"    * If it is too small, you'll see \"too bad\" benchmark result for Droonga, b"
+"ecause the percentage of the Droonga's overhead becomes relatively too large.\n"
+"    * If it is too large, you'll see \"too unstable\" result because swapping of"
+" RAM will slow the performance down randomly.\n"
+"    * If RAM size of all nodes are different, you should determine the size of"
+" the database for the minimum size RAM."
+msgstr ""
+
+msgid ""
+"    For example, if there are three nodes `192.168.100.50` (8GB RAM), `192.168"
+".100.51` (8GB RAM), and `192.168.100.52` (6GB RAM), then the database should b"
+"e smaller than 6GB.\n"
+" 2. Set up the Groonga server, as instructed on [the installation guide](http:"
+"//groonga.org/docs/install.html)."
+msgstr ""
+
+msgid ""
+"    ~~~\n"
+"    (on 192.168.100.50)\n"
 "    % sudo apt-get -y install software-properties-common\n"
 "    % sudo add-apt-repository -y universe\n"
 "    % sudo add-apt-repository -y ppa:groonga/ppa\n"
 "    % sudo apt-get update\n"
-"    % sudo apt-get -y install groonga"
+"    % sudo apt-get -y install groonga\n"
+"    ~~~"
+msgstr ""
+
+msgid ""
+"    Then the Groonga becomes available.\n"
+" 3. Download the archive of Wikipedia pages and convert it to a dump file for "
+"Groonga, with the rake task `data:convert:groonga:ja`.\n"
+"    You can specify the number of records (pages) to be converted via the envi"
+"ronment variable `MAX_N_RECORDS` (default=5000)."
+msgstr ""
+
+msgid ""
+"    ~~~\n"
+"    (on 192.168.100.50)\n"
+"    % cd ~/\n"
+"    % git clone https://github.com/droonga/wikipedia-search.git\n"
+"    % cd wikipedia-search\n"
+"    % bundle install\n"
+"    % time (MAX_N_RECORDS=100000 bundle exec rake data:convert:groonga:ja \\\n"
+"                                   data/groonga/ja-pages.grn)\n"
+"    ~~~"
 msgstr ""
 
 msgid ""
-"Now the Groonga is available.\n"
-"Prepare the database based dump files.\n"
-"This may take much time (10 or more hours)."
+"    Because the archive is very large, downloading and data conversion may tak"
+"e time."
 msgstr ""
 
 msgid ""
-"    (on 192.168.0.10)\n"
+"    After that, a dump file `~/wikipedia-search/data/groonga/ja-pages.grn` is "
+"there.\n"
+"    Create a new database and load the dump file to it.\n"
+"    This also may take more time:"
+msgstr ""
+
+msgid ""
+"    ~~~\n"
+"    (on 192.168.100.50)\n"
 "    % mkdir -p $HOME/groonga/db/\n"
 "    % groonga -n $HOME/groonga/db/db quit\n"
 "    % time (cat ~/wikipedia-search/config/groonga/schema.grn | groonga $HOME/g"
 "roonga/db/db)\n"
 "    % time (cat ~/wikipedia-search/config/groonga/indexes.grn | groonga $HOME/"
 "groonga/db/db)\n"
-"    % time (cat ~/wikipedia-search/data/groonga/ja-all-pages.grn | groonga $HO"
-"ME/groonga/db/db)"
+"    % time (cat ~/wikipedia-search/data/groonga/ja-pages.grn | groonga $HOME/g"
+"roonga/db/db)\n"
+"    ~~~"
+msgstr ""
+
+msgid ""
+"    Note: number of records affects to the database size.\n"
+"    Just for information, my results are here:"
+msgstr ""
+
+msgid ""
+"     * 1.1GB database was constructed from 300000 records.\n"
+"       Data conversion took 17 min, data loading took 6 min.\n"
+"     * 4.3GB database was constructed from 1500000 records.\n"
+"       Data conversion took 53 min, data loading took 64 min."
 msgstr ""
 
-msgid "Then start the Groonga as an HTTP server."
+msgid " 4. Start the Groonga as an HTTP server."
 msgstr ""
 
 msgid ""
-"    (on 192.168.0.10)\n"
-"    % groonga -p 10041 -d --protocol http $HOME/groonga/db/db"
+"    ~~~\n"
+"    (on 192.168.100.50)\n"
+"    % groonga -p 10041 -d --protocol http $HOME/groonga/db/db\n"
+"    ~~~"
+msgstr ""
+
+msgid "OK, now we can use this node as the reference for benchmarking."
 msgstr ""
 
 msgid "## Set up a Droonga cluster"
 msgstr "## Droongaクラスタをセットアップする"
 
-msgid "Install Droonga to nodes."
+msgid ""
+"Install Droonga to all nodes.\n"
+"Because we are benchmarking it via HTTP, you have to install both services `dr"
+"oonga-engine` and `droonga-http-server` for each node."
 msgstr ""
 
 msgid ""
-"    (on 192.168.0.10, 192.168.0.11, 192.168.0.12)\n"
-"    % sudo apt-get update\n"
-"    % sudo apt-get -y upgrade\n"
-"    % sudo apt-get install -y ruby ruby-dev build-essential nodejs nodejs-lega"
-"cy npm\n"
-"    % sudo gem install droonga-engine grn2drn drnbench\n"
-"    % sudo npm install -g droonga-http-server\n"
-"    % mkdir ~/droonga\n"
-"    % droonga-engine-catalog-generate \\\n"
-"        --hosts=192.168.0.10,192.168.0.11,192.168.0.12 \\\n"
-"        --n-workers=$(cat /proc/cpuinfo | grep processor | wc -l) \\\n"
-"        --output=~/droonga/catalog.json"
+"~~~\n"
+"(on 192.168.100.50)\n"
+"% host=192.168.100.50\n"
+"% curl https://raw.githubusercontent.com/droonga/droonga-engine/master/install"
+".sh | \\\n"
+"    sudo HOST=$host bash\n"
+"% curl https://raw.githubusercontent.com/droonga/droonga-http-server/master/in"
+"stall.sh | \\\n"
+"    sudo ENGINE_HOST=$host HOST=$host PORT=10042 bash\n"
+"% sudo droonga-engine-catalog-generate \\\n"
+"    --hosts=192.168.100.50,192.168.100.51,192.168.100.52\n"
+"% sudo service droonga-engine start\n"
+"% sudo service droonga-http-server start\n"
+"~~~"
+msgstr ""
+
+msgid ""
+"~~~\n"
+"(on 192.168.100.51)\n"
+"% host=192.168.100.51\n"
+"...\n"
+"~~~"
+msgstr ""
+
+msgid ""
+"~~~\n"
+"(on 192.168.100.52)\n"
+"% host=192.168.100.52\n"
+"...\n"
+"~~~"
+msgstr ""
+
+msgid ""
+"Note: to start `droonga-http-server` with a port number different from Groonga"
+", we should specify another port `10042` via the `PORT` environment variable, "
+"like above."
+msgstr ""
+
+msgid "## Synchronize data from Groonga to Droonga"
+msgstr ""
+
+msgid ""
+"Next, prepare the Droonga database.\n"
+"Send Droonga messages from dump files, like:"
+msgstr ""
+
+msgid ""
+"~~~\n"
+"(on 192.168.100.50)\n"
+"% sudo gem install grn2drn\n"
+"% time (cat ~/wikipedia-search/config/groonga/schema.grn | \\\n"
+"          grn2drn | \\\n"
+"          droonga-send --server=192.168.100.50 \\\n"
+"                       --report-throughput)\n"
+"% time (cat ~/wikipedia-search/config/groonga/indexes.grn | \\\n"
+"          grn2drn | \\\n"
+"          droonga-send --server=192.168.100.50 \\\n"
+"                       --report-throughput)\n"
+"% time (cat ~/wikipedia-search/data/groonga/ja-pages.grn | \\\n"
+"          grn2drn | \\\n"
+"          droonga-send --server=192.168.100.50 \\\n"
+"                       --server=192.168.100.51 \\\n"
+"                       --server=192.168.100.52 \\\n"
+"                       --report-throughput)\n"
+"~~~"
+msgstr ""
+
+msgid ""
+"Note that you must send requests for schema and indexes to just one endpoint.\n"
+"Parallel sending of schema definition requests for multiple nodes will break t"
+"he database."
 msgstr ""
 
 msgid ""
-"After installation, start servers.\n"
-"To run Groonga and Droonga parallelly, specify a new port number for the `droo"
-"nga-http-server` different to Groonga's one.\n"
-"Now we use `10042` for Droonga, `10041` for Groonga."
+"This may take much time.\n"
+"After all, now you have two HTTP servers: Groonga HTTP server with the port `1"
+"0041`, and Droonga HTTP Servers with the port `10042`."
+msgstr ""
+
+msgid "## Set up the client"
+msgstr ""
+
+msgid "You must install the benchmark client to the computer."
+msgstr ""
+
+msgid "Assume that you use a computer `192.168.100.53` as the client:"
 msgstr ""
 
 msgid ""
-"    (on 192.168.0.10)\n"
-"    % export host=192.168.0.10\n"
-"    % export DROONGA_BASE_DIR=$HOME/droonga\n"
-"    % droonga-engine --host=$host \\\n"
-"        --log-file=$DROONGA_BASE_DIR/droonga-engine.log \\\n"
-"        --daemon \\\n"
-"        --pid-file=$DROONGA_BASE_DIR/droonga-engine.pid\n"
-"    % droonga-http-server --port=10042 \\\n"
-"        --receive-host-name=$host \\\n"
-"        --droonga-engine-host-name=$host \\\n"
-"        --environment=production \\\n"
-"        --daemon \\\n"
-"        --pid-file=$DROONGA_BASE_DIR/droonga-http-server.pid"
+"~~~\n"
+"(on 192.168.100.53)\n"
+"% sudo apt-get update\n"
+"% sudo apt-get -y upgrade\n"
+"% sudo apt-get install -y ruby curl jq\n"
+"% sudo gem install drnbench\n"
+"~~~"
+msgstr ""
+
+msgid "## Prepare request patterns"
+msgstr ""
+
+msgid "Let's prepare request pattern files for benchmarking."
+msgstr ""
+
+msgid "### Determine the expected cache hit rate"
+msgstr ""
+
+msgid "First, you have to determine the cache hit rate."
 msgstr ""
 
 msgid ""
-"    (on 192.168.0.11)\n"
-"    % export host=192.168.0.11\n"
-"    ..."
+"If you have any existing service based on Groonga, you can get the actual cach"
+"e hit rate of the Groonga database via `status` command, like:"
 msgstr ""
 
 msgid ""
-"    (on 192.168.0.12)\n"
-"    % export host=192.168.0.12\n"
-"    ..."
+"~~~\n"
+"% curl \"http://192.168.100.50:10041/d/status\" | jq .\n"
+"[\n"
+"  [\n"
+"    0,\n"
+"    1412326645.19701,\n"
+"    3.76701354980469e-05\n"
+"  ],\n"
+"  {\n"
+"    \"max_command_version\": 2,\n"
+"    \"alloc_count\": 158,\n"
+"    \"starttime\": 1412326485,\n"
+"    \"uptime\": 160,\n"
+"    \"version\": \"4.0.6\",\n"
+"    \"n_queries\": 1000,\n"
+"    \"cache_hit_rate\": 0.5,\n"
+"    \"command_version\": 1,\n"
+"    \"default_command_version\": 1\n"
+"  }\n"
+"]\n"
+"~~~"
 msgstr ""
 
 msgid ""
-"Next, prepare the database from dump files.\n"
-"Note that you must send requests for schema and indexes to just one endpoint, "
-"because parallel sending of schema definition requests for multiple nodes will"
-" break the database."
+"The cache hit rate appears as `\"cache_hit_rate\"`.\n"
+"`0.5` means 50%, then a half of responses are returned from cached results."
 msgstr ""
 
 msgid ""
-"    (on 192.168.0.10)\n"
-"    % time (cat ~/wikipedia-search/config/groonga/schema.grn | grn2drn | \\\n"
-"              droonga-send --server=192.168.0.10)\n"
-"    % time (cat ~/wikipedia-search/config/groonga/indexes.grn | grn2drn | \\\n"
-"              droonga-send --server=192.168.0.10)"
+"If you have no existing service, you should assume that the cache hit rate bec"
+"omes 50%."
 msgstr ""
 
-msgid "Instead you can use a direct dump from the Groonga server, like:"
+msgid ""
+"To measure and compare performance of Groonga and Droonga properly, you should"
+" prepare request patterns for benchmarking which make the cache hit rate near "
+"the actual rate.\n"
+"So, how do it?"
 msgstr ""
 
 msgid ""
-"    (on 192.168.0.10)\n"
-"    % time (grndump --no-dump-tables $HOME/groonga/db/db | grn2drn | \\\n"
-"              droonga-send --server=192.168.0.10 \\\n"
-"                           --report-throughput)"
+"You can control the cache hit rate by the number of unique request patterns, c"
+"alculated with the expression:\n"
+"`N = 100 / (cache hit rate)`, because Groonga and Droonga (`droonga-http-serve"
+"r`) cache 100 results at a maximum by default.\n"
+"When the expected cache hit rate is 50%, the number of unique requests is calc"
+"ulated as: `N = 100 / 0.5 = 200`"
+msgstr ""
+
+msgid "### Prepare list of search terms"
 msgstr ""
 
-msgid "After that, import data from the dump file."
+msgid ""
+"The package `drnbench` includes a utility command `drnbench-generate-select-pa"
+"tterns` to generate request patterns for benchmarking, from a list of unique t"
+"erms, like:"
 msgstr ""
 
 msgid ""
-"    (on 192.168.0.10)\n"
-"    % time (cat ~/wikipedia-search/data/groonga/ja-pages.grn | grn2drn | \\\n"
-"              droonga-send --server=192.168.0.10 \\\n"
-"                           --server=192.168.0.11 \\\n"
-"                           --server=192.168.0.12)"
+"~~~\n"
+"AAA\n"
+"BBB\n"
+"CCC\n"
+"~~~"
 msgstr ""
 
 msgid ""
-"    (on 192.168.0.10)\n"
-"    % time (grndump --no-dump-schema --no-dump-indexes $HOME/groonga/db/db | \\"
+"To generate 200 unique request patterns, you have to prepare 200 terms.\n"
+"Moreover, all of terms must be effective search term for the Groonga database."
 "\n"
-"              grn2drn | \\\n"
-"              droonga-send --server=192.168.0.10 \\\n"
-"                           --server=192.168.0.11 \\\n"
-"                           --server=192.168.0.12 \\\n"
-"                           --report-throughput)"
+"If you use randomly generated terms (like `P2qyNJ9L`, `Hy4pLKc5`, `D5eftuTp`, "
+"...), you won't get effective benchmark result, because \"not found\" results wi"
+"ll be returned for most requests."
+msgstr ""
+
+msgid ""
+"So there is another utility command `drnbench-extract-searchterms`.\n"
+"It generates list of terms from Groonga's select result, like:"
+msgstr ""
+
+msgid ""
+"~~~\n"
+"% curl \"http://192.168.100.50:10041/d/select?table=Pages&limit=10&output_colum"
+"ns=title\" | \\\n"
+"    drnbench-extract-searchterms\n"
+"title1\n"
+"title2\n"
+"title3\n"
+"...\n"
+"title10\n"
+"~~~"
+msgstr ""
+
+msgid ""
+"`drnbench-extract-searchterms` extracts terms from the first column of records"
+".\n"
+"To collect 200 effective search terms, you just have to give a select result w"
+"ith an option `limit=200`."
+msgstr ""
+
+msgid "### Generate request pattern file from given terms"
+msgstr ""
+
+msgid ""
+"OK, let's generate request patterns by `drnbench-generate-select-patterns` and"
+" `drnbench-extract-searchterms`, from a select result."
+msgstr ""
+
+msgid ""
+"~~~\n"
+"% n_unique_requests=200\n"
+"% curl \"http://192.168.100.50:10041/d/select?table=Pages&limit=$n_unique_reque"
+"sts&output_columns=title\" | \\\n"
+"    drnbench-extract-searchterms | \\\n"
+"    drnbench-generate-select-patterns \\\n"
+"    > ./patterns.json\n"
+"~~~"
+msgstr ""
+
+msgid "The generated file `patterns.json` becomes like following:"
+msgstr ""
+
+msgid ""
+"~~~\n"
+"{\n"
+"  \"with-query\": {\n"
+"    \"frequency\": 1.0,\n"
+"    \"method\": \"get\",\n"
+"    \"patterns\": [\n"
+"      {\n"
+"        \"path\": \"/d/select?limit=10&offset=0&query=AAA\"\n"
+"      },\n"
+"      {\n"
+"        \"path\": \"/d/select?limit=10&offset=0&query=BBB\"\n"
+"      },\n"
+"      ...\n"
+"    ]\n"
+"  }\n"
+"}\n"
+"~~~"
+msgstr ""
+
+msgid ""
+"Like above, request patterns for the `select` command are generated with the p"
+"arameter `query`, based on given terms."
+msgstr ""
+
+msgid ""
+"However, these requests are too simple.\n"
+"No table is specified, there is no output, no drilldown.\n"
+"To construct more effective select requests, you can give extra parameters to "
+"the `drnbench-generate-select-patterns` via its `--base-params` option, like:"
+msgstr ""
+
+msgid ""
+"~~~\n"
+"% n_unique_requests=200\n"
+"% curl \"http://192.168.100.50:10041/d/select?table=Pages&limit=$n_unique_reque"
+"sts&output_columns=title\" | \\\n"
+"    drnbench-extract-searchterms | \\\n"
+"    drnbench-generate-select-patterns \\\n"
+"      --base-params=\"table=Pages&limit=10&match_columns=title,text&output_colu"
+"mns=snippet_html(title),snippet_html(text),categories,_key\" \\\n"
+"    > ./patterns.json\n"
+"~~~"
+msgstr ""
+
+msgid "Then the generated file becomes:"
+msgstr ""
+
+msgid ""
+"~~~\n"
+"{\n"
+"  \"with-query\": {\n"
+"    \"frequency\": 1.0,\n"
+"    \"method\": \"get\",\n"
+"    \"patterns\": [\n"
+"      {\n"
+"        \"path\": \"/d/select?table=Pages&limit=10&match_columns=title,text&outpu"
+"t_columns=snippet_html(title),snippet_html(text),categories,_key&query=AAA\"\n"
+"      },\n"
+"      {\n"
+"        \"path\": \"/d/select?table=Pages&limit=10&match_columns=title,text&outpu"
+"t_columns=snippet_html(title),snippet_html(text),categories,_key&query=BBB\"\n"
+"      },\n"
+"      ...\n"
+"    ]\n"
+"  }\n"
+"}\n"
+"~~~"
+msgstr ""
+
+msgid "## Run the benchmark"
+msgstr ""
+
+msgid ""
+"OK, it's ready to run.\n"
+"Let's benchmark Groonga and Droonga."
+msgstr ""
+
+msgid "### Benchmark Groonga"
+msgstr ""
+
+msgid ""
+"First, run benchmark for Groonga as the reference.\n"
+"Start Groonga's HTTP server before running."
+msgstr ""
+
+msgid ""
+"~~~\n"
+"(on 192.168.100.50)\n"
+"% groonga -p 10041 -d --protocol http $HOME/groonga/db/db\n"
+"~~~"
+msgstr ""
+
+msgid "You can run benchmark with the command `drnbench-request-response`, like:"
+msgstr ""
+
+msgid ""
+"~~~\n"
+"(on 192.168.100.53)\n"
+"% drnbench-request-response \\\n"
+"    --step=2 \\\n"
+"    --start-n-clients=0 \\\n"
+"    --end-n-clients=20 \\\n"
+"    --duration=30 \\\n"
+"    --interval=10 \\\n"
+"    --request-patterns-file=$PWD/patterns.json \\\n"
+"    --default-hosts=192.168.100.50 \\\n"
+"    --default-port=10041 \\\n"
+"    --output-path=$PWD/groonga-result.csv\n"
+"~~~"
+msgstr ""
+
+msgid "Important parameters are:"
+msgstr ""
+
+msgid ""
+" * `--step` is the number of virtual clients increased on each progress.\n"
+" * `--start-n-clients` is the initial number of virtual clients.\n"
+"   Even if you specify `0`, initially one client is always generated.\n"
+" * `--end-n-clients` is the maximum number of virtual clients.\n"
+"   Benchmark is performed progressively until the number of clients is reached"
+" to this limit.\n"
+" * `--duration` is the duration of each benchmark.\n"
+"   This should be long enough to average out the result.\n"
+"   `30` (seconds) seems good for my case.\n"
+" * `--interval` is the interval between each benchmark.\n"
+"   This should be long enough to finish previous benchmark.\n"
+"   `10` (seconds) seems good for my case.\n"
+" * `--request-patterns-file` is the path to the pattern file.\n"
+" * `--default-hosts` is the list of host names of target endpoints.\n"
+"   By specifying multiple hosts as a comma-separated list, you can simulate lo"
+"ad balancing.\n"
+" * `--default-port` is the port number of the target endpoint.\n"
+" * `--output-path` is the path to the result file.\n"
+"   Statistics of all benchmarks is saved as a file at the location."
+msgstr ""
+
+msgid ""
+"Then you'll get the reference result of the Groonga.\n"
+"After that you should stop Groonga to release CPU and RAM resources."
+msgstr ""
+
+msgid "### Benchmark Droonga"
+msgstr ""
+
+msgid ""
+"To clear effects from previous benchmark, you should restart services before e"
+"ach test."
+msgstr ""
+
+msgid ""
+"~~~\n"
+"(on 192.168.100.50, 192.168.100.51, 192.168.100.52)\n"
+"% sudo service droonga-engine restart\n"
+"% sudo service droonga-http-server restart\n"
+"~~~"
+msgstr ""
+
+msgid "#### Benchmark Droonga with single node"
+msgstr ""
+
+msgid "Before benchmarking, make your cluster with only one node."
+msgstr ""
+
+msgid ""
+"~~~\n"
+"(on 192.168.100.50)\n"
+"% sudo droonga-engine-catalog-generate \\\n"
+"    --hosts=192.168.100.50\n"
+"~~~"
+msgstr ""
+
+msgid ""
+"After that the endpoint `192.168.100.50` works as a Droonga cluster with singl"
+"e node.\n"
+"Run the benchmark."
+msgstr ""
+
+msgid ""
+"~~~\n"
+"(on 192.168.100.53)\n"
+"% drnbench-request-response \\\n"
+"    --step=2 \\\n"
+"    --start-n-clients=0 \\\n"
+"    --end-n-clients=20 \\\n"
+"    --duration=30 \\\n"
+"    --interval=10 \\\n"
+"    --request-patterns-file=$PWD/patterns.json \\\n"
+"    --default-hosts=192.168.100.50 \\\n"
+"    --default-port=10042 \\\n"
+"    --output-path=$PWD/droonga-result-1node.csv\n"
+"~~~"
+msgstr ""
+
+msgid ""
+"Note that the default port is changed from `10041` (Groonga's HTTP server) to "
+"`10042` (Droonga).\n"
+"Moreover, the path to the result file also changed."
+msgstr ""
+
+msgid "#### Benchmark Droonga with two nodes"
+msgstr ""
+
+msgid "Before benchmarking, join the second node to the cluster."
+msgstr ""
+
+msgid ""
+"~~~\n"
+"(on 192.168.100.50, 192.168.100.51)\n"
+"% sudo droonga-engine-catalog-generate \\\n"
+"    --hosts=192.168.100.50,192.168.100.51\n"
+"~~~"
+msgstr ""
+
+msgid ""
+"After that both endpoints `192.168.100.50` and `192.168.100.51` work as a Droo"
+"nga cluster with two nodes.\n"
+"Run the benchmark."
+msgstr ""
+
+msgid ""
+"~~~\n"
+"(on 192.168.100.53)\n"
+"% drnbench-request-response \\\n"
+"    --step=2 \\\n"
+"    --start-n-clients=0 \\\n"
+"    --end-n-clients=20 \\\n"
+"    --duration=30 \\\n"
+"    --interval=10 \\\n"
+"    --request-patterns-file=$PWD/patterns.json \\\n"
+"    --default-hosts=192.168.100.50,192.168.100.51 \\\n"
+"    --default-port=10042 \\\n"
+"    --output-path=$PWD/droonga-result-2nodes.csv\n"
+"~~~"
+msgstr ""
+
+msgid "Note that two hosts are specified via the `--default-hosts` option."
+msgstr ""
+
+msgid ""
+"If you send all requests to single endpoint, `droonga-http-server` will become"
+" a bottleneck, because it works as a single process for now.\n"
+"Moreover, `droonga-http-server` and `droonga-engine` will scramble for CPU res"
+"ources.\n"
+"To measure the performance of your Droonga cluster effectively, you should ave"
+"rage out CPU load per capita."
+msgstr ""
+
+msgid ""
+"Of course, on the production environment, it should be done by a load balancer"
+", but It's a hassle to set up a load balancer for just benchmarking.\n"
+"Instead, you can specify multiple endpoint host names as a comma-separated lis"
+"t for the `--default-hosts` option."
+msgstr ""
+
+msgid "And, the path to the result file also changed."
+msgstr ""
+
+msgid "#### Benchmark Droonga with three nodes"
+msgstr ""
+
+msgid "Before benchmarking, join the last node to the cluster."
+msgstr ""
+
+msgid ""
+"~~~\n"
+"(on 192.168.100.50, 192.168.100.51)\n"
+"% sudo droonga-engine-catalog-generate \\\n"
+"    --hosts=192.168.100.50,192.168.100.51,192.168.100.52\n"
+"~~~"
+msgstr ""
+
+msgid ""
+"After that all endpoints `192.168.100.50`, `192.168.100.51`, and `192.168.100."
+"52` work as a Droonga cluster with three nodes.\n"
+"Run the benchmark."
+msgstr ""
+
+msgid ""
+"~~~\n"
+"(on 192.168.100.53)\n"
+"% drnbench-request-response \\\n"
+"    --step=2 \\\n"
+"    --start-n-clients=0 \\\n"
+"    --end-n-clients=20 \\\n"
+"    --duration=30 \\\n"
+"    --interval=10 \\\n"
+"    --request-patterns-file=$PWD/patterns.json \\\n"
+"    --default-hosts=192.168.100.50,192.168.100.51,192.168.100.52 \\\n"
+"    --default-port=10042 \\\n"
+"    --output-path=$PWD/droonga-result-3nodes.csv\n"
+"~~~"
+msgstr ""
+
+msgid "Note that both `--default-hosts` and `--output-path` are changed again."
+msgstr ""
+
+msgid "## Analyze the result"
+msgstr ""
+
+msgid "OK, now you have four results:"
+msgstr ""
+
+msgid ""
+" * `groonga-result.csv`\n"
+" * `droonga-result-1node.csv`\n"
+" * `droonga-result-2nodes.csv`\n"
+" * `droonga-result-3nodes.csv`"
+msgstr ""
+
+msgid "[As described](#how-to-analyze), you can analyze them."
+msgstr ""
+
+msgid "For example, you can plot a graph from these results like:"
+msgstr ""
+
+msgid ""
+"![A layered graph of throughput](/images/tutorial/benchmark/throughput-mixed.p"
+"ng)"
+msgstr ""
+
+msgid ""
+"You can explain this graph as: \"On this condition Droonga has better performan"
+"ce when there are multiple nodes\", \"Single Droonga node's performance is lesse"
+"r than Groonga's one, on this setting\", and so on."
+msgstr ""
+
+msgid ""
+"(Note: Performance results fluctuate from various factors.\n"
+"This graph is just an example on a specific version, specific environment.)"
 msgstr ""
 
-msgid "This may take much time (10 or more hours)."
+msgid "## Conclusion"
 msgstr ""
 
 msgid ""
-"(TBD, based on https://github.com/droonga/presentation-droonga-meetup-1-introd"
-"uction/blob/master/benchmark/README.md )"
+"In this tutorial, you did prepare a reference [Groonga][] server and [Droonga]"
+"[] cluster.\n"
+"And, you studied how to prepare request patterns, how measure your systems, an"
+"d how analyze the result."
 msgstr ""
 
 msgid ""
@@ -244,5 +926,7 @@ msgid ""
 "  [CentOS]: https://www.centos.org/\n"
 "  [Droonga]: https://droonga.org/\n"
 "  [Groonga]: http://groonga.org/\n"
+"  [drnbench]: https://github.com/droonga/drnbench/\n"
+"  [wikipedia-search]: https://github.com/droonga/wikipedia-search/\n"
 "  [command reference]: ../../reference/commands/"
 msgstr ""

  Added: _po/ja/tutorial/benchmark/index.po (+16 -0) 100644
===================================================================
--- /dev/null
+++ _po/ja/tutorial/benchmark/index.po    2014-10-04 02:59:00 +0900 (af21f9d)
@@ -0,0 +1,16 @@
+msgid ""
+msgstr ""
+"Project-Id-Version: PACKAGE VERSION\n"
+"PO-Revision-Date: 2014-09-25 04:08+0900\n"
+"Language: ja\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=UTF-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+
+msgid ""
+"---\n"
+"layout: redirect-to-current-version\n"
+"unversioned_path: /tutorial/\n"
+"---"
+msgstr ""

  Modified: ja/tutorial/1.0.5/benchmark/index.md (+5 -5)
===================================================================
--- ja/tutorial/1.0.5/benchmark/index.md    2014-10-04 02:45:29 +0900 (1486189)
+++ ja/tutorial/1.0.5/benchmark/index.md    2014-10-04 02:59:00 +0900 (dc11ef2)
@@ -1,6 +1,6 @@
 ---
-title: "How to benchmark Droonga with Groonga?"
-layout: en
+title: "DroongaとGroongaのベンチマークの取り方"
+layout: ja
 ---
 
 {% comment %}
@@ -26,10 +26,10 @@ Learning steps to benchmark a [Droonga][] cluster and compare it to a [Groonga][
 * You must have basic knowledge to construct a [Droonga][] cluster by your hand.
   Please complete the ["getting started" tutorial](../groonga/) before this.
 
-## Why benchmarking?
+## ベンチマークの必要性について
 
-Because Droonga has compatibility to Groonga, you'll plan to migrate your application based on Groonga to Droonga.
-Before that, you should benchmark Droonga and confirm that it is better alternative for your application.
+DroongaはGroongaと互換性があるため、GroongaベースのアプリケーションをDroongaに移行することを検討することもあるでしょう。
+そんな時は、実際に移行する前に、Droongaの性能を測定して、より良い移行先であるかどうかを確認しておくべきです。
 
 For example, assume that your application has following spec:
 

  Modified: ja/tutorial/1.0.6/benchmark/index.md (+5 -5)
===================================================================
--- ja/tutorial/1.0.6/benchmark/index.md    2014-10-04 02:45:29 +0900 (1f21e06)
+++ ja/tutorial/1.0.6/benchmark/index.md    2014-10-04 02:59:00 +0900 (ca0ad17)
@@ -1,6 +1,6 @@
 ---
-title: "How to benchmark Droonga with Groonga?"
-layout: en
+title: "DroongaとGroongaのベンチマークの取り方"
+layout: ja
 ---
 
 {% comment %}
@@ -26,10 +26,10 @@ Learning steps to benchmark a [Droonga][] cluster and compare it to a [Groonga][
 * You must have basic knowledge to construct a [Droonga][] cluster by your hand.
   Please complete the ["getting started" tutorial](../groonga/) before this.
 
-## Why benchmarking?
+## ベンチマークの必要性について
 
-Because Droonga has compatibility to Groonga, you'll plan to migrate your application based on Groonga to Droonga.
-Before that, you should benchmark Droonga and confirm that it is better alternative for your application.
+DroongaはGroongaと互換性があるため、GroongaベースのアプリケーションをDroongaに移行することを検討することもあるでしょう。
+そんな時は、実際に移行する前に、Droongaの性能を測定して、より良い移行先であるかどうかを確認しておくべきです。
 
 For example, assume that your application has following spec:
 

  Modified: ja/tutorial/1.0.7/benchmark/index.md (+543 -110)
===================================================================
--- ja/tutorial/1.0.7/benchmark/index.md    2014-10-04 02:45:29 +0900 (d2d25c5)
+++ ja/tutorial/1.0.7/benchmark/index.md    2014-10-04 02:59:00 +0900 (721b9a5)
@@ -1,6 +1,6 @@
 ---
-title: "How to benchmark Droonga with Groonga?"
-layout: en
+title: "DroongaとGroongaのベンチマークの取り方"
+layout: ja
 ---
 
 {% comment %}
@@ -15,156 +15,589 @@ layout: en
 * TOC
 {:toc}
 
+<!--
+this is based on https://github.com/droonga/presentation-droonga-meetup-1-introduction/blob/master/benchmark/README.md
+-->
+
 ## チュートリアルのゴール
 
-Learning steps to benchmark a [Droonga][] cluster and compare it to a [Groonga][groonga].
+[Droonga][]クラスタのベンチマークの測定し、[Groonga][groonga]での結果と比較するまでの、一連の手順を学ぶこと。
 
 ## 前提条件
 
-* You must have basic knowledge and experiences to set up and operate an [Ubuntu][] or [CentOS][] Server.
-* You must have basic knowledge and experiences to use the [Groonga][groonga] via HTTP.
-* You must have basic knowledge to construct a [Droonga][] cluster by your hand.
-  Please complete the ["getting started" tutorial](../groonga/) before this.
-
-## Why benchmarking?
-
-Because Droonga has compatibility to Groonga, you'll plan to migrate your application based on Groonga to Droonga.
-Before that, you should benchmark Droonga and confirm that it is better alternative for your application.
-
-For example, assume that your application has following spec:
-
- * The database contains all pages of [Japanese Wikipedia](http://ja.wikipedia.org/).
- * 50% accesses are a fixed query for the front page. Others have different search queries.
- * There are three [Ubuntu][] 14.04LTS servers for the new Droogna cluster: `192.168.0.10`, `192.168.0.11`, and `192.168.0.12`.
-
-## Prepare the data source
-
-First, download the archive of Wikipedia pages and convert it to a dump file for Groonga, on the node `192.168.0.10`.
-Because the archive is very large, downloading and data conversion may take some a few hours.
-
-    (on 192.168.0.10)
-    % cd ~/
-    % git clone https://github.com/droonga/wikipedia-search.git
-    % cd wikipedia-search
-    % bundle install
-    % time rake data:convert:groonga:ja data/groonga/ja-all-pages.grn
-
-After that, a dump file `~/wikipedia-search/data/groonga/ja-all-pages.grn` becomes available.
-
-## Set up a Groonga server
-
-As a criterion, let's setup the Groonga on the node `192.168.0.10`.
-
-    (on 192.168.0.10)
+* [Ubuntu][]または[CentOS][]のサーバの操作に関する基本的な知識と経験があること。
+* [Groonga][groonga]をHTTP経由で操作する際の基本的な知識と経験があること。
+* [Droonga][]クラスタの構築手順について基本的な知識があること。
+  このチュートリアルの前に、[「始めてみる」のチュートリアル](../groonga/)を完了しておいて下さい。
+
+また、新しいDroongaクラスタのために以下の4つの[Ubuntu][] 14.04LTSのサーバがあると仮定します:
+
+ * `192.168.100.50`
+ * `192.168.100.51`
+ * `192.168.100.52`
+ * `192.168.100.53`
+
+1つはクライアント用で、残りの3つはDroongaノード用です。
+
+## ベンチマークの必要性について
+
+DroongaはGroongaと互換性があるため、GroongaベースのアプリケーションをDroongaに移行することを検討することもあるでしょう。
+そんな時は、実際に移行する前に、Droongaの性能を測定して、より良い移行先であるかどうかを確認しておくべきです。
+
+もちろん、単にGroongaとDroongaの性能差を知りたいと思うこともあるでしょう。
+ベンチマークによって、差を可視化することができます。
+
+
+### ベンチマークツールはどのように性能を測定するのか
+
+ベンチマークは、[drnbench]()というGemパッケージによって導入される`drnbench-request-response`コマンドで行うことができます。
+このツールは、対象サービスのスループット性能、つまり、一度にどれだけの数のリクエストを捌けるかを計測します。
+性能の指標は「*クエリ毎秒*(Queries Per Second, *QPS*)」という単位で表されます。
+
+For example, if a Groonga server processed 10 requests in one second, that is described as "10 QPS".
+Possibly there are 10 users (clients), or, there are 2 users and each user opens 5 tabs in his web browser.
+Anyway, "10 QPS" means that the Groonga actually accepted and responded for 10 requests while one second is passing.
+
+`drnbench-request-response` benchmarks the target service, by steps like following:
+
+ 1. The master process generates one virtual client.
+    The client starts to send many requests to the target sequentially and frequently.
+ 2. After a while, the master process kills the client.
+    Then he counts up the number of requests actually processed by the target, and reports it as QPS of the single client case.
+ 3. The master process generates two virtual clients.
+    They starts to send requests.
+ 4. After a while, the master process kills all clients.
+    Then total number of processed requests sent by all clients is reported as QPS of the two clients case.
+ 5. Repeated with three clients, four clients ... and more progressively.
+ 6. Finally, the master process reports QPS and other extra information for each case, as a CSV file like:
+    
+    ~~~
+    n_clients,total_n_requests,queries_per_second,min_elapsed_time,max_elapsed_time,average_elapsed_time,0,200
+    1,164,5.466666666666667,0.002184631,1.951960432,0.1727086823963415,0,100.0
+    2,1618,53.93333333333333,0.001466091,1.587372312,0.026789948272558754,0.12360939431396785,99.87639060568603
+    4,4690,156.33333333333334,0.001065161,0.26070575,0.015224578191897657,0.042643923240938165,99.95735607675907
+    6,6287,209.56666666666666,0.000923332,0.25709169,0.018191428254970568,0.09543502465404805,99.90456497534595
+    8,6628,220.93333333333334,0.000979707,0.288406006,0.02557014875603507,0.030175015087507546,99.96982498491249
+    10,7117,237.23333333333332,0.001235846,0.303093461,0.03160425060474918,0.1405086412814388,99.85949135871857
+    12,7403,246.76666666666668,0.001111115,0.33163911,0.03792291040199917,0.09455626097528029,99.90544373902472
+    14,7454,248.46666666666667,0.00151987,0.335161281,0.04522922885028168,0.174403005097934,99.82559699490207
+    16,7357,245.23333333333332,0.000763487,0.356862003,0.05435767224085904,0.08155498165012913,99.91844501834987
+    18,7494,249.8,0.001017168,0.378661333,0.061178927504003194,0.20016012810248196,99.79983987189752
+    20,7506,250.2,0.001759464,0.404634447,0.06887332192845741,0.21316280309086064,99.78683719690913
+    ~~~
+    
+    You can analyze it, draw a graph from it, and so on.
+    
+    (Note: Performance results fluctuate from various factors.
+    This is just an example on a specific version, specific environment.)
+
+### How read and analyze the result? {#how-to-analyze}
+
+![A graph of throughput](/images/tutorial/benchmark/throughput-groonga.png)
+
+Look at the result above, and this graph.
+You'll see that the QPS stagnated around 250, for 12 or more clients.
+This means that the target service can process 250 requests in one second, at a maximum.
+
+In other words, we can describe the result as: 250 QPS is the maximum throughput performance of this system - generic performance of hardware, software, network, size of the database, queries, and more.
+If the number of requests for your service is growing up and it is going to reach the limit, you have to do something about it - optimize queries, replace the computer with more powerful one, and so on.
+
+And, sending same request patterns to Groonga and Droonga, you can compare maximum QPS for each system.
+If Droonga's QPS is larger than Groonga's one (=Droonga has better performance about throughput), it will become good reason to migrate your service from Groogna to Droonga.
+Moreover, comparing multiple results from different number of Droogna nodes, you can analyze the cost-benefit performance for newly introduced nodes.
+
+
+### Ensure an existing reference database (and the data source)
+
+If you have any existing service based on Groonga, it becomes the reference.
+Then you just have to dump all data in your Groonga database and load them to a new Droonga cluster.
+
+Otherwise - if you have no existing service, prepare a new reference database with much data for effective benchmark.
+The repository [wikipedia-search][] includes some helper scripts to construct your Groonga server (and Droonga cluster), with [Japanese Wikipedia](http://ja.wikipedia.org/) pages.
+
+So let's prepare a new Groonga database including Wikipedia pages, on a node `192.168.100.50`.
+
+ 1. Determine the size of the database.
+    You have to use good enough size database for benchmarking.
+    
+    * If it is too small, you'll see "too bad" benchmark result for Droonga, because the percentage of the Droonga's overhead becomes relatively too large.
+    * If it is too large, you'll see "too unstable" result because swapping of RAM will slow the performance down randomly.
+    * If RAM size of all nodes are different, you should determine the size of the database for the minimum size RAM.
+
+    For example, if there are three nodes `192.168.100.50` (8GB RAM), `192.168.100.51` (8GB RAM), and `192.168.100.52` (6GB RAM), then the database should be smaller than 6GB.
+ 2. Set up the Groonga server, as instructed on [the installation guide](http://groonga.org/docs/install.html).
+    
+    ~~~
+    (on 192.168.100.50)
     % sudo apt-get -y install software-properties-common
     % sudo add-apt-repository -y universe
     % sudo add-apt-repository -y ppa:groonga/ppa
     % sudo apt-get update
     % sudo apt-get -y install groonga
-
-Now the Groonga is available.
-Prepare the database based dump files.
-This may take much time (10 or more hours).
-
-    (on 192.168.0.10)
+    ~~~
+    
+    Then the Groonga becomes available.
+ 3. Download the archive of Wikipedia pages and convert it to a dump file for Groonga, with the rake task `data:convert:groonga:ja`.
+    You can specify the number of records (pages) to be converted via the environment variable `MAX_N_RECORDS` (default=5000).
+    
+    ~~~
+    (on 192.168.100.50)
+    % cd ~/
+    % git clone https://github.com/droonga/wikipedia-search.git
+    % cd wikipedia-search
+    % bundle install
+    % time (MAX_N_RECORDS=100000 bundle exec rake data:convert:groonga:ja \
+                                   data/groonga/ja-pages.grn)
+    ~~~
+    
+    Because the archive is very large, downloading and data conversion may take time.
+    
+    After that, a dump file `~/wikipedia-search/data/groonga/ja-pages.grn` is there.
+    Create a new database and load the dump file to it.
+    This also may take more time:
+    
+    ~~~
+    (on 192.168.100.50)
     % mkdir -p $HOME/groonga/db/
     % groonga -n $HOME/groonga/db/db quit
     % time (cat ~/wikipedia-search/config/groonga/schema.grn | groonga $HOME/groonga/db/db)
     % time (cat ~/wikipedia-search/config/groonga/indexes.grn | groonga $HOME/groonga/db/db)
-    % time (cat ~/wikipedia-search/data/groonga/ja-all-pages.grn | groonga $HOME/groonga/db/db)
+    % time (cat ~/wikipedia-search/data/groonga/ja-pages.grn | groonga $HOME/groonga/db/db)
+    ~~~
+    
+    Note: number of records affects to the database size.
+    Just for information, my results are here:
+    
+     * 1.1GB database was constructed from 300000 records.
+       Data conversion took 17 min, data loading took 6 min.
+     * 4.3GB database was constructed from 1500000 records.
+       Data conversion took 53 min, data loading took 64 min.
+    
+ 4. Start the Groonga as an HTTP server.
+    
+    ~~~
+    (on 192.168.100.50)
+    % groonga -p 10041 -d --protocol http $HOME/groonga/db/db
+    ~~~
 
-Then start the Groonga as an HTTP server.
+OK, now we can use this node as the reference for benchmarking.
 
-    (on 192.168.0.10)
-    % groonga -p 10041 -d --protocol http $HOME/groonga/db/db
 
 ## Droongaクラスタをセットアップする
 
-Install Droonga to nodes.
+Install Droonga to all nodes.
+Because we are benchmarking it via HTTP, you have to install both services `droonga-engine` and `droonga-http-server` for each node.
+
+~~~
+(on 192.168.100.50)
+% host=192.168.100.50
+% curl https://raw.githubusercontent.com/droonga/droonga-engine/master/install.sh | \
+    sudo HOST=$host bash
+% curl https://raw.githubusercontent.com/droonga/droonga-http-server/master/install.sh | \
+    sudo ENGINE_HOST=$host HOST=$host PORT=10042 bash
+% sudo droonga-engine-catalog-generate \
+    --hosts=192.168.100.50,192.168.100.51,192.168.100.52
+% sudo service droonga-engine start
+% sudo service droonga-http-server start
+~~~
+
+~~~
+(on 192.168.100.51)
+% host=192.168.100.51
+...
+~~~
+
+~~~
+(on 192.168.100.52)
+% host=192.168.100.52
+...
+~~~
+
+Note: to start `droonga-http-server` with a port number different from Groonga, we should specify another port `10042` via the `PORT` environment variable, like above.
+
+
+## Synchronize data from Groonga to Droonga
+
+Next, prepare the Droonga database.
+Send Droonga messages from dump files, like:
+
+~~~
+(on 192.168.100.50)
+% sudo gem install grn2drn
+% time (cat ~/wikipedia-search/config/groonga/schema.grn | \
+          grn2drn | \
+          droonga-send --server=192.168.100.50 \
+                       --report-throughput)
+% time (cat ~/wikipedia-search/config/groonga/indexes.grn | \
+          grn2drn | \
+          droonga-send --server=192.168.100.50 \
+                       --report-throughput)
+% time (cat ~/wikipedia-search/data/groonga/ja-pages.grn | \
+          grn2drn | \
+          droonga-send --server=192.168.100.50 \
+                       --server=192.168.100.51 \
+                       --server=192.168.100.52 \
+                       --report-throughput)
+~~~
+
+Note that you must send requests for schema and indexes to just one endpoint.
+Parallel sending of schema definition requests for multiple nodes will break the database.
+
+This may take much time.
+After all, now you have two HTTP servers: Groonga HTTP server with the port `10041`, and Droonga HTTP Servers with the port `10042`.
+
+
+## Set up the client
+
+You must install the benchmark client to the computer.
+
+Assume that you use a computer `192.168.100.53` as the client:
+
+~~~
+(on 192.168.100.53)
+% sudo apt-get update
+% sudo apt-get -y upgrade
+% sudo apt-get install -y ruby curl jq
+% sudo gem install drnbench
+~~~
+
+
+## Prepare request patterns
+
+Let's prepare request pattern files for benchmarking.
+
+### Determine the expected cache hit rate
+
+First, you have to determine the cache hit rate.
+
+If you have any existing service based on Groonga, you can get the actual cache hit rate of the Groonga database via `status` command, like:
+
+~~~
+% curl "http://192.168.100.50:10041/d/status" | jq .
+[
+  [
+    0,
+    1412326645.19701,
+    3.76701354980469e-05
+  ],
+  {
+    "max_command_version": 2,
+    "alloc_count": 158,
+    "starttime": 1412326485,
+    "uptime": 160,
+    "version": "4.0.6",
+    "n_queries": 1000,
+    "cache_hit_rate": 0.5,
+    "command_version": 1,
+    "default_command_version": 1
+  }
+]
+~~~
+
+The cache hit rate appears as `"cache_hit_rate"`.
+`0.5` means 50%, then a half of responses are returned from cached results.
+
+If you have no existing service, you should assume that the cache hit rate becomes 50%.
+
+To measure and compare performance of Groonga and Droonga properly, you should prepare request patterns for benchmarking which make the cache hit rate near the actual rate.
+So, how do it?
+
+You can control the cache hit rate by the number of unique request patterns, calculated with the expression:
+`N = 100 / (cache hit rate)`, because Groonga and Droonga (`droonga-http-server`) cache 100 results at a maximum by default.
+When the expected cache hit rate is 50%, the number of unique requests is calculated as: `N = 100 / 0.5 = 200`
+
+### Prepare list of search terms
+
+The package `drnbench` includes a utility command `drnbench-generate-select-patterns` to generate request patterns for benchmarking, from a list of unique terms, like:
+
+~~~
+AAA
+BBB
+CCC
+~~~
+
+To generate 200 unique request patterns, you have to prepare 200 terms.
+Moreover, all of terms must be effective search term for the Groonga database.
+If you use randomly generated terms (like `P2qyNJ9L`, `Hy4pLKc5`, `D5eftuTp`, ...), you won't get effective benchmark result, because "not found" results will be returned for most requests.
+
+So there is another utility command `drnbench-extract-searchterms`.
+It generates list of terms from Groonga's select result, like:
+
+~~~
+% curl "http://192.168.100.50:10041/d/select?table=Pages&limit=10&output_columns=title" | \
+    drnbench-extract-searchterms
+title1
+title2
+title3
+...
+title10
+~~~
+
+`drnbench-extract-searchterms` extracts terms from the first column of records.
+To collect 200 effective search terms, you just have to give a select result with an option `limit=200`.
+
+
+### Generate request pattern file from given terms
+
+OK, let's generate request patterns by `drnbench-generate-select-patterns` and `drnbench-extract-searchterms`, from a select result.
+
+~~~
+% n_unique_requests=200
+% curl "http://192.168.100.50:10041/d/select?table=Pages&limit=$n_unique_requests&output_columns=title" | \
+    drnbench-extract-searchterms | \
+    drnbench-generate-select-patterns \
+    > ./patterns.json
+~~~
+
+The generated file `patterns.json` becomes like following:
+
+~~~
+{
+  "with-query": {
+    "frequency": 1.0,
+    "method": "get",
+    "patterns": [
+      {
+        "path": "/d/select?limit=10&offset=0&query=AAA"
+      },
+      {
+        "path": "/d/select?limit=10&offset=0&query=BBB"
+      },
+      ...
+    ]
+  }
+}
+~~~
+
+Like above, request patterns for the `select` command are generated with the parameter `query`, based on given terms.
+
+However, these requests are too simple.
+No table is specified, there is no output, no drilldown.
+To construct more effective select requests, you can give extra parameters to the `drnbench-generate-select-patterns` via its `--base-params` option, like:
+
+~~~
+% n_unique_requests=200
+% curl "http://192.168.100.50:10041/d/select?table=Pages&limit=$n_unique_requests&output_columns=title" | \
+    drnbench-extract-searchterms | \
+    drnbench-generate-select-patterns \
+      --base-params="table=Pages&limit=10&match_columns=title,text&output_columns=snippet_html(title),snippet_html(text),categories,_key" \
+    > ./patterns.json
+~~~
+
+Then the generated file becomes:
+
+~~~
+{
+  "with-query": {
+    "frequency": 1.0,
+    "method": "get",
+    "patterns": [
+      {
+        "path": "/d/select?table=Pages&limit=10&match_columns=title,text&output_columns=snippet_html(title),snippet_html(text),categories,_key&query=AAA"
+      },
+      {
+        "path": "/d/select?table=Pages&limit=10&match_columns=title,text&output_columns=snippet_html(title),snippet_html(text),categories,_key&query=BBB"
+      },
+      ...
+    ]
+  }
+}
+~~~
+
+
+## Run the benchmark
+
+OK, it's ready to run.
+Let's benchmark Groonga and Droonga.
+
+### Benchmark Groonga
+
+First, run benchmark for Groonga as the reference.
+Start Groonga's HTTP server before running.
+
+~~~
+(on 192.168.100.50)
+% groonga -p 10041 -d --protocol http $HOME/groonga/db/db
+~~~
+
+You can run benchmark with the command `drnbench-request-response`, like:
+
+~~~
+(on 192.168.100.53)
+% drnbench-request-response \
+    --step=2 \
+    --start-n-clients=0 \
+    --end-n-clients=20 \
+    --duration=30 \
+    --interval=10 \
+    --request-patterns-file=$PWD/patterns.json \
+    --default-hosts=192.168.100.50 \
+    --default-port=10041 \
+    --output-path=$PWD/groonga-result.csv
+~~~
+
+Important parameters are:
+
+ * `--step` is the number of virtual clients increased on each progress.
+ * `--start-n-clients` is the initial number of virtual clients.
+   Even if you specify `0`, initially one client is always generated.
+ * `--end-n-clients` is the maximum number of virtual clients.
+   Benchmark is performed progressively until the number of clients is reached to this limit.
+ * `--duration` is the duration of each benchmark.
+   This should be long enough to average out the result.
+   `30` (seconds) seems good for my case.
+ * `--interval` is the interval between each benchmark.
+   This should be long enough to finish previous benchmark.
+   `10` (seconds) seems good for my case.
+ * `--request-patterns-file` is the path to the pattern file.
+ * `--default-hosts` is the list of host names of target endpoints.
+   By specifying multiple hosts as a comma-separated list, you can simulate load balancing.
+ * `--default-port` is the port number of the target endpoint.
+ * `--output-path` is the path to the result file.
+   Statistics of all benchmarks is saved as a file at the location.
+
+Then you'll get the reference result of the Groonga.
+After that you should stop Groonga to release CPU and RAM resources.
+
+
+### Benchmark Droonga
+
+To clear effects from previous benchmark, you should restart services before each test.
+
+~~~
+(on 192.168.100.50, 192.168.100.51, 192.168.100.52)
+% sudo service droonga-engine restart
+% sudo service droonga-http-server restart
+~~~
+
+#### Benchmark Droonga with single node
+
+Before benchmarking, make your cluster with only one node.
+
+~~~
+(on 192.168.100.50)
+% sudo droonga-engine-catalog-generate \
+    --hosts=192.168.100.50
+~~~
+
+After that the endpoint `192.168.100.50` works as a Droonga cluster with single node.
+Run the benchmark.
+
+~~~
+(on 192.168.100.53)
+% drnbench-request-response \
+    --step=2 \
+    --start-n-clients=0 \
+    --end-n-clients=20 \
+    --duration=30 \
+    --interval=10 \
+    --request-patterns-file=$PWD/patterns.json \
+    --default-hosts=192.168.100.50 \
+    --default-port=10042 \
+    --output-path=$PWD/droonga-result-1node.csv
+~~~
 
-    (on 192.168.0.10, 192.168.0.11, 192.168.0.12)
-    % sudo apt-get update
-    % sudo apt-get -y upgrade
-    % sudo apt-get install -y ruby ruby-dev build-essential nodejs nodejs-legacy npm
-    % sudo gem install droonga-engine grn2drn drnbench
-    % sudo npm install -g droonga-http-server
-    % mkdir ~/droonga
-    % droonga-engine-catalog-generate \
-        --hosts=192.168.0.10,192.168.0.11,192.168.0.12 \
-        --n-workers=$(cat /proc/cpuinfo | grep processor | wc -l) \
-        --output=~/droonga/catalog.json
+Note that the default port is changed from `10041` (Groonga's HTTP server) to `10042` (Droonga).
+Moreover, the path to the result file also changed.
+
+
+#### Benchmark Droonga with two nodes
+
+Before benchmarking, join the second node to the cluster.
+
+~~~
+(on 192.168.100.50, 192.168.100.51)
+% sudo droonga-engine-catalog-generate \
+    --hosts=192.168.100.50,192.168.100.51
+~~~
+
+After that both endpoints `192.168.100.50` and `192.168.100.51` work as a Droonga cluster with two nodes.
+Run the benchmark.
+
+~~~
+(on 192.168.100.53)
+% drnbench-request-response \
+    --step=2 \
+    --start-n-clients=0 \
+    --end-n-clients=20 \
+    --duration=30 \
+    --interval=10 \
+    --request-patterns-file=$PWD/patterns.json \
+    --default-hosts=192.168.100.50,192.168.100.51 \
+    --default-port=10042 \
+    --output-path=$PWD/droonga-result-2nodes.csv
+~~~
+
+Note that two hosts are specified via the `--default-hosts` option.
+
+If you send all requests to single endpoint, `droonga-http-server` will become a bottleneck, because it works as a single process for now.
+Moreover, `droonga-http-server` and `droonga-engine` will scramble for CPU resources.
+To measure the performance of your Droonga cluster effectively, you should average out CPU load per capita.
 
-After installation, start servers.
-To run Groonga and Droonga parallelly, specify a new port number for the `droonga-http-server` different to Groonga's one.
-Now we use `10042` for Droonga, `10041` for Groonga.
+Of course, on the production environment, it should be done by a load balancer, but It's a hassle to set up a load balancer for just benchmarking.
+Instead, you can specify multiple endpoint host names as a comma-separated list for the `--default-hosts` option.
 
-    (on 192.168.0.10)
-    % export host=192.168.0.10
-    % export DROONGA_BASE_DIR=$HOME/droonga
-    % droonga-engine --host=$host \
-        --log-file=$DROONGA_BASE_DIR/droonga-engine.log \
-        --daemon \
-        --pid-file=$DROONGA_BASE_DIR/droonga-engine.pid
-    % droonga-http-server --port=10042 \
-        --receive-host-name=$host \
-        --droonga-engine-host-name=$host \
-        --environment=production \
-        --daemon \
-        --pid-file=$DROONGA_BASE_DIR/droonga-http-server.pid
+And, the path to the result file also changed.
 
-    (on 192.168.0.11)
-    % export host=192.168.0.11
-    ...
 
-    (on 192.168.0.12)
-    % export host=192.168.0.12
-    ...
+#### Benchmark Droonga with three nodes
 
-Next, prepare the database from dump files.
-Note that you must send requests for schema and indexes to just one endpoint, because parallel sending of schema definition requests for multiple nodes will break the database.
+Before benchmarking, join the last node to the cluster.
 
-    (on 192.168.0.10)
-    % time (cat ~/wikipedia-search/config/groonga/schema.grn | grn2drn | \
-              droonga-send --server=192.168.0.10)
-    % time (cat ~/wikipedia-search/config/groonga/indexes.grn | grn2drn | \
-              droonga-send --server=192.168.0.10)
+~~~
+(on 192.168.100.50, 192.168.100.51)
+% sudo droonga-engine-catalog-generate \
+    --hosts=192.168.100.50,192.168.100.51,192.168.100.52
+~~~
 
-Instead you can use a direct dump from the Groonga server, like:
+After that all endpoints `192.168.100.50`, `192.168.100.51`, and `192.168.100.52` work as a Droonga cluster with three nodes.
+Run the benchmark.
 
-    (on 192.168.0.10)
-    % time (grndump --no-dump-tables $HOME/groonga/db/db | grn2drn | \
-              droonga-send --server=192.168.0.10 \
-                           --report-throughput)
+~~~
+(on 192.168.100.53)
+% drnbench-request-response \
+    --step=2 \
+    --start-n-clients=0 \
+    --end-n-clients=20 \
+    --duration=30 \
+    --interval=10 \
+    --request-patterns-file=$PWD/patterns.json \
+    --default-hosts=192.168.100.50,192.168.100.51,192.168.100.52 \
+    --default-port=10042 \
+    --output-path=$PWD/droonga-result-3nodes.csv
+~~~
 
-After that, import data from the dump file.
+Note that both `--default-hosts` and `--output-path` are changed again.
 
-    (on 192.168.0.10)
-    % time (cat ~/wikipedia-search/data/groonga/ja-pages.grn | grn2drn | \
-              droonga-send --server=192.168.0.10 \
-                           --server=192.168.0.11 \
-                           --server=192.168.0.12)
+## Analyze the result
 
-Instead you can use a direct dump from the Groonga server, like:
+OK, now you have four results:
 
-    (on 192.168.0.10)
-    % time (grndump --no-dump-schema --no-dump-indexes $HOME/groonga/db/db | \
-              grn2drn | \
-              droonga-send --server=192.168.0.10 \
-                           --server=192.168.0.11 \
-                           --server=192.168.0.12 \
-                           --report-throughput)
+ * `groonga-result.csv`
+ * `droonga-result-1node.csv`
+ * `droonga-result-2nodes.csv`
+ * `droonga-result-3nodes.csv`
 
-This may take much time (10 or more hours).
+[As described](#how-to-analyze), you can analyze them.
 
+For example, you can plot a graph from these results like:
 
+![A layered graph of throughput](/images/tutorial/benchmark/throughput-mixed.png)
 
-(TBD, based on https://github.com/droonga/presentation-droonga-meetup-1-introduction/blob/master/benchmark/README.md )
+You can explain this graph as: "On this condition Droonga has better performance when there are multiple nodes", "Single Droonga node's performance is lesser than Groonga's one, on this setting", and so on.
 
+(Note: Performance results fluctuate from various factors.
+This graph is just an example on a specific version, specific environment.)
 
+## まとめ
 
+In this tutorial, you did prepare a reference [Groonga][] server and [Droonga][] cluster.
+And, you studied how to prepare request patterns, how measure your systems, and how analyze the result.
 
   [Ubuntu]: http://www.ubuntu.com/
   [CentOS]: https://www.centos.org/
   [Droonga]: https://droonga.org/
   [Groonga]: http://groonga.org/
+  [drnbench]: https://github.com/droonga/drnbench/
+  [wikipedia-search]: https://github.com/droonga/wikipedia-search/
   [command reference]: ../../reference/commands/
-------------- next part --------------
HTML����������������������������...
ダウンロード 



More information about the Groonga-commit mailing list
アーカイブの一覧に戻る