Masamichi Fukuda - elf-systems
masamichi_fukud****@elf-s*****
2015年 3月 17日 (火) 10:31:09 JST
山内さん cc:松島さん おはようございます、福田です。 crmの例をありがとうございます。 早速、こちらの環境に合わせてみました。 $ cat test.crm ### Cluster Option ### property \ no-quorum-policy="ignore" \ stonith-enabled="true" \ startup-fencing="false" \ stonith-timeout="710s" \ crmd-transition-delay="2s" ### Resource Default ### rsc_defaults \ resource-stickiness="INFINITY" \ migration-threshold="1" ### Group Configuration ### group HAvarnish \ vip_208 \ varnishd group grpStonith1 \ Stonith1-1 \ Stonith1-2 group grpStonith2 \ Stonith2-1 \ Stonith2-2 ### Clone Configuration ### clone clone_ping \ ping ### Fencing Topology ### fencing_topology \ lbv1.beta.com: Stonith1-1 Stonith1-2 \ lbv2.beta.com: Stonith2-1 Stonith2-2 ### Primitive Configuration ### primitive vip_208 ocf:heartbeat:IPaddr2 \ params \ ip="192.168.17.208" \ nic="eth0" \ cidr_netmask="24" \ op start interval="0s" timeout="90s" on-fail="restart" \ op monitor interval="5s" timeout="60s" on-fail="restart" \ op stop interval="0s" timeout="100s" on-fail="fence" primitive varnishd lsb:varnish \ op start interval="0s" timeout="90s" on-fail="restart" \ op monitor interval="10s" timeout="60s" on-fail="restart" \ op stop interval="0s" timeout="100s" on-fail="fence" primitive ping ocf:pacemaker:ping \ params \ name="default_ping_set" \ host_list="192.168.17.254" \ multiplier="100" \ dampen="1" \ op start interval="0s" timeout="90s" on-fail="restart" \ op monitor interval="10s" timeout="60s" on-fail="restart" \ op stop interval="0s" timeout="100s" on-fail="fence" primitive Stonith1-1 stonith:external/stonith-helper \ params \ pcmk_reboot_retries="1" \ pcmk_reboot_timeout="40s" \ hostlist="lbv1.beta.com" \ dead_check_target="192.168.17.132 10.0.17.132" \ standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep -q `hostname`" \ run_online_check="yes" \ op start interval="0s" timeout="60s" on-fail="restart" \ op stop interval="0s" timeout="60s" on-fail="ignore" primitive Stonith1-2 stonith:external/xen0 \ params \ pcmk_reboot_timeout="60s" \ hostlist="lbv1.beta.com:/etc/xen/lbv1.cfg" \ dom0="xen0.beta.com" \ op start interval="0s" timeout="60s" on-fail="restart" \ op monitor interval="3600s" timeout="60s" on-fail="restart" \ op stop interval="0s" timeout="60s" on-fail="ignore" primitive Stonith2-1 stonith:external/stonith-helper \ params \ pcmk_reboot_retries="1" \ pcmk_reboot_timeout="40s" \ hostlist="lbv2.beta.com" \ dead_check_target="192.168.17.133 10.0.17.133" \ standby_check_command="/usr/local/sbin/crm_resource -r varnishd -W | grep -q `hostname`" \ run_online_check="yes" \ op start interval="0s" timeout="60s" on-fail="restart" \ op stop interval="0s" timeout="60s" on-fail="ignore" primitive Stonith2-2 stonith:external/xen0 \ params \ pcmk_reboot_timeout="60s" \ hostlist="lbv2.beta.com:/etc/xen/lbv2.cfg" \ dom0="xen0.beta.com" \ op start interval="0s" timeout="60s" on-fail="restart" \ op monitor interval="3600s" timeout="60s" on-fail="restart" \ op stop interval="0s" timeout="60s" on-fail="ignore" ### Resource Location ### location HA_location-1 HAvarnish \ rule 200: #uname eq lbv1.beta.com \ rule 100: #uname eq lbv2.beta.com location HA_location-2 HAvarnish \ rule -INFINITY: not_defined default_ping_set or default_ping_set lt 100 location HA_location-3 grpStonith1 \ rule -INFINITY: #uname eq lbv1.beta.com location HA_location-4 grpStonith2 \ rule -INFINITY: #uname eq lbv2.beta.com これを流しこんだところ、昨日とはメッセージが異なります。 pingのメッセージはなくなっていました。 # crm_mon -rfA Last updated: Tue Mar 17 10:21:28 2015 Last change: Tue Mar 17 10:21:09 2015 Stack: heartbeat Current DC: lbv2.beta.com (82ffc36f-1ad8-8686-7db0-35686465c624) - parti tion with quorum Version: 1.1.12-561c4cf 2 Nodes configured 8 Resources configured Online: [ lbv1.beta.com lbv2.beta.com ] Full list of resources: Resource Group: HAvarnish vip_208 (ocf::heartbeat:IPaddr2): Started lbv1.beta.com varnishd (lsb:varnish): Started lbv1.beta.com Resource Group: grpStonith1 Stonith1-1 (stonith:external/stonith-helper): Stopped Stonith1-2 (stonith:external/xen0): Stopped Resource Group: grpStonith2 Stonith2-1 (stonith:external/stonith-helper): Stopped Stonith2-2 (stonith:external/xen0): Stopped Clone Set: clone_ping [ping] Started: [ lbv1.beta.com lbv2.beta.com ] Node Attributes: * Node lbv1.beta.com: + default_ping_set : 100 * Node lbv2.beta.com: + default_ping_set : 100 Migration summary: * Node lbv2.beta.com: Stonith1-1: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17 10:21:17 2015' * Node lbv1.beta.com: Stonith2-1: migration-threshold=1 fail-count=1000000 last-failure='Tue Mar 17 10:21:17 2015' Failed actions: Stonith1-1_start_0 on lbv2.beta.com 'unknown error' (1): call=31, st atus=Error, last-rc-change='Tue Mar 17 10:21:15 2015', queued=0ms, exec=1082ms Stonith2-1_start_0 on lbv1.beta.com 'unknown error' (1): call=31, st atus=Error, last-rc-change='Tue Mar 17 10:21:16 2015', queued=0ms, exec=1079ms /var/log/ha-debugのログです。 IPaddr2(vip_208)[7851]: 2015/03/17_10:21:22 INFO: Adding inet address 192.168.17.208/24 with broadcast address 192.168.17.255 to device eth0 IPaddr2(vip_208)[7851]: 2015/03/17_10:21:22 INFO: Bringing device eth0 up IPaddr2(vip_208)[7851]: 2015/03/17_10:21:22 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.17.208 eth0 192.168.17.208 auto not_used not_used 標準出力や標準エラー出力はありませんでした。 stonith-helperがおかしいのでしょうか。 stonith-helperはシェルスクリプトなのでインストールはあまり気にしていなかったのですが。 stonith-helperはここに配置されています。 /usr/local/heartbeat/lib/stonith/plugins/external/stonith-helper 宜しくお願いします。 以上 2015-03-17 9:45 GMT+09:00 <renay****@ybb*****>: > 福田さん > > おはようございます。山内です。 > > 念の為、手元にある複数のstonithを利用した場合の例を抜粋してお送りします。 > (実際には、改行に気を付けてください) > > 以下の例は、PM1.1系での設定で、 > nodeaは、prmStonith1-1、 prmStonith1-2の順でstonithが実行されます。 > nodebは、prmStonith2-1、 prmStonith2-2の順でstonithが実行されます。 > > stonith自体は、helperとsshです。 > > > (snip) > ### Group Configuration ### > group grpStonith1 \ > prmStonith1-1 \ > prmStonith1-2 > > group grpStonith2 \ > prmStonith2-1 \ > prmStonith2-2 > > ### Fencing Topology ### > fencing_topology \ > nodea: prmStonith1-1 prmStonith1-2 \ > nodeb: prmStonith2-1 prmStonith2-2 > (snp) > primitive prmStonith1-1 stonith:external/stonith-helper \ > params \ > > pcmk_reboot_retries="1" \ > pcmk_reboot_timeout="40s" \ > hostlist="nodea" \ > dead_check_target="192.168.28.60 192.168.28.70" \ > standby_check_command="/usr/sbin/crm_resource -r prmRES -W | grep -qi > `hostname`" \ > run_online_check="yes" \ > op start interval="0s" timeout="60s" on-fail="restart" \ > op stop interval="0s" timeout="60s" on-fail="ignore" > > primitive prmStonith1-2 stonith:external/ssh \ > params \ > pcmk_reboot_timeout="60s" \ > hostlist="nodea" \ > op start interval="0s" timeout="60s" on-fail="restart" \ > op monitor interval="3600s" timeout="60s" on-fail="restart" \ > op stop interval="0s" timeout="60s" on-fail="ignore" > > primitive prmStonith2-1 stonith:external/stonith-helper \ > params \ > pcmk_reboot_retries="1" \ > pcmk_reboot_timeout="40s" \ > hostlist="nodeb" \ > dead_check_target="192.168.28.61 192.168.28.71" \ > standby_check_command="/usr/sbin/crm_resource -r prmRES -W | grep -qi > `hostname`" \ > run_online_check="yes" \ > op start interval="0s" timeout="60s" on-fail="restart" \ > op stop interval="0s" timeout="60s" on-fail="ignore" > > primitive prmStonith2-2 stonith:external/ssh \ > params \ > pcmk_reboot_timeout="60s" \ > hostlist="nodeb" \ > op start interval="0s" timeout="60s" on-fail="restart" \ > op monitor interval="3600s" timeout="60s" on-fail="restart" \ > op stop interval="0s" timeout="60s" on-fail="ignore" > (snip) > location rsc_location-grpStonith1-2 grpStonith1 \ > rule -INFINITY: #uname eq nodea > location rsc_location-grpStonith2-3 grpStonith2 \ > rule -INFINITY: #uname eq nodeb > > > 以上です。 > > > > -- ELF Systems Masamichi Fukuda mail to: *masamichi_fukud****@elf-s***** <elfsy****@gmail*****>* -------------- next part -------------- HTML$B$NE:IU%U%!%$%k$rJ]4I$7$^$7$?(B...ダウンロード