如何從SBD fence方案遷移到Fence aliyun方案?
更新時間:
問題描述
你希望將在阿里云上部署的SAP高可用環境(SBD fence方案)遷移到Fence aliyun方案。
適用于
- 阿里云ECS實例上部署的SAP高可用環境(SAP HANA、SAP ASCS/SCS)
- SAP ASCS/SCS高可用環境的ERS實例安裝在本機,并且使用高可用虛擬IP產品管理服務地址
使用限制和說明
- 使用此遷移方案前請確保當前您的SAP高可用環境(SAP HANA、SAP ASCS/SCS)運行正常。
- SAP ASCS/SCS高可用環境沒有安裝ERS實例的環境,不適用此方案。
- 操作系統的版本需要SLES for SAP 12 SP4及以上。
- 此遷移方案需要業務停機,請提前規劃停機窗口。
- 強烈建議做變更前對ECS的系統盤和數據盤創建快照,您可以參考單塊云盤快照或者多個云盤快照。
方案
場景一:SAP HANA高可用環境
以下是SAP HANA高可用環境的操作流程,具體如下:
- 登錄集群的主節點,執行以下命令,查看所有資源的狀態。
說明:未特殊說明的步驟只需要在集群的一個節點上操作即可。
crm_mon -r
系統顯示類似如下,示例有兩臺ECS,hana001和hana002,集群狀態和被管理的資源狀態正常。
Stack: corosync
2 nodes configured
6 resources configured
Online: [ hana001 hana002 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started hana001
rsc_vip (ocf::heartbeat:IPaddr2): Started hana001
Clone Set: msl_SAPHana_HDB [rsc_SAPHana_HDB] (promotable)
Masters: [ hana001 ]
Slaves: [ hana002 ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
Started: [ hana001 hana002 ] - 執行以下命令,查找當前SBD的塊設備名。
cat /etc/sysconfig/sbd | grep SBD_DEVICE
命令返回類似如下,本示例的SBD塊設備名是/dev/vdf
。# SBD_DEVICE specifies the devices to use for exchanging sbd messages
登錄再次確認ECS實例掛載的設備名跟上面查詢到的設備名一致。
SBD_DEVICE="/dev/vdf"
請確認這里顯示的設備名去掉x字符跟上面查詢到的結果一致
- 本示例執行以下命令,查詢ASCS和ERS的高可用虛擬IP的設置。
crm configure show | grep -E "primitive rsc_vip|params ip"
命令返回類似如下。primitive rsc_vip IPaddr2 \
params ip=192.168.10.101請根據實際情況替換對應的參數名
- 參考SAP HANA同可用區高可用部署中的5.3.2 方案二:fence_aliyun章節,完成全部配置。
- 執行以下命令,將集群設置為維護模式。
crm configure property maintenance-mode=true
如果集群中存在maintenance屬性的設定,會彈出類似提示,輸入y即可。
'maintenance' attribute already exists in rsc_sbd. Remove it (y/n)? y
'is-managed' conflicts with 'maintenance' in cln_SAPHanaTopology_HDB. Remove it (y/n)? y - 設置成功后,執行以下命令,確認所有資源都是unmanaged狀態。
crm_mon -r
命令返回類似如下。
2 nodes configured
6 resources configured
*** Resource management is DISABLED ***
The cluster will not attempt to start, stop or recover services
Online: [ hana001 hana002 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started hana001 (unmanaged)
rsc_vip (ocf::heartbeat:IPaddr2): Started hana001 (unmanaged)
Clone Set: msl_SAPHana_HDB [rsc_SAPHana_HDB] (promotable) (unmanaged)
rsc_SAPHana_HDB (ocf::suse:SAPHana): Slave hana002 (unmanaged)
rsc_SAPHana_HDB (ocf::suse:SAPHana): Master hana001 (unmanaged)
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB] (unmanaged)
rsc_SAPHanaTopology_HDB (ocf::suse:SAPHanaTopology): Started hana002 (unmanaged)
rsc_SAPHanaTopology_HDB (ocf::suse:SAPHanaTopology): Started hana001 (unmanaged)說明:如果還存在沒被unmanaged的資源,需要手工將其設置成unmanaged,命令語法如下:
語法:
以SAP HANA的資源沒有被正常設置為unmanaged為例。
crm resource maintenance [resource name] true2 nodes configured
執行以下命令來完成設置:
6 resources configured
*** Resource management is DISABLED ***
The cluster will not attempt to start, stop or recover services
Online: [ hana001 hana002 ]
Full list of resources:
rsc_sbd (stonith:external/sbd): Started hana001 (unmanaged)
rsc_vip (ocf::heartbeat:IPaddr2): Started hana001 (unmanaged)
Clone Set: msl_SAPHana_HDB [rsc_SAPHana_HDB] (promotable) (unmanaged)
rsc_SAPHana_HDB (ocf::suse:SAPHana): Slave hana002
rsc_SAPHana_HDB (ocf::suse:SAPHana): Master hana001
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB] (unmanaged)
rsc_SAPHanaTopology_HDB (ocf::suse:SAPHanaTopology): Started hana002 (unmanaged)
rsc_SAPHanaTopology_HDB (ocf::suse:SAPHanaTopology): Started hana001 (unmanaged)crm resource maintenance rsc_SAPHana_HDB true
請再次確認所有資源都已經處于unmanaged狀態
- 將所有資源設置為stop狀態。
語法:
crm resource stop ID1 ID2 ...
本示例運行的命令:
crm resource stop rsc_sbd rsc_vip rsc_SAPHana_HDB rsc_SAPHanaTopology_HDB請替換成您的環境的資源ID
- 刪除所有資源。
語法:
crm configure delete ID1 ID2 ...
本示例命令:
crm configure delete rsc_sbd rsc_vip rsc_SAPHana_HDB rsc_SAPHanaTopology_HDB - 分別在兩個節點上重啟pacemaker服務
systemctl restart pacemaker
- 退出集群維護模式
crm configure property maintenance-mode=false
- 清空資源后,確認集群中只有兩個node,資源數為0。
crm_mon -r
Stack: corosync
Current DC: hana001 (version 2.0.1+20190417.13d370ca9-3.24.1-2.0.1+20190417.13d370ca9) - partition with quorum
Last updated: Thu Feb 24 11:57:13 2022
Last change: Thu Feb 24 11:57:09 2022 by root via cibadmin on hana001
2 nodes configured
0 resources configured
Online: [ hana001 hana002 ]
No resources - 參考SAP HANA同可用區高可用部署,11.2章節完成fence agent的腳本配置。
- 執行以下命令,驗證集群配置。
Stack: corosync
Current DC: hana001 (version 2.0.1+20190417.13d370ca9-3.24.1-2.0.1+20190417.13d370ca9) - partition with quorum
Last updated: Thu Feb 24 17:51:44 2022
Last change: Thu Feb 24 17:51:41 2022 by root via crm_attribute on hana001
2 nodes configured
7 resources configured
Online: [ hana001 hana002 ]
Full list of resources:
res_ALIYUN_STONITH_1 (stonith:fence_aliyun): Started hana002
res_ALIYUN_STONITH_2 (stonith:fence_aliyun): Started hana001
rsc_vip (ocf::heartbeat:IPaddr2): Started hana001
Clone Set: msl_SAPHana_HDB [rsc_SAPHana_HDB] (promotable)
Masters: [ hana001 ]
Slaves: [ hana002 ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
Started: [ hana001 hana002 ]注意:請確認集群的主備節點角色符合預期。
- 高可用環境切換測試驗證,可參考SUSE官網文檔或SAP系統高可用環境維護指南
- 執行以下命令,禁用SBD服務
systemctl disable sbd
- 釋放共享存儲產品
登錄
場景二:SAP ASCS/SCS高可用環境
以下是SAP S/4HANA ASCS高可用環境的操作流程,具體如下:
- 登錄集群的主節點,執行以下命令,查看所有資源的狀態。
說明:未特殊說明的步驟只需要在集群的一個節點上操作即可。crm_mon -r
系統顯示類似如下,示例有兩臺ECS,SAPAPP01和SAPAPP02上安裝了ASCS高可用環境,集群狀態和被管理的資源狀態正常。
Stack: corosync
2 nodes configured
5 resource instances configured
Online: [ SAPAPP01 SAPAPP02 ]
Full list of resources:
stonith-sbd (stonith:external/sbd): Started SAPAPP01
Resource Group: grp_S4A_ASCS00
rsc_ip_S4A_ASCS00 (ocf::heartbeat:IPaddr2): Started SAPAPP01
rsc_sap_S4A_ASCS00 (ocf::heartbeat:SAPInstance): Started SAPAPP01
Resource Group: grp_S4A_ERS10
rsc_ip_S4A_ERS10 (ocf::heartbeat:IPaddr2): Started SAPAPP02
rsc_sap_S4A_ERS10 (ocf::heartbeat:SAPInstance): Started SAPAPP02 - 執行以下命令,查找當前SBD的塊設備名。
cat /etc/sysconfig/sbd | grep SBD_DEVICE
命令返回類似如下,本示例的SBD塊設備名是/dev/vdc
。# SBD_DEVICE specifies the devices to use for exchanging sbd messages
登錄再次確認ECS實例掛載的設備名跟上面查詢到的設備名一致。
SBD_DEVICE="/dev/vdc"
請確認這里顯示的設備名去掉x字符跟上面查詢到的結果一致 - 參考SAP S/4HANA同可用區高可用部署中的4.4 方案二:Fence_aliyun實現fence功能章節,完成全部配置。
- 執行以下命令,將集群設置為維護模式。
crm configure property maintenance-mode=true
如果集群中存在maintenance屬性的設定,會彈出類似提示,輸入y即可。
'maintenance' attribute already exists in rsc_sap_S4A_ERS10. Remove it (y/n)? - 設置成功后,執行以下命令,確認所有資源都是unmanaged狀態。
-
crm_mon -r
命令返回類似如下。
2 nodes configured
說明:如果還存在沒被unmanaged的資源,需要手工將其設置成unmanaged,命令語法如下:
5 resource instances configured
*** Resource management is DISABLED ***
The cluster will not attempt to start, stop or recover services
Online: [ SAPAPP01 SAPAPP02 ]
Full list of resources:
stonith-sbd (stonith:external/sbd): Started SAPAPP01 (unmanaged)
Resource Group: grp_S4A_ASCS00
rsc_ip_S4A_ASCS00 (ocf::heartbeat:IPaddr2): Started SAPAPP01 (unmanaged)
rsc_sap_S4A_ASCS00 (ocf::heartbeat:SAPInstance): Started SAPAPP01 (unmanaged)
Resource Group: grp_S4A_ERS10
rsc_ip_S4A_ERS10 (ocf::heartbeat:IPaddr2): Started SAPAPP02 (unmanaged)
rsc_sap_S4A_ERS10 (ocf::heartbeat:SAPInstance): Started SAPAPP02 (unmanaged)語法:
以rsc_ip_S4A_ASCS00資源為例,執行以下命令來完成設置:
crm resource maintenance [resource name] truecrm resource maintenance rsc_ip_S4A_ASCS00 true
請再次確認所有資源都已經處于unmanaged狀態 - 將所有資源設置為stop狀態。
語法:
請替換成您的環境的資源ID
crm resource stop ID1 ID2 ...
本示例運行的命令:
crm resource stop stonith-sbd rsc_ip_S4A_ERS10 rsc_sap_S4A_ERS10 rsc_ip_S4A_ASCS00 rsc_sap_S4A_ASCS00 - 刪除所有資源。
語法:
crm configure delete ID1 ID2 ...
本示例命令:
crm configure delete stonith-sbd rsc_ip_S4A_ERS10 rsc_sap_S4A_ERS10 rsc_ip_S4A_ASCS00 rsc_sap_S4A_ASCS00 - 分別在兩個節點上重啟pacemaker服務
systemctl restart pacemaker
- 退出集群維護模式
crm configure property maintenance-mode=false
- 清空資源后,確認集群中只有兩個node,資源數為0。
crm_mon -r
2 nodes configured
0 resource instances configured
Online: [ SAPAPP01 SAPAPP02 ]
No resources - 參考SAP S/4HANA同可用區高可用部署,7.5.4 方案二Fence_aliyun實現fence功能章節完成fence agent的腳本配置。
- 執行以下命令,驗證集群配置。
Stack: corosync
注意:請確認集群的主備節點角色符合預期。
2 nodes configured
6 resources configured
Online: [ SAPAPP01 SAPAPP02 ]
Full list of resources:
res_ALIYUN_STONITH_1 (stonith:fence_aliyun): Started SAPAPP02
res_ALIYUN_STONITH_2 (stonith:fence_aliyun): Started SAPAPP01
Resource Group: grp_S4A_ASCS00
rsc_ip_S4A_ASCS00 (ocf::heartbeat:IPaddr2): Started SAPAPP01
rsc_sap_S4A_ASCS00 (ocf::heartbeat:SAPInstance): Started SAPAPP01
Resource Group: grp_S4A_ERS10
rsc_ip_S4A_ERS10 (ocf::heartbeat:IPaddr2): Started SAPAPP02
rsc_sap_S4A_ERS10 (ocf::heartbeat:SAPInstance): Started SAPAPP02 - 高可用環境切換測試驗證,可參考SUSE官網文檔或SAP系統高可用環境維護指南
- 執行以下命令,禁用SBD服務
systemctl disable sbd
- 釋放共享存儲產品
登錄
相關文檔
文檔內容是否對您有幫助?