Beispielinstallation eines SUN 2 Node Cluster mit SunCluster 3.1 | ||
Author: | Frank | |
Datum: | 01.03.2004 letztes Update 02.03.2004 | |
Zusammenfassung: |
Der Beispiel-Cluster besteht physikalisch aus zwei Workstation, den sogenannten Nodes, und zwei Array. Auf diesen Maschinen wird eine Datenbank installiert. Durch den Einsatz der Clustersoftware wird die Hardware so zu einem logischen Host verstrickt, dass möglichst alle Fehlersituationen umgangen werden können. |
|
Inhalt: |
|
Schritt A. |
[Installation Admin Station und Terminalconcentrator] |
Hostname: | dba01 |
IP-Adresse: | 192.168.3.6 /24 |
node1:\ :dv=/dev/term/a:br#9600:el=^C^S^Q^U^D:ie=%$:oe=^D: node2:\ :dv=/dev/term/b:br#9600:el=^C^S^Q^U^D:ie=%$:oe=^D:
Schritt B. |
[Aufsetzen der beiden Cluster Node] |
Hostname Node 1: | db1 |
IP-Adresse Node 1: | 192.168.3.1 /24 |
Hostname Node 2: | db2 |
IP-Adresse Node 2: | 192.168.3.2 /24 |
![]() |
Beim Einsatz eines A5100 und nur 7 Platten muss man genau auf die Verteilung achten. Die würde wie folgt aussehen: Vorn: 0, 5 und 10 Hinten: 0, 3, 6 und 10 |
Steckplatz PCI 1 | Quad Ethenet |
Steckplatz PCI 2 | Grafik |
Steckplatz PCI 3 | Fibre Channel |
Steckplatz PCI 4 | Fibre Channel |
=============================================== Slice Mount Point Size (MB) 0 / 28052 1 swap 6144 2 overlap 34730 3 0 4 0 5 0 6 /globaldevices 512 7 /bla 21 ================================================
# for i in `cat README.txt` > do > echo "installiere $i" ; unzip -q $i ; patchadd $i > done
Schritt C. |
[Installation der eigentlichen Cluster Software] |
Is it okay to continue (yes/no) [yes]? yes ** Installing SunCluster 3.1 framework ** SUNWscr.....done SUNWscu.....done SUNWscsck...done SUNWscnm....done SUNWscdev...done SUNWscgds...done SUNWscman...done SUNWscsal...done SUNWscsam...done SUNWscvm....done SUNWmdm.....done SUNWscva....done SUNWscvr....done SUNWscvw....done SUNWscrsm...done SUNWfsc.....done SUNWfscvw...done SUNWjsc.....done SUNWjscman..done SUNWjscvw...done SUNWksc.....done SUNWkscvw...done SUNWcsc.....done SUNWcscvw...done SUNWhsc.....done SUNWhscvw...done SUNWexplo...done Press Enter to continue:
>Do you want to run sccheck (yes/no) [yes]? sccheck: Requesting explorer data and node report from db1. sccheck: db1: Explorer finished. sccheck: db1: Starting single-node checks. sccheck: db1: Single-node checks finished. Press Enter to continue:
Node name: db2 Node name (Control-D to finish): ^D This is the complete list of nodes: db1 db2 Is it correct (yes/no) [yes]?
>>> Cluster Transport Adapters and Cables <<< You must configure at least two cluster transport adapters for each node in the cluster. These are the adapters which attach to the private cluster interconnect. Select the first cluster transport adapter: 1) hme0 2) qfe0 3) qfe2 4) qfe3 5) Other Option: 1 Adapter "hme0" is an Ethernet adapter. Searching for any unexpected network traffic on "hme0" ... done Verification completed. No traffic was detected over a 10 second sample period. The "dlpi" transport type will be set for this cluster. Select the second cluster transport adapter: 1) hme0 2) qfe0 3) qfe2 4) qfe3 5) Other Option: 2
>>> Global Devices File System <<< Each node in the cluster must have a local file system mounted on /global/.devices/node@before it can successfully participate as a cluster member. Since the "nodeID" is not assigned until scinstall is run, scinstall will set this up for you. You must supply the name of either an already-mounted file system or raw disk partition which scinstall can use to create the global devices file system. This file system or partition should be at least 512 MB in size. If an already-mounted file system is used, the file system must be empty. If a raw disk partition is used, a new file system will be created for you. The default is to use /globaldevices. Is it okay to use this default (yes/no) [yes]?
Schritt D. |
[Resetten des installmode] |
# grep did /etc/name_to_major did 300
scdidadm -L 1 db1:/dev/rdsk/c0t0d0 /dev/did/rdsk/d1 2 db1:/dev/rdsk/c0t6d0 /dev/did/rdsk/d2 3 db1:/dev/rdsk/c1t19d0 /dev/did/rdsk/d3 3 db2:/dev/rdsk/c1t19d0 /dev/did/rdsk/d3 4 db1:/dev/rdsk/c1t22d0 /dev/did/rdsk/d4 4 db2:/dev/rdsk/c1t22d0 /dev/did/rdsk/d4 5 db1:/dev/rdsk/c1t26d0 /dev/did/rdsk/d5 5 db2:/dev/rdsk/c1t26d0 /dev/did/rdsk/d5 6 db1:/dev/rdsk/c1t0d0 /dev/did/rdsk/d6 6 db2:/dev/rdsk/c1t0d0 /dev/did/rdsk/d6 7 db1:/dev/rdsk/c1t16d0 /dev/did/rdsk/d7 7 db2:/dev/rdsk/c1t16d0 /dev/did/rdsk/d7 8 db1:/dev/rdsk/c1t5d0 /dev/did/rdsk/d8 8 db2:/dev/rdsk/c1t5d0 /dev/did/rdsk/d8 9 db1:/dev/rdsk/c1t10d0 /dev/did/rdsk/d9 9 db2:/dev/rdsk/c1t10d0 /dev/did/rdsk/d9 10 db1:/dev/rdsk/c2t51d0 /dev/did/rdsk/d10 10 db2:/dev/rdsk/c2t51d0 /dev/did/rdsk/d10 11 db1:/dev/rdsk/c2t58d0 /dev/did/rdsk/d11 11 db2:/dev/rdsk/c2t58d0 /dev/did/rdsk/d11 12 db1:/dev/rdsk/c2t48d0 /dev/did/rdsk/d12 12 db2:/dev/rdsk/c2t48d0 /dev/did/rdsk/d12 13 db1:/dev/rdsk/c2t32d0 /dev/did/rdsk/d13 13 db2:/dev/rdsk/c2t32d0 /dev/did/rdsk/d13 14 db1:/dev/rdsk/c2t42d0 /dev/did/rdsk/d14 14 db2:/dev/rdsk/c2t42d0 /dev/did/rdsk/d14 15 db1:/dev/rdsk/c2t54d0 /dev/did/rdsk/d15 15 db2:/dev/rdsk/c2t54d0 /dev/did/rdsk/d15 16 db1:/dev/rdsk/c2t37d0 /dev/did/rdsk/d16 16 db2:/dev/rdsk/c2t37d0 /dev/did/rdsk/d16 17 db2:/dev/rdsk/c0t0d0 /dev/did/rdsk/d17 18 db2:/dev/rdsk/c0t6d0 /dev/did/rdsk/d18
# grep 172 /etc/cluster/ccr/infrastructure cluster.properties.private_net_number 172.16.0.0 cluster.nodes.1.adapters.1.properties.ip_address 172.16.0.129 cluster.nodes.1.adapters.2.properties.ip_address 172.16.1.1 cluster.nodes.2.adapters.1.properties.ip_address 172.16.0.130 cluster.nodes.2.adapters.2.properties.ip_address 172.16.1.2
>>> Initial Cluster Setup <<< This program has detected that the cluster "installmode" attribute is still enabled. As such, certain initial cluster setup steps will be performed at this time. This includes adding any necessary quorum devices, then resetting both the quorum vote counts and the "installmode" property. Please do not proceed if any additional nodes have yet to join the cluster. Is it okay to continue (yes/no) [yes]? Do you want to add any quorum disks (yes/no) [yes]? Dual-ported SCSI-2 disks may be used as quorum devices in two-node clusters. However, clusters with more than two nodes require that SCSI-3 PGR disks be used for all disks with more than two node-to-disk paths. You can use a disk containing user data or one that is a member of a device group as a quorum device. Each quorum disk must be connected to at least two nodes. Please refer to the Sun Cluster documentation for more information on supported quorum device topologies. Which global device do you want to use (d)? d3 Is it okay to proceed with the update (yes/no) [yes]? scconf -a -q globaldev=d3 Command completed successfully. Do you want to add another quorum disk (yes/no)? no Once the "installmode" property has been reset, this program will skip "Initial Cluster Setup" each time it is run again in the future. However, quorum devices can always be added to the cluster using the regular menu options. Resetting this property fully activates quorum settings and is necessary for the normal and safe operation of the cluster. Is it okay to reset "installmode" (yes/no) [yes]? scconf -c -q reset scconf -a -T node=. Cluster initialization is complete.
Schritt E. |
[Installation von Veritas] |
Do you want Volume Manager to encapsulate root [no]? yes Where is the Volume Manager cdrom [/cdrom/cdrom0]? Disabling DMP. Installing packages from /cdrom/cdrom0/pkgs. Installing VRTSvlic. Installing VRTSvxvm. Installing VRTSvmman. Obtaining the clusterwide vxio number... Using 315 as the vxio major number. Volume Manager installation is complete. One or more Volume Manager features are already licensed. Use the "vxlicrep" command to get a report of Veritas license keys installed. If you do not want to supply an additional license key, just press ENTER. Otherwise, you may provide one additional key. Please enter a Volume Manager license key [none]: Verifying encapsulation requirements. The Volume Manager root disk encapsulation step will begin in 20 seconds. Type Ctrl-C to abort .................... Arranging for Volume Manager encapsulation of the root disk. The vxconfigd daemon has been started and is in disabled mode... Reinitialized the volboot file... Created the rootdg... Added the rootdisk to the rootdg... The setup to encapsulate rootdisk is complete... Updating /global/.devices entry in /etc/vfstab.Wenn alles richtig gemacht wurde, rebootet die Maschine nun zweimal wie von Zauberhand.
Schritt F. |
[Konfiguration der shared Disk Groups und Volumen] |
# vxdisk -o alldgs list DEVICE TYPE DISK GROUP STATUS c0t0d0s2 sliced rootdisk_1 rootdg online c1t0d0s2 sliced - - error c1t5d0s2 sliced - - error c1t10d0s2 sliced - - error c1t16d0s2 sliced - - error c1t19d0s2 sliced - - error c1t22d0s2 sliced - - error c1t26d0s2 sliced - - error c2t32d0s2 sliced - - error c2t37d0s2 sliced - - error c2t42d0s2 sliced - - error c2t48d0s2 sliced - - error c2t51d0s2 sliced - - error c2t54d0s2 sliced - - error c2t58d0s2 sliced - - error
# vxdisk -o alldgs list DEVICE TYPE DISK GROUP STATUS c0t0d0s2 sliced rootdisk_1 rootdg online c1t0d0s2 sliced - - online c1t5d0s2 sliced - - online c1t10d0s2 sliced - - online c1t16d0s2 sliced - - online c1t19d0s2 sliced - - online c1t22d0s2 sliced - - online c1t26d0s2 sliced - - online c2t32d0s2 sliced - - online c2t37d0s2 sliced - - online c2t42d0s2 sliced - - online c2t48d0s2 sliced - - online c2t51d0s2 sliced - - online c2t54d0s2 sliced - - online c2t58d0s2 sliced - - online
# vxdg init protdg lv0=c1t0d0s2 # vxdg -g protdg adddisk rv0=c2t32d0s2 # # vxdg init oradg lv5=c1t5d0s2 # vxdg -g oradg adddisk lv10=c1t10d0s2 # vxdg -g oradg adddisk lh0=c1t16d0s2 # vxdg -g oradg adddisk lh3=c1t19d0s2 # vxdg -g oradg adddisk lh6=c1t22d0s2 # vxdg -g oradg adddisk lh10=c1t26d0s2 # vxdg -g oradg adddisk rv5=c2t37d0s2 # vxdg -g oradg adddisk rv10=c2t42d0s2 # vxdg -g oradg adddisk rh0=c2t48d0s2 # vxdg -g oradg adddisk rh3=c2t51d0s2 # vxdg -g oradg adddisk rh6=c2t54d0s2 # vxdg -g oradg adddisk rh10=c2t58d0s2
# vxdisk -o alldgs list DEVICE TYPE DISK GROUP STATUS c0t0d0s2 sliced rootdisk_1 rootdg online c1t0d0s2 sliced lv0 protdg online c1t5d0s2 sliced lv5 oradg online c1t10d0s2 sliced lv10 oradg online c1t16d0s2 sliced lh0 oradg online c1t19d0s2 sliced lh3 oradg online c1t22d0s2 sliced lh6 oradg online c1t26d0s2 sliced lh10 oradg online c2t32d0s2 sliced rv0 protdg online c2t37d0s2 sliced rv5 oradg online c2t42d0s2 sliced rv10 oradg online c2t48d0s2 sliced rh0 oradg online c2t51d0s2 sliced rh3 oradg online c2t54d0s2 sliced rh6 oradg online c2t58d0s2 sliced rh10 oradg online
# vxassist -g oradg make redolog01 512m layout=mirror-concat lv5 rv5 & # vxassist -g oradg make redolog02 512m layout=mirror-concat lv10 rv10 & # vxassist -g oradg make redolog03 512m layout=mirror-concat lh0 rh0 & # vxassist -g oradg make rbs 3072m layout=mirror-stripe lh3 lh6 rh3 rh6 & # vxassist -g oradg make tools 1536m layout=mirror-concat lv5 rv5 & # vxassist -g oradg make temp01 1536m layout=mirror-concat lv10 rv10 & # vxassist -g oradg make user 1536m layout=mirror-concat lh0 rh0 & # vxassist -g oradg make index01 1024m layout=mirror-stripe lh3 lh6 rh3 rh6 &
# vxassist -g oradg maxsize layout=mirror-stripe lv5 lv10 lh0 lh3 lh6 rv5 rv10 rh0 rh3 rh6 Maximum volume size: 155777024 (76063Mb)Mit dieser Info (76063Mb) bauen wir das letzte Volumen
vxassist -g oradg make system01 76063m layout=mirror-stripe nstripe=5 mirror=ctlr ctlr:c1 ctlr:c2 \ lv5 lv10 lh0 lh3 lh6 rv5 rv10 rh0 rh3 rh6
vxassist -g protdg make prot01 10240m layout=mirror-concat lv0 rv0 &
vxedit -g oradg set spare=on lh10 vxedit -g oradg set spare=on rh10 # vxdisk list |grep spare c1t26d0s2 sliced lh10 oradg online spare c2t58d0s2 sliced rh10 oradg online spare
# vxtask list TASKID PTID TYPE/STATE PCT PROGRESS 170 RDWRBACK/R 30.81% 0/155777024/48001392 VOLSTART system01 172 RDWRBACK/R 58.61% 0/20971520/12291624 VOLSTART prot01
# scstat -D -- Device Group Servers -- Device Group Primary Secondary ------------ ------- --------- -- Device Group Status -- Device Group Status ------------ ------
# scconf -a -D type=vxvm,name=protdg,nodelist=db1:db2 # scconf -a -D type=vxvm,name=oradg,nodelist=db1:db2Und hat er er die nun ?
# scstat -D -- Device Group Servers -- Device Group Primary Secondary ------------ ------- --------- Device group servers: protdg db1 db2 Device group servers: oradg db1 db2 -- Device Group Status -- Device Group Status ------------ ------ Device group status: protdg Online Device group status: oradg Online
newfs /dev/vx/rdsk/protdg/prot01 newfs /dev/vx/rdsk/oradg/rbs newfs /dev/vx/rdsk/oradg/tools newfs /dev/vx/rdsk/oradg/temp01 newfs /dev/vx/rdsk/oradg/user newfs /dev/vx/rdsk/oradg/index01 newfs /dev/vx/rdsk/oradg/system01 newfs /dev/vx/rdsk/oradg/redolog01 newfs /dev/vx/rdsk/oradg/redolog02 newfs /dev/vx/rdsk/oradg/redolog03
Schritt G. |
[IPMP anlegen] |
# ifconfig qfe1 qfe1: flags=1000843mtu 1500 index 2 inet 192.168.3.1 netmask ffffff00 broadcast 192.168.3.255 # more /etc/hostname.qfe1 db1
# grep db1 /etc/hosts 192.168.3.1 db1 db1.d-frust.de 192.168.3.101 db1-qfe1 192.168.3.201 db1-qfe3 172.16.0.129 db1-priv1 172.16.1.1 db1-priv2 # more /etc/hostname.qfe1 db1 group ocsp-net up addif db1-qfe1 -failover deprecated up # more /etc/hostname.qfe3 db1-qfe3 group ocsp-net -failover deprecated upWenn die drei Datein entsprechend geändert wurden können wir diesen Node rebooten:
init 6
# ifconfig -a qfe1: flags=1000843mtu 1500 index 2 inet 192.168.3.1 netmask ffffff00 broadcast 192.168.3.255 groupname ocsp-net qfe1:1: flags=9040843 mtu 1500 index 2 inet 192.168.3.101 netmask ffffff00 broadcast 192.168.3.255 qfe3: flags=9040843 mtu 1500 index 3 inet 192.168.3.201 netmask ffffff00 broadcast 192.168.3.255 groupname ocsp-net
Schritt H. |
[ORACLE] |
# grep 3.5 /etc/hosts 192.168.3.5 db db.d-frust.de
groupadd -g 2005 dba useradd -m -c "oracle user" -g dba -u 2005 -d /opt/oracle oracle passwd oracle
# vxprint -l -v redolog01 |more Disk group: oradg Volume: redolog01 info: len=1048576 type: usetype=fsgen state: state=ACTIVE kernel=ENABLED cdsrecovery=0/0 (clean) assoc: plexes=redolog01-01,redolog01-02 policies: read=SELECT (round-robin) exceptions=GEN_DET_SPARSE flags: closed writeback logging: type=REGION loglen=0 serial=0/0 (disabled) apprecov: seqno=0/0 recovery: mode=default recov_id=0 device: minor=24000 bdev=315/24000 cdev=315/24000 path=/dev/vx/dsk/oradg/redolog01 perms: user=root group=root mode=0600 # vxedit -g oradg set user=oracle group=dba redolog01 # vxedit -g oradg set user=oracle group=dba redolog02 # vxedit -g oradg set user=oracle group=dba redolog03 ---usw.----- # vxprint -l -v redolog01 |grep perms perms: user=oracle group=dba mode=0600 # scconf -c -D name=oradg,sync
# mkdir -p /global/oracle/system # mkdir -p /global/oracle/index # mkdir -p /global/oracle/rbs # mkdir -p /global/oracle/temp # mkdir -p /global/oracle/tools # mkdir -p /global/oracle/user # mkdir -p /global/oracle/redolog01 # mkdir -p /global/oracle/redolog02 # mkdir -p /global/oracle/redolog03
/dev/vx/dsk/oradg/system01 /dev/vx/rdsk/oradg/system01 /global/oracle/system ufs 2 yes logging,global /dev/vx/dsk/oradg/index01 /dev/vx/rdsk/oradg/index01 /global/oracle/index ufs 2 yes logging,global /dev/vx/dsk/oradg/rbs /dev/vx/rdsk/oradg/rbs /global/oracle/rbs ufs 2 yes logging,global /dev/vx/dsk/oradg/temp01 /dev/vx/rdsk/oradg/temp01 /global/oracle/temp ufs 2 yes logging,global /dev/vx/dsk/oradg/tools /dev/vx/rdsk/oradg/tools /global/oracle/tools ufs 2 yes logging,global /dev/vx/dsk/oradg/user /dev/vx/rdsk/oradg/user /global/oracle/user ufs 2 yes logging,global /dev/vx/dsk/oradg/redolog01 /dev/vx/rdsk/oradg/redolog01 /global/oracle/redolog01 ufs 2 yes logging,global /dev/vx/dsk/oradg/redolog02 /dev/vx/rdsk/oradg/redolog02 /global/oracle/redolog02 ufs 2 yes logging,global /dev/vx/dsk/oradg/redolog03 /dev/vx/rdsk/oradg/redolog03 /global/oracle/redolog03 ufs 2 yes logging,globalDie Filesysteme werden im Cluster durch die s.g. HAStoragePlus Resource verwaltet. Diese holt Ihre Informationen aus der /etc/vfstab.
* Start aenderung fuer oracle set shmsys:shminfo_shmmax=4294967295 set shmsys:shminfo_shmmin=1 set shmsys:shminfo_shmmni=200 set shmsys:shminfo_shmseg=40 set semsys:seminfo_semmni=200 set semsys:seminfo_semmsl=1000 set semsys:seminfo_semmns=3000 set semsys:seminfo_semopm=200 set semsys:seminfo_semvmx=32767 * End aenderung fuer oracleund rebooten init 6
$ sqlplus /nolog SQL> connect / as sysdba SQL> create user sc_fm identified by sc_fm; SQL> grant create session, create table to sc_fm; SQL> grant select on v_$sysstat to sc_fm; SQL> alter user sc_fm default tablespace users quota 1m on users; SQL> quitDas Passwort identified by sc_fm ist natürlich nur ein Beispiel, und sollte auf etwas sicherem gestellt werden. Die Tabelle v_$sysstate wird erst nach einiger Zeit erstellt. Wenn diese Tabelle noch nicht da sein sollte, muss man eben eine Kaffepause machen.
$ sqlplus sc_fm/sc_fm SQL> select * from sys.v_$sysstat; SQL> quit
$ sqlplus /nolog SQL> connect / as sysdba SQL> create table mytable (mykey VARCHAR2(10), myval NUMBER(10)); SQL> insert into mytable values (´off´, 0); SQL> insert into mytable values (´on´, 1); SQL> commit; SQL> select * from mytable; MYKEY MYVAL ---------- ---------- off 0 on 1 SQL> quit
Schritt I. |
[Einbau von Oracle in das Clusterframework] |
scrgadm -a -t SUNW.HAStoragePlus
cd /cdrom/scdataservices_3_1_vb/components/SunCluster_HA_Oracle_3.1/Sol_9/Packages pkgadd -d . SUNWscor
scrgadm -a -t SUNW.oracle_server scrgadm -a -t SUNW.oracle_listener
scrgadm -a -g oracle-rg -h db1,db2
scrgadm -a -L -g oracle-rg -l db
scrgadm -a -j oracle-hasp-res -g oracle-rg -t HAStoragePlus \ -x FilesystemMountpoints=/global/oracle/system,\ /global/oracle/index,/global/oracle/rbs,\ /global/oracle/temp,/global/oracle/tools,\ /global/oracle/user,/global/oracle/redolog01,\ /global/oracle/redolog02,/global/oracle/redolog03 \ -x AffinityOn=true
scrgadm -a -j oracle-db-rs -g oracle-rg \ -t oracle_server \ -y resource_dependencies=oracle-hasp-res \ -x Oracle_sid=oid1 -x Oracle_home=/opt/oracle/product/8.1.7 \ -x Alert_log_file=/opt/oracle/product/8.1.7/admin/oid1/bdump/alert_oid1.log \ -x Parameter_file=/global/oracle/user/oid1/pfile/initoid1.ora -x Connect_string=sc_fm/sc_fmIn allen Fällen steht der Schalter -a für hinzufügen. Falls man sich mal vertippt haben sollte kann man das mit dem Schalter -c gerade ziehen So ändert man z.B. mit folgendem Befehl den Link auf das Oracle Parameterfile.
scrgadm -c -j oracle-db-rs -x Parameter_file="/global/oracle/user/oid1/pfile/initoid1.ora"
scrgadm -a -j oracle-list-rs -g oracle-rg -t oracle_listener -x Oracle_home=/opt/oracle/product/8.1.7 -x Listener_name=LISTENER
scswitch -Z -g oracle-rg
Schritt J. |
[Beispiel zum Hinzufügen des Oracle Internet Directory] |
scrgadm -a -t DFRU.oid
scrgadm -a -j oracle-oid-rs -g oracle-rg -t DFRU.oid / -x Confdir_list=/global/oracle/system / -y Resource_dependencies=oracle-db-rsSollte das nicht funktionieren liegt es meist am Validierungs Script welches unter /opt/DFRU.oid/bin/oid_validate.ksh liegt. Das muss man sich dann sehr genau ansehen, um zu verstehen was der Cluster eigentlich von einem erwartet.
scswitch -e -j oracle-oid-rs