Link back: This guide is a part of the Virtual Oracle RAC project, the index to the whole project is here.
This part of the project provides instructions on configuring iSCSI initiators in Linux.
Configuring iSCSI services.
Start the “iscsid” serivce:
# service iscsid start
Run the following commands to make sure the services will run next time after reboot:
# chkconfig iscsid on
# chkconfig iscsi on
Now, we check if our iSCSI service can communicate with Openfiler. Run this command (shows available iSCSI targets):
# iscsiadm -m discovery -t sendtargets -p openfiler-priv
Is your output any different from what is shown above? Got less targets or none at all? And you say that your openfiler is up and running and you followed all instructions to the letter. Well, let’s go back and recall how we configured our iSCSI targets. There was something called “Network ACL” setup, that may be likened to a firewall in the openfiler. Remember now? It is set to “Deny” by default for each target. Go back and set it all to “Allow” for all targets. The change will take effect immediately, so you can run discovery again.
As it turns out, when the iSCSI initiator discovers the targets it configures the services to start up automatically on reboot and log into the targets. We can test this now by rebooting our linux machine. This is what you should see during restart if iSCSI setup was done properly:
|If login to the targets did not happen on reboot you will need to execute the commands below:
# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.crs -p 10.10.1.20 -l
# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.crs -p 10.10.1.20 –op update -n node.startup -v automatic
The “-l” option means – log into the target (“a node”, do not confuse this “node” with RAC nodes)
The “–op” option means update configuration property specified by “-n” option, it is “node.startup” in this case.
Do these two commands for each of the targets appropriately, I am showing only first of them.
The command below (no operation specified) can be used to query configuration of a target:
# iscsiadm -m node -T iqn.2006-01.com.openfiler:racdb.crs -p 10.10.1.20
Making device names persistent.
Linux “talks” to iSCSI targets using local device names. The mapping of our iSCSI targets to local SCSI device names is random and may change after reboot. It is a problem that needs fixing. The mapping of targets to the local devices is illustrated here:
Since we want to have a permanent and consistent mapping across all RAC nodes, we are going to create persistent local SCSI device names. This is done using “udev”, which is a Dynamic Device Management tool.
# cd /etc/udev/rules.d/
Create a file called “55-openiscsi.rules” with the following content:
KERNEL==”sd*”, BUS==”scsi”, PROGRAM=”/etc/udev/scripts/iscsidev.sh %b”,SYMLINK+=”iscsi/%c/part%n”
Navigate to another directory:
# cd /etc/udev/scripts
We create here a new shell script called “iscsidev.sh” with the following content:
# FILE: /etc/udev/scripts/iscsidev.sh
[ -e /sys/class/iscsi_host ] || exit 1
# This is not an open-scsi drive
Make the new script executable:
# chmod 755 /etc/udev/scripts/iscsidev.sh
Let’s restart the iSCSI initiator service:
# service iscsi stop
# service iscsi start
Here is the outcome of all that (the image below may be too wide but I did not want the lines to wrap):
Well, how do we know if what we’ve done actually worked? We look at the names in /dev/iscsi (this directory has just been created after we ran these commands, it did not exist before) and compare them to mapping in /dev/disk/by-path:
Take for instance “/dev/iscsi/asm1/part”, it corresponds to the “asm1” target (through /dev/sda).
Now we have persistent local names for our targets. We can reboot the odbn1 machine and see that iscsi devices are still there and properly mapped.
|Mapping of iSCSI Target Name to Local Device Name
|iSCSI Target Name||Local Device Name|
Next, we will have to create partitions in our SCSI volumes.
Creating partitions on iSCSI Volumes.
Before we start creating partitions it makes sense to showdown our virtual machines and take a snapshot of them. This provides an option of reverting to a known state of machine if something goes wrong.
Assuming that snapshot is taken and our machines are back online.
Notice: some of the material for this article was taken from Oracle author’s article, I recommend reading that article if you need more detailed information.
The following table lists the five iSCSI volumes and what file systems they will support:
|Oracle Shared Drive Configuration|
|File System Type||iSCSI Target (short) Name||Size||Mount Point||ASM Diskgroup Name||File Types|
|OCFS2||crs||2GB||/u02||Oracle Cluster Registry (OCR) File – (~250 MB)
Voting Disk – (~20MB)
|ASM||asm1||8GB||ORCL:VOL1||+RACDB_DATA1||Oracle Database Files|
|ASM||asm2||8GB||ORCL:VOL2||+RACDB_DATA1||Oracle Database Files|
|ASM||asm3||8GB||ORCL:VOL3||+FLASH_RECOVERY_AREA||Oracle Flash Recovery Area|
|ASM||asm4||8GB||ORCL:VOL4||+FLASH_RECOVERY_AREA||Oracle Flash Recovery Area|
In the picture below shown the fdisk dialog that creates a primary partition of a maximum available size. Red arrows mark your input, where you either type values or accept defaults.
# fdisk /dev/iscsi/asm1/part
Repeat the command sequence for volumes asm2 through asm4 and then for crs, which is shown below. Remember to always create a primary partition, number 1, of max size):
Verify new partitions
Keep in mind that the mapping of iSCSI target names and local SCSI device names will be different on each of our RAC nodes (it may even change on each particular node after a reboot). This does not present a problem as we are using local device names presented to us by “udev”.
So, if you have not restarted your node after partitioning, run following command as root:
And now we will query the partitions with fdisk command:
# fdisk -l
Here is the results:
This is all for the volumes and the partitions at this stage of our project. When we clone our Linux guest (a node) at a later time, the clone will have all these settings and configurations already done. Notice that partitioning is only done once, since the storage is shared between all nodes.