10.11.17

OracleVM 3.3.4 and Data Corruption When Cloning VM from Template

We faced quite serious issues once trying to create a virtual server from a template that resided in NFS mount, target repository was on iSCSI storage.

The problem was that for some reason OracleVM 3.3.4 kernel 3.8.13 started corrupting the image while cloning the virtual server from template. Symptoms were that after the clone operation everything looks good from OracleVM Manager point ot view but when trying to startup the server, it fails with error stating there is no bootable operating system.

During the cloning operations there were huge amount of following errors in the /var/log/messages. The errors were the same despite I changed the utilility server to be different, so this is not hardware issue:

Nov  9 18:35:17 vs9 kernel: sd 5:0:0:0: [sdd] CDB:
Nov  9 18:35:17 vs9 kernel: Write(10): 2a 00 26 bf f0 00 00 0a 00 00
Nov  9 18:35:17 vs9 kernel: sd 5:0:0:0: [sdd] Invalid command failure
Nov  9 18:35:17 vs9 kernel: sd 5:0:0:0: [sdd]
Nov  9 18:35:17 vs9 kernel: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Nov  9 18:35:17 vs9 kernel: sd 5:0:0:0: [sdd]
Nov  9 18:35:17 vs9 kernel: Sense Key : Illegal Request [current]
Nov  9 18:35:17 vs9 kernel: sd 5:0:0:0: [sdd]
Nov  9 18:35:17 vs9 kernel: Add. Sense: Invalid field in cdb
Nov  9 18:35:17 vs9 kernel: sd 5:0:0:0: [sdd] CDB:
Nov  9 18:35:17 vs9 kernel: Write(10): 2a 00 26 bf fa 00 00 0a 00 00
Nov  9 18:35:17 vs9 kernel: JBD2: Detected IO errors while flushing file data on dm-3-617
When searching for the explanation, looks like this is an issue with using iSCSI and jumbo frames with certain 3.9 kernel versions. Could be that this is an issue with OVM 3.8 kernel as well.What makes this particularly nasty is that we’ve made several copies of virtual servers for backup purposes and there is no guarantee that those copies are valid and functional any more.

Possible Solution

To troubleshoot the fix I decided to upgrade the whole OracleVM park to the latest OVM 3.4.4 that uses Kernel 4.1.

After upgrading the all the pools and OVM Manager to 3.4.4 looks like we got rid of this nasty behaviour.

Tried exactly the same way ot cloning, using the same servers, no errors and the cloned virtual server works just fine.

Recommendation

I strongly recommend to upgrade to OracleVM 3.4.x as soon as possible if you are using 3.3.x, iSCSI and jumbo frames AND you are seeing these errors.

Check your OracleVM servers, if you see any of these errors in /var/log/messages, you might have data corruption issues in the images.

6.7.17

RedHat 6.9 Update Breaks Oracle Reports

For those that have Oracle Reports running, be aware. Updating the latest RedHat 6 or 7 updates will break the Oracle Reports execution.

For example upgrading to the latest RedHat 6.9 caused all the reports to be failing with signal 11 or signal 6.

This is a known issue (referring to Oracle technical note 2280616.1).


For fixing edit the reports.sh file

Reports 11.1.2.x:

File: INSTANCE_HOME/config/reports/bin/reports.sh

Add in the last line:
REPORTS_JVM_OPTIONS="-Xss2M"; export REPORTS_JVM_OPTIONS

Reports 12.2.1.x:

File: DOMAIN_HOME/reports/bin/reports.sh

Add in the last line:
REPORTS_JVM_OPTIONS="-Xss2M"; export REPORTS_JVM_OPTIONS

After modifying, restart the Reports server.

3.7.17

OVM Guest Linux LVM2 Disk Mounting

There might be need some time to mount OVM guest server disk images directly from OVM Server and modify some settings e.g. that prevent the server from booting or change passwords.

Typically doing this is quite straight forward by setting up a loop device and mounting the wanted partition. If the target partition is LVM2 partition, this becomes a bit more complex in OracleVM environment.

The problem is that by default OracleVM server /etc/lvm/lvm.conf has filtering enabled to prevent discovering loop device LVM2 devices.

This article describes the steps to perform to get a LVM2 volume mounted and data there changed off-line.

If you are unsure what you are doing, please make a backup of the virtual disk and lvm.conf you are going to change, before you proceed with following actions.

Instructions below expect that an experience Linux/OVM administrator knows what she/he is doing. I’m not going into details what a root will do after issuing the chroot command. ANY CHANGES ARE AT YOUR OWN RISK. TAKE GOOD BACKUPS ANYWAYS.

Preparation: modify /etc/lvm/lvm.conf

In order for OVM server to able to scan the loop devices /etc/lvm.conf needs to be modified.

# 30.6.2017 Harri Kaukovuo, modify the preferred_names
#preferred_names = [ "^/dev/mpath/", "^/dev/mapper/mpath", "^/dev/[hs]d" ]
preferred_names = [ ]

# 30.6.2017 Harri Kaukovuo, uncomment
filter = [ "a|.*/|" ]

# 30.6.2017 Harri Kaukovuo, comment out this line
#global_filter = [ "r|.*/|" ]

Mount the Virtual LVM2 Disk

Find the next free loop device:

losetup –f

By default OVM Server has max 10 loop devices. You might be running out of loop devices, which you can work around by either shutting down all the guest VM servers or increasing the loop devices by adding following line in /etc/rc.local and rebooting the OVM server:

# 28.6.2017 Harri Kaukovuo
/sbin/MAKEDEV -m 32 /dev/loop

Please note that above step is only needed if you ran out of loop devices.

Anyways, when you have a free loop device, in my example I have /dev/loop9 as the loop device, you can proceed with following step which is setting up the loop device to point to the virtual disk. In the example below I have retrieved the disk image file name from OVM Manager console:

losetup  /dev/loop9 /OVS/Repositories/0004fb000003000088c2307002d1b442/VirtualDisks/0004fb0000120000ba5aa22cb02675b7.img

Read partition tables from the loop device and create device maps with kpartx:

kpartx -av /dev/loop9

The output is something like:

[root@myovm01 etc]# kpartx -av /dev/loop9
add map loop9p1 (249:1): 0 401562 linear /dev/loop9 63
add map loop9p2 (249:2): 0 142898175 linear /dev/loop9 401625


Perform volume group scan: by issuing command:

vgscan

Output is like:

[root@myovm01 etc]# vgscan
   Reading all physical volumes.  This may take a while...
   Found volume group "VolGroup00" using metadata type lvm2

Activate volume groups by issuing command:

vgchange –ay

Output is like:

[root@myovm01 etc]# vgchange -ay
   4 logical volume(s) in volume group "VolGroup00" now active


Show logical volumes:

[root@myovm01 etc]# lvscan
   ACTIVE            '/dev/VolGroup00/LogVol03' [42.03 GiB] inherit
   ACTIVE            '/dev/VolGroup00/LogVol02' [6.06 GiB] inherit
   ACTIVE            '/dev/VolGroup00/LogVol00' [4.00 GiB] inherit
   ACTIVE            '/dev/VolGroup00/LogVol01' [16.00 GiB] inherit


Mount the wanted  logical volume. In this example case I know already that LogVol03 is the root partition and I want to change something there.


[root@myovm01 etc]# mount /dev/VolGroup00/LogVol03 /mnt/virtualdisk
[root@myovm01 etc]# df -H
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2              53G  2.8G   48G   6% /
tmpfs                 3.2G     0  3.2G   0% /dev/shm
/dev/sda1             500M  145M  325M  31% /boot
none                  3.2G  213k  3.2G   1% /var/lib/xenstored
/dev/mapper/361866da0905943002027220113a0a9c8
                       8.4T  3.0T  5.5T  36% /OVS/Repositories/0004fb000003000088c2307002d1b442
/dev/mapper/VolGroup00-LogVol03
                        45G   33G  9.4G  78% /mnt/virtualdisk


Use chroot to change the root directory, if you need to change something there:

chroot /mnt/virtualdisk

(do your stuff here)

Get out of chroot by issuing “exit”

Unmount the disk after use:

umount /mnt/virtualdisk

Deactivate the volume group:

[root@myovm01 ~]# vgchange --activate n VolGroup00
   0 logical volume(s) in volume group "VolGroup00" now active

Delete the partition device mappings:

kpartx -dv /dev/loop9

Output is something like:

[root@myovm01 etc]# kpartx -dv /dev/loop9
del devmap : loop9p2
del devmap : loop9p1


Delete the loop device mapping:

losetup -d /dev/loop9

Now there should not be any LVM2 mappings found, also the loop device should be free:

[root@myovm01 ~]# pvscan
   No matching physical volumes found
[root@myovm01 ~]# vgscan
   Reading all physical volumes.  This may take a while...
[root@myovm01 ~]# losetup -f
/dev/loop9

After unmounting the disk, startup the guest linux and enjoy the changes.

13.6.17

OracleVM Server Update Disabled

One of the tasks to setup the OracleVM platform is to setup the YUM repository for updating the OracleVM servers from OracleVM Manager console.

If your OVM server sits behind the firewall and cannot connect the Oracle public YUM server without proxy setup you might have a situation where your OracleVM server context menu shows up like:

image

Update menu option is disabled even after you’ve setup the YUM server and enabled it.

Problem might be that your OVM Server needs to connect the public YUM server using company proxy server.

To fix this:

1. Login to OracleVM server as root

2. Edit /etc/yum.conf, configure to setup your company http proxy. Add following lines:

enableProxy=1
httpProxy=http://proxy.acme.com:8080
proxy=http://proxy.acme.com:8080

3. Disable and Enable the YUM repository setup in OracleVM Manager.

image


After this you should have the “Update” menu item enabled:

image