Friday, December 14, 2012

Migrating from Solaris/Sparc to Solaris/x86 or RHEL? Use ZFS!

Problem
At a large company,
the DBA teams migrating Oracle databases from Solaris/Sparc to RHEL5/x86 had a major pain point:
- dumping/exporting the database was easy, converting its endianness was simple and re-importing it on the Linux side was easy too.
-HOWEVER- transferring the dump from one system to another was 1) slow and 2) unpredictable (large VLAN's make for increased latency and slower transfers).

If your DB was pretty small, you would just wait for a few hours but with a database of a few Tb, it just wasn't practical.

Solution: Use a shared and compressed ZFS pool on a SAN box and enjoy transfer-less database dumps.

Solaris/Sparc and Solaris/x64 speak ZFS natively but Linux doesn't, unless you use zfs-fuse (http://zfs-fuse.net). ZFS-fuse installs without reboot and makes your zpool available instantly.



Here's the howto:
1) Install zfs-fuse from http://vscojot.free.fr/dist/zfs-fuse on your Linux box (here, an HP DL580G7):

vcojot@rhel5x64$ sudo yum install -y xz-libs
vcojot@rhel5x64$ sudo rpm -ivh  liblzo2_2-2.03-6.el5.x86_64.rpm  zfs-fuse-0.7.0p1-11.el5.x86_64.rpm


2) Create your Zpool on the Sparc ( version 26 is the latest supported by zfs-fuse).

vcojot@solsparc$ sudo zpool create -o version=26 vsc_pool \
c4t60000970000292602571533030333032d0 c4t60000970000292602571533030333030d0 \
[....]

vcojot@solsparc$ uname -a
SunOS solsparc 5.10 Generic_147440-07 sun4u sparc SUNW,SPARC-Enterprise
vcojot@solsparc$ sudo zpool list adbtmpdbdump
NAME           SIZE  ALLOC   FREE    CAP  HEALTH  ALTROOT
adbtmpdbdump   540G   190K   540G     0%  ONLINE  -


vcojot@solsparc$ sudo zpool status adbtmpdbdump
  pool: adbtmpdbdump
 state: ONLINE
status: The pool is formatted using an older on-disk format.  The pool can
        still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
        pool will no longer be accessible on older software versions.
 scan: none requested
config:

        NAME                                     STATE     READ WRITE CKSUM
        adbtmpdbdump                             ONLINE       0     0     0
          c4t60060160A9312C000A487C3E6132E211d0  ONLINE       0     0     0
          c4t60060160A9312C008A157877D6EAE111d0  ONLINE       0     0     0
          c4t60060160A9312C00969C5951D6EAE111d0  ONLINE       0     0     0
          c4t60060160A9312C009C9C5951D6EAE111d0  ONLINE       0     0     0
          c4t60060160A9312C00A09C5951D6EAE111d0  ONLINE       0     0     0
          c4t60060160A9312C000C487C3E6132E211d0  ONLINE       0     0     0
          c4t60060160A9312C009A9C5951D6EAE111d0  ONLINE       0     0     0
          c4t60060160A9312C004EE187C5711AE111d0  ONLINE       0     0     0
          c4t60060160A9312C0008487C3E6132E211d0  ONLINE       0     0     0
          c4t60060160A9312C0012487C3E6132E211d0  ONLINE       0     0     0
          c4t60060160A9312C000E487C3E6132E211d0  ONLINE       0     0     0
          c4t60060160A9312C00989C5951D6EAE111d0  ONLINE       0     0     0
          c4t60060160A9312C00A29C5951D6EAE111d0  ONLINE       0     0     0
          c4t60060160A9312C009E9C5951D6EAE111d0  ONLINE       0     0     0
          c4t60060160A9312C00A49C5951D6EAE111d0  ONLINE       0     0     0
          c4t60060160A9312C0010487C3E6132E211d0  ONLINE       0     0     0

errors: No known data errors


3) Re-discover your SAN configuration on your other hosts and check that the pool can be imported.
Here's the pool as seen from the solx64 machine:

vcojot@solx64$ sudo zpool import
  pool: adbtmpdbdump
    id: 16626771482833154241
 state: ONLINE
status: The pool was last accessed by another system.
action: The pool can be imported using its name or numeric identifier and
        the '-f' flag.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

        adbtmpdbdump                             ONLINE
          c4t60060160A9312C000A487C3E6132E211d0  ONLINE
          c4t60060160A9312C008A157877D6EAE111d0  ONLINE
          c4t60060160A9312C00969C5951D6EAE111d0  ONLINE
          c4t60060160A9312C009C9C5951D6EAE111d0  ONLINE
          c4t60060160A9312C00A09C5951D6EAE111d0  ONLINE
          c4t60060160A9312C000C487C3E6132E211d0  ONLINE
          c4t60060160A9312C009A9C5951D6EAE111d0  ONLINE
          c4t60060160A9312C004EE187C5711AE111d0  ONLINE
          c4t60060160A9312C0008487C3E6132E211d0  ONLINE
          c4t60060160A9312C0012487C3E6132E211d0  ONLINE
          c4t60060160A9312C000E487C3E6132E211d0  ONLINE
          c4t60060160A9312C00989C5951D6EAE111d0  ONLINE
          c4t60060160A9312C00A29C5951D6EAE111d0  ONLINE
          c4t60060160A9312C009E9C5951D6EAE111d0  ONLINE
          c4t60060160A9312C00A49C5951D6EAE111d0  ONLINE
          c4t60060160A9312C0010487C3E6132E211d0  ONLINE


Here's the pool as seen from the RHEL5 x64 machine:

 vcojot@rhel5x64$ sudo zpool import
  pool: adbtmpdbdump
    id: 16626771482833154241
 state: ONLINE
status: The pool was last accessed by another system.
action: The pool can be imported using its name or numeric identifier and
        the '-f' flag.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

        adbtmpdbdump                                                                    ONLINE
          disk/by-path/pci-0000:81:00.0-fc-0x5006016846e0311e:0x000f000000000000-part1  ONLINE
          disk/by-path/pci-0000:0e:00.0-fc-0x5006016946e0311e:0x001b000000000000-part1  ONLINE
          disk/by-path/pci-0000:81:00.0-fc-0x5006016846e0311e:0x0013000000000000-part1  ONLINE
          disk/by-path/pci-0000:0e:00.0-fc-0x5006016146e0311e:0x0016000000000000-part1  ONLINE
          disk/by-path/pci-0000:81:00.0-fc-0x5006016846e0311e:0x0018000000000000-part1  ONLINE
          disk/by-path/pci-0000:81:00.0-fc-0x5006016846e0311e:0x0010000000000000-part1  ONLINE
          disk/by-path/pci-0000:81:00.0-fc-0x5006016046e0311e:0x0015000000000000-part1  ONLINE
          disk/by-path/pci-0000:0e:00.0-fc-0x5006016146e0311e:0x000d000000000000-part1  ONLINE
          disk/by-path/pci-0000:0e:00.0-fc-0x5006016146e0311e:0x000e000000000000-part1  ONLINE
          disk/by-path/pci-0000:81:00.0-fc-0x5006016846e0311e:0x001c000000000000-part1  ONLINE
          disk/by-path/pci-0000:81:00.0-fc-0x5006016046e0311e:0x0011000000000000-part1  ONLINE
          disk/by-path/pci-0000:81:00.0-fc-0x5006016046e0311e:0x0014000000000000-part1  ONLINE
          disk/by-path/pci-0000:81:00.0-fc-0x5006016846e0311e:0x0019000000000000-part1  ONLINE
          disk/by-path/pci-0000:81:00.0-fc-0x5006016046e0311e:0x0017000000000000-part1  ONLINE
          disk/by-path/pci-0000:0e:00.0-fc-0x5006016146e0311e:0x001a000000000000-part1  ONLINE
          disk/by-path/pci-0000:81:00.0-fc-0x5006016846e0311e:0x0012000000000000-part1  ONLINE


4) You're almost done! Dump your database to your zpool.

5) Export the zpool on solsparc and re-import it on solx64 or rhel5x64 (only takes a few minutes).

6) Re-import your database

Also, database files do usually compress pretty well so here's a 3.5Tb database on another shared zpool:

vcojot@solx64$ sudo zpool list ADBZFSDUMP
NAME         SIZE  ALLOC   FREE    CAP  HEALTH  ALTROOT
ADBZFSDUMP  1.99T  1.55T   449G    77%  ONLINE  -


vcojot@solx64$ sudo zfs get compressratio ADBZFSDUMP
NAME        PROPERTY       VALUE  SOURCE
ADBZFSDUMP  compressratio  2.07x  -


Even though zfs-fuse is userspace-only, we're seeing decent performance using it. Here, on a Xeon system, it's reading compressed data into memory:


vcojot@rhel5x64$ sudo zpool iostat 1
               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
adbtmpzfs    15.2G   223G     11     67   635K   994K
adbtmpzfs    15.2G   223G      0      0      0      0
adbtmpzfs    15.2G   223G      0      0      0      0
adbtmpzfs    15.2G   223G      0      0      0      0
adbtmpzfs    15.2G   223G      0      0      0      0
adbtmpzfs    15.2G   223G    849      0   106M      0
adbtmpzfs    15.2G   223G    878      0   109M      0
adbtmpzfs    15.2G   223G    990      0   123M      0
adbtmpzfs    15.2G   223G    911      0   114M      0
adbtmpzfs    15.2G   223G    898      0   112M      0
adbtmpzfs    15.2G   223G    967      0   120M      0
adbtmpzfs    15.2G   223G    929      0   116M      0
adbtmpzfs    15.2G   223G  1.03K      0   132M      0
adbtmpzfs    15.2G   223G  1.10K      0   141M      0
adbtmpzfs    15.2G   223G   1014      0   126M      0
adbtmpzfs    15.2G   223G    849      0   106M      0
adbtmpzfs    15.2G   223G    821      0   102M      0
adbtmpzfs    15.2G   223G    914      0   114M      0
adbtmpzfs    15.2G   223G    967      0   120M      0
adbtmpzfs    15.2G   223G  1.03K      0   131M      0
adbtmpzfs    15.2G   223G    862      0   107M      0
[...]



The only requirement is that all of your hosts must share a common SAN fabric (Which usually means metro-localization).


LVM2 bootdisk encapsulation on RHEL7/Centos7

Introduction Hi everyone, Life on overcloud nodes was simple back then and everybody loved that single 'root' partition on th...