SHIFT

--- Sjoerd Hooft's InFormation Technology ---

User Tools

Site Tools


Sidebar

Sponsor:

Would you like to sponsor this site?
Or buy me a beer?:


Recently Changed Pages:

View All Pages
View All Q Pages


View All Tags


Sign up for Q to post comments.





WIKI Disclaimer: As with most other things on the Internet, the content on this wiki is not supported. It was contributed by me and is published “as is”. It has worked for me, and might work for you.
Also note that any view or statement expressed anywhere on this site are strictly mine and not the opinions or views of my employer.


Terms And Conditions for Q users


Pages with comments

PageDateDiscussionTags
2018/11/10 11:24 2 Comments
2017/04/20 16:35 1 Comment
2017/04/20 15:28 1 Comment
2017/04/20 15:23 1 Comment
2017/04/19 14:59 1 Comment
2017/04/19 14:45 3 Comments
2017/04/19 14:44 1 Comment
2017/04/17 20:10 1 Comment
2017/04/17 20:07 1 Comment
2017/04/17 19:58 1 Comment
2017/04/17 19:52 1 Comment

View All Comments

snapmirrordata

NetApp SnapMirror: Using Data

In this article I'll describe a couple of scenarios I could think of regarding the usage of snapmirror data.
First Scenario: ESX ISO Store
In this scenario we have a ISO store in our easy accessible acceptance location. This ISO store is used by ESX to mount ISOs to install VMs. However, we need these ISOs in our production environment as well. Copying the ISOs manually is quite a hassle, so we want this to go almost automatically.
Second Scenario: Disaster Recovery
In this scenario we'll pretend that our production site is unavailable and we'll have to do a failover using the snapmirrored data. When the disaster is resolved we'll also do a switch-back.
In these scenarios the snapmirror has already been set up as described here.

ESX ISO Store

Setting up the ESX ISO store scenario basically comes down to these steps:

  1. Break the mirror
  2. Mount the datastore
  3. Perform a resync if new ISOs are added

> Note: Mounting the datastores without breaking the mirror is unfortunately not possible. ESX requires that the LUNs are writable, which is not possible if the mirror is still operational.

Break the Mirror

Breaking the mirror will automatically make the target writable. This is done from the destination filer.

dst-prd-filer1> snapmirror break ESX_ISOSTORE
snapmirror break: Destination ESX_ISOSTORE is now writable.
Volume size is being retained for potential snapmirror resync.  If you would like to grow the volume and do not expect to resync, set vol option fs_size_fixed to off.
dst-prd-filer1> snapmirror status ESX_ISOSTORE
Snapmirror is on.
Source                          Destination                  State          Lag        Status
src-acc-filer1:SRC_ACC_ESX_ISO  dst-prd-filer1:ESX_ISOSTORE  Broken-off     00:50:12   Idle

Mount the Datastore

First Host

The first time you do this there is a difference between the first host and other hosts on how to mount the datastore. The first host can be done through vCenter, additional hosts not.

Log on to vCenter, select the first host and go to Configuration → Storage. Click on “Rescan All” and select both options to scan all the HBAs. After refreshing the device is present but the datastore is not. To make the datastore visible perform these steps:

  • Add Storage
  • Select “Disk/LUN”
  • The device on which the VMFS datastore is should be visible, select it
  • Select “Keep the existing signature”
  • The VMFS partition is shown, click next and finish.

The datastore should now be accessible by the host.

Additional Hosts

If you would do the same steps as above for additional hosts you'll get this error:

Call "HostStorageSystem.ResolveMultipleUnresolvedVmfsVolumes" for object "storageSystem-446" on vCenter Server "vCenter.company.local" failed.

The solution by VMware kb is to point the vSphere Client directly to the host instead of vCenter and then perform the same steps as above.

Resync From Original Source

dst-prd-filer1> snapmirror status ESX_ISOSTORE
Snapmirror is on.
Source                          Destination                  State          Lag        Status
src-acc-filer1:SRC_ACC_ESX_ISO  dst-prd-filer1:ESX_ISOSTORE  Broken-off     00:50:12   Idle
dst-prd-filer1> snapmirror resync ESX_ISOSTORE
The resync base snapshot will be: dst-prd-filer1(0151762648)_ESX_ISOSTORE.2
These older snapshots have already been deleted from the source and will be deleted from the destination:
    dst-prd-filer1(0151762648)_ESX_ISOSTORE.1
Are you sure you want to resync the volume? y
Mon May 23 16:45:39 CEST [dst-prd-filer1: snapmirror.dst.resync.info:notice]: SnapMirror resync of ESX_ISOSTORE to src-acc-filer1:SRC_ACC_ESX_ISO is using dst-prd-filer1(0151762648)_ESX_ISOSTORE.2 as the base snapshot.
Volume ESX_ISOSTORE will be briefly unavailable before coming back online.
Mon May 23 16:45:40 CEST [dst-prd-filer1: wafl.snaprestore.revert:notice]: Reverting volume ESX_ISOSTORE to a previous snapshot.
exportfs [Line 2]: NFS not licensed; local volume /vol/ESX_ISOSTORE not exported
Revert to resync base snapshot was successful.
Mon May 23 16:45:40 CEST [dst-prd-filer1: replication.dst.resync.success:notice]: SnapMirror resync of ESX_ISOSTORE to src-acc-filer1:SRC_ACC_ESX_ISO was successful.
Transfer started.
Monitor progress with 'snapmirror status' or the snapmirror log.
dst-prd-filer1> snapmirror status ESX_ISOSTORE
Snapmirror is on.
Source                          Destination                  State          Lag        Status
src-acc-filer1:SRC_ACC_ESX_ISO  dst-prd-filer1:ESX_ISOSTORE  Snapmirrored   00:50:47   Transferring  (7592 KB done)
dst-prd-filer1>
Note: The volume is set to “snapmirrored,read-only” automatically
Note2: The datastore is gone from the vmware hosts

Break again

Because the datastore has already been known to the host, all you need to do to gain access to the datastore again is a rescan all on every host. The datastore will be made available automatically.

Disaster Recovery

Using the snapmirrored data for disaster recovery is most probably the most basic reason why you bought SnapMirror. Furthermore, because in real life disasters rarely happen, we'll do a disaster test. The test will be done in two different ways. The first one is done while assuming you have flexclone licensed. This is your preferred scenario since it won't break your mirror, which means that if you have a disaster during testing you'll have the latest version of your data untouched and available. The second way is without flexclone. This means breaking the mirror, snapshot the volume and clone the lun. Afterwards we'll have to restore the SnapMirror relationship.

Disaster Recovery Test Using FlexClone

Create Snapshot On Source

Because the target volume is read-only, we'll have to create the snapshot we need for cloning on the source filer. When we done that we replicate the snapshot to the target filer.

Creating Snapshot

On the source filer:

storage01> snap list snapmirrorsource
Volume snapmirrorsource
working....

  %/used       %/total  date          name
----------  ----------  ------------  --------
  2% ( 2%)    0% ( 0%)  May 26 10:15  storage02(0099904947)_snapmirrortarget.2 (snapmirror)

storage01> snap create snapmirrorsource volclonetest

storage01> snap list snapmirrorsource
Volume snapmirrorsource
working...

  %/used       %/total  date          name
----------  ----------  ------------  --------
  2% ( 2%)    0% ( 0%)  May 26 10:16  volclonetest
  3% ( 2%)    0% ( 0%)  May 26 10:15  storage02(0099904947)_snapmirrortarget.2 (snapmirror)
Note that it's (probably) possible to use the snapmirror snapshot. I just prefer simple named snapshots.

Update Snapmirror

On the target filer:

storage02> snapmirror update snapmirrortarget
Transfer started.
Monitor progress with 'snapmirror status' or the snapmirror log.
storage02> snapmirror status
Snapmirror is on.
Source                      Destination                 State          Lag        Status
storage01:snapmirrorsource  storage02:snapmirrortarget  Snapmirrored   00:00:10   Idle
storage02> snap list snapmirrortarget
Volume snapmirrortarget
working...

  %/used       %/total  date          name
----------  ----------  ------------  --------
  0% ( 0%)    0% ( 0%)  May 26 10:19  storage02(0099904947)_snapmirrortarget.3
  2% ( 2%)    0% ( 0%)  May 26 10:16  volclonetest
  4% ( 2%)    0% ( 0%)  May 26 10:15  storage02(0099904947)_snapmirrortarget.2

The created snapshot volclonetest is available on the target filer now as well.

Clone Volume

On the target filer we'll have to license flex_clone before we can clone the volume:

storage02> license add XXXXXXX
A flex_clone site license has been installed.
        FlexClone enabled.

storage02> vol clone create snapmirrortargetclone -s volume -b snapmirrortarget volclonetest
Creation of clone volume 'snapmirrortargetclone' has completed.
  • snapmirrortargetclone: is the name of the new volume
  • -s volume: is the space reservation
  • -b snapmirrortarget: is the parent volume
  • volclonetest: the snapshot the clone is using

Check the Volume

You can see the available volumes on the CLI of the filer, note that you can't tell it's a flexclone:

storage02> vol status
         Volume       State           Status            Options
snapmirrortarget      online          raid_dp, flex     nosnap=on, snapmirrored=on,
                                      snapmirrored      create_ucode=on,
                                      read-only         convert_ucode=on,
                                                        fs_size_fixed=on
snapmirrortargetclone online          raid_dp, flex     nosnap=on, create_ucode=on,
                                                        convert_ucode=on

You can see it's a flexclone in filerview: netapp-flexclone.jpg

Check the Snapshot

storage02> snap list
Volume snapmirrortarget
working...

  %/used       %/total  date          name
----------  ----------  ------------  --------
  0% ( 0%)    0% ( 0%)  May 26 10:19  storage02(0099904947)_snapmirrortarget.3
  2% ( 2%)    0% ( 0%)  May 26 10:16  volclonetest   (busy,snapmirror,vclone)
  4% ( 2%)    0% ( 0%)  May 26 10:15  storage02(0099904947)_snapmirrortarget.2

Volume snapmirrortargetclone
working....

  %/used       %/total  date          name
----------  ----------  ------------  --------
  4% ( 4%)    0% ( 0%)  May 26 10:16  volclonetest

Using the Data

Now you can use the new volume in any way you like. Write requests will be done in the new volume, read requests will first be done in the new volume, and if necessary be forwarded to the old volume. Since the new volume is based on a snapshot, the original source and target are not altered in any way.

What Else is Possible

In this part we'll go outside the original casus for just using the data and we'll do the whatif scenario. We'll try to remove the source snapshot, see what happens and… how to fix it.

Remove the Snapshot

On the source filer:

storage01> snap list snapmirrorsource
Volume snapmirrorsource
working....

  %/used       %/total  date          name
----------  ----------  ------------  --------
  4% ( 4%)    0% ( 0%)  May 26 10:19  storage02(0099904947)_snapmirrortarget.3 (snapmirror)
  6% ( 2%)    0% ( 0%)  May 26 10:16  volclonetest

storage01> snap delete snapmirrorsource volclonetest
storage01> snap list snapmirrorsource
working....

  %/used       %/total  date          name
----------  ----------  ------------  --------
  4% ( 4%)    0% ( 0%)  May 26 10:19  storage02(0099904947)_snapmirrortarget.3 (snapmirror)

So it's possible to remove the snapshot the cloned volume is based on in the target filer… What will happen when we'll try to update the SnapMirror.

Update SnapMirror

storage02> snapmirror update snapmirrortarget
Transfer started.
Monitor progress with 'snapmirror status' or the snapmirror log.

storage02> snapmirror status
Snapmirror is on.
Source                      Destination                 State          Lag        Status
storage01:snapmirrorsource  storage02:snapmirrortarget  Snapmirrored   01:14:25   Pending

As you can see is status: pending
Filerview reports: replication transfer failed to complete
Syslog in /etc/messages shows:

Thu May 26 11:35:05 CEST [snapmirror.dst.snapDelErr:error]: Snapshot volclonetest in destination volume snapmirrortarget is in use, cannot delete.
Thu May 26 11:35:05 CEST [replication.dst.err:error]: SnapMirror: destination transfer from storage01:snapmirrorsource to snapmirrortarget : replication transfer failed to complete.

So the system is trying to remove the snapshot but can't, which means the mirror is not working anymore.

Split Cloned Volume

The solution to this problem is, assuming we want to keep the cloned volume is to split the cloned volume off from it's parent and the snapshot from the parent. This effectively means that all the data which is now in the original target volume will be copied to the cloned volume.

Note that changes to the data you did in the clone are persistent, if you deleted files from the clone they will keep being deleted.
storage02> vol clone split start snapmirrortargetclone
Clone volume 'snapmirrortargetclone' will be split from its parent.
Monitor system log or use 'vol clone split status' for progress.

storage02> snapmirror status
Snapmirror is on.
Source                      Destination                 State          Lag        Status
storage01:snapmirrorsource  storage02:snapmirrortarget  Snapmirrored   00:00:10   Idle

As you can see, after splitting the volume the snapmirror is OK again. Also, teh volume snapmirrortargetclone is now a volume on it's own, and doesn't show as a flexclone anymore in FilerView.

Disaster Recovery Test Witout FlexClone

Testing the data without without Flexclone means breaking the mirror and just using the data. But in case we'll want to restore the mirror really fast we'll use LUN cloning to make sure the mirror can be restored. We'll do so with these steps:

  1. Break the mirror
  2. Create a Snapshot
  3. Clone the LUN with the created snapshot
  4. Resync the data and restore the SnapMirror relationship

Break the SnapMirror Relationship

src-acc-filer1> snapmirror status
Snapmirror is on.
Source                          Destination                  State          Lag        Status
dst-prd-filer1:AIX_01           src-acc-filer1:AIX_01        Snapmirrored   00:00:53   Idle
dst-prd-filer1:AIX_02           src-acc-filer1:AIX_02        Snapmirrored   00:00:53   Idle
src-acc-filer1:SRC_ACC_ESX_ISO  dst-prd-filer1:ESX_ISOSTORE  Source         88:27:13   Idle

src-acc-filer1> snapmirror break src-acc-filer1:AIX_01
snapmirror break: Destination AIX_01 is now writable.
Volume size is being retained for potential snapmirror resync.  If you would like to grow the volume and do not expect to resync, set vol option fs_size_fixed to off.

src-acc-filer1> snapmirror status
Snapmirror is on.
Source                          Destination                  State          Lag        Status
dst-prd-filer1:AIX_01           src-acc-filer1:AIX_01        Broken-off     00:01:04   Idle
dst-prd-filer1:AIX_02           src-acc-filer1:AIX_02        Snapmirrored   00:01:04   Idle
src-acc-filer1:SRC_ACC_ESX_ISO  dst-prd-filer1:ESX_ISOSTORE  Source         88:27:24   Idle

Create a Snapshot

src-acc-filer1> snap create AIX_01 clonesnapshot
src-acc-filer1> snap list AIX_01
Volume AIX_01
working...

  %/used       %/total  date          name
----------  ----------  ------------  --------
  0% ( 0%)    0% ( 0%)  May 27 09:13  clonesnapshot
  0% ( 0%)    0% ( 0%)  May 27 09:12  src-acc-filer1(0151762815)_AIX_01.9687
  0% ( 0%)    0% ( 0%)  May 27 09:06  src-acc-filer1(0151762815)_AIX_01.9686
  0% ( 0%)    0% ( 0%)  May 27 06:00  hourly.0
 10% (10%)    3% ( 2%)  May 27 00:00  nightly.0
 10% ( 1%)    3% ( 0%)  May 26 18:00  hourly.1
 19% (10%)    5% ( 3%)  May 26 00:00  nightly.1
 26% (11%)    8% ( 3%)  May 24 14:08  sjoerd_snapshot
 34% (15%)   12% ( 4%)  May 23 00:00  weekly.0

Check Free Space

First time I ever tried this I made a mistake. I didn't check my free space requirements AND didn't create the LUN clones without a space reservation. That resulted in losing all snapshots because the volume was still on a fixed size (so couldn't grow) and the only way left to clear space was by deleting snapshots…
So after breaking a volume you can turn fixed size off like this:

vol options AIX_01 fs_size_fixed off

Create Lun clone

Before cloning check the current LUNs for pathnames:

        /vol/AIX_01/boot              30g (32212254720)   (r/w, online)
        /vol/AIX_01/optamb            10g (10737418240)   (r/w, online)
        /vol/AIX_01/optoracle         10g (10737418240)   (r/w, online)
        /vol/AIX_01/varbackup      120.0g (128861601792)  (r/w, online)
        /vol/AIX_01/vardata          100g (107374182400)  (r/w, online)
        /vol/AIX_01/vardump           10g (10737418240)   (r/w, online)
        /vol/AIX_01/varlog          40.0g (42953867264)   (r/w, online)

Than create the LUN clones with space reservation:

src-acc-filer1> lun clone create /vol/AIX_01/vardataclone -b /vol/AIX_01/vardata clonesnapshot
src-acc-filer1> lun clone create /vol/AIX_01/varbackupclone -b /vol/AIX_01/varbackup clonesnapshot

Or without space reservation:

src-acc-filer1> lun clone create /vol/AIX_01/vardataclone -o noreserve -b /vol/AIX_01/vardata clonesnapshot
src-acc-filer1> lun clone create /vol/AIX_01/varbackupclone -o noreserve -b /vol/AIX_01/varbackup clonesnapshot

You now have two extra LUNs in the volume:

        /vol/AIX_01/boot              30g (32212254720)   (r/w, online)
        /vol/AIX_01/optamb            10g (10737418240)   (r/w, online)
        /vol/AIX_01/optoracle         10g (10737418240)   (r/w, online)
        /vol/AIX_01/varbackup      120.0g (128861601792)  (r/w, online)
        /vol/AIX_01/varbackupclone  120.0g (128861601792)  (r/w, online)
        /vol/AIX_01/vardata          100g (107374182400)  (r/w, online)
        /vol/AIX_01/vardataclone     100g (107374182400)  (r/w, online)
        /vol/AIX_01/vardump           10g (10737418240)   (r/w, online)
        /vol/AIX_01/varlog          40.0g (42953867264)   (r/w, online)

And you can check the parent snapshot information by requesting verbose information about the LUNs:

src-acc-filer1> lun show -v /vol/AIX_01/vardataclone
        /vol/AIX_01/vardataclone     100g (107374182400)  (r/w, online)
                Serial#: W-/QGJd3msKE
                Backed by: /vol/AIX_01/.snapshot/clonesnapshot/vardata
                Share: none
                Space Reservation: enabled
                Multiprotocol Type: aix
src-acc-filer1> lun show -v /vol/AIX_01/varbackupclone
        /vol/AIX_01/varbackupclone  120.0g (128861601792)  (r/w, online)
                Serial#: W-/QGJd3nOO5
                Backed by: /vol/AIX_01/.snapshot/clonesnapshot/varbackup
                Share: none
                Space Reservation: enabled
                Multiprotocol Type: aix

Use the LUNs

Use them as you would always, so just map them to the correct initiator group on a free LUN id (if no LUN id is given, the lowest available will be asigned:

lun map /vol/AIX_01/varbackupclone SRC-AIX-01 20
lun map /vol/AIX_01/vardataclone SRC-AIX-01 21

After mapping you can discover them from the commandline as root (this is on AIX with the host utilities installed):

root@src-aix-01:/home/root>sanlun lun show
  controller:          lun-pathname        device filename  adapter  protocol          lun size         lun state
src-acc-filer1:  /vol/SRC_AIX_01/boot       hdisk1           fcs0     FCP           30g (32212254720)    GOOD
src-acc-filer1:  /vol/SRC_AIX_01/optamb     hdisk2           fcs0     FCP           10g (10737418240)    GOOD
src-acc-filer1:  /vol/SRC_AIX_01/optoracle  hdisk3           fcs0     FCP           10g (10737418240)    GOOD
src-acc-filer1:  /vol/SRC_AIX_01/varbackup  hdisk4           fcs0     FCP        120.0g (128861601792)   GOOD
src-acc-filer1:  /vol/SRC_AIX_01/vardata    hdisk5           fcs0     FCP          100g (107374182400)   GOOD
src-acc-filer1:  /vol/SRC_AIX_01/vardump    hdisk6           fcs0     FCP           10g (10737418240)    GOOD
src-acc-filer1:  /vol/SRC_AIX_01/varlog     hdisk7           fcs0     FCP         40.0g (42953867264)    GOOD
root@src-aix-01:/home/root>cfgmgr
root@src-aix-01:/home/root>sanlun lun show
  controller:          lun-pathname         device filename  adapter  protocol          lun size         lun state
src-acc-filer1:  /vol/SRC_AIX_01/boot        hdisk1           fcs0     FCP           30g (32212254720)    GOOD
src-acc-filer1:  /vol/SRC_AIX_01/optamb      hdisk2           fcs0     FCP           10g (10737418240)    GOOD
src-acc-filer1:  /vol/SRC_AIX_01/optoracle   hdisk3           fcs0     FCP           10g (10737418240)    GOOD
src-acc-filer1:  /vol/SRC_AIX_01/varbackup   hdisk4           fcs0     FCP        120.0g (128861601792)   GOOD
src-acc-filer1:  /vol/SRC_AIX_01/vardata     hdisk5           fcs0     FCP          100g (107374182400)   GOOD
src-acc-filer1:  /vol/SRC_AIX_01/vardump     hdisk6           fcs0     FCP           10g (10737418240)    GOOD
src-acc-filer1:  /vol/SRC_AIX_01/varlog      hdisk7           fcs0     FCP         40.0g (42953867264)    GOOD
src-acc-filer1:  /vol/AIX_01/varbackupclone  hdisk8           fcs0     FCP        120.0g (128861601792)   GOOD
src-acc-filer1:  /vol/AIX_01/vardataclone    hdisk9           fcs0     FCP          100g (107374182400)   GOOD

Than on AIX import the volumegroups:

root@src-aix-01:/home/root>importvg -y backupclone hdisk8
0516-530 synclvodm: Logical volume name loglv02 changed to loglv06.
0516-712 synclvodm: The chlv succeeded, however chfs must now be
        run on every filesystem which references the old log name loglv02.
0516-530 synclvodm: Logical volume name fslv02 changed to fslv06.
imfs: mount point "/var/backup" already exists in /etc/filesystems
backupclone
root@src-aix-01:/home/root>importvg -y dataclone hdisk9
0516-530 synclvodm: Logical volume name loglv03 changed to loglv07.
0516-712 synclvodm: The chlv succeeded, however chfs must now be
        run on every filesystem which references the old log name loglv03.
0516-530 synclvodm: Logical volume name fslv03 changed to fslv07.
imfs: mount point "/var/data" already exists in /etc/filesystems
dataclone
root@src-aix-01:/home/root>lsvg
rootvg
optambvg
optoraclevg
varbackupvg
vardatavg
vardumpvg
varlogvg
backupclone
dataclone

And add this to /etc/filesystems:

/var/backupclone:
        dev             = /dev/fslv06
        vfs             = jfs2
        log             = /dev/loglv06
        mount           = true
        options         = rw
        account         = false

/var/dataclone:
        dev             = /dev/fslv07
        vfs             = jfs2
        log             = /dev/loglv07
        mount           = true
        options         = rw
        account         = false

Than enable volumegroups and mount filesystems:

root@src-aix-01:/home/root>varyonvg backupclone
root@src-aix-01:/home/root>varyonvg dataclone
root@src-aix-01:/home/root>mkdir /var/backupclone
root@src-aix-01:/home/root>mkdir /var/dataclone
root@src-aix-01:/home/root>mount /var/dataclone
Replaying log for /dev/fslv07.
root@src-aix-01:/home/root>mount /var/backupclone
Replaying log for /dev/fslv06.

To get the data mounted:

root@src-aix-01:/home/root>df -Pm
Filesystem    MB blocks      Used Available Capacity Mounted on
/dev/hd4        2048.00     61.21   1986.79       3% /
/dev/hd2        4096.00   1365.46   2730.54      34% /usr
/dev/hd9var     1024.00     25.74    998.26       3% /var
/dev/hd3        8192.00    898.02   7293.98      11% /tmp
/dev/fwdump      832.00      0.45    831.55       1% /var/adm/ras/platform
/dev/hd1         512.00     67.16    444.84      14% /home
/proc                 -         -         -       -  /proc
/dev/hd10opt    4096.00    159.90   3936.10       4% /opt
/dev/fslv00    10208.00   1671.47   8536.53      17% /opt/amb
/dev/fslv01    10208.00   4333.34   5874.66      43% /opt/oracle
/dev/fslv02   122752.00  24100.24  98651.76      20% /var/backup
/dev/fslv03   102144.00  53500.98  48643.02      53% /var/data
/dev/fslv04    10208.00      1.89  10206.11       1% /var/dump
/dev/fslv05    40896.00    444.71  40451.29       2% /var/log
/dev/fslv07   102144.00  42812.18  59331.82      42% /var/dataclone
/dev/fslv06   122752.00  28333.43  94418.57      24% /var/backupclone

Restore the SnapMirror Relationship

First remove the LUNs from AIX:

root@src-aix-01:/home/root>umount /var/dataclone/
root@src-aix-01:/home/root>umlsvg
ount /var/backupclone/
root@src-aix-01:/home/root>rmdir /var/dataclone/
root@src-aix-01:/home/root>rmdir /var/backupclone/
root@src-aix-01:/home/root>varyoffvg backupclone
root@src-aix-01:/home/root>varyoffvg dataclone
root@src-aix-01:/home/root>exportvg backupclone
root@src-aix-01:/home/root>exportvg dataclone
root@src-aix-01:/home/root>rmdev -dl hdisk8
root@src-aix-01:/home/root>rmdev -dl hdisk9

And don't forget to clear /etc/filesystems.

Than unmap the LUNs on the filer and simply perform a resync:

src-acc-filer1> lun unmap /vol/AIX_01/varbackupclone SRC-AIX-01
src-acc-filer1> lun unmap /vol/AIX_01/vardataclone SRC-AIX-01
src-acc-filer1> snapmirror resync AIX_01
The resync base snapshot will be: src-acc-filer1(0151762815)_AIX_01.308
These newer snapshots will be deleted from the destination:
    hourly.0
    clonesnapshot
These older snapshots have already been deleted from the source
and will be deleted from the destination:
    src-acc-filer1(0151762815)_AIX_01.307
Are you sure you want to resync the volume? yes
Mon May 30 14:55:27 CEST [src-acc-filer1: snapmirror.dst.resync.info:notice]: SnapMirror resync of AIX_01 to dst-prd-filer1:AIX_01 is using src-acc-filer1(0151762815)_AIX_01.308 as the base snapshot.
Volume AIX_01 will be briefly unavailable before coming back online.
Mon May 30 14:55:28 CEST [src-acc-filer1: wafl.snaprestore.revert:notice]: Reverting volume AIX_01 to a previous snapshot.
Revert to resync base snapshot was successful.
Mon May 30 14:55:29 CEST [src-acc-filer1: replication.dst.resync.success:notice]: SnapMirror resync of AIX_01 to dst-prd-filer1:AIX_01 was successful.
Transfer started.
Monitor progress with 'snapmirror status' or the snapmirror log.
src-acc-filer1> snapmirror status
Snapmirror is on.
Source                          Destination                  State          Lag        Status
dst-prd-filer1:AIX_01           src-acc-filer1:AIX_01        Snapmirrored   04:37:36   Transferring  (9024 KB done)
dst-prd-filer1:AIX_02           src-acc-filer1:AIX_02        Snapmirrored   00:01:35   Idle
src-acc-filer1:SRC_ACC_ESX_ISO  dst-prd-filer1:ESX_ISOSTORE  Source         166:09:56  Idle

Troubleshooting

If you would run into this problem it is not possible to resync and you'll have to do a initialize:

Fri May 27 13:57:31 CEST [src-acc-filer1: replication.dst.resync.failed:error]: SnapMirror resync of AIX_01 to dst-prd-filer1:AIX_01 : no common snapshot to use as the base for resynchronization.
Snapmirror resynchronization of AIX_01 to dst-prd-filer1:AIX_01 : no common snapshot to use as the base for resynchronization
Aborting resync.

Set the target volume in restricted mode and initialize the snapmirror relationship.

Disaster Recovery

In this test we'll actually break the mirror, and use the original data. When the disaster has resolved, we'll use SnapMirror to replicate the changed data to the original source, and then restore the original mirror. So that will take these steps:

  1. Break the mirror
  2. Use the data
  3. Sync the data back from the destination to the source
  4. Restore the original SnapMirror relationship, making the source the source again.

As this sceanrio is in the beginning the same as with the ESX ISO Store scenario we'll just focus on syncing the data back to the source and then restoring the original SnapMirror relationship. The ISO store is actually a good example since we also had some ISOs in the production site that we didn't have at the acceptance site. So I'm adding these ISO's to the store, and then I'll sync the data back.

Sync Data Back

If original source is available and data is present, including the snapshots: On original source:

src-acc-filer1> snapmirror status
Snapmirror is on.
Source                          Destination                  State          Lag        Status
dst-prd-filer1:AIX_01           src-acc-filer1:AIX_01        Snapmirrored   00:06:33   Transferring  (26 MB done)
dst-prd-filer1:AIX_02           src-acc-filer1:AIX_02        Snapmirrored   00:00:33   Idle
src-acc-filer1:SRC_ACC_ESX_ISO  dst-prd-filer1:ESX_ISOSTORE  Source         166:38:53  Idle
src-acc-filer1> snapmirror resync -S dst-prd-filer1:ESX_ISOSTORE src-acc-filer1:SRC_ACC_ESX_ISO
The resync base snapshot will be: dst-prd-filer1(0151762648)_ESX_ISOSTORE.3
These newer snapshots will be deleted from the destination:
    hourly.0
    hourly.1
    weekly.0
    nightly.0
    nightly.1
Are you sure you want to resync the volume? yes
Mon May 30 15:27:09 CEST [src-acc-filer1: snapmirror.dst.resync.info:notice]: SnapMirror resync of SRC_ACC_ESX_ISO to dst-prd-filer1:ESX_ISOSTORE is using dst-prd-filer1(0151762648)_ESX_ISOSTORE.3 as the base snapshot.
Volume SRC_ACC_ESX_ISO will be briefly unavailable before coming back online.
Mon May 30 15:27:10 CEST [src-acc-filer1: wafl.snaprestore.revert:notice]: Reverting volume SRC_ACC_ESX_ISO to a previous snapshot.
Revert to resync base snapshot was successful.
Mon May 30 15:27:11 CEST [src-acc-filer1: replication.dst.resync.success:notice]: SnapMirror resync of SRC_ACC_ESX_ISO to dst-prd-filer1:ESX_ISOSTORE was successful.
Transfer started.
Monitor progress with 'snapmirror status' or the snapmirror log.
src-acc-filer1> snapmirror status
Snapmirror is on.
Source                          Destination                     State          Lag        Status
dst-prd-filer1:ESX_ISOSTORE     src-acc-filer1:SRC_ACC_ESX_ISO  Snapmirrored   166:41:45  Transferring  (28 MB done)
dst-prd-filer1:AIX_01           src-acc-filer1:AIX_01           Snapmirrored   00:03:26   Idle
dst-prd-filer1:AIX_02           src-acc-filer1:AIX_02           Snapmirrored   00:03:25   Idle
src-acc-filer1:SRC_ACC_ESX_ISO  dst-prd-filer1:ESX_ISOSTORE     Source         166:41:45  Idle

If the original data is not available you can recreate the volume and start a complete new initialization: On original source:

snapmirror initialize -S dst-prd-filer1:ESX_ISOSTORE src-acc-filer1:SRC_ACC_ESX_ISO

Last Data Sync

Then stop all systems, make sure no data is written anymore and do: On original source:

src-acc-filer1> snapmirror update -S dst-prd-filer1:ESX_ISOSTORE src-acc-filer1:SRC_ACC_ESX_ISO
Transfer started.
Monitor progress with 'snapmirror status' or the snapmirror log.

Restore Original SnapMirror Relationship

On original source:

src-acc-filer1> snapmirror break SRC_ACC_ESX_ISO
snapmirror break: Destination SRC_ACC_ESX_ISO is now writable.
Volume size is being retained for potential snapmirror resync.  If you would like to grow the volume and do not expect to resync, set vol option fs_size_fixed to off.

On original destination:

dst-prd-filer1> snapmirror resync ESX_ISOSTORE
The resync base snapshot will be: src-acc-filer1(0151762815)_SRC_ACC_ESX_ISO.2
Are you sure you want to resync the volume? yes
Tue May 31 08:33:30 CEST [dst-prd-filer1: snapmirror.dst.resync.info:notice]: SnapMirror resync of ESX_ISOSTORE to src-acc-filer1:SRC_ACC_ESX_ISO is using src-acc-filer1(0151762815)_SRC_ACC_ESX_ISO.2 as the base snapshot.
Volume ESX_ISOSTORE will be briefly unavailable before coming back online.
Tue May 31 08:33:32 CEST [dst-prd-filer1: wafl.snaprestore.revert:notice]: Reverting volume ESX_ISOSTORE to a previous snapshot.
exportfs [Line 2]: NFS not licensed; local volume /vol/ESX_ISOSTORE not exported
Revert to resync base snapshot was successful.
Tue May 31 08:33:32 CEST [dst-prd-filer1: replication.dst.resync.success:notice]: SnapMirror resync of ESX_ISOSTORE to src-acc-filer1:SRC_ACC_ESX_ISO was successful.
Transfer started.
Monitor progress with 'snapmirror status' or the snapmirror log.

Cleanup the Temporary SnapMirror Relationship

Status original source:

src-acc-filer1> snapmirror status
Snapmirror is on.
Source                          Destination                     State          Lag        Status
dst-prd-filer1:ESX_ISOSTORE     src-acc-filer1:SRC_ACC_ESX_ISO  Broken-off     00:25:08   Idle
dst-prd-filer1:AIX_01           src-acc-filer1:AIX_01           Snapmirrored   00:01:16   Idle
dst-prd-filer1:AIX_02           src-acc-filer1:AIX_02           Snapmirrored   00:01:16   Idle
src-acc-filer1:SRC_ACC_ESX_ISO  dst-prd-filer1:ESX_ISOSTORE     Source         00:00:18   Idle

Status original destination:

dst-prd-filer1> snapmirror status
Snapmirror is on.
Source                          Destination                     State          Lag        Status
src-acc-filer1:SRC_ACC_ESX_ISO  dst-prd-filer1:ESX_ISOSTORE     Snapmirrored   00:01:00   Idle
dst-prd-filer1:ESX_ISOSTORE     src-acc-filer1:SRC_ACC_ESX_ISO  Source         00:25:50   Idle
dst-prd-filer1:AIX_01           src-acc-filer1:AIX_01           Source         00:01:58   Idle
dst-prd-filer1:AIX_02           src-acc-filer1:AIX_02           Source         00:01:58   Idle

So we have a broken-off snapmirror relationship we used for syncing the data from the destination back to the source. Usually a release would do, but not now:

src-acc-filer1> snapmirror release ESX_ISOSTORE src-acc-filer1:SRC_ACC_ESX_ISO
snapmirror release: ESX_ISOSTORE, src-acc-filer1:SRC_ACC_ESX_ISO: source is offline, is restricted, or does not exist

And when trying to delete this in FilerView I got this error:

Invalid Delete Operation: Schedule for 'src-acc-filer1:SRC_ACC_ESX_ISO' already deleted.

Looking into the snapmirror.conf I see that the relationship is not listed:

src-acc-filer1> rdfile /etc/snapmirror.conf
#Regenerated by registry Mon Apr 11 13:28:39 GMT 2011
dst-prd-filer1:AIX_01 src-acc-filer1:AIX_01 - 0-59/6 * * *
dst-prd-filer1:AIX_02 src-acc-filer1:AIX_02 - 0-59/6 * * *

But showing the snapshots shows that there are still two snapshots from the destination site:

src-acc-filer1> snap list SRC_ACC_ESX_ISO
Volume SRC_ACC_ESX_ISO
working...

  %/used       %/total  date          name
----------  ----------  ------------  --------
  0% ( 0%)    0% ( 0%)  May 31 08:55  dst-prd-filer1(0151762648)_ESX_ISOSTORE.2 (snapmirror)
  0% ( 0%)    0% ( 0%)  May 31 08:30  src-acc-filer1(0151762815)_SRC_ACC_ESX_ISO.2 (snapmirror)
  0% ( 0%)    0% ( 0%)  May 31 06:00  hourly.0
  0% ( 0%)    0% ( 0%)  May 31 00:00  nightly.0
  0% ( 0%)    0% ( 0%)  May 30 18:00  hourly.1
  0% ( 0%)    0% ( 0%)  May 30 15:27  src-acc-filer1(0151762815)_SRC_ACC_ESX_ISO.1
  0% ( 0%)    0% ( 0%)  May 30 00:00  weekly.0
  0% ( 0%)    0% ( 0%)  May 29 00:00  nightly.1
  0% ( 0%)    0% ( 0%)  May 23 16:45  dst-prd-filer1(0151762648)_ESX_ISOSTORE.3 (snapmirror)

As SnapMirror is based on snapshots from the destination we'll have to keep the snapshots starting with dst-prd-filer1 and removing the src-acc-filer1 snapshots since they were used for the temporary snapshot:

src-acc-filer1> snap delete SRC_ACC_ESX_ISO src-acc-filer1(0151762815)_SRC_ACC_ESX_ISO.2
Tue May 31 09:34:12 CEST [src-acc-filer1: wafl.snap.delete:info]: Snapshot copy src-acc-filer1(0151762815)_SRC_ACC_ESX_ISO.2 on volume SRC_ACC_ESX_ISO IBM was deleted by the Data ONTAP function snapcmd_delete. The unique ID for this Snapshot copy is (37, 1967).
src-acc-filer1> snap delete SRC_ACC_ESX_ISO src-acc-filer1(0151762815)_SRC_ACC_ESX_ISO.1
Tue May 31 09:34:26 CEST [src-acc-filer1: wafl.snap.delete:info]: Snapshot copy src-acc-filer1(0151762815)_SRC_ACC_ESX_ISO.1 on volume SRC_ACC_ESX_ISO IBM was deleted by the Data ONTAP function snapcmd_delete. The unique ID for this Snapshot copy is (33, 1852).
src-acc-filer1> snap list SRC_ACC_ESX_ISO
Volume SRC_ACC_ESX_ISO
working...

  %/used       %/total  date          name
----------  ----------  ------------  --------
  0% ( 0%)    0% ( 0%)  May 31 09:24  dst-prd-filer1(0151762648)_ESX_ISOSTORE.4 (snapmirror)
  0% ( 0%)    0% ( 0%)  May 31 06:00  hourly.0
  0% ( 0%)    0% ( 0%)  May 31 00:00  nightly.0
  0% ( 0%)    0% ( 0%)  May 30 18:00  hourly.1
  0% ( 0%)    0% ( 0%)  May 30 00:00  weekly.0
  0% ( 0%)    0% ( 0%)  May 29 00:00  nightly.1
  0% ( 0%)    0% ( 0%)  May 23 16:45  dst-prd-filer1(0151762648)_ESX_ISOSTORE.3 (snapmirror)

As you can see, the broken off snapmirror is gone now: On original source:

src-acc-filer1> snapmirror status
Snapmirror is on.
Source                          Destination                  State          Lag        Status
dst-prd-filer1:AIX_01           src-acc-filer1:AIX_01        Snapmirrored   00:05:27   Idle
dst-prd-filer1:AIX_02           src-acc-filer1:AIX_02        Snapmirrored   00:05:19   Idle
src-acc-filer1:SRC_ACC_ESX_ISO  dst-prd-filer1:ESX_ISOSTORE  Source         00:10:22   Idle

But on the original destination the relationship is still available. Note that this is the source for the relationship we're trying to remove:

dst-prd-filer1> snapmirror status
Snapmirror is on.
Source                          Destination                     State          Lag        Status
src-acc-filer1:SRC_ACC_ESX_ISO  dst-prd-filer1:ESX_ISOSTORE     Snapmirrored   00:11:41   Idle
dst-prd-filer1:ESX_ISOSTORE     src-acc-filer1:SRC_ACC_ESX_ISO  Source         01:06:30   Idle
dst-prd-filer1:AIX_01           src-acc-filer1:AIX_01           Source         00:06:46   Transferring  (32 MB done)
dst-prd-filer1:AIX_02           src-acc-filer1:AIX_02           Source         00:00:38   Idle

In this case a release works:

dst-prd-filer1> snapmirror release ESX_ISOSTORE src-acc-filer1:SRC_ACC_ESX_ISO
dst-prd-filer1> snapmirror status
Snapmirror is on.
Source                          Destination                  State          Lag        Status
src-acc-filer1:SRC_ACC_ESX_ISO  dst-prd-filer1:ESX_ISOSTORE  Snapmirrored   00:13:15   Idle
dst-prd-filer1:AIX_01           src-acc-filer1:AIX_01        Source         00:02:12   Idle
dst-prd-filer1:AIX_02           src-acc-filer1:AIX_02        Source         00:02:12   Idle
You could leave a comment if you were logged in.
snapmirrordata.txt · Last modified: 2013/02/25 21:20 by sjoerd