Converting the ONTAP Simulator to work in VirtualBox

This is a topic that comes up after just about every ONTAP release. The ONTAP simulator is only supported on VMware hypervisors, but with a little effort it can run in VirtualBox. Simulate ONTAP, also known as the ONTAP simulator, or the vsim, is distributed as virtual appliance in OVA format, but VirtualBox fails to import the OVA. Instead, it throws some variation of this error:

The underlying problem is that virtual box doesn’t connect IDE devices to the correct IDE ports unless they are presented in just the right order within the OVF xml file. This can be overcome by modifying the xml by to avoid the bug in VirtualBox, but that is tedious. There is another way. VirtualBox has a nice ‘vboxmanage’ command line interface that can be used to rebuild the VM, and then export it to a VirtualBox friendly OVA file. It is this more scriptable approach that will be used here.

Obtaining the Simulator

The simulator is available to NetApp customers, most partners, and employees. It is not available to guest accounts. Guest accounts can access the ONTAP Select evaluation as an alternative.

For ONTAP 9.7, the download is available from the beta support site at:

Previous versions can be downloaded from the tool chest at:

Scripting the OVA conversion

This example script is in bash, and uses the ONTAP 9.7 version of the simulator. It runs as-is on OS X but some adjustments may be needed for other operating systems or other versions of the simulator.

Grab the complete script from this GitHub repo, then follow along with the rest of the blog.

Lets start by defining some variable. These variables will need to be adjusted if working with a different version of the simulator. name is the base name of the ova file, without the extension. memory is the amount of ram to assign to the VM. The VMware version only needs about 5gb, but on VirtualBox a bit more ram is needed. The IDExx variables hold the names of the corresponding VMDK files within OVA archive.


And do a little sanity check to make sure the vboxmanage command is available

if [ -z "$(which vboxmanage)" ];then echo "vboxmanage not found";exit;fi

Now we can extract the contents of the OVA, which is just a tar file with some specific formatting constraints.

tar -xzvf "$name".ova

Some older versions of the simulator also gzip the VMDK files, so if you are working with an older version be sure to decompress the VMDK files as well.

Now use vboxmanage to create a new VM

vboxmanage createvm --name "$name" --ostype "FreeBSD_64" --register
vboxmanage modifyvm "$name" --ioapic on 
vboxmanage modifyvm "$name" --vram 16
vboxmanage modifyvm "$name" --cpus 2
vboxmanage modifyvm "$name" --memory "$memory"

Next add the NICs. Here the internal network, intnet, is used because it makes the conversion predictably scriptable. When the final ova is re-imported, adjust the networking to suite your needs.

vboxmanage modifyvm "$name" --nic1 intnet --nictype1 82545EM --cableconnected1 on
vboxmanage modifyvm "$name" --nic2 intnet --nictype2 82545EM --cableconnected2 on
vboxmanage modifyvm "$name" --nic3 intnet --nictype3 82545EM --cableconnected3 on
vboxmanage modifyvm "$name" --nic4 intnet --nictype4 82545EM --cableconnected4 on

Next add 2 serial ports. These are not needed on VMware, but ONTAP may hang on VirtualBox if these are not present.

vboxmanage modifyvm "$name" --uart1 0x3F8 4
vboxmanage modifyvm "$name" --uart2 0x2F8 3

The VirtualBox bios enumerate disks a little differently, so to maintain the original device order presented to the VM add an empty virtual floppy drive.

vboxmanage storagectl "$name" --name floppy --add floppy --controller I82078 --portcount 1 
vboxmanage storageattach "$name" --storagectl floppy --device 0 --medium emptydrive

And now we can add the IDE controller and attach those VMDK files.

vboxmanage storagectl "$name" --name IDE    --add ide    --controller PIIX4  --portcount 2
vboxmanage storageattach "$name" --storagectl IDE --port 0 --device 0 --type hdd --medium "$IDE00"
vboxmanage storageattach "$name" --storagectl IDE --port 0 --device 1 --type hdd --medium "$IDE01"
vboxmanage storageattach "$name" --storagectl IDE --port 1 --device 0 --type hdd --medium "$IDE10"
vboxmanage storageattach "$name" --storagectl IDE --port 1 --device 1 --type hdd --medium "$IDE11"

Now that the VM is built, vboxmanage can export it to an OVA file that VirtualBox can understand.

vboxmanage export "$name" -o "$name"-vbox.ova

And finally, remove the temporary VM from VirtualBox.

vboxmanage unregistervm "$name" --delete

The end result should be a -vbox.ova that can be cleanly imported into VirtualBox. But there are a couple of considerations when running the simulator on this platform.

First, this isn’t VMware, so the VMware tools service will throw errors at startup. That can be avoided by setting a variable at the loader prompt that disables that service.

setenv bootarg.vm.run_vmtools false

Second, the NICs are not enumerated in the order one would expect. Here is the mapping of VirtualBox network adapters to ONTAP ports:

Adapter 1: e0d
Adapter 2: e0a
Adapter 3: e0b
Adapter 4: e0c

And finally, beware the nat network. ONTAP has multiple ethernet ports that expect to be able to communicate with each other over the network, but the VirtualBox NAT network isolates each port to its own private broadcast domain. As a result the NAT network will not work.

Aside from that you should now be able to run a completely unsupported instance of the ONTAP Simulator in VirtualBox.

VMware NIC order may change after SuperMicro BIOS updates.

I encountered this issue over the holidays while doing some firmware and bios updates in the lab. A couple of my hosts are based on SuperMicro Xeon-D boards from the X10SDV line. These systems have 2x1gb and 2x10gb ports, and the original NIC order enumerated the 1gb ports first.

After updating the the latest BIOS (2.1), one of my ESX hosts did not come back online. I could see from the IPMI console the system had booted, but was not responding to pings. When I checked the management NICs, I discovered the order had changed.

I had to reconfigure my vSwitch uplinks to accommodate the new NIC order, by re-assigning the management uplink in the DCUI so I could get back into the hosts and fix everything else. I don’t know why one system re-ordered the NICs and the other did not, but I am now left with two otherwise identical hosts with different network uplink topologies. That is a mystery for another day, but if you are running SuperMicro based VMware hosts, proceed with caution.

Deploying ONTAP Select on KVM (on a NUC)

In my last post I went through the process for getting KVM installed and installing the ONTAP deploy VM. Deploying ONTAP Select is mostly a matter of stepping through a nice wizard, but I will have to make one adjustment in the swagger interface to deploy it on the NUC. Everything in here could be done with RESTful API calls, but unfamiliar things are easier to learn in a GUI.

After logging into a fresh install of the deploy utility you land at this workflow. If you bought a license this is where you would upload the license file. I don’t have any licenses to add, so I’ll run it as an eval cluster. Click Next.

The next step is to add the hypervisor hosts to the inventory, which in this case is just my KVM box. Fill in the form, click add, and wait for it to show up in the list on the right. Next.

This page defines the ONTAP Select cluster. In this example, its a single node cluster running 9.6 on KVM. Fill in the form and click Done.

Done doesn’t really mean done. It just advances to node setup. Under licenses pick Evaluation Mode, then fill out the hypervisor particulars.

Undersized hosts like the NUC may not appear in the the Hosts drop list. It can still be assigned to a host from the cli by connecting to the deploy instance over ssh:

(ONTAPdeploy) node modify -cluster-name otskvm -name otskvm-01 -host-name

Under storage, pick the Storage Pool from the drop list and assign some of its capacity to ONTAP. Don’t try to assign the entire capacity of the storage pool. ONTAP Select needs about 266GB for its system disks, which are not included in the information presented on this panel. Also to use a license type other than evaluation, the storage pool capacity needs to be set to at least 1tb. Factoring in the system disks the storage pool needs about 1.3TB available to accept a licensed instance of ONTAP Select. Here I am deploying in eval mode, and only assigning 500GB.
To move on, click Done.

The Next button will become enabled, and the final fields before ‘create cluster’ are the cluster admin password. If you are on a host with 6 cores, click create cluster and you’re done. If you are following along on a quad core host like a NUC, we need to use the swagger interface to change a setting that is not exposed in the GUI.

When deploy creates the VM, it will reserve a full 4 cores worth of CPU. This creates a VM with optimal performance, but on a host that only has 4 cores we need to dial that back a bit. Note that this should not usually be done in production. If you need to do this in production check in with your account team first to make sure your scenario can be supported.

To access the swagger interface, select “API Documentation” from the help menu. This is where you can access all of the API documentation and try out API calls along the way.

In the swagger interface, scroll down toward the bottom and expand the clusters section.

Find the GET /clusters section, and click “Try it out!”

Record the cluster’s id. It becomes an input on the next API call. Scroll down to GET /clusters/{cluster_id}/nodes. Fill in the cluster ID from the first API call, and click “Try it out!”. The output returned will have the id of the node.

Now that we have both the cluster id, and the node id, we can adjust the reservations setting on the node. Scroll on down to:
PATCH /clusters/{cluster_id}/nodes/{node_id}

Fill in the cluster id, the node id, and the changes shown here. Valid values for cpu_reservation_percentage are 25, 50,75, 100, with 100 being the default.

Once again click “Try it out!”, but this time look for {} in the response body and a response code of 200.

Now switch back to the deploy GUI, pick a cluster admin password, and click create cluster. It will take a while to deploy, but eventually should end in a successful deployment:

It will take several minutes for the cluster’s ONTAP System Manager web interface to become available on the cluster management IP you specified. Be patient and remember to connect over https. There is even a link to it on the clusters tab of the ONTAP deploy UI.

Once you have access to the ONTAP system manager, provisioning storage services is the same as it is in any other ONTAP system. For a walkthrough of setting up CIFS services, see this post.

Installing ONTAP Select Deploy on KVM (on a NUC)

In a previous series of posts I built an ESX host on a NUC and used it to run ONTAP Select. This time around I’ll do it on KVM. This is one of those ‘prove it actually works’ posts, because I keep hearing it doesn’t work. That may have been true at one time, but with a quarterly release cadence this is a product that evolves fairly quickly. This post will cover installing KVM and the ONTAP deploy utility, the next post will cover the actual ONTAP Select deployment.

ONTAP Select is supported on KVM so this is mostly just a matter of following the instructions, but the NUC platform brings a few challenges. It only has 4 cores, and it only has 1 NIC, which just like on VMware is a little below the documented system requirements. Unlike VMware, there is no “standalone eval” image. This time I’ll build it the proper way, using the ONTAP Deploy utility VM. But first, I need to get KVM up and running.

The hardware specifications are the same as the VMware build:

NUC8i5BEH, (4 cores, 8 threads)
512GB NVME drive
1TB SSD drive
Note: To deploy a licensed instance of ONTAP Select a 2TB SSD would be needed.

For this build I chose Centos 7.6 and these install options, based entirely on personal preference:
Server with GUI
+ Virtualization Client
+ Virtualization Hypervisor
+ Virtualization Tools
+ System Administration Tools

I installed Centos to the NVME drive, saving the SATA SSD for later.

During setup I created a local account called ‘admin’ and set the password for root.

Later in the process I will be creating a bridge for openvswitch and adding the sole 1gb NIC, which will drop the wired network connectivity to the KVM host. So to be able to do that work over SSH I will use the Wifi adapter for host management, and assign the wired interface to a link-local only address.

Time to build KVM. Start by opening an ssh session into the host as admin, and switch to root:


Next use yum to install all the dependancies:

yum install -y qemu-kvm libvirt openvswitch virt-install lshw lsscsi lsof

If openvswitch is missing from the repo, you can either build it from source or grab it from the community build service. This post is long enough without a build-from-source detour, so I’ll grab it from the CBS.

yum install openvswitch-2.7.3-1.1fc27.el7.x86_64.rpm

Next create a storage pool using that data SSD, which on this platform is /dev/sda

virsh pool-define-as select_pool logical --source-dev /dev/sda --target=/dev/select_pool
virsh pool-build select_pool
virsh pool-start select_pool
virsh pool-autostart select_pool

Now setup openvswitch

systemctl start openvswitch
systemctl enable openvswitch
ovs-vsctl add-br br0
ifdown eno1
ovs-vsctl add-port br0 eno1
ifup eno1

Set the queue length rules required for ONTAP Select

echo "SUBSYSTEM=="net", ACTION=="add", KERNEL=="ontapn*", ATTR{tx_queue_len}="5000"" > /etc/udev/rules.d/99-ontaptxqueuelen.rules
cat /etc/udev/rules.d/99-ontaptxqueuelen.rules

Thats it for KVM. Now for the ONTAP Deploy VM. ONTAP Deploy is part deployment utility, part HA mediator, and part license server. It is the standard supported way to deploy ONTAP Select regardless of the hypervisor. Deploy does not have to run on the same host as Select. One deploy instance can manage about 100 instances of Select in an enterprise environment.

A raw image is available for running the Deploy VM on KVM, which you can get from the evaluation section of the Netapp support site, or you can get a 90day eval here: . Start by downloading the ONTAPdeploy raw.tgz file on your local machine and copying it over with scp:

scp ~/Downloads/ONTAPdeploy2.12.1.raw.tgz admin@

And now back over on the ssh session to the KVM host…
Extract the tgz:

cd /home/admin
tar -xzvf ONTAPdeploy2.12.1.raw.tgz

Give it a home:

mkdir /home/ontap
mv ONTAPdeploy.raw /home/ontap

And use virt-install to build a VM around it:

virt-install --name=ontapdeploy --vcpus=2 --ram=4096 --os-type=linux --controller=scsi,model=virtio-scsi --disk path=/home/ontap/ONTAPdeploy.raw,device=disk,bus=scsi,format=raw --network "type=bridge,source=br0,model=virtio,virtualport_type=openvswitch" --console=pty --import --wait 0

Set it to autostart:

virsh autostart ontapdeploy

Next use the virsh console to complete the VMs setup script:

virsh console ontapdeploy

The setup script will look something like this:

Connected to domain ontapdeploy
Escape character is ^]
That does not appear to be a valid hostname
Host name            : ontapdeploy
Use DHCP to set networking information? [n]: n
Net mask             :
Gateway              :
Primary DNS address  :
Secondary DNS address: 
Please enter in all search domains separated by spaces (can be left blank):
Selected IP           :
Selected net mask     :
Selected gateway      :
Selected primary DNS  :
Selected secondary DNS: 
Search domains        : 
Calculated network    :
Calculated broadcast  :
Are these values correct? [y]: y
Applying network configuration. Please wait...
Continuing system startup. Please wait...
Debian GNU/Linux 9 ontapdeploy ttyS0
ontapdeploy login:

The GUI should be available now over https on the specified address. The default credentials are:
username: admin
password: admin123
Log in once now to change the default password, and the system will be ready to deploy ONTAP Select.

Creating a new Active Directory Forest with Ansible

Building new AD forests isn’t something most of us do often enough to need to automate it, but recently I was talking to a good friend and a fellow homelabber who needed to provision some new domains in his lab. I do this a lot because every lab environment I build gets its own AD forest. When I told him I’d been automating it with Ansible he suggested I write it up for the blog.

This playbook creates a new domain in a new forest from a freshly provisioned VM, like the one built in my previous post on building windows VMs with Ansible.

The beginning of the playbook defines all the variables needed to provision the new AD Forest. In practice I keep them in a vars file, but to simplify the example playbook I put them in-line.

- name: Create new Active-Directory Domain & Forest
  hosts: localhost
    dc_netmask_cidr: 24
    dc_hostname: 'dc01'
    domain_name: "demo.lab"
    local_admin: '.\administrator'
    temp_password: 'Changeme!'
    dc_password: 'P@ssw0rd'
    recovery_password: 'P@ssw0rd'
    reverse_dns_zone: ""
    ntp_servers: ","
  gather_facts: no

Part of the process of preparing this VM to become a domain controller involves setting a static IP, changing its hostname, and changing its password, so I use Ansible’s dynamic inventory rather than a static inventory file.

First I add it to inventory using the VMs original IP and password:

  - name: Add host to Ansible inventory
      name: '{{ temp_address }}'
      ansible_user: '{{ local_admin }}'
      ansible_password: '{{ temp_password }}'
      ansible_connection: winrm
      ansible_winrm_transport: ntlm
      ansible_winrm_server_cert_validation: ignore
      ansible_winrm_port: 5986
  - name: Wait for system to become reachable over WinRM
      timeout: 900
    delegate_to: '{{ temp_address }}'

Next set the static IP. This task does not have windows Ansible coverage, so it uses win_shell, which in turn runs the command under Powershell.

  - name: Set static IP address
    win_shell: "(new-netipaddress -InterfaceAlias Ethernet0 -IPAddress {{ dc_address }} -prefixlength {{dc_netmask_cidr}} -defaultgateway {{ dc_gateway }})"
    delegate_to: '{{ temp_address }}'  
    ignore_errors: True 

This command will always return failed because once the IP changes Ansible can’t check the results of the task. Just set ignore_errors: true and let it time out. Next Add the host back in to inventory under its new IP address:

  - name: Add host to Ansible inventory with new IP
      name: '{{ dc_address }}'
      ansible_user: '{{ local_admin }}'
      ansible_password: '{{ temp_password }}'
      ansible_connection: winrm
      ansible_winrm_transport: ntlm
      ansible_winrm_server_cert_validation: ignore
      ansible_winrm_port: 5986 
  - name: Wait for system to become reachable over WinRM
      timeout: 900
    delegate_to: '{{ dc_address }}'

Next set the local administrator password. This password will become the domain admin password later when the system is promoted to a domain controller.

  - name: Set Password
      name: administrator
      password: "{{dc_password}}"
      state: present
    delegate_to: '{{ dc_address }}'
    ignore_errors: True  

Once again re-add it to inventory using its new IP address:

  - name: Add host to Ansible inventory with new Password
      name: '{{ dc_address }}'
      ansible_user: '{{ local_admin }}'
      ansible_password: '{{ dc_password }}'
      ansible_connection: winrm
      ansible_winrm_transport: ntlm
      ansible_winrm_server_cert_validation: ignore
      ansible_winrm_port: 5986 
  - name: Wait for system to become reachable over WinRM
      timeout: 900
    delegate_to: '{{ dc_address }}'

Next set the upstream DNS servers. These will become the DNS forwarders once the AD integrated DNS server is installed.

  - name: Set upstream DNS server 
      adapter_names: '*'
      - '{{ upstream_dns_1 }}'
      - '{{ upstream_dns_2 }}'
    delegate_to: '{{ dc_address }}'

Next set the upstream NTP servers. Domain controllers should reference an authoritative time source.

  - name: Stop the time service
      name: w32time
      state: stopped
    delegate_to: '{{ dc_address }}'
  - name: Set NTP Servers
    win_shell: 'w32tm /config /syncfromflags:manual /manualpeerlist:"{{ntp_servers}}"'
    delegate_to: '{{ dc_address }}'  
  - name: Start the time service
      name: w32time
      state: started  
    delegate_to: '{{ dc_address }}'

Now before proceeding disable the windows firewall. Otherwise the domain firewall policy will prevent later tasks from succeeding after the system reboots. You can re-enable it and set rules to your liking once the playbook is complete.

  - name: Disable firewall for Domain, Public and Private profiles
      state: disabled
      - Domain
      - Private
      - Public
    tags: disable_firewall
    delegate_to: '{{ dc_address }}'

Before promoting a system to a DC, you should set its hostname. Its much simpler to rename it before it becomes a domain controller. These tasks update the hostname, and reboot if required.

  - name: Change the hostname 
      name: '{{ dc_hostname }}'
    register: res
    delegate_to: '{{ dc_address }}'
  - name: Reboot
    when: res.reboot_required   
    delegate_to: '{{ dc_address }}'

Now you are ready to install active directory and create the domain.

  - name: Install Active Directory
    win_feature: >
    register: result
    delegate_to: '{{ dc_address }}'
  - name: Create Domain
    win_domain: >
       dns_domain_name='{{ domain_name }}'
       safe_mode_password='{{ recovery_password }}'
    register: ad
    delegate_to: "{{ dc_address }}"
  - name: reboot server
     msg: "Installing AD. Rebooting..."
     pre_reboot_delay: 15
    when: ad.changed
    delegate_to: "{{ dc_address }}"

Once the system reboots there are a few little cleanup tasks. First domain controllers should use themselves as the DNS server. This should get set during dc_promo, but I like to be sure it gets set:

  - name: Set internal DNS server 
      adapter_names: '*'
      - ''
    delegate_to: '{{ dc_address }}'

Next create the reverse lookup zone for the local subnet. The forward lookup zones get created automatically, but the reverse zones do not. Note the retries on this step. At this point in the process the system has rebooted after becoming a domain controller. It takes a while for it to really be ready to continue.

  - name: Create reverse DNS zone
    win_shell: "Add-DnsServerPrimaryZone -NetworkID {{reverse_dns_zone}} -ReplicationScope Forest"
    delegate_to: "{{ dc_address }}"    
    retries: 30
    delay: 60
    register: result           
    until: result is succeeded

And the final step in my process is to make sure RDP is enabled so I can remote in and do any one-off customizations:

  - name: Check for xRemoteDesktopAdmin Powershell module
      name: xRemoteDesktopAdmin
      state: present
    delegate_to: "{{ dc_address }}"
  - name: Enable Remote Desktop
      resource_name: xRemoteDesktopAdmin
      Ensure: present
      UserAuthentication: NonSecure
    delegate_to: "{{ dc_address }}"

Thats the process end to end from newly installed windows server to newly provisioned Active Directory forest. The complete playbook is in the examples repo on my github.

Running an ONTAP Select eval cluster on a NUC

In this post, I’ll give a little overview of ONTAP Select, how to get a free eval copy, and how to deploy it on a NUC or other small lab host. I’ll also go through some getting started steps to take the ONTAP Select instance through deployment and on to serving data. This builds on a recent post that covered the install of ESXi on the NUC and turns it into a storage appliance running ONTAP.

About ONTAP Select

ONTAP Select is the ONTAP operating system, running in a Virtual Machine on an ESXi or KVM host. This is the same ONTAP operating system that runs on NetApp FAS and AFF engineered systems, and in the major cloud providers as Cloud Volumes ONTAP. ONTAP can run just about anywhere, but the accessibility of ONTAP Select makes it a great platform for running ONTAP in the homelab. You can use it to learn ONTAP, try out new releases, or just to add some feature rich storage to your lab.

System Requirements

What I need:
According to the documentation, to run a single node ONTAP Select cluster my VMware host needs:
– 2 x 1GbE NICs
– Six physical cores or greater, with four reserved for ONTAP Select
– 24GB or greater with 16GB reserved for ONTAP Select.
– A hardware raid controller or enough internal SSDs to enable software raid

What I have:
– 1 x 1GbE NIC
– 4 physical cores
– 64GB RAM
– 1xNVME drive + 1xSATA SSD drive

So the NUC doesn’t quite meet the documented system requirements, but I’ll make it work anyway.

Obtaining ONTAP Select

ONTAP Select has a free 90 trial available at the following this link:
If you have an existing account on the NetApp support site, you can download it from the evaluation section of the support site. If you need to create a guest account, follow the instructions on the 90 day trial link to get access. Once logged in you’ll find it in the product evaluation section, or just follow this link:
The version I’ll be installing is the Standalone Eval OVA:

Just to be clear, this is not the way a licensed version would be installed. A licensed version would be installed using the ONTAP Deploy utility OVA, which is part deployment tool, part HA mediator, and part license manager. It is possible to install a properly licensed ONTAP Select instance on a NUC, but that is a topic for another day. Today is about having some fun with the Standalone Eval version.

Deploying ONTAP Select

Now that we have the OVA file downloaded, we can just deploy it like any other OVA. From a resource standpoint, the VM requires 4 vCPUs, 16gb of RAM, and 300gb+ of disk space, which is just within reach of a small platform like the intel NUC.

The datastore really should be a local SSD or NVME based datastore for performance reasons, so I will be deploying it to the internal NVME drive on my NUC. The VM will need about 302GB for a thick provisioned deployment.

Connect it to your VM Network, and choose the default deployment type, “ONTAP Select Evaluation – Small”.

The additional settings page contains the IPs and hostname that will be used to create the cluster.

Clustername: ONTAP Select clusters can contain 1,2,4,or 8 nodes. This field specifies the name of the cluster, not the underlying node(s).
Data Disk Size for ONTAP Select: This is the size of the virtual disk that will be used to store user data. The default for the eval is 100gb, but you can increase it if more space is available.
Cluster Management IP address: This is the primary management IP for the cluster, regardless of the number of nodes it contains.
Node Management IP address: This IP is used to manage the individual node.
Netmask and Gateway: Set these to match your VM Network subnet.
Node Name: Each node in an ONTAP cluster is assigned a unique name. This cluster will only contain one node, so give it a name.
Administrative Password: This sets the initial password for the cluster’s ‘admin’ account. Use at least a mix of letters and numbers. The deployment may fail if the password is too simple.

Continue on with deployment, then open up the VM console and wait for the login prompt:

You won’t typically use the VM console window after the initial deployment. Instead you would either SSH to the cluster management IP, or log into the ONTAP System Manager GUI in a browser.

Accessing the ONTAP System Manager GUI

After a few minutes the System Manager login should be available on the Cluster Management IP address. By default, the interface is only available via HTTPS. Login as user admin, with the password specified in the OVA deployment workflow.

Once logged in to the ONTAP System Manager GUI, you’ll be at the cluster’s dashboard page:

Preparing to serve data

ONTAP Clusters are a platform for running Storage Virtual Machines (SVMs, also known as vservers). The SVMs provide the actual user facing data services like CIFS, NFS, iSCSI, etc. Before we can create an SVM, we need a data aggregate. To use a virtualization frame of reference, if SVMs are analogous to VMs, data aggregates are analogous to datastores. On larger systems, data aggregates are an ‘aggregate’ of one or more raid groups. In the case of this little ONTAP Select instance, the data aggregate will be a single virtual disk in raid0.

Navigate to Storage->Aggregates & Disks->Aggreagates, then click create:

At this point there will only be 1 disk available, so give the aggregate a name. In this example I called it aggr1. Then click submit.

Adding more storage

My NUC has a SATA SSD in addition to the NVME drive where I deployed the ONTAP Select VM. I could make a datastore on that SSD, put a large VMDK on that datastore, and attach it to the ONTAP Select VM. But I actually prefer to just RDM the whole disk. That takes a little CLI work on the ESX host.

After enabling SSH on my ESX host and logging in over SSH, I can find the sata drive’s device identifier:

And use vmkfstools to create a passthru RDM for that device:

I can then attach that VMDK to my ONTAP Select VM and use it to create another data aggregate. Edit the VM, add a hard disk, and pick “Existing Hard Disk”. Browse to the RDM disk and add it to the VM.

That disk will show up as an unassigned disk in ONTAP, which I can later assign to my node and use to create another data aggregate.

Navigate to Storage->Aggregates & Disks->Disks, select the unassigned disk from the list and click assign.

In the Assign Disks dialogue, click Assign. Then the disk will be available in the aggregate create workflow.

Setting Reservations

The ONTAP Select VM needs some reservations to protect it from other VMs that may be running on the host. In a production deployment, 100% of CPU and RAM would be reserved, but on this tiny platform that is not feasible. We can and should reserve 100% of the RAM, and at least ~25% of the CPU. This host has ~8ghz available over 4 cores, so I’ll set my CPU reservation at 2000, and my memory reservation at 16GB.

Treating it like an appliance

Since I’ll be treating this host like a home lab storage appliance, I will set this VM to start and stop with host. Enable autostart, make this VM start first, and set the shutdown behaviour to ‘shut down’ to allow ONTAP to shutdown gracefully.

Passing VLANs to ONTAP

For ONTAP Select to support VLANs like the hardware appliances do, it needs to be attached to a VMware port group assigned to VLAN 4095. This configures the port group as a VLAN trunk, and allows VMs to handle VLAN tags on their own. VMware calls this configuration “Virtual Guest Tagging” or VGT. If you want that, configure a port group as shown below and connect the Select VMs NICs to that port group.

Creating a Storage Virtual Machine

If you make SVMs all the time you might want to skip on to the conclusion, otherwise read on for a walkthrough of provisioning an SVM.
Start by navigating to Storage->SVMs, and click create:

On the first page of the SVM Setup wizard, specify the SVM name, which protocols to enable, and the DNS domain and name severs required to join active directory. It is possible to run CIFS in workgroup mode, but that feature is only available at the command line. For this example I’ll just enable CIFS, join AD, and create my first share.

On the next page of the wizard provide the CIFS configuration details, starting with the IP address to use for CIFS access. In the Assign IP address drop list, select “Without a subnet”, then fill out the IP information in the Add Details box. Then click OK.

This creates a logical interface, which needs to be assigned to a port. Ports in ONTAP are named e0a,e0b,e0c, etc. Click browse next to the Port: box and pick e0a.

Next fill in the CIFS server details. The CIFS Server Name is the name of the computer account it will create in Active Directory, and the remaining fields are your active directory details. You also have the option of creating an initial CIFS share as part of the SVM setup wizard.

On the Admin details page of the wizard, enter a password for the vsadmin account. Each SVM has its own administrative account that can be used for delegation, or integration with other applications.

Click submit and continue, then OK on the final confirmation page, and your new SVM will be created, along with that initial share.

This was a deliberately simple example. To learn more about ONTAP and ONTAP Select, see the resources available on


The end result of this little lab adventure is an Intel NUC or similar small ESX host, configured to act as an appliance running a single node ONTAP Select cluster, with a couple of RAID0 data aggregates, support for CIFS, NFS and iSCSI, and all the data management features you would get in an enterprise class storage system. It may only last for 90 days, but that’s long enough to learn how to use SnapMirror. With storage efficiencies applied, the results can be pretty impressive. Here is a screenshot from one filled with nested lab VMs.

I have covered the steps to build this lab box interactively, but I’ll revisit this in a future post and replace all these tasks with Ansible so I can spin up a new lab box just by running a playbook. After all, this build has several single points of failure so I’ll need something to SnapMirror to for data protection.

Anatomy of a Virtual Lab Environment

Virtual Labs are everywhere. VMware has HOL (Hands on labs), Microsoft has Hands-on Labs, Cisco has dCloud, NetApp has labondemand, and on and on. They’re great for making complete lab environments available for demos, training, and study. But how do they work, and how can they be scaled down to run in a homelab?

Virtual Labs generally have a few things in common. They have isolated network(s) internal to the lab, they contain a collection of pre-configured VMs, and they are accessed via some sort of jump host. Virtual labs are typically cloned into multiple instances, with every lab instance containing an identical set of VMs and networks.

In this simple example, each lab instance has an identical set of VMs, with identical IP addresses. Each lab’s gateway connects it to the transit network, and lab users connect to their lab instance through a remote display protocol.

For this scheme to work, each lab needs an isolated internal network. In fact the VMs within these labs should be completely identical, down to the mac addresses of their NICs. There are lots of ways this could be accomplished, with VXLAN and NSX at the top of the list, but those are heavyweight solutions at homelab scale. Instead I’ll take a simpler approach, and just use portgroups on an isolated vSwitch to achieve network isolation between lab instances.

Here is a diagram of what those 3 lab instances might look like from a vSwitch perspective:

Each lab instance’s internal lab network is backed by an individual port group, with a unique vlan assignment. A virtual router acts as the lab gateway, with the router’s LAN port connected to the instance’s network, and the router’s WAN port connected to the VM Network. The router provides NAT to the lab instance, and a simple RDP port forward to the jump host facilitates remote access to the lab environment.

One caveat of using this strategy is that lab instances cannot span hosts. If there was a need for an individual instance to span hosts, the networks would need to be VLAN backed, or provisioned with an overlay technology. In practice there are other reasons to keep the VMs within a given instance running on the same host, so this isn’t really a limitation, but it does mean there needs to be a way to group these VMs together so they always run on the same host. I use two VMware features to accomplish this, vApps, and affinity rules.

Here I’ve grouped my lab instance VMs into vApp containers.

vApps can also be cloned using the vCenter UI, providing an easy way to provision more instances. Many of my lab topologies are too complex to survive vApp cloning, but its a good way to get started. If you pre-provision a network for the new lab clone, you can map the VMs to that network as part of the New vApp wizard.

Next I can create an affinity rule, to keep all the child VMs of that vApp on the same host. But since I have cloned my vApps, all of the child VM names are the same, and the vCenter UI for creating affinity rules cannot distinguish one from another. In this case, its much easier to just create the rule with a little snippet of PowerCLI:

New-DrsRule -Cluster "Lab Cluster" -Name "VirtualLab-Instance1" -KeepTogether $true -VM (get-vapp "VirtualLab-Instance1" | get-vm)

So far I’ve covered the general anatomy of a virtual lab, with an emphasis on the networking aspects, and an approach to implementing them on a small scale suitable for a home lab. There is a lot more to cover on this topic. The configuration of the virtual router serving as the gateway, strategies for configuring VMs to survive this kind of cloning, ways to optimize active directory for cloning and long term storage, and strategies for automated provisioning are all important topics. I also have a project on my github with my virtual lab automation and provisioning portal, if you want to see how I really do things in my home lab. It’s a perpetual work in progress, but for now I’ll leave off with a screenshot of the dashboard.

Running ESXi 6.7 on a Bean Canyon Intel NUC NUC8i5BEH.

With 4 cores, 8 logical CPUs and up to 64 GB of ram the 8th generation i5 and i7 Intel NUCs make nice little home lab virtualization hosts. This week I rebuilt one of mine and documented the build process.


In terms of components, there is not much to it. The NUC includes everything except ram and storage on the motherboard. The components I chose for this build are listed below.

  • NUC8i5BEH
  • 64GB (2x32GB) SoDIMM (M471A4G43MB1)
  • A 32GB USB stick for the ESXi boot disk
  • Local Storage: (optional)
    • Samsung 970 PRO NVME M.2 drive, 512GB
    • Samsung 960 EVO SSD drive, 1TB

Everything is easily accessible for installation. Loosen 4 screws to remove the bottom cover and everything can be assembled in minutes.

Bios settings:

Next boot into the BIOS and update it if needed. The BIOS hotkey is F2. If the NUC doesn’t detect a monitor at boot the video out may not work, so plug in and turn on the monitor before powering up the NUC. I have already updated the BIOS on this one, but it is easy to do. Just put the bios file on a USB stick and install it from the Visual BIOS.

There are a few BIOS settings that should be adjusted to make things go smoothly. First, to reliably boot ESXi from the USB stick, both UEFI boot and Legacy Boot should be enabled.

Next, on the boot configuration tab, enable “Boot USB devices first”:

Next head over to the Security tab and uncheck “Intel Platform Trust Technology”. The NUC doesn’t have a TPM chip, so if you don’t disable this you’ll get a persistent warning in vCenter: ” TPM 2.0 device detected but a connection cannot be established. “

On the Power tab you’ll find the setting that controls what happens after a power failure. By default it will stay powered off. For lab hosts I set it to ‘last state’, for appliance hosts like my pfSense firewall, I set it to always power on.

ESXi Installation:

ESXi 6.7U1 works out of the box, with no special vibs or image customization required. There is really nothing unique to see here so I’ll skip on to configuration.

ESXi Configuration:

Once ESX is up and running you can see the 64GB ram kit is working despite the 32GB limit in Intel’s documentation.

Because I have internal storage, I create a datastore ‘datastore1’ on the NVME drive. I’m saving the SATA SSD for a later project so I am leaving it alone for now.

Next there are a few settings in ESXi that are worth pointing out. First, set the swap location to a datastore. This avoids some situations where a patch may fail to install due to lack of swap.

Similarly the logs should be moved to a persistent location, here I’ll put them on datastore1. These settings are found in the “Advanced settings” on the system tab shown above. Note that I had to pre-create the directory structure on datastore1 before applying this setting.

The next few settings are less conventional, and not recommended for production, but make life easier in the lab. I’ll explain my reasoning for each and you can decide for yourself how you like your systems configured.

First up is salting. Salting is used to deliberately break transparent page sharing (TPS) in an effort to improve the default security posture of the host. But this isn’t production, it’s a home lab. I fully expect to over-commit memory on this little host, so if I can gain any efficiency by re-enabling transparent page sharing and letting it de-dupe the ram across VMs, I’ll take it.

Next is the BlueScreenTmeout. By default if an ESXi host panics (a PSOD), it will sit on the panic screen forever so you can diagnose the error and so the host doesn’t go back into service until you’ve had a chance to address the problem. But for these little NUCs I run them headless, and they don’t have IPMI or even vPro. I would have to plug in a monitor and reboot it anyway to get at the console, so I would rather it just reboot so I can access it over the network. On this setting 0=never reboot, >0=seconds to wait before reboot. I’m going with 30 seconds:

And finally I will enable SSH and disable the resulting shell warning. I frequently connect to my lab hosts over SSH, so I prefer to leave SSH enabled. Again, this isn’t something you would do in production. It is purely a lab convenience.

For both of the TSM services, I set the policy to “Start and Stop with Host”, then start the service.

The UI will continue to warn that these services are running. This setting disables that warning:

Networking Options:

The built-in 1gb network adapter may not be enough for every lab scenario, but network connectivity is not limited to the single onboard gigabit NIC. The Apple Thunderbolt adapter and Apple Thunderbolt NIC are fully functional:

Or you can install the USB Network driver fling, and add some USB 3 gigabit nics, with some caveats around jumbo frame support and a few other things you can read about while you download the driver.

Here’s a screenshot from a NUC I was testing with 4x1gb connectivity provided by 2 USB3 adapters, the Apple dongles, and the onboard NIC.

10GB is also option, with a working Thunderbolt3 adapter and driver. William Lam over has been testing some of the 10GB options.


The current generation of NUCs are surprisingly capable and configurable little ESXi lab hosts. This is how I build mine, but if you’ve got other ideas share them in the comments.

Deploying the vCenter Server Appliance OVA with Ansible

The goal of this playbook is to deploy the VCSA virtual appliance without any additional user interaction, but the strategy used here will work with any OVA that leverages OVF properties as part of its deployment.

The Ansible module that does the bulk of the work is vmware_deploy_ovf. The outline is pretty straightforward:

  - vmware_deploy_ovf:
      hostname: '{{ esxi_address }}'
      username: '{{ esxi_username }}'
      password: '{{ esxi_password }}'
      name: '{{ vcenter_hostname }}'
      ovf: '{{ vcsa_ova_file }}' 
      wait_for_ip_address: true
      validate_certs: no
      inject_ovf_env: true
        property: value
        property: value
    delegate_to: localhost

Setting ‘inject_ovf_env’ to ‘true’ will pass the properties to the VM at power on. We just have to know what properties the virtual appliance is expecting. To get those properties, we have to pull apart the OVA and examine its ovf xml file.

In the case of the VCSA, first we need to grab the OVA file. It’s located in the vcsa folder within the VCSA ISO.

Next we need to extract the OVA to get at the ovf file. OVA files are tar archives with a specific set of constraints, so anything that can extract a tar can extract an OVA.

The OVF file defines the virtual machine(s), deployment options, and properties that make up the virtual appliance. It is really just an xml file, and it can be viewed in any text editor.

The first thing to look for in a OVF file is the DeploymentOptionSection. This section is not always present, but if an OVF supports multiple deployment options they are defined here. In the case of the VCSA, there are a lot of deployment options to choose from. Most of the time for my homelab or nested lab scenarios the one I want is ‘tiny’.

This becomes the first value in the property list in the playbook:

        DeploymentOption.value: 'tiny'

Next move on to the property section of the xml:

The property name is the value of ovf:key, and the default value is defined in ovf:value. Expected values are usually explained by the <description> text, which is what the OVF deployment UI in vCenter would present to a user performing this task interactively. Notice that the value of ovf:userConfigurable is not always true. Some values are not intended to be user configurable. Also some value are not applicable to all deployment options. In the case of the VCSA several properties are specific to upgrades, so I’ll ignore those.

The VCSA has one property that has userConfigurable:false that we need to configure anyway to make the deployment fully automatic:

Currently, Ansible allows properties to be configured in the playbook regardless of the userConfigurable setting, at least when the deployment target is an ESXi host. If you ever need to flip userConfigurable to true for some property, the OVF file itself can be edited, but doing so invalidates the hashes in the .mf (manifest) file within the OVA. In this case just delete the .mf file and deploy the edited ovf file instead of the original ova file.

Here are the relevant properties extracted from the .ovf, formatted for the vmware_deploy_ovf module:

      inject_ovf_env: true
        DeploymentOption.value: '{{ vcsa_size }}' 'ipv4' 'static' '{{ vcenter_address }}' '{{ vcenter_fqdn }}' '{{ net_prefix }}' '{{ net_gateway }}' '{{ dns_servers }}' 
        guestinfo.cis.appliance.root.passwd: '{{ vcenter_password }}' 
        guestinfo.cis.ceip_enabled: "False"
        guestinfo.cis.deployment.autoconfig: 'True' 
        guestinfo.cis.vmdir.password: '{{ vcenter_password }}' 
        domain: '{{ domain }}'
        searchpath: '{{ searchpath }}'

I’ve substituted Ansible variable names so I can define these in the vars file associated with this playbook.

Once the OVF deploys, the module can wait for an IP address before continuing. Sometimes this is good enough, but the VCSA may take another 20 minutes or so before it is actually usable. To make sure it is at least ready to take API calls before continuing, I will run vmware_about_facts once a minute until it succeeds.

  - name: Wait for vCenter
      hostname: '{{ vcenter_address }}'
      username: 'administrator@vsphere.local'
      password: '{{ vcenter_password }}'
      validate_certs: no
    delegate_to: localhost
    retries: 30
    delay: 60
    register: result           
    until: result is succeeded 

In my lab this playbook takes about 20 minutes to run, and the web client takes a few additional minutes to be ready for interactive use.

The complete playbook and vars file for deploying vCenter can be found in this github repo. This was developed on vCenter 6.7U1, but new versions may bring new properties, or existing properties may be renamed. By following the process outlined in this post, you can adapt the playbook as the VCSA evolves over time, or apply this approach to automating the deployment of other virtual appliances in your environment.

How to build a Windows VM from scratch with Ansible

This should be a simple task. I have already created the answer file for an unattended installation, copied it to a virtual floppy image, and obtained a windows installation ISO. VMware’s Ansible modules look promising, so you would think you could use vsphere_copy to transfer the .iso and .flp to a datastore, use vsphere_guest to create the VM, and sit back and wait for WinRM to start responding.

Unfortunately, VMware’s modules have some coverage gaps that prevent this from working. The vsphere_copy module does not work with standalone hosts. In my greenfield deployment scenario, I am building the domain controller before I deploy vCenter, so vCenter is not available yet. But even if it was, the vmware_guest module can’t create a virtual floppy drive. That feature was in the deprecated vsphere_guest module it replaced, but was somehow lost along the way.

To overcome these limitations I had to take a different approach. I realized that I could use the vsphere_host module to enable SSH in ESXi, and then treat it like a linux host and use some of the available shell commands to copy bits and edit files. It is not the most elegant solution, but until the VMware modules mature it will have to do.

You can find the complete playbook on the madlabber github, but here are some of the highlights:

Putting the ESX login variables in an anchor:

    esxi_login: &esxi_login
      hostname: '{{ esxi_address }}'
      username: '{{ esxi_username }}'
      password: '{{ esxi_password }}'   
      validate_certs: no 

Enabling SSH (starting the TSM-SSH and TSM services):

  - name: Enable ESX SSH (TSM-SSH)
      <<: *esxi_login
      esxi_hostname: '{{ esxi_address }}'
      service_name: TSM-SSH
      state: present
    delegate_to: localhost
  - name: Enable ESX Shell (TSM)
      <<: *esxi_login
      esxi_hostname: '{{ esxi_address }}'
      service_name: TSM
      state: present
    delegate_to: localhost

And telling ESXi to go download the bits. I could have used the Ansible copy module to copy over the floppy image, but the ISO is too large to transfer this way. The copy module copies files to the target system’s temp volume first. Filling the temp space on an ESXi host doesn’t end well, and the ISO never makes it to the destination.

  - name: Download the Windows Server ISO
    shell: 'wget -P /vmfs/volumes/{{ esxi_datastore }} {{ windows_iso_url }}'
      creates: '/vmfs/volumes/{{ esxi_datastore }}/{{ windows_iso }}'
    delegate_to: '{{ esxi_address }}'
  - name: Download the autounattend floppy .flp
    shell: 'wget -P /vmfs/volumes/{{ esxi_datastore }} {{ windows_flp_url }}'
      creates: '/vmfs/volumes/{{ esxi_datastore }}/{{ windows_flp }}'
    delegate_to: '{{ esxi_address }}'

Now I can create the VM:

  - name: Create a new Server 2016 VM
      <<: *esxi_login
      folder: /
      name: '{{ vm_name }}'
      state: present
      guest_id: windows9Server64Guest
        type: iso
        iso_path: '[{{ esxi_datastore }}] {{ windows_iso }}'
      - size_gb: '{{ vm_disk_gb }}'
        type: thin
        datastore: '{{ esxi_datastore }}'
        memory_mb: '{{ vm_memory_mb }}'
        num_cpus: '{{ vm_num_cpus }}'
        scsi: lsilogicsas
      - name: '{{ vm_network }}'
        device_type: e1000
      wait_for_ip_address: no
    delegate_to: localhost
    register: deploy_vm

Notice how there is no floppy drive. Since I can’t create it with the vmware_guest module, I’ll have to edit the vmx file. It’s a little gruesome, but it works. I should be able to clean this up with customvalues in the vsphere_guest module, but it doesn’t currently work on a standalone host.

  - name: Adding VMX Entry - floppy0.fileType
      path: '/vmfs/volumes/{{ esxi_datastore }}/{{ vm_name }}/{{ vm_name }}.vmx'
      line: 'floppy0.fileType = "file"'
    delegate_to: '{{ esxi_address }}'
  - name: Adding VMX Entry - floppy0.fileName
      path: '/vmfs/volumes/{{ esxi_datastore }}/{{ vm_name }}/{{ vm_name }}.vmx'
      line: 'floppy0.fileName = "/vmfs/volumes/{{ esxi_datastore }}/{{ windows_flp }}"'
    delegate_to: '{{ esxi_address }}'
  - name: Removing VMX Entry - floppy0.present = "FALSE"
      path: '/vmfs/volumes/{{ esxi_datastore }}/{{ vm_name }}/{{ vm_name }}.vmx'
      line: 'floppy0.present = "FALSE"'
      state: absent
    delegate_to: '{{ esxi_address }}'

One last thing before I can power it on. The default boot sequence won’t work. This one will:

  - name: Change virtual machine's boot order and related parameters
      <<: *esxi_login 
      name: '{{ vm_name }}'
      boot_delay: 1000
      enter_bios_setup: False
      boot_retry_enabled: True
      boot_retry_delay: 20000
      boot_firmware: bios
      secure_boot_enabled: False
        - cdrom
        - disk
        - ethernet
        - floppy
    delegate_to: localhost
    register: vm_boot_order

Now I can power it on, and wait. Once the VMware tools are responding, I can use the vmware_vm_shell module to run commands inside the guest OS to assign a new hostname, set the IP address, etc. In the playbook this is part of a second play called “Customize Guest”.

  - name: Set password via vmware_vm_shell
      module: vmware_vm_shell
      <<: *esxi_login
      vm_username: Administrator
      vm_password: '{{ vm_password_old }}'
      vm_id: '{{ vm_name }}'
      vm_shell: 'c:\windows\system32\windowspowershell\v1.0\powershell.exe'
      vm_shell_args: '-command "(net user Administrator {{ vm_password_new }})"'
      wait_for_process: true
    ignore_errors: yes
  - name: Configure IP address via vmware_vm_shell
      module: vmware_vm_shell
      <<: *esxi_login
      vm_username: Administrator
      vm_password: '{{ vm_password_new }}'
      vm_id: '{{ vm_name }}'
      vm_shell: 'c:\windows\system32\windowspowershell\v1.0\powershell.exe'
      vm_shell_args: '-command "(new-netipaddress -InterfaceAlias Ethernet0 -IPAddress {{ vm_address }} -prefixlength {{vm_netmask_cidr}} -defaultgateway {{ vm_gateway }})"' 
      wait_for_process: true
  - name: Configure DNS via vmware_vm_shell
      module: vmware_vm_shell
      <<: *esxi_login
      vm_username: Administrator
      vm_password: '{{ vm_password_new }}'
      vm_id: '{{ vm_name }}'
      vm_shell: 'c:\windows\system32\windowspowershell\v1.0\powershell.exe'
      vm_shell_args: '-command "(Set-DnsClientServerAddress -InterfaceAlias Ethernet0 -ServerAddresses {{ vm_dns_server }})"'
      wait_for_process: true
  - name: Rename Computer via vmware_vm_shell
      module: vmware_vm_shell
      <<: *esxi_login
      vm_username: Administrator
      vm_password: '{{ vm_password_new }}'
      vm_id: '{{ vm_name }}'
      vm_shell: 'c:\windows\system32\windowspowershell\v1.0\powershell.exe'
      vm_shell_args: '-command "(Rename-Computer -NewName {{ vm_name }})"'
      wait_for_process: true

One more reboot and the VM is ready for advanced configuration by another playbook. In the complete playbook and the corresponding example vars file, there are a few extra steps I take to capture and later restore the state of the TSM & TSM-SSH services since most people don’t leave those enabled.