packer - Terraform - Deployment of VM from Template but Need to wait for first boot - Stack Overflow

admin2025-04-29  0

Environment:

  • terraform version = v1.10
  • target platform = vCenter (vSphere) v7.x
  • plugins = vsphere(v2.10)

VM template:

  • OS = Ubuntu 24.04.1 LTS
  • Build process: Packer -> cloud-config -> autoinstall

Packer build process:

during the Packer build process, the vm is built and the last steps are:

  • 1st reboot for cloud-init to complete
  • Packer uses this to validate the VM is accessible via SSH

at this stage, the VM is shutdown, converted to template or imported into the Content Library. So once a VM is cloned from this template, it will actually be the 2nd time (from the OS perspective) it was booted.

second-boot-process:

The user-data has a 'runcmd' section which basically creates a systemd service 'second-boot-service'. The point of this service is:

  1. verify the boot count
  2. if 2nd boot, then disable the server, run the cloud-init clean command, reboot

This ensures that new VMs cloned from this template have:

  • created a new /etc/machine-id
  • created new ssh keys
  • cleaned logs

this logic works great when/if you manually deploy without any OS customizations.

ISSUE:

When I use this template within a Terraform collection to deploy a VM using the vsphere_virtual_machine.clone.customize.linux_options;

  1. clone operation happens
  2. vm boots for 1st time
  3. terraform tries to apply customizations

ERROR:

vsphere_virtual_machine.hcv_nodes[0]: Still creating... [1m50s elapsed]
vsphere_virtual_machine.hcv_nodes[0]: Still creating... [2m0s elapsed]
╷
│ Error: 
│ Virtual machine customization failed on "/CDADC001/vm/cdavault001-01.domain":
│ 
│ An error occurred while customizing VM cdavault001-01.domain. For details reference the log file /var/log/vmware-imc/toolsDeployPkg.log in the guest OS.

This is where the second-boot-service reboots after the cloud-init command but Terraform is still attempting to apply the linux_options.

So I need to->

  1. somehow have the image perform the cloud-init clean at a later time but ensure it is run
  2. tell Terraform to wait/sleep before attempting customizations

user-data (snippet):

user-data:
        # Post-Processing
        # - Execute these on the 2nd boot, only once (poorman's rc.local)
        # logic: since Packer/VMware-ISO has already booted the system once to validate SSH connectivity,
        # this cleanup process needs to run the 1st time the final image boots.
        # This will reset SSH keys, machineID, and cloud-init logs.
        write_files:
            - path: /usr/local/bin/second-boot-script.sh
              permissions: '0755'
              owner: root:root
              content: |
                #!/bin/bash
                # Path to the counter file
                COUNTER_FILE="/var/run/boot-counter"

                # Systemd service name
                SERVICE_NAME="second-boot.service"

                # Initialize boot count if the file doesn't exist
                if [ ! -f "$COUNTER_FILE" ]; then
                    echo 1 > "$COUNTER_FILE"
                fi

                # Read the current boot count
                BOOT_COUNT=$(cat "$COUNTER_FILE")

                if [ "$BOOT_COUNT" -eq 1 ]; then
                    echo "Executing second boot commands..."

                    # Place all pre-reboot commands here
                    echo "Running pre-reboot commands..." >> /var/log/second-boot.log
                    # Example: Updating some configurations or running additional setup
                    echo "Pre-reboot tasks completed at $(date)" >> /var/log/second-boot.log

                    # Disable the systemd service to prevent future executions
                    echo "Disabling $SERVICE_NAME to prevent future runs."
                    systemctl disable "$SERVICE_NAME"

                    # Final step: Run cloud-init clean and reboot
                    echo "Running cloud-init clean and rebooting system..." >> /var/log/second-boot.log
                    cloud-init clean --logs --machine-id --configs ssh_config --reboot
                fi

                # Increment the boot count
                BOOT_COUNT=$((BOOT_COUNT + 1))
                echo "$BOOT_COUNT" > "$COUNTER_FILE"

        runcmd:
        - echo "Creating systemd service for second-boot-script.sh"
        - |
            cat <<EOF > /etc/systemd/system/second-boot.service
            [Unit]
            Description=Run a script only on the second boot
            Before=multi-user.target
            Wants=network.target
            After=network.target

            [Service]
            Type=oneshot
            ExecStart=/usr/local/bin/second-boot-script.sh
            RemainAfterExit=yes

            [Install]
            WantedBy=multi-user.target
            EOF
        - echo "Reloading systemd daemon and enabling second-boot service"
        - systemctl daemon-reload
        - systemctl enable second-boot.service

main.tf (snippet):

resource "vsphere_virtual_machine" "hcv_nodes" {
  count            = 1
  name             = "cdavault001-0${count.index + 1}.domain"
  resource_pool_id = data.vsphere_compute_cluster.target_compute_cluster.resource_pool_id
  datastore_id     = data.vsphere_datastore.target_datastore.id

  num_cpus = 2
  memory   = 4096

  network_interface {
    network_id = data.vsphere_network.target_vsphere_network.id
    adapter_type = "vmxnet3"
  }

  disk {
    label            = "disk0"
    size             = 50
    eagerly_scrub    = false
    thin_provisioned = true
  }

  clone {
    template_uuid = data.vsphere_content_library_item.target_template.id

    customize {
      linux_options {
        host_name = "cdavault001-0${count.index + 1}"
        domain    = "domain"
      }

      network_interface {
        ipv4_address = "10.10.10.${31 + count.index}"
        ipv4_netmask = 24
      }

      ipv4_gateway = "10.10.10.254"

      dns_server_list = ["10.10.10.1", "8.8.8.8"]
      dns_suffix_list = ["domain", "domain.local"]
    }
  }
}

/var/log/vmware-imc/toolsDeployPkg.log

2025-01-07T12:37:36 DEBUG: Command: 'rm -rf /var/lib/dhcp/*' 
2025-01-07T12:37:36 DEBUG: Exit Code: 0 
2025-01-07T12:37:36 DEBUG: Result:  
2025-01-07T12:37:36 DEBUG: Check if command [hostnamectl] is available 
2025-01-07T12:37:36 INFO: Check if hostnamectl is available 
2025-01-07T12:37:36 DEBUG: Command: 'hostnamectl status 2>/tmp/guest.customization.stderr' 
2025-01-07T12:37:36 DEBUG: Exit Code: 1 
2025-01-07T12:37:36 DEBUG: Result:  
2025-01-07T12:37:36 DEBUG: Stderr: Failed to query system properties: Transaction for systemd-hostnamed.service/start is destructive (system-systemd\x2dfsck.slice has 'stop' job queued, but 'start' is included in transaction).
 
2025-01-07T12:37:37 INFO: Check if hostnamectl is available 
2025-01-07T12:37:37 DEBUG: Command: 'hostnamectl status 2>/tmp/guest.customization.stderr' 
2025-01-07T12:37:37 DEBUG: Exit Code: 1 
2025-01-07T12:37:37 DEBUG: Result:  
2025-01-07T12:37:37 DEBUG: Stderr: Failed to query system properties: Transaction for systemd-hostnamed.service/start is destructive (time-set.target has 'stop' job queued, but 'start' is included in transaction).
 
2025-01-07T12:37:38 INFO: Check if hostnamectl is available 
2025-01-07T12:37:38 DEBUG: Command: 'hostnamectl status 2>/tmp/guest.customization.stderr' 
2025-01-07T12:37:38 DEBUG: Exit Code: 1 
2025-01-07T12:37:38 DEBUG: Result:  
2025-01-07T12:37:38 DEBUG: Stderr: Failed to query system properties: Transaction for systemd-hostnamed.service/start is destructive (systemd-reboot.service has 'start' job queued, but 'stop' is included in transaction).
 
2025-01-07T12:37:39 INFO: Check if hostnamectl is available 
2025-01-07T12:37:39 DEBUG: Command: 'hostnamectl status 2>/tmp/guest.customization.stderr' 

=================== Perl script log end =================
[2025-01-07T12:37:39.616Z] [   error] Customization command failed with exitcode: 127, stderr: ''.
[2025-01-07T12:37:39.616Z] [   error] Customization process returned with error.
[2025-01-07T12:37:39.616Z] [   debug] Deployment result = 127.
[2025-01-07T12:37:39.616Z] [    info] Setting 'unknown' error status in vmx.
[2025-01-07T12:37:39.617Z] [    info] Transitioning from state 'INPROGRESS' to state 'ERRORED'.
[2025-01-07T12:37:39.617Z] [    info] ENTER STATE 'ERRORED'.
[2025-01-07T12:37:39.617Z] [    info] EXIT STATE 'INPROGRESS'.
[2025-01-07T12:37:39.617Z] [   debug] Setting deploy error: 'Deployment failed.The forked off process returned error code.'.
[2025-01-07T12:37:39.617Z] [   error] Deployment failed.The forked off process returned error code.
[2025-01-07T12:37:39.617Z] [    info] Launching cleanup.
[2025-01-07T12:37:39.617Z] [   debug] Command to exec : '/bin/rm'.
[2025-01-07T12:37:39.617Z] [    info] sizeof ProcessInternal is 56
[2025-01-07T12:37:39.618Z] [    info] Returning, pending output from stdout
[2025-01-07T12:37:39.618Z] [    info] Returning, pending output from stderr
[2025-01-07T12:37:39.718Z] [    info] Process exited normally after 0 seconds, returned 0
[2025-01-07T12:37:39.718Z] [    info] No more output from stdout
[2025-01-07T12:37:39.719Z] [    info] No more output from stderr
[2025-01-07T12:37:39.719Z] [    info] Customization command output:
''.
[2025-01-07T12:37:39.719Z] [    info] sSkipReboot: 'false', forceSkipReboot 'false'.
[2025-01-07T12:37:39.719Z] [   error] Deploy error: 'Deployment failed.The forked off process returned error code.'.
[2025-01-07T12:37:39.719Z] [   error] Package deploy failed in DeployPkg_DeployPackageFromFile
[2025-01-07T12:37:39.719Z] [   debug] ## Closing log

转载请注明原文地址:http://anycun.com/QandA/1745933971a91325.html