With Linux, the primary method for obtaining debugging information of a serious error or fault is via the kdump mechanism. Kdump captures a wealth of kernel and machine state and writes it to a file for post-mortem debugging. But if kdump writes to a file on a remote server, and networking is down, then kdump cannot work.

In this context, networking includes the guest’s network driver and stack, the host’s network driver(s), and the network hardware both on the host and in the surrounding data center.

What is Linux Pstore?

Linux provides a persistent storage file system that can store error records when the kernel dies (or reboots or powers off). These records, in turn can be referenced to debug kernel problems (currently, the kernel stuffs the tail of the dmesg, which also contains a stack backtrace, into store).

The pstore is backed by local non-volatile memory and presented to the running system via traditional filesystem interfaces. Since it uses local non-volatile memory, pstore works even when kdump cannot.


Pstore was introduced into Linux to record information (eg. dmesg tail) upon panics and shutdowns. Pstore is independent of and can run before kdump. In specific scenarios (ie. hosts/guests with root filesystems on NFS/iSCSI where networking software and/or hardware has failed), pstore may contain information available for post-mortem debugging not otherwise captured.

pstore is a persistent storage driver. It saves data in a reserved part of memory, which can then be read from a working kernel. Its primary purpose is to save kernel crash logs to memory.


Ramoops is an oops/panic logger that writes its logs to RAM before the system crashes. It works by logging oopses and panics in a circular buffer. Ramoops needs a system with persistent RAM so that the content of that area can survive after a restart.

Pstore & Ramoops Setup

Kernel Configuration
To enable pstore and use the ramoops backend, make sure the following kernel options are set:


They can be found under File systems > Miscellaneous filesystems > Persistent store support.

On older kernels without CONFIG_PSTORE_CONSOLE and CONFIG_PSTORE_RAM, you will need to enable


Configuring Ramoops
Make sure that the memory address you reserve and its size are the same in your downstream and mainline configuration, or else you won’t be able to read the information.

Now you can mount the pstore partition:

mkdir /tmp/pstore
mount -t pstore -o kmsg_bytes=16000 - /sys/fs/pstore
$ ls -l /sys/fs/pstore/
  total 0
 -r--r--r-- 1 root root 7896 Nov 30 15:38 dmesg-erst-1

Different users of this interface will result in different filename prefixes. Currently, two are defined:

  • “dmseg” – saved console log
  • “mce” – architecture-dependent data from fatal h/w error

Once the information in a file has been read, removing the file will signal to the underlying persistent storage device that it can reclaim the space for later re-use:

  $ rm /sys/fs/pstore/dmesg-erst-1

The expectation is that all files in /sys/fs/pstore/will be saved elsewhere and erased from the persistent store soon after boot to free up space ready for the next catastrophe.

The ‘kmsg_bytes’ mount option changes the target amount of data saved on each oops/panic. Pstore saves (possibly multiple) files based on the record size of the underlying persistent storage until at least this amount is reached. Default is 10 Kbytes.

Pstore only supports one backend at a time. If multiple backends are available, the preferred backend may be set by passing the pstore.backend= argument to the kernel at boot time.


The behavior of systemd-pstore is configured through the configuration file /etc/systemd/pstore.conf and corresponding snippets /etc/systemd/pstore.conf.d/*.conf, see pstore.conf

Disabling pstore processing

To disable pstore processing by systemd-pstore, set


For example, if the Linux kernel dies, the dmesg tail, is written to pstore.

Linux Kernel

If the pstore backend were UEFI, it may look more like the following:

Understanding the pstore backend

The dmesg tail is fragmented (based on the underlying storage exchange buffer size) into several error records, which are presented as files and can be re-assembled. Of course, the most important thing is that the dmesg tail, and thus the kernel panic call trace, has been captured to determine where things went badly wrong.

The size of the dmesg tail is tuneable via CONFIG_PSTORE_DEFAULT_KMSG_BYTES and is 10KiB by default.

As the local non-volatile storage tends to be small, typically tens of kilobytes, Oracle provided the systemd-pstore service to help manage the pstore space. In short, upon boot (or when systemd-pstore is re/started), it archives the contents of the pstore to other storage (eg. the regular filesystem), thus preserving the existing information and clearing pstore for future error events. Oh, and systemd-pstore will re-assemble the dmesg too!

Systemd-pstore first appeared in v243 and is present and enabled in OL7.9 and OL8.2 and newer.The systemd-pstore service is enabled by default. It can be re-run by issuing:

systemctl restart systemd-pstore

You can find the archive of past pstore contents under /var/lib/systemd/pstore, for example:

Past Pstore Contents

Where the dmesg.txt is re-assembled from the dmesg tail fragments related to dmesg-efi-155741337* files.

Pstore Enablement

To enable Linux pstore, which UEK does by default, ensure the following kernel configuration options are set.

Enable Linux Pstore

With the above, pstore is enabled in the kernel and the ACPI ERST and UEFI storage backends, if present on the machine, are available.
The selection of the pstore backend is done at kernel boot time. By default, ACPI ERST is selected as the storage backend, and is preferred as it was designed for this function.

The UEFI backend is disabled by default, and to use the UEFI backend it must be explicitly selected at kernel boot time.
For UEK5 era kernels, the following kernel parameter is selected to utilize the UEFI pstore backend:

EUK5 Era Kernels

However, for UEK6 (5.4) era kernels, an additional kernel parameter is needed:


Without this kernel parameter, the EFI backend is never attempted.
To see which backend is active, you can inquire with:

# cat /sys/module/pstore/parameters/backend

Pstore Kernel Parameters

Two kernel parameters impact the writing of data into pstore.
Parameter printk.always_kmsg_dump writes to pstore at kernel shutdown or reboot.
Parameter crash_kexec_post_notifiers enable the writing to pstore before attempting kdump. Do be aware of the kernel documentation warning for this parameter.

These parameters can be passed at kernel boot time or set via the sysfs interface.

echo Y > /sys/module/printk/parameters/always_kmsg_dump

echo Y > /sys/module/kernel/parameters/crash_kexec_post_notifiers
echo Y > /sys/module/printk/parameters/always_kmsg_dump

echo Y > /sys/module/kernel/parameters/crash_kexec_post_notifiers

To persist a change to these settings, un-comment the appropriate line(s) from /usr/lib/tmpfiles.d/systemd-pstore.conf:

#w /sys/module/printk/parameters/always_kmsg_dump - - - - Y
#w /sys/module/kernel/parameters/crash_kexec_post_notifiers - - - - Y

At next reboot, systemd will process this file and apply the changes.


By enabling pstore to run prior to kdump, you are assured of capturing the kernel call backtrace, even under challenging scenarios. With that dmesg tail, and kernel call trace in hand, you are one step closer to finding the cause of the panic!

Symantec’s Veritas Volume Manager (VxVM) is a storage management subsystem that allows you to manage physical disks as logical devices called volumes. This blog will outline and dive deeper into the components of a Veritas Volume Manager.

Why VxVM?

We have a volume manager called “LVM,” which comes with OS installation by default in the Linux operating system.
Logical volumes are an alternate method of partitioning hard drive space. The capability has been built into the Linux kernel since 1999.

Linux and Partitions

Windows and OS X assume that the hard drive is a single monolithic partition. Linux assumes that a hard drive will be partitioned as part of the basic operating system installation, with specific partitions called “/var,” “/usr,” “/tmp,” “/home” and “/boot.” The “/boot” partition is where your operating system lives, while the other partitions hold applications, spool files and error logs, temporary data, and user data. While Linux can run on a single partition, additional partitions improve system performance.
In the following sections, I will point out the features of the LVM.

Dynamic Volume Resizing

Logical volumes allow you to leave unpartitioned space on the hard disk and add it to a specific partition as needed without having to back up your data and reformat the hard drive. This allocation of unpartitioned space can be done dynamically through both the command line and graphical user interfaces without rebooting the computer.

Spanning Volumes on Multiple Disks

When you use logical volumes, you can assign multiple physical disks to the same logical volume, which means that all those disks are seen as one partition from the user’s perspective.

Shrinking Volume Sizes

While logical volumes are great for adding unpartitioned disk space to a specific volume, the reverse is not true. Shrinking a logical volume to reallocate its disk space somewhere else is risky and can result in data loss. You should back up your data and migrate to a new, larger disk rather than attempt to shrink a volume.

Disaster Recovery

While spanning a logical volume across several disks is one of the “killer features” on LVM, the loss of a single disk in a logical volume can render the entire volume unusable. Therefore, if you’re going to use LVM, make extensive and regular backups of the entire volume.

We helped a global tech company migrate fifteen business units to the cloud, streamlining business through one cloud environment with Azure and M365.

Drawbacks of LVM

The main disadvantage of LVM is that it adds another layer to the storage system. While the overhead of LVM is usually small, any decrease in performance can be critical on busy systems. Many users have reported significant performance issues when creating snapshots, limiting their production system use.

Veritas Volume Terminology

LVM and VxVM

So, to avoid the above drawback, we go for VxVM. VxVM allows a system administrator to configure various volume layouts for volume this high, allowing redundancy and high performance better than LVM. Several Volume Manager objects must be understood before you can use the Volume Manager to perform disk management tasks:

VxVM uses two types of objects to handle storage management: physical objects and virtual objects.

  • Physical objects — It’s a physical disk or LUN from storage. Physical disks or other hardware with block and raw operating system device interfaces that are used to store data.
  • Virtual objects — When one or more physical disks are brought under the control of VxVM, it creates virtual objects called volumes on those physical disks.
  • VM disks — A VM disk is a contiguous area of disk space from which the Volume Manager allocates storage. It’s nothing but a public region of the disk.
  • Disk groups — A disk group is a collection of VM disks that share a common configuration.
  • Subdisks — A subdisk is a set of contiguous disk blocks. A VM disk can be divided into one or more subdisks.
  • Plexes — The Volume Manager uses subdisks to build virtual entities called plexes. A plex consists of one or more subdisks located on one or more disks.
  • Volumes — A volume is a virtual disk device that appears to applications, databases, and file systems like a physical disk partition. Still, it does not have the physical limitations of a physical disk partition. Each volume records and retrieves data from one or more physical disks. Volumes are accessed by file systems, databases, or other applications in the same way physical disks are accessed. Volumes are also composed of other virtual objects (plexes and subdisks) used to change the volume configuration. Volumes and their virtual components are called virtual objects or VxVM objects. There are lots of other features you can add to vxvm easily.  Database snapshotting, dirty region logging for fast resynch, remote site replication for DR, clustering, etc.

Installation Process on a Linux Server

1. Download the .sh file: https://sort.veritas.com/data_collectors/download

2. Extract the installer package and run the installer:

Extract Installer Package

NOTE: If you get an error as “Cannot find perl to execute”, Just move “bin”,”lib” folder from perl/RHEL6x8664/ to perl directory, because the installer looking the perl files under perl folder, but it placed on perl /RHEL6x8664

Move bin lib folder

3. Install the Veritas Volume Manager on Linux. You will see the prompt with many options, as shown in the image.

Install a Product

i) Press “I” and “Enter” to select “Install a Product”

Symantec SF/HA

ii) Press “Y” to accept the Terms and Conditions

Basic Foundation 6.2

iii) Press “2” to install the recommended RPMs

Enter the system name (enter hostname here). Once we enter the system name, the installer will check and give you the status in the last column. Sometimes it may report you as failed because of missing RPMs, and that will provide you with another option to install via YUM utility.

NOTE: If you have a YUM server, select “1” to install the missing required RPMs with yum. If you are getting the errors, RPMs are missing and have failed to install with yum.

Try the solution below. Sometimes it may fail for keys, and the below example may help correct it.

Again, start the installer and check. If that solution didn’t help, try to download the RPMs manually and rerun the installer. Once complete, press “Enter” to continue.

Download RPMS

After a few minutes, you should see the message Symantec Storage Foundation Basic installation has is complete, as shown in the below imaInstallation completedge.

Create Volumes and File Systems on RHEL7

1. Identify the correct Disks using Vxdisk – Ensure that the disk is detected, not mounted using the fdisk and df -h command. Then, execute the below command to list the available disk under Veritas Volume Manager.

Identify the correct disks

If you see the status as “online invalid,” it indicates these disks are yet to be added into Veritas Volume Manager. Be careful; even the mounted disks will show as invalid under Veritas Volume Manager because VxVM does not initialize these disks.

NOTE: Sometimes, we may get device column output in “Enclosure based names,” for example.

Enclosure Based Names

So, change the naming conversion format to Operating System Based names to help us identify the correct disk.

2. Change the “Enclosure Based Names” to “Operating System Based Names”

Change Enclosure Based Names

To revert to “Enclosure Bases names” use the below command:

Revert back to Enclosure Bases Names

Once we have identified the correct disks, go ahead to initialize the disks using the vxdisk setup command.

For example, let’s take sbd and sdc disks.

Vxdisk setup command

Disk’s status, which shows as online, is initialized, and it belongs to VxVM.

3. Create a Disk group and add the new disks. Disks groups are similar to volume groups in LVM, so create a disk group called “testdg” and add the identified disks to the “testdg” disk group.

Testdg Disk Group

Check the disk group properties.

Add a new Disk

Let’s assume that we have a new disk “sdd,” which needs to be added to the existing disk group “testdg”. So, we will see how to add a new disk into the existing Disk group. Initialize and add the disk as shown below.

Initialize and add the disk

4. Create a volume on the Disk group: Let’s create a volume of 100MB within the disk group.

List the volume details using vxlist command.

List volume details

Note: If you get an error when you use “vxlist” command as shown below:

Use vxlist command

Start the below script:

Below Script

5. Create a File system on the volume.

Test the Volume

Where /dev/vx/rdisk/testdg/vol1 is the Device file for volume vol1

6. Mount the file system using mount command

Mount the file system

Verify the mounted file system using mount and df command.

Online Resizing:

To achieve this, we can use “vxassist” and “vxresize” commands to resize the volume size, which means increasing or reducing the size of the volume. There are two parameters that can be used along with “vxassist” command to find the space available in a Disk group to know how much it can be increased or extended.

  • Maxsize – Use this option to find the size to which a new volume or existing volume can grow.
  • Maxgrow – Use this option to find the size to which an existing volume can be grown.

1. How to find the total Disk group size in Veritas Volume Manager:

Find total disk group size

The above output shows that mytestdg Disk group is initiated with the disk “sdb” and the disk size is 0.91G, and Volume “testvol1” size is 0.48G.

2. How to find the maximum free size available in the Disk group to extend or increase.

Find maximum free size avialable

Where mytestdg our Disk group name the above output shows that we have 441Mb of free volume size. This can be used to create a new volume or extend the existing volume.

3. How to find the maximum size of an existing volume

Test the size of existing volume

The current size of the volume is 0.48G. We had found the size in the 1st point itself so that the volume can be increased to 941Mb.

4. How to increase the volume size or extend the volume size in Veritas Volume Manager

Option 1: To extend or increase the volume to a specific size 500Mb, use the grown option, so your total volume size would be 500Mb.

Extend to increase volume size

Option2: To extend or increase the volume by a specific size 800Mb, use the grown option, so your total volume size would be added with a specified size.

Add total volume size to specific size

Resize the mounted volume size on the fly without unmounting.

Resize the mounted volume size

5. How to decrease the volume size or reduce the volume size.

Option 1: To reduce or decrease the volume to a specific size 500Mb, use the shrink to option. So your total volume size would be 500Mb.

Reduce or decrease the volume

Option 2: To reduce or decrease the volumes by a specific size 800Mb, use the shrink option, so your total volume size will reduce the specified size from the total volume size.

Use the shrink option

In order to resize the mounted volume size on the fly without unmounting, use the image below:

resize mounted volume size

Key Points:

  • It is available for Windows, AIX, Solaris, Linux, and HP-UX. A modified version is bundled with HP-UX as its built-in volume manager.
  • The latest version is Veritas Volume Manager 7.4.1Release date (Windows): February 2019.
  • VXVM supports a cluster file system with CFS and Oracle RAC
  • Base VXVM allows you to mirror your boot disk without any additional license and only the root dg.
  • VxVM allows you to stripe and mirror and convert between layered and non-layered, preserving the data.


  • VxVM main disadvantage is that it costs.
  • We must pay additional licenses for every little feature.

Veritas Volume Manager provides manageability, availability, and performance enhancements for enterprise computing environments. It has benefits like Disk spanning, Load balancing, Complex multidisk configurations, Online administration, and High availability. I hope that this blog has been helpful!

A DoD client requested support with automated file transfers. The client has files placed in a common folder that can be accessed by the standard File Transfer Protocol (FTP). Given the FTP server’s connection information, the client requested the files to be moved to an Amazon Web Services (AWS) S3 bucket where their analysis tools are configured to use.

Automating the download and upload process would save users time by allowing for a scheduled process to transfer data files. This can be achieved using a combination of AWS Lambda and EC2 services. AWS Lambda provides a plethora of triggering and scheduling options and the power to create EC2 instances. By creating an EC2 example, a program or script can avoid Lambdas’ limitations and perform programmatic tasking such as downloading and uploading. Additionally, this can be done using Terraform to allow for deployment in any AWS space.

Writing a Script to Do the Work

Create a Script that can log in to the FTP server, fetch/download files, and copy them to an S3 bucket before using Terraform or AWS console. This can be done effectively with Python’s built-in FTPlib and the AWS boto3 API library. There are various libraries and examples online to show how to set up a Python script to download files from an FTP server and use the boto3 library to copy them to S3.

Consider writing the script that file size will play a significant role in how FTPlib and Boto3’s copy functions work. Anything over 5GB will need to be chunked from the FTP Server and use the multiple file upload methods for the AWS API.

Creating an Instance with Our Script Loaded

Amazon provides Amazon Managed Images (AMI) to start up a basic instance. The provided Linux x86 AMI is the perfect starting place for creating a custom instance and eventually custom AMI.

With Terraform, creating an instance is like creating any other module, requiring Identity and Access Management (IAM) permissions, security group settings, and other configuration settings. The following shows the necessary items needed to make an EC2 instance with a key-pair, permissions to write to s3, install Python3.8 and libraries, and copy the script to do the file transferring into the ec2-user directory.

First, generating a key-pair, a private key, and a public key is used to prove identity when connecting to an instance. The benefit of creating the key-pair in the AWS Console is access to the generated .pem file. Having a local copy will allow for connecting to the instance via the command line, while great for debugging, but not great for deployment. Terraform can be generated and store a key-pair in its memory to avoid passing sensitive information.

# Generate a ssh key that lives in terraform
# https://registry.terraform.io/providers/hashicorp/tls/latest/docs/resources/private_key
resource "tls_private_key" "instance_private_key" {
  algorithm = "RSA"
  rsa_bits  = 4096

resource "aws_key_pair" "instance_key_pair" {
  key_name   = "${var.key_name}"
  public_key = "${tls_private_key.instance_private_key.public_key_openssh}"


To set up the secrSetup, which is the security group to run the instance in, open up the ports for Secure Shell (SSH) and Secure Copy Protocol (SCP) to copy the script file(s) to the instance. A security group acts as a virtual firewall for your EC2 instances to control incoming and outgoing traffic. Then, open other ports for ingress and egress as needed, i.e. 443 for HTTP traffic. The security group will require the vpc_id for your project. This is the Visual Private Cloud (VPC) that the instance will be running. The security group should match up with your VPC settings.

resource "aws_security_group" "instance_sg" {
  name   = "allow-all-sg"
  vpc_id = "${var.vpc_id}"
  ingress {
    description = "ftp port"
    cidr_blocks = [""]
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"

The IAM policy, for instance, will require PutObject access to the S3 bucket. The Terraform module will need the S3 bucket as an environment variable, and a profile instance is created. If creating the IAM policy in the AWS Console, a profile instance is automatically created, but it has to be explicitly defined in Terraform.

#iam instance profile setup
resource "aws_iam_role" "instance_s3_access_iam_role" {
  name               = "instance_s3_access_iam_role"
  assume_role_policy = <<EOF
  "Version": "2012-10-17",
  "Statement": [
      "Action": "sts:AssumeRole",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      "Effect": "Allow",
      "Sid": ""
resource "aws_iam_policy" "iam_policy_for_ftp_to_s3_instance" {
  name = "ftp_to_s3_access_policy"

  policy = <<EOF
  "Version": "2012-10-17",
  "Statement": [
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
      "Resource": "arn:aws:s3:::${var.s3_bucket}"

resource "aws_iam_role_policy_attachment" "ftp_to_s3" {
  role       = aws_iam_role.instance_s3_access_iam_role.name
  policy_arn = aws_iam_policy.iam_policy_for_ftp_to_s3_instance.arn

resource "aws_iam_instance_profile" "ftp_to_s3_instance_profile" {
  name = "ftp_to_s3_instance_profile"
  role = "instance_s3_access_iam_role"

Defining the instance to start from and create the custom AMI from in the Terraform will need the following variables:

  • AMI – the AMI of the Linux x86 image
  • instance_type – the type of instance, i.e., t2.micro
  • subnet_id – the subnet string from which VPC the instance will run on
  • key-name – the name of the key, should match the key-pair name generated above or the one from the AWS console, could use a variable reference here too

Define the connection and provisioner attributes to copy the python script to do the file transferring to the ec2-user home folder. The connection will use the default ec2-user using the secure key and then copy over the python file. If using the key downloaded from AWS Console, use the following to point to the file private_key = “${file (“path/to/key-pair-file.pem”)}”.

Complete the instance setup with the correct Python version and library. The user_data attribute sends a bash script to install whatever is needed— in this case, updating Python to 3.8, installing the boto3, and paramiko libraries.

# Instance that we want to build out
resource "aws_instance" "ftp-to-s3-instance" {
  ami           = var.ami
  instance_type = var.instance_type
  subnet_id     = var.subnet_id
  key_name 	   = "${var.key_name}" #use your own key for testing
  security_groups      = ["${aws_security_group.instance_sg.id}"]
  iam_instance_profile = "${aws_iam_instance_profile.ftp_to_s3_instance_profile.id}"

  # Copies the python file to /home/ec2-user
  # depending on how the install of python works we may need to change this location
  connection {
    type        = "ssh"
    user        = "ec2-user"
    host        = "${element(aws_instance.ftp-to-s3-instance.*.public_ip, 0)}"
    private_key = "${tls_private_key.instance_private_key.private_key_pem}"

  provisioner "file" {
    source      = "${path.module}/ftp_to_s3.py"
    destination = "/home/ec2-user/ftp_to_s3.py"

  user_data = <<EOF
sudo amazon-linux-extras install python3.8
python3.8 -m pip install -U pip
pip3.8 --version
pip3.8 install boto3 
pip3.8 install paramiko 


The last step is to create the custom AMI. This will allow our Lambda to duplicate and make as many of these instances as need.

resource "aws_ami_from_instance" "ftp-to-s3-ami" {
  name               = "ftp-to-s3_ami"
  description        = "ftp transfer to s3 bucket python 3.8 script"
  source_instance_id = "${aws_instance.ftp-to-s3-instance.id}"

  depends_on = [aws_instance.ftp-to-s3-instance]

  tags = {
    Name = "ftp-to-s3-ami"

Creating Instances on the Fly in Lambda

Using a Lambda function that can be triggered in various ways is a straightforward way to invoke EC2 instances. The following python code show passing in environment variables to be used in an EC2 instance as both environment variables in the instance and arguments passed to the Python script. The variables needed in the python script for this example are as followed:

  • FTP_HOST – the URL of the FTP server
  • FTP_PATH – the path to the files on the URL server
  • FTP_USERNAME, FTP_PASSWORD, FTP_AUTH – to be used for any authentication for the FTP SERVER
  • S3_BUCKET_NAME – the name of the bucket for the files
  • S3_PATH – the folder or path files should be downloaded to in the S3 bucket
  • Files_to_download – for this purpose, a python list of dictionary objects with filename and size to downloaded.

For this example, the logic for checking for duplicate files is down before the Lambda invoking the instance for transferring is called. This allows the script in the instance to remain singularly focused on downloading and uploading. It is important to note that the files_to_download variable is converted to a string, and the quotes are made into double-quotes. Not doing this will make the single quotes disappear when passing to the EC2 instance.

The init_script variable will use the passed-in event variables to set up the environment variables and python script arguments. Just like when creating the instance, the user_data script is run by the instance’s root user. The root user will need to use the ec2-user’s python to run our script with the following bash command: PYTHONUSERBASE=/home/ec2-user/.local python3.8 /home/ec2-user/ftp_to_s3.py {s3_path} {files_to_download}.

# convert to string with double quotes so it knows its a string
    files_to_download = ",".join(map('"{0}"'.format, files_to_download))
    vars = {
        "FTP_HOST": event["ftp_url"],
        "FTP_PATH": event["ftp_path"],
        "FTP_USERNAME": event["username"],
        "FTP_PASSWORD": event["password"],
        "FTP_AUTH_KEY": event["auth_key"],
        "S3_BUCKET_NAME": event["s3_bucket"],
        "files_to_download": files_to_download,
        "S3_PATH": event["s3_path"],

    init_script = """#!/bin/bash
                /bin/echo "**************************"
                /bin/echo "* Running FTP to S3.     *"
                /bin/echo "**************************"
                export S3_BUCKET_NAME={S3_BUCKET_NAME}
                export PRODUCTS_TABLE={PRODUCTS_TABLE}
                export FTP_HOST={FTP_HOST}
                export FTP_USERNAME={FTP_USERNAME}
                export FTP_PASSWORD={FTP_PASSWORD}
                PYTHONUSERBASE=/home/ec2-user/.local python3.8 /home/ec2-user/ftp_to_s3.py {s3_path} {files_to_download}
                shutdown now -h""".format(

Invoke the instance with the boto3 library providing the parameters for the custom image AMI, Instance type, key-pair, subnet, and instance profile, all defined by Terraform environment variables. Optionally, set the Volume size to 50GB from the default 8GB for larger files.

instance = ec2.run_instances(
        IamInstanceProfile={"Arn": INSTANCE_PROFILE},
        BlockDeviceMappings=[{"DeviceName": "/dev/xvda", "Ebs": {"VolumeSize": 50}}],


After deploying to AWS, Terraform will have created a Lambda that invokes an EC2 instance running the script passed to it during its creation. Triggering the Lambda function to invoke the custom instance can be done from a DynamoDB Stream update, scheduled timer, or even another Lambda function. This provides flexibility on how and when the instance is called.

Ultimately, this solution provides a flexible means of downloading files from an FTP server. Changes to the Lambda invoking the instance could include separating the file list to create several more minor instances to run simultaneously, moving more files faster to the AWS S3 bucket. This greatly depends on the client’s needs and the cost of operating the AWS services.

Changes can also be made to the script downloading the files. One option would be to use more robust FTP libraries than the built-in provided python library. Larger files may require more effort as FTP servers can timeout when network latency and file sizes come into play. Python’s FTPlib does not auto-reconnect, nor does it keep track of incomplete file downloads.

When it comes to Microsoft, people don’t generally think “Open Source” or “Linux Support”. But in recent years, Microsoft has come a long way. They’ve released many of their most commonly used frameworks under open source licenses, including ASP.NET MVC/Web API/Web Pages and Entity Framework!

Additionally, they’ve given first-class support for many non-Microsoft offerings, especially in Azure. Currently, this includes support in Azure for open source gems like Node.js, PHP, and, yes, even Linux. Heck, they even have an Openness logo:

In this post, I’ll walk you through setting up the Ubuntu Desktop on an Azure Virtual Machine and configure it so you can connect to it through Windows Remote Desktop. It’s a lot easier than you think!

Read More…