Elasticsearch 7.1 is a major release that provides improved resiliency, scalability, and more efficient query processing. I wanted to update my AWS ES cluster and wanted to find out if there is an in-place upgrade available for Elasticsearch version upgrade from 6.2 to 7.1 with zero downtime. If not, what is the recommended strategy for performing this upgrade? Is there any data loss going to happen in this process?
In this article, I will explain how to upgrade your AWS Elasticsearch manually via AWS Console and in an automated fashion using CloudFormation.
Amazon ES provides in-place upgrades, and this is the most simple and straight forward process. Amazon ES starts the upgrade, which can take from 15 minutes to several hours to complete. During this time, all the cluster operations can be performed as normal if the upgrade has been performed on a healthy cluster. Also, this has been tested and observed that Kibana might be unavailable for a few minutes during the upgrade process. It is recommended to take a manual snapshot of the ES domain before initiating the upgrade and test the pre-upgrade checks. Please refer to the below links for detailed information on AWS in-place upgrades & note that the upgrade process is irreversible and cannot be passed nor canceled once initiated –
We can perform this upgrade manually via the AWS Console and also in an automated fashion using the CloudFormation template.
Manual Upgrade via AWS Console
When I navigate to AWS Console, select my domain, and click on ‘Upgrade Domain’ – I see options to upgrade in place to max 6.8 version.
In-place upgrade from 6.2 to 7.1 is a two-step process —
- First, you will have to update the domain from 6.2 to 6.8.
- Once that the domain is upgraded to 6.8 version, you can update it to desired and final version 7.1
It is recommended to take a manual snapshot before each upgrade.
You can upgrade the Elasticsearch version for your domain without creating a separate domain and migrating your data. The upgrade process first checks if the domain is eligible to upgrade. If this check succeeds, Amazon Elasticsearch Service takes a snapshot of the domain and initiates the upgrade. You can also perform an upgrade eligibility check without performing the actual upgrade.
After initiating an upgrade eligibility check, you can’t cancel it. You can continue to read and write data while the upgrade is in progress, but you can’t change your domain configuration.
It takes few minutes to do the upgrade eligibility check, and the results are stored in the ‘Upgrade history‘ tab in the ES Cluster overview screen —
If you decide to go ahead and upgrade your cluster following the 2 step approach, you will see the upgrade summary in the ‘Upgrade history’ tab.
Please note that the upgrade process can sometimes take hours to complete. In my case, it took close to 10 hours. Behind the scene, a blue-green deployment is initiated to handle this upgrade.
Blue/Green deployment of the cluster will be triggered whenever there is configuration change to the domain. During the upgrade, a new set of nodes is created, and data is migrated from old nodes to the new nodes. This practice minimizes downtime and maintains the original environment in the event that deployment to the new environment is unsuccessful. This deployment should not cause any issues with the cluster activities, but sometimes you might notice the performance degrade due to additional data migration and shard balancing activities.
In-place upgrades are recommended strategy, but there is also an alternate option of creating a new 7.1 ES domain and restore the data to the new domain using a snapshot. This process has low risk during the upgrade process as you are not making any changes to the existing/functioning domain, and if you see any issues in the newly upgraded cluster, you can always refer/go back to the old 6.2 cluster.
Automated Upgrade via CloudFormation
For performing any resource updates, I prefer to use AWS CloudFormation, so that the process is automated via coding and it ensures the same resource state across all environments.
The ES Upgrade from 6.2 to 6.8 works fine manually via AWS Console, but it did not via CloudFormation. I got an error message stating, ‘CloudFormation cannot update a stack when a custom-named resource requires replacing. Rename and update the stack again.’
Hence as the next step, I modified the domain name and deployed the changes. However, when you remove or change the DomainName value in the existing CF, it results in data loss. I noticed that the existing domain ‘titlesearch’ was removed, and a new domain ‘titlesearch-ke92iyvwu5re’ was created with no documents.
This is an unexpected behavior since we do not want either a new domain to be created or a data loss.
To fix the above problem, there is an easy solution. Go ahead and set the ‘EnableVersionUpgrade‘ UpdatePolicy to True. When EnableVersionUpgrade is set to false, or is not specified, updating Elasticsearch Version results in replacement. Once I made this change, my existing Elasticsearch domain was upgraded without any issues. There was no need to modify the domain name, there was no data loss in this process and no interruption or downtime. I would still recommend performing this upgrade during a time window when there are minimal requests to your Elasticsearch instance.
To sum it up, you can perform zero downtime version upgrades for your Elasticsearch clusters. In this article, I explained how you could do this upgrade seamlessly via AWS Console and CloudFormation. In case you bump into any issues with your update, please comment below, and I will be happy to assist.
If you are interested to learn more about Elasticsearch performance, you can read my article about the Five critical Elasticsearch metrics to monitor