Reimagining Replication Agents on AEM as a Cloud Service
Co-author: Amol Anand
Part of the journey when evaluating how to migrate to AEM as a Cloud Service involves taking a look at your current AEM implementation to see if there are any incompatibilities or areas which will require refactoring. This may include reviewing documentation, code, and running the Best Practices Analyzer (BPA) tool. One area which may be highlighted during your review is replication agents.
In the pre-Cloud Service model, numerous situations called for replication agents. This article won’t cover all use cases, but you can read more background here. With Cloud Service and its new architecture, the rules have changed in this area. Read on for some tips on navigating this change.
Author to Publish
For standard publishing, or copying content from an author environment to a publish environment, this is handled out-of-the-box with Cloud Service — no custom agents required. See here.
Publish to Author
Sending content from a publish environment back to an author environment is known as reverse replication. This is not supported per the Notable Changes documentation. Pre-Cloud Service, this is typically (although not exclusively) done for one of two reasons.
The first reason is to support user-generated content (UGC). The alternative with Cloud Service is to externalize UGC out of AEM. Let AEM focus on managed content delivery, and treat UGC as an integration.
The second reason is to ensure consistency of user accounts across publish instances. When a site requires authentication, via SAML perhaps, nodes and properties representing a profile of the user are created in the content repository of the publish instance the user is browsing. The profile info can be reverse replicated back to the author environment for replication out to the other publish instance(s) serving the site in order to ensure consistency in case the user hits a different publish instance during a future browsing session. Cloud Service handles consistency of user profiles as described here without this reverse replication scheme.
Dispatcher Flush Agent
Content on Cloud Service is removed from the dispatcher cache by default when it is published. No custom agent required. See here for more details.
Author to External System
Credit goes to my colleague Amol Anand for the rest of this article!
Some redesign is required prior to migration if you are currently using a static agent to copy content to a filesystem for other applications to consume. There are two main options to redesign this use-case:
Convert from a Push to Pull model
External systems can pull the latest content from the publish tier in Cloud Service on a scheduled or on-demand basis.
Advantages: This eliminates custom code from AEM and ensures that the external system will get the latest content available from the CDN based on TTLs and caching strategies put in place. This approach is highly scalable and performant and is not coupled to anyone external system.
I/O Events integration
If a push to the external system is the only option due to external system limitations, you can set up an Adobe I/O events integration as described here which will emit an event when content is published — the payload is small but contains enough information to then make a request back to AEM to pull the content and manipulate as you wish.
There are two ways to process events:
- Webhook: You can create a webhook (Push model) to run either an AppBuilder (formerly known as Project Firefly) action or trigger the external system directly.
- Journal: Have the external system pull events from the I/O Events Journal (Pull model) and process the events accordingly.
You can also use both options simultaneously based on multiple external systems and their requirements.
Advantages: This allows for decoupling AEM from the external system so that multiple external systems can consume the same event. It is also asynchronous in nature and supports both a Push and Pull model.
It is worthwhile to learn about Adobe I/O for reasons beyond this replication replacement use case, as it is used in Cloud Service for things like tailing logs and Commerce Integration Framework and is extensible for other integration activities.