Written by Bård Farstad, Co-founder and Chairman of Product Innovation Board at eZ Systems

Content migration is an often underestimated aspect of a CMS project. Do you have the knowledge you need to make sure it doesn’t catch you by surprise? We spoke with a content migration expert and we’ll be sharing his insights later in this article.

You have a lot of content on your website, and automating its migration to another platform can be challenging, time consuming and costly. Successful project managers understand this. Content migration is actually an opportunity to identify the successful content on your website and revise the content that isn’t engaging your audience. Though it may take hard work, content migration can be more of a blessing than a burden.

One of the best approaches to content migration is to start to revisit content strategy and understand the "Why" and "How" of your information architecture.

Defining Personas

One of the best approaches to content migration is to start by defining personas, which means understanding the target users of your website. Creating strong personas will enable you to better communicate with them. For example, at eZ, we could define a developer, an editorial user and a project manager for a creative agency. Each individual is looking for different information relating to our product.

A short list of goals for the defined personas could be:

Dave the Developer, who is looking for:

  • Main technical features overview
  • Software download to do development and customization
  • Documentation and tutorials
  • Forums to ask questions

Edith the Editor, who is looking for:

  • Information on the editorial interface features and user interface
  • Free trial to test the software
  • Capabilities to manage landing pages and other central features

Pamela the Project Manager, who is looking for:

  • Relevant industry references in the space she is executing a project
  • Matching capabilities compared to her requirements
  • Training and certification to support her team
  • Vendor contact information to go over the project requirements

When you define unique personas, it’s clear that the content required to reach your audiences varies greatly. But once you’ve nailed down what you want to communicate, the next step is to look at their user journeys.

Understanding User Journeys

User journeys are the navigation paths that users take in order to reach their desired destinations. A strong user journey will connect your marketing copy towards call to actions like: book a demo, download a trial, or contact us. Understanding your users’ journeys will help you build your information architecture and create strong relationships between different pieces of content through internal linking, navigation and related content.


By now you’ve likely begun wireframing and defining the composition of your site navigation, content and calls for action -- remember to take into account desktops and mobile versions of your site.

Wireframing helps copywriters to understand the parameters in which they will be working. Most times, content will need to be rewritten to fit the new constraints of the website.

To Automate or Not?

At this point it’s usually clear whether or not you will need automated content migration. In many cases, it’s beneficial to skip automated content migration entirely by writing new copy, reusing and reworking older copy.

So when then should you automate content? We’ve spoken with one of our North American implementation partners to answer just that.

Peter Keung is the Managing Director at Mugo Web and has run a range of content migrations into (and out of) eZ Publish both large and small.

How do you make a distinction between content that needs to be migrated manually as opposed to automatically?

We start by asking questions like:

  • How much data is there in sheer volume?
  • How many different content types are there and how many pages of each type are there?
  • How should the content change from the old system to the new system?
  • How complex is the data?

A manual migration becomes more attractive when the complexity is editorially-based (that is, whoever is doing the migration needs to manually review all of the content); an automated migration becomes more attractive when the complexity is based on scientific data and internal data relationships that are tough for an editor to sort out. For example, an archive of articles, products, books, inventory items and of scientific data are all common candidates for automated migration.

What lessons have you learned from your experiences with content migration?

Perhaps more so than on other elements of a project: Plan for the unknown. Also, include a generous QA time. It’s tempting to try and fully define a data import up front, but the nature of the problems makes it such that you have to actually do some of the real import work before you can uncover those problems.

Why is content migration such a difficult and complex challenge?

Data is not always structured cleanly. It isn’t always structured the way that you think it is upon first inspection or the way the people familiar with the content think it’s structured. The data model is often undocumented or incomplete, which can create a large number of problems.

There are almost always obstacles related to the source data not following a structured content model, leading to the discovery of edge cases where data is broken (and the client never knew it). This leads to a modification of the data model in the target system and of course a modification of your import scripts.

This is a problem you know you are going to have, but you don’t always grasp the magnitude of it. When you are doing a migration, you are rarely doing “just a migration.” There are different content types from different sources that all need to be consolidated into one platform. It makes things challenging and it makes estimating the timeframe of a content migration really difficult.

Often the client/users of the data won't realize all the improvements they want made until they are forced to help unpack the source data.

What are some mistakes you’ve made in the past during content migration that you could help others avoid?

Mistakes are mostly around failing to directly validate assumptions about the source data model. You should involve someone along the way that really understands the source data model, and if no such experts exist, be extra diligent about validating what others say about the source data model.

What are some examples where you were able to successfully automate the content migration of a site?

  • A news publisher: We had a high volume of records. This was a good opportunity to drop a lot of unused data in a formerly bloated architecture while needing to preserve complex data relationships.
  • A book publisher: We needed to preserve complex data relationships and needed to re-use import tools to support an ongoing sync/bridge with another system, but still allow data to be natively editable and extended in the CMS.
  • A scientific database: For this case we had a high volume of records and complexity of data, the need to define a standard data structure, and the need to preserve complex data relationships.

What advice would you give a large company beginning a CMS content migration project?

There are several pieces of advice I would give:

  • Evaluate against manual migration at least as a sanity check.
  • Budget for the unknown more so than other elements, and plan for multiple iterations of import and test cycles.
  • Try to get and validate a source and target data model up front.
  • Involve editors at every step.
  • Choose a CMS that separates content from data.
  • Document at every step.

What’s the number one key to success?

Treat a content migration project as its own mini-project rather than just checking off the box. Also, use Mugo's data_import extension for eZ Publish. :D

Load Comments