Content migration from Drupal 7 to Contentful: A step-by-step guide

Illustration of a hand dragging a luggage bag with wheels, representing content migration from one platform to another
Published
July 19, 2022
Category

Guides

Folks, we have an important question. Are you considering migrating your content from Drupal to Contentful? Then this post will show you how. Carpe diem!

Drupal 7 is officially reaching end of life on November 1, 2023. After that point, all support for this legacy content management system will come to a close. 

So how should you plan for the change? Sure, you could start using the latest version of Drupal. Or — and maybe this is a wild and crazy idea, but hear us out — you could move everything over to a new CMS. 

After all, if you’ve taken this long to move on from Drupal 7, there’s likely a good reason why you were taking your time.

Let’s hazard a guess: Was it a feeling of dissatisfaction with using a monolithic CMS? Is it too complex to maintain? Do you desire a conscious uncoupling? To upgrade to something more versatile for your content needs?

It’s time to use Contentful! We have a lot of resources on this site to outline the benefits of adopting a headless CMS. But rather than dive into a discussion about content strategy and data migration, this post explains how to migrate content from Drupal 7 to Contentful. 

We’ll be outlining the workflow of a migration project step by step, together with examples, migration tools, and command line prompts. 

Before we dive in, there are two prerequisites in order for this guide to be useful. First, you have worked with Drupal 7. Second, you have some knowledge of the command line interface in Contentful. 

Ready? Let’s go!

Step 1: Content audit

A content audit is an essential stage of any migration process. Don’t embark on a migration plan without one, not unless you’re working purely on a “lift and shift” effort.

First, get a clear understanding of the content you want to migrate from Drupal 7. Extensive discussions with stakeholders will guide the process and establish the parameters of the content migration process. 

In this sample checklist, we cover questions like content types, taxonomies, user profiles, and more. The answers are unique to your requirements and your content, and should probably be tracked in an Excel spreadsheet or similar.

  • Content (types)

    • Create a full list of content types and their fields.

    • Once it’s completely understood if all content or just some content needs to be migrated, start thinking about excluding content within the content types.

      • For example, do you really need to migrate old content published before a certain date?

      • Do you need to migrate archived content from the old website too?

      • Consult your SEO data and Google analytics to check visits against the given path of the node, and ensure you only migrate valuable content. Look for indicators like:

        • Overall traffic

        • Inbound links

        • Time spent on pages

      • For removed content, it would make sense to start thinking about collecting the URL paths for 301 redirects.

  • Taxonomies

    • Which tags can be imported?

    • Will content with unique tags be imported too?

  • User profiles

    • These can’t be imported into Contentful’s user profiles, but rather into a new content type, e.g., Blog Authors.

  • Views that are used

    • Outdated views may hint at data that is maintained but never used on the website.

We recommend that you use a site crawler to identify all URLs on your Drupal 7 website. Some pages may already have their redirects in place. Once you have migrated all of the content and are ready for QA, running the URL path will reveal any broken links or missing redirects.

Step 2: Create content model

New content model

Now that the audit is done and you have an overview of your content inventory, you can create a new content model with your stakeholders to come up with a simple structure. 

In this example, we have a content model representing the structure of a blog:

In this example, we have a content model representing the structure of a blog:

Map Drupal content types to Contentful’s content model

This is the part that will help you move content from Drupal nodes into Contentful entries. In theory, you should be able to map each of the Drupal content type fields to an equivalent in Contentful content type fields.

Continuing with our example of the blog content model, this table represents a very simplified use case and mapping from the old CMS to the new system.

Contentful 'Type'

Contentful Field

Drupal Content-Type

Drupal Field

Comment

Blog

Internal Title

 

 

Format: [Blog] - [Author name] - [Drupal Title]

Blog

Title

Blog

Title

 

Blog

Slug

Blog

Path

 

Blog

Body

Blog

Body

for the example we will just import the text

Blog

Author (reference)

Blog

Author (uid)

user id

Blog

Hero Image (asset)

Blog

Blog Main Image

 

Blog

Published on

Blog

Published On

 

Blog (Tags)

Tags

Taxonomy

Tags

export taxonomies

Author

Internal Title

 

 

Format: [Author] - [First + Last Name]

Author

First Name

user

Name (first part)

 

Author

Name

user

Name (second part)

 

Author

Email

user

Email

 

Author

Photo (Asset)

 

 

no data to import

Author

Short Bio

n/a

Biography

 

Author

Phone

 

 

no data to import

Author

Facebook

 

 

no data to import

Author

Twitter

 

 

no data to import

Author

Github

 

 

no data to import

Step 3: Tags and taxonomies

In order to import taxonomies into Contentful, you must decide if you want to work with existing Drupal tags or create a separate content type. 

Here’s a brief overview of the two approaches:

Tags

Content Type

Already built into Contentful

Hierarchy is possible

Only simplified hierarchy possible

Cross content type queries are more complex

You can query content across content types

Nesting of entries will have an impact on GraphQL complexity

Content types do not need to be updated as the reference happens inside the Tags entry editor

Additional content type

For this example, we’ll only be looking at tags within Contentful.

Tags

You can create all tags using the CLI and then map the Contentful tag IDs with the tags you need to import.

If you’d like to create tags using the CLI, you can create them as follows:

The important elements to pay attention to are:

  • The name as it appears in the web interface

  • The ID which is used in each entry (see below) to refer to the tags

  • Visibility (either public or private)

For the import, we will need to generate the following construct for each tag within an entry:

Step 4: Drupal modules and views

Modules

To create the JSON export functionality (and to create dummy content for test purposes) you need to install the following modules:

  • Views & Views UI

    • If you don’t have views enabled, then we’re not sure why you’re working with Drupal in the first place! =D

  • Ctools

    • Required module by devel and others.

  • UUID

    • Highly recommended for the export of data to follow Contentful’s guide on entry IDs.

  • Views data export

    • Allows for data export with some available options.

  • Views data export JSON (not covered by Drupal’s security advisory policy) or Views Datasource

    • Enhances the Views Data Export module by providing a JSON option.

      Enhances the Views Data Export module by providing a JSON option.

  • Devel generate

    • Generate dummy users, nodes, and taxonomy terms.

  • Realistic Dummy Content

    • Used to generate dummy content for test purposes.

      Used to generate dummy content for test purposes.

Views

To create an export view, simply create a new view with the content types you want to export, or start with a single one first to work the process.

You should certainly make use of the filters to exclude old content (in line with the content audit you conducted earlier). For example:

  • Old, outdated content published before a certain date

  • Content that’s no longer being used or not published

  • Excluded via certain criteria (e.g. within a category, written by a certain author, and so on)

For our purposes, we’re only interested in:

  • Title

  • User UID

  • Body

  • Node UUID

  • Path

  • Content type

  • Updated date (used to display and order the blog post)

Blog Post

This screenshot illustrates how we’ll be migrating content in our example of a blog post:

This screenshot illustrates how we’ll be migrating content in our example of a blog post:

Important to note is that the format in which we select the JSON format is now available through the different modules we installed earlier.

With the above module and Drupal View, you’ll generate a very flat JSON object for the blog post:

User / Author

We’ll also do the same for the user data:

We’ll also do the same for the user data

Please note: Contentful doesn't apply content types to a user the way Drupal 7 does. We'll need to create a custom text person which we can use later for importing.

Assets

In this example, we’ll get all assets without filtering.

In this example, we’ll get all assets without filtering.

This generates the following JSON:

Putting it all together

Finally, combine all of the data into one JSON object:

Step 5: Transform

The data that we’re going to export will need to match exactly what the CLI importer can handle. 

Here is an example of what will be required for our blog post (with unnecessary data removed for the import).

Sample for assets:

You can generate the above by exporting sample content from your space by using the CLI. This will give you the general object structure you require for the import.

A useful tool for creating PHP objects from JSON is Convert JSON Object to PHP Array Online.

For each node that we want to migrate, we will need to create a content-type specific object, and need to adjust the following aspects from the schema above:

  • Space ID

  • Environment ID

  • Content-type ID

  • Metadata (if applicable)

  • Fields

    • These will be specific for each content type

    • May require localized content

    • Include assets

    • Include references to other entries

Mapping Drupal user IDs to Contentful user IDs is not possible

Unfortunately, an import will assign the user of the CMA key that was used during the migration as the author. Therefore, the following items are not needed in the import.

Transforming the JSON with PHP

First, let’s move the export file into our working directory, then create a simple PHP file. Within our new PHP file, let’s get the JSON:

Let’s set some variables to reuse:

And now the fun part! Loop through the PHP object and create the structure we need for importing assets or entries:

CreateEntry generates the skeleton of an entry and calls another function to create the fields.

createFields just calls the specific functions for the different content types and returns the results for each.

Creates the fields for the blogPost content type. Note that for this example we are moving the title into the internal title.

Creates the fields for the person content type. Same as above, we’re just moving the name into the first name and internal name fields. If there are fields that you don't have data for, you can remove them from your code (see the image in this example). Phone, Facebook, etc. could also have been removed.

And back to the assets. With the export, we have the desired structure (less the non-required fields), and can call createAsset.

This fills the PHP object with the necessary data. The URL requests the data from a specific folder within the sites/default/files folder. Since we do not have a title for the image, we will be using the image name, but please note that it’s not required to have a title.

In the end, let’s just spit the JSON out on the screen or put it into a file:

Step 6: Import

By now, we should have a nicely formatted JSON that’s ready for import. We can use Contentful’s CLI for the actual import.

Given that you could also the Drupal node IDs as the unique identifier in the JSON import file, new entries would be created for each entry, and there should not be any conflicts. We can then also reimport the data if needed as the content would be overwritten with an additional import based on the same ID.

Considerations 

The content migration in this example was only concerned with text from a blog. But what happens when you’re working with more than just text? 

The Drupal Body and Summary fields, plus many other Drupal fields, may contain HTML and possibly even CSS. Unfortunately, migrating content from free-form text fields to Contentful is not straightforward since Contentful uses a Rich Text Editor that stores the data in a JSON object.

But not to worry! A viable solution is to migrate your HTML to Contentful by using Turndown. This migration tool converts HTML to markdown, and then the next step would be to convert the markdown to Rich Text using a markdown converter.

Wrapping up and lessons learned

And there you have it! A successful content migration from Drupal 7 to Contentful. Doesn’t that feel so much better now? It's practically a whole new website!

Hopefully along the way you’ll have learned a few new things. Perhaps this exercise would be a good template for another content migration project you have to perform down the line, or you’ve picked up some tips and tricks for web content management in general.

Here’s a summary of some of the lessons we learned while conducting this exercise for ourselves:

  • PHP date formats are not the same as on Contentful.

    • Datetime format in PHP for export: Y-m-d\TH:m:s.ms\Z which returns 4 digits for milliseconds while Contentful requires 3 digits and fires a validation error for the import.

    • Quick fix: substr_replace($array["Updated date"],"",23,1)

  • Drupal node IDs are nice but do not conform to the standards of Contentful. It’s highly recommended to install the UUID module to be able to export them (fewer replacement functions if you do it in the code).

  • Carefully consider validations and regular expression validation when setting up your content model. Your migrated content will need to pass the validation for the import.

  • Helpful online tools:

  • json_encode: Ensure to have the JSON_UNESCAPED_SLASHES flag set or else it adds backslashes in front of every forward slash.

  • The file import requires you to have an SSL connection to your remote server (with the test server, we didn’t set this up initially).

  • Ensure to aggregate data in Drupal views to avoid duplicates in the export and thus retrieving 409 errors during the import.

  • CLI: Ensure you have the latest package installed.

The Contentful Professional Services team provides a Content Migration offering as part of a suite of training and consultation products. If you would like some help, don’t hesitate to drop us a line.

Contentful’s API-first content platform is purpose-built for creating omnichannel digital experiences. The platform helps digital teams innovate, iterate, and go to market faster with an agile, modern tech stack that integrates seamlessly with ecommerce tools. Visit the Contentful Marketplace to see these integrations, and read our customer use cases to learn more about how Contentful can help your organization grow its digital footprint.

About the author

Don't miss the latest

Get updates in your inbox
Discover new insights from the Contentful developer community each month.
add-circle arrow-right remove style-two-pin-marker subtract-circle remove