Published on July 10, 2024
Editor’s note: this article originally appeared on the Avatria Insights blog and is republished with permission.
Here at Avatria, we are constantly looking to evolve the technologies we support. As we’ve seen an increasing demand for composable solutions across our implementations, we’ve had to become experts in a number of different platforms that fit within a composable architecture.
Enter Contentful, a composable content platform that allows for a ton of flexibility and freedom in building a scalable content model for your implementation needs. We’ve implemented Contentful as the CMS platform on a number of projects, and are excited to continue this partnership as composable requirements become more and more common.
We recently embarked on a total redesign of our internal site, avatria.com. Given our growing expertise in Contentful and the necessity of building out a number of site components, we decided to use the opportunity to migrate all of our website content from Wordpress into Contentful.
Though there were a few minor hiccups along the way, the overall process was extremely simple. If you’re interested in migrating your website from Wordpress to Contentful, this article will line out the steps you need to take to ensure your migration process is as smooth as possible.
Wordpress's export functionality offers a quick way to export any content type from your Wordpress site. The steps below will walk through how to extract this data from your Wordpress implementation.
Login to wp-admin
Navigate to Tools -> Export
Select either “Export all” or choosing what to export from a list of radio button of types of content, then “Download Export File”
The result of these steps will be an XML file that will hold a series of objects from your Wordpress implementation. Below is a snippet from the XML file, where each item
represents a blog post (or other content type, if you export more than just blog posts).
The goal for this part of the exercise is to determine how to convert these XML objects from Wordpress into your new content model within Contentful.
If possible, we recommend going through this process once you have a general sense of the design of your website, as it will help ensure that you develop a content model that supports all components and component types.
Given Contentful is extremely flexible, this exercise is a data modeling one, where you should be asking the following questions:
What are the key fields from the Wordpress objects that we want to have in our Contentful content model moving forward? What can we exclude?
Examples - title, image, date, author name, author image, slug, rich content
What are fields that we can set to default for each of the objects?
Featured Article
What are fields that can be used by references to the article?
Think of a tile that shows the article & a snippet of the article, providing a link to the article itself
In the case of a listing page of different blog posts that shows these tiles, would be useful to have a summary, a link, tags for filters, etc.
What other features do you think you could use these objects in?
What needs to be manually authored and cannot be easily migrated?
For us, this was images, for reasons we'll discuss below
The better you can answer these questions, the more comprehensive and well fitting your data model will be to your intended functionality. Of course there will always be new features / new enhancements to implement in the future, but with a bit of forward thinking now, you'll have less work in the future.
For our purposes, we answered the questions above and came up with the following data model within Contentful for our blog and blog posts.
Note that this is just one part of our broader Contentful data model. Since blog posts are the only thing we needed to migrate from Wordpress, it is the only piece that we will be discussing in this post. We use the Content Page
content model to support a number of different types of pages, including our blog posts.
Now that our content model is defined, we can populate it through a script that converts the Wordpress data into our Contentful models.
Contentful offers two main APIs for interacting with data within their platform:
Content Delivery API — a read-only API for retrieving content
Content Management API — create, edit, manage, and publish content
It was clear that we were interested in how to best utilize the Content Management API for what we needed to accomplish as a part of this exercise.
Contentful has a number of different client libraries in various languages, which we reviewed on their platforms page. For the purpose of our migration, we used the Python SDK client library due to comfort with Python and the quick nature of this integration.
Below is walkthrough of the script we developed.
The script starts with reading in the XML file using the library BeautifulSoup. This allows us to easily search through the file and grab elements that we need in an easy format.
Thus, we have stored all of our blog post items and stored them within an array with name items
.
Now, we are able to parse through each item, extracting the information from each item
as we see fit:
Note that for some of the fields, it’s not as simple as using a get_text()
, such as retrieving the tags or parsing through the body. Some of this is due to the encoding that has been done to the content, while others depends on how data is stored in the source system vs. how you want it to be stored in Contentful.
For example, the blog post link object, obtained via item.find('link').get_text()
as shown above, would come back with the following value:
https://www.avatria.com/news-and-insights/appointment-service-design-for-ecommerce-applications/
In our implementation, all we need is the url suffix of news-and-insights/appointment-service-design-for-ecommerce-applications/
. Thus, we will apply the following transformation this field:
slug = item.find('link').get_text().rsplit("/")[-2]
In determining what additional transformations you need, it's worth experimenting with a manually created object to determine the behavior of each object with the front-end design.
Remember that the more conversion you can do as a part of this step, the less you’ll need to manually do in the future, and the lower risk you'll run for errors in live content. The amount of effort in this conversion process likely depends on the number of resulting items. If you have 10 resulting objects, a manual update to a field in each is quite a bit less effort than if you have 100 resulting objects.
Note from our content modeling above, we reference a number of nested items: (1) link, (2) article, (3) page metadata, (4) related insights. All of these objects will live within our Content Page that will ultimately serve as our blog post object.
Before we do this, we need to create our connection to Contentful via the Contentful Management Python SDK.
client = Client('<CMA_TOKEN>')
space_id = '<SPACE_ID>'
environment_id = '<ENVIRONMENT_ID>'
For these values, you can find them in the following locations:
CMA_TOKEN: From the Contentful dashboard, navigate to Settings > CMA Tokens, generating one for this exercise.
SPACE_ID: Check the URL while on the Contentful dashboard. It should look something like app.contentful.com/spaces/<space_id>/environments/...
ENVIRONMENT_ID: Select the hamburger menu above the Home button. The environment ID is what the environment alias is pointing to. For example, it should look something similar to main > main-source, where main-source would be your environment ID.
With this, we can use client
to execute calls to Contentful to create entries moving forward.
Thus, we need to create these objects to provide this reference for our blog post item.
First we define a link object, which will be used for linking to the post from different pages throughout our site.
For a number of these attributes, we default the values to the title to allow for easier searching later on, as well as to allow us to differentiate between each Link item within our Contentful data.
We then call the client.entries
method, passing our space_id
and environment_id
to create a link entry within our Contentful instance, passing along our attributes that we set above.
We save the entry id attribute from the resulting object created within Contentful to reference later on within the page creation call.
link_id = link_entry.id
Following a similar process to the above, we need to create an insightsArticle
object, which includes key aspects like the title, date, author, and reference to our link
object we just created in the previous step:
Once we set these attributes, now we can call the client library to create the entry via the Content Management API.
Note we don’t publish this item because these article objects are used on a separate blog listing page that customers use to navigate to individual posts. Since we still need to publish this post, we should create this object but leave it in Draft status so that the tile does not appear on the listing page.
Creating a metadata object for each page allows you to group certain attributes like keywords, page ID, page description, etc. for each page without having to bloat the actual content within the page attributes themselves.
Thus, we populate this with some general attributes that will fit most of our blog posts, while including the title as the id attribute within this object.
Similar to before, we call the client.entries method to create this object, then store the ID for this metadata object for later use:
One of the components on our blog posts is a small carousel linking to other related blog posts. For the purpose of this migration, we want to set up a new relatedInsights
object that will hold these references.
Note that we won’t populate the references themselves, as the related posts are new functionality, and can't be imported from Wordpress. Creating these references will be a content exercise later on. However, setting up the objects now will make this content exercise a lot easier in the future.
Below is the code we use to set the attributes and create this entry within Contentful, saving the resulting entry ID within insights_id
for use later on.
Now that we have all of our nested items created, we can finalize the creation of our Content Page.
We first set-up some default objects that will be included on every blog post:
Then we set the page attributes based on a combination of our default entry references & the entry references from the previous steps:
Now we can set our page attributes and submit the request to Contentful to create our blog post page entry.
We decided to not automatically publish this item for a couple reasons:
Wanted to ensure the content looked appropriate before making it live on our new site.
There were various content authoring steps (mainly image-based) that our script did not cover that we needed to do before publishing.
This is what a resulting blog post object with our above data modeling looks like in practice.
Shared is the Content Page which contains all objects necessary to render our blog post appropriately.
All steps I listed above were very specific to both (a) what we were exporting from Wordpress and (b) what we’re importing into Contentful.
When mapping out the content authoring solution, the process should follow a similar structure, but each field, data model, and exact field population will likely be a different process.
Given that, there are some things to consider if your implementation looks somewhat similar to ours:
Have an idea of what you’re going to migrate, then try it for one entry.
There are going to be issues. Before implementing the entire process, try the script first on each embedded object and iterate. What’s easy? What’s really difficult? Verify amongst the team as you progress with this single entry.
The structure of entry attributes is a bit wonky within the Contentful Python SDK.
This may be something that is similar in other SDKs, but I found that it’s necessary to mark every field with localization in order to import appropriately via the client.entries(…)
call.
This included booleans, urls, etc. This wasn’t explicitly clear in the documentation and took a bit of trial / error & clever googling to find the solution.
Determine the balance of “manual” vs. “scripted” for your implementation.
For example, we decided to manage the migration of image assets manually. Given the relatively small number of posts, and the necessity of steps that would be difficult to automate (such as resizing images for the new design), we determined that the payoff was not worth the effort.
There may be similar content types that requires manual authoring, and that’s okay. It’s about the balance of your team capabilities, timelines of go-live, etc.
As a result of this migration process, our content model has become more dynamic and the content model is flexible to support additional features we come up with for avatria.com.
Building our content model within Contentful allows us to have access to our content library through the Contentful API, which makes us much more flexible for other internal marketing material in the future.
If you’re interested in learning more about Avatria or our abilities in CMS-related implementations, feel free to reach out to us!
Inspiration for your inbox
Subscribe and stay up-to-date on best practices for delivering modern digital experiences.