By Aleks Dubinskiy, on Apr 22, 2020

Bulk operations and programmatic content management: How DFDS lost its head, part two

A person working on a laptop

At DFDS, we embarked on a digital transformation in 2017. We replaced our traditional CMS with Contentful, which we now use to build automation that helps us create and manage content at scale. You can read an introduction to our transformation and learn how content management fits into the headless CMS universe in the first post to this series.

Contentful Management API

In this post, I draw a line from content-related business objectives to code. I explore how a simple business objective can translate into a rather painful manual process, and then demonstrate how it can be delivered with only a few lines of code in the Contentful Content Management API (CMA). Working at a large enough scale, the CMA pays off rather quickly.

So fasten your seatbelts (or better yet, put on your diving gear) for this deep dive into the world of programmatic content management on Contentful.

Case study: Localized URLs

The introduction of localized URLs was an important objective during our transition to Contentful. For example, an English URL https://www.dfds.com/en/freight-shipping corresponds to https://www.dfds.com/de-de/frachtschifffahrt in the German locale.

In order to accomplish this, editors and translators needed to take up to 1,000 pages in English and translate their URLs for all of our 19 European languages (we did not localize URLs for the three Asian languages that we support). The translations were put into spreadsheets and sent to various departments for verification. Then we needed to import those 1,000 localized URLs back into Contentful.

The manual approach to localized URLs 

We initially took a manual approach. For each of the 1,000 rows in a spreadsheet, editors had to:

  1. Select the translated URL.

  2. Find the entry corresponding to the English URL.

  3. Copy/paste the translated URL at the right locale in Contentful.

Assuming an optimistic ten seconds per row, the time it took to do the whole batch added up to about 10,000 seconds — close to three hours of employee time for just one language. If this number doesn’t sound too bad, you should keep in mind that this process can be quite error prone and difficult to distribute to more than a few people at a time.

The automated approach using the CMA

In order to automate the import of a long list of localized URLs, we converted our spreadsheet to a JSON file, mapping IDs to localized URLs. We were then able to use CMA and JavaScript to write code along the lines of the following:

const createClient = require('contentful-management').createClient;
async function ImportURLs(env, localeCode, idToUrl) {
   for (const id in idToUrl) {
        let entry = await env.getEntry(id);
        const localizedUrl = idToUrl[id];
        entry.fields['slug'][localeCode] = localizedUrl;
        try {
            entry = await entry.update();
            await entry.publish();
        } catch (err) { }  //error handling
    }
}
 
const client = createClient({accessToken: <your management token>});
client.getSpace(<your space id>)
.then(space => space.getEnvironment(<your environment id>))
.then(env => {
  let idsToUrls = JSON.parse(<JSON file content>);
  ImportURLs(env, locale, idsToUrls));
)
.catch(console.error);

This code takes about 3.5 minutes to execute, which is already over fifty times faster than the manual approach. On top of that, we were able to easily reuse this code for importing URLs for new languages, previewing content before going live and doing several iterations per language until everyone was happy with the result. 

Several other automation use cases

The above example was just one of our first exposures to the power of CMA. Rather quickly after that, the code base was expanded to take care of several other content-management problems.

In-text URL search and replace

In addition to URL localization, there were also in-text references that needed to be localized. As we had up to 1,000 URLs per locale and even more entries referencing those URLs, we were able to take advantage of automating this lookup-search replace process. 

Eliminate “dead” content

As our content evolved, we naturally retired or temporarily took down many pages. This built up a whole collection of entries which were most likely “dead,” as in they would never be used again. However, they continued to show up in searches, slowing down editors. A manual and gradual solution to this would be possible, but that would be painful, inconsistent and require additional governance to make sure everyone on the project worked together efficiently. 

After identifying the criteria for what we thought constituted dead content, we were able to archive 10-15% of our content. 

Import translations from a third-party translation vendor

When working with third-party translation vendor VistaTec, we received our translations either as spreadsheets or JSON files. This included over 50,000 entries that needed to be transferred from the spreadsheets into Contentful. The manual workload for this extremely tedious, error prone and repetitive exercise would have added up to a total of two human-years by conservative estimates. 

We were able to build a translation system on top of the CMA, which helped us automate common translation tasks. It ended up being about four months of development and a whopping 24 hours execution time for the entire volume of content. 

This effort has been a big investment with a sizable pay off, the details of which I will cover in the next post of the series. 

The motivation for programmatic approach

To most developers, the idea that content should be managed programmatically seems fairly obvious. However, before a problem lands on a developer’s desk, there needs to be an understanding between business stakeholders and content creators about what is possible and  what the return on investment is. At DFDS, there was no previous experience with this kind of approach, since on our old traditional CMS solution, content was “organically grown” with no automation. Contentful’s CMA provided us with a toolbox for solving a wide range of content management problems. 

Conclusion: quicker time to market and fewer headaches

Contentful CMA proved to be a valuable tool for bulk content management operations, allowing us to translate real business objectives to programmatic solutions that scale well. With our content size being well over 100,000 entries, over 66% of them were imported with help from CMA-based script. This is an approach that required some investment, but in the end provided benefits in several areas. 

The first and most obvious benefit was a dramatic improvement in the time to market for content-related tasks. Not only were we able to do bulk operations on content quicker, certain tasks that were previously prohibitively expensive, became achievable with automation. The second benefit was a lower error rate, since thousands of repetitive actions lead to fatigue and mistakes. Finally, the third benefit was being able to do more iterations on content with our content editors, because certain parts of the process were quick and streamlined. 

One of the most important areas of content management at DFDS has been working with translations and automating translation workflows. Since this is a rather big topic for many companies, I will get into the details of that in the next post. 

Aleks Dubinskiy

Software developer at DFDS, technical writer, system analyst and future trends enthusiast.

add-circle remove subtract-circle