Defensive design and content model validation

Published on June 24, 2020

Defensive design and content model validation

Subscribe for updates

Build better digital experiences with Contentful updates direct to your inbox.

Humans get tired. We get distracted. We make mistakes. Even at peak mental capacity, we sometimes fail to communicate what we expect from each other. Foolproofing a complex process like content production might seem daunting, but a practice known as defensive design can guide us one step closer.

Preparing for entropy

The first principle of defensive design is to prepare for all eventualities so that you can guard against misuse of the final product. In the context of content production, you aim to maximize the quality and consistency of content.

In software engineering, defensive design is often referred to as defensive programming. The core idea of defensive programming is that any piece of functionality can only be used explicitly for its intended purpose. Everything that is not allowed is actively denied. Incidentally, that's exactly how managing user and group privileges works in Contentful.

Content modeling, on the other hand, abides by an opposite paradigm. By default, everything is allowed within the confinements of a field type unless prevented explicitly. While an unrestricted approach enables rapid prototyping, it may have a negative impact on data quality and content lifecycle if proper validations are not implemented.

Content models are the foundation of any Contentful-based application. Like with any structure, a poor foundation equals poor structure. Data quality degradation and content entropy grow exponentially worse over time unless kept in check.

In the simplest, most practical terms, defensive programming means donning one's proverbial tinfoil hat and expecting the worst possible outcome with each and every user input — almost as if it was a deliberate, malicious attempt to abuse the system.

It’s easier to lift overly strict validation rules than to compensate for unwanted results in content production by adding more validation rules afterwards. That’s not to say you shouldn’t implement the latter just the same. I’m just saying that it’s usually more convenient (and cost effective) to come prepared from day one.

Currency of the twenty-first century 

Content is the currency of the twenty-first century — just ask Google or Facebook. Quality, not just quantity, matters. So what exactly makes high-quality content from the developer’s perspective? Let me give you a few examples.

Integrity

Do all entries consistently include a value for the same field? Do semantically related fields consistently include values for each instance? For example, if an address is present, zip code, state and city should be present too.

Choosing the right editor interfaces contributes to building a content-editing experience that is both user-friendly and defensive.

State is a good use case when to use a preordained list instead of free text. In any given country, states (or comparable regions) are usually counted in the dozens and change infrequently. Having the content creators choose from a preordained list removes the possibility of a typo and does not require them to touch the keyboard.

Consistency

Do field values follow the same pattern across all entries? For example, are all phone numbers formatted with an area code? Pro-Tip™: Use regular expressions.

The lack of field values should be consistent as well. NULL and an empty string are not equals. Neither is the lack of value and the boolean value false — something that's easy to miss with Contentful's boolean field type.

Reusability and portability

Ideally data should require as little processing and/or enriching as possible when transferred to another system or platform. However, it's difficult to foresee and predict what kind of datasets the future system or platform will need.

Microformat schemas provide a great foundation for designing standardized, structured datasets. Want to design a content model for a corporation? Try using the Corporation schema as a starting point. As an added bonus, the schema also defines how the HTML and/or JSON-LD should be formatted on the front end, so bots are able to understand your site better semantically.

Accessibility, interoperability and performance

Is the data available via standardized interfaces like a REST API or GraphQL? How many requests and additional operations should the user perform in order to retrieve the desired dataset? How quickly is the desired dataset constructed and retrieved?

The days of exporting and importing large data dumps on scheduled intervals have long passed. Modern microservice architecture is all about JIT (Just In Time): getting the data you need — only the data you need — as fast as possible and exactly when you need it.

Murphy’s Law

As a software developer, I’ve spent over a decade designing content models, APIs, integrations and migrations for a broad range of industries. In every case, the applications accept user input, i.e. content created by a human being.

I’ve consequently become a firm believer in Murphy’s Law; if content creators are able to input invalid, incomplete or otherwise subpar content, they will. That’s just human nature. 

Leading by example

Teams responsible for creating content don’t always share the same standards. It’s helpful when developers inform the content creators of what kind of input is expected from them.

Help text found in the Appearance tab of the field's settings provides the easiest way to offer instructions. Help texts and validation rules compliment each other: The former provides an example of the desired input, while the latter steps in when the content creator fails to emulate said example.

I also recommend providing custom error messages. Even with watertight validations, the lack of help texts and informative error messages can create a poor user experience.

Communication is key

Content modeling should ideally be an iterative and inclusive process. An ongoing dialogue between the content creators and the developers helps everyone understand each other's needs and workflows.

If a content model doesn't provide the necessary fields to achieve the desired function, content creators tend to get creative and adapt any available field for a purpose the developer never intended. Accordingly, if content creators perceive content models as rigid and immutable, they'll settle for a quick workaround and not say anything. The developer is left unaware of the content creator’s needs. All the while tools like environments and Migration CLI would've made testing and implementing content model changes a breeze.

Defensive design should never come at the expense of the user experience. As far as helping and informing goes, developers should strive to be the content creator’s best friend.

Subscribe for updates

Build better digital experiences with Contentful updates direct to your inbox.

Related articles

This post provides a high-level overview of serverless architecture, why it’s useful in your development practice, and reviews some real-world uses.
Guides

What is serverless architecture?

March 14, 2023

When you need fast-loading, high-performing web pages, a static website will be high on your priority list. Let's look at static websites and how they work.
Guides

What's a static website?

February 21, 2023

GraphQL and REST are two different approaches for building APIs. Let's dig into both of them, with examples to demonstrate how they handle data retrieval.
Guides

GraphQL vs. REST: Exploring how they work

August 16, 2023

Contentful Logo 2.5 Dark

Ready to start building?

Put everything you learned into action. Create and publish your content with Contentful — no credit card required.

Get started