When we launched the Contentful App Framework in February, you probably saw the various apps we made available in our marketplace, including Netlify, Optimizely and Jira. They are ready-to-use add-ons for your Contentful spaces, integrating content editing experience with third-party tools.
But the main goal of our work on the App Framework is to create an open platform. We thought that we — as engineers at Contentful — shouldn’t be privileged over developers working for our customers. An external developer should be able to build any app that was part of the release using public APIs, libraries and SDKs.
In this article, we’ll dive into the architecture and deployment of the AI Image Tagging app. After reading, you’ll be able to create your own version of the app or use it as the basis for a similar use case.
The AI Image Tagging app uses AI to automatically assign tags to images. The tags are visible and accessible through the Contentful web app entry editor, and tags are searchable in the search bar.
The app packages:
An API that accepts images and returns tags
Installation screen that creates a content type used for storing image metadata
Editorial widget that calls the API and updates image metadata with tags
Now, on to how we built it. This article will give you an architectural overview of the app. The full code is available on GitHub in our open-source repository.
At Contentful, we’re big believers in the “API-first” approach. It means that we build features first as API features and everything else follows: user interfaces, CLIs and SDKs.
We need an API that, given an image stored in Contentful, will return a list of labels describing the content of the image. Of course, the hardest part is to perform detection, but thankfully we don’t need to implement it on our own. Instead we can use AWS Rekognition, an AI-backed image recognition service.
DetectLabels API method requires a base64-encoded binary of the image. We could transfer the whole image to our endpoint, but it would make calling the endpoint harder, especially from a browser.
To be smarter, we can inspect the payload of the Contentful Asset entity:
As you can see the
file.url property of the asset contains a URL. Its general format is:
So instead of sending the whole image to our API, we could only send four identifiers listed above and fetch the image directly from the Images API in our service. The general flow looks like this:
We created our API as a regular Express server. Thanks to this, you can run it locally and test it as you would test any other node.js service.
But to make this API available to users, we need to deploy it to the public internet. This is where AWS Lambda comes in handy. Using a serverless technology to deploy our services reduces both maintenance overhead (no servers to manage) and cost (you pay only for real usage, not servers idling).
Thanks to AWS Serverless Express, any Express app can be wrapped to serve as a handler for a Lambda function.
With our server exported from
app.js, we need less than 10 lines of code:
exports.handler is now ready to accept requests from API Gateway while running in the AWS Lambda environment.
The last step is to deploy our service to AWS. One of the popular options is Serverless Framework, which takes care of packaging and creation of required resources. You only need to create a declarative configuration file,
serverless.yml to allow your service to call
DetectLabels and route all the incoming traffic to our Express app:
Once we’re ready with the configuration, type
sls deploy and see your API go live!
So far so good: we can call our public API to tag any of our images stored in Contentful. To expose this functionality to editors in spaces, we need to create a user interface. With some help from our team’s designer, we prepared the following screens:
Space administrators use the installation screen to enable AI tagging in the entry editor. Tags are stored in a field of a special content type that references an image and stores extra metadata, including said tags. The installation screen will ask for name and ID for the content type and create it automatically during the installation.
Once the installation process is completed, editors can use auto-tagging in the entry editor. Clicking the “Auto-tag” button will call the API we created and populate tag input with tags returned. It should be possible to add tags on your own, too.
To implement the frontend for our app, we use the SDK for getting the value of an asset that a user wants to tag. By adding Forma 36 to the mix, we can produce native-like user interfaces that are consistent with the look and feel of Contentful.
Because our AI image tagging app consists of two views, we need to provide components for two “locations”. Location represents a place in the Contentful web app that can be controlled by an app:
Once our frontend is ready, we need to make it publicly available again. Because we’re using Express, we can use the
Voila! Now both the API and frontend are served by our Express app which we can deploy to AWS Lambda.
With both components deployed to AWS Lambda, the last step is to create an
AppDefinition. App definition is a Contentful entity that will make the custom app we’ve created available for installation in all space environments of an organization.
In the app management view, we can name the app, provide the URL of its deployment to AWS Lambda and select locations which are implemented. In our case, we implemented the app configuration screen and entry field widget operating on lists of short strings (list of tags provided, either manually or using the AI auto-tagging API).
In this article we’ve presented a real-life process of architecting and building one of the 30+ integrations available now in Contentful Marketplace. You can start using it right away here in the web app.
If you’re interested in all the technical details of implementation or want to fork the app and tweak it to fit your needs, there’s good news: all the code is open sourced on GitHub!