Since Alaska Dispatch News’ launch of Google AMP we’ve run into one issue that keeps popping up: journalists. AMP is quite restrictive, and sometimes human error can cause a document to become invalid. Two of our biggest examples of this come from malformed URL’s, and content pasted from another source due to additional attributes that AMP doesn’t like. While we’ve put in a number of restrictions that curb how their content is filtered through to the AMP site there’s only so much we can do until human intervention is required to solve the issue. But how do you know there’s an issue? Google Webmaster tools reports on AMP errors whenever it crawls the site, but that is not instant and not everyone has access to it, and by the time you’re aware you might have already missed the traffic spike which it may have produced. In order to make sure that all of our articles are reaching their full potential we decided to create a Slack bot using Python.
Cloudflare & Chartbeat
Alaska Dispatch News uses a product called Chartbeat to monitor real-time site analytics on a day-to-day basis. It reports current visitors, their traffic source, distribution, you name it. The nice thing about Chartbeat is that it has an API method to fetch your highest performing content. We already use this method to power other things such as our “Most-Read” widget so this is what we decided to use to power our bot.
During Google’s AMPConf it was announced that Cloudflare would be providing a free AMP validator API which you can cURL with your AMP document path to see if it’s validating or not. If it’s not validating it will return a piece of JSON which contains the reason as to why. This is perfect for the bot, so once we have our URL we send it to the validate function which does just that.
If the document failed validation I used the slackweb package by Satoshi Innami to send a payload to Slack via a webhook with the information returned from the Cloudflare API. This message includes the error description, code, line and source. We also didn’t want warnings appearing so we filtered them out here using an if statement.
We wanted the bot to send a confirmation message if all of the articles it tested passed validation. In order to do this I setup an
passes variable near the top of the script and used the
failed() functions to increment their counters. Once each article has finished validating I call a confirmation function which tallies up the results. If there are no errors it sends a confirmation message to Slack, if there are a large amount of errors (4 or more) it sends an alert to our Developer user group on Slack as it’s likely that something more serious is causing a large amount of articles to become invalid.
And with that the ‘Google AMP Validator Cat’ bot was born. It’s setup to run hourly on a scheduler via Heroku, but can also be run locally.
I created this service with inspiration from the talk that Natalia Baltazar from The Guardian gave at AMPConf, it’s certainly worth checking out. I also used the following Python packages: requests, and slackweb. It’s been a long time since I’ve wrote anything in Python so I’m quite pleased with how useful this bot has already become considering how simple it was to make.