Fetching Authenticated API Data with GitHub Actions
Overview
I've been running a lot lately. I'm training for my first ever marathon and so far It's been tough. I'm proud of how far I've come in such a short period of time. It was only in October of 2019 that I did my first half marathon in Baltimore and before then I didn't run or workout at all.
I track all of my runs on a website called Strava and thought it would be kind of cool to show my latest run on my website. In order to make an API request to Strava you must provide an access token with the request. These access tokens expire every six hours and a new one can be requested using the oauth/token
endpoint. This is not ideal for me for a few reasons:
- I can't expose my access token as it allows people to make authorized requests on my account.
- Requesting a new token requires you to provide even more credentials such as a client id and secret.
- This page is hosted on GitHub Pages which doesn't have a backend, so I can't proxy the API request at all.
I needed to find a way to securely request this data. Sure I could create an AWS Lambda or something similar but that seemed like more hassle than it was worth.
Enter GitHub Actions
I have some familiarity already with GitHub Actions, and thought this project would be a great use-case for them as I didn't care how up-to-date the data is. If it's a couple hours behind that's no big deal.
GitHub Actions like other CI tools out there allows you to run some code on any event that GitHub supports such as pushing, pull requests, tagging and even on a schedule. I thought a great way to retrieve this data would be to run a job every day that saves the running data as a .json
file. I could then combine this workflow with the GitHub Pages Deploy Action that I built to push that data into my base branch. Once that base branch detects a change it automatically re-builds the page that ingests the .json
file.
Building the Action
The action itself is going to be quite simple. It's going to make a POST
request for the token, and then afterwards a GET
request to get the running data. At its core the TypeScript action code looked something like the following.
import { getInput } from '@actions/core'
// Retrieves the tokens for the data request.
export async function retrieveTokens(): Promise<object> {
const configuration = JSON.parse(getInput('CONFIGURATION'))
if (configuration.body) {
configuration.body = JSON.stringify(configuration.body)
}
const response = await fetch(
'https://strava.com/api/v3/oauth/token',
configuration,
)
return await response.json()
}
// Retrieves the data
export async function retrieveData({
action,
}: ActionInterface): Promise<object> {
try {
const tokens = await retrieveTokens(action)
const response = await fetch(
'https://www.strava.com/api/v3/athlete/activities',
{
headers: {
Authorization: `Bearer ${tokens.access_token}`,
},
},
)
const data = await response.json()
await fs.writeFile('fetch-api-data-action/data.json', data, 'utf8')
} catch (error) {
throw new Error(`There was an error fetching from the API: ${error}`)
}
}
The function call to getInput
allows you to retrieve data passed through the with
parameter when invoking an action in a workflow. These configurations are great because you can store secrets in the repositories Settings/Secrets
menu and replace parts in a string with the ${{ secrets.secret_name }}
syntax. GitHub will automatically hide these strings if they get exposed in your logs.
name: Refresh Feed
on:
schedule:
- cron: 25 17 * * 0-6
jobs:
refresh-feed:
runs-on: ubuntu-latest
steps:
- name: Fetch API Data 📦
uses: JamesIves/fetch-api-data-action@releases/v1
with:
CONFIGURATION: '{ "method": "POST", "body": {"client_id": "${{ secrets.client_id }}", "client_secret": "${{ secrets.client_secret }}"} }'
With that said and done I was able to retrieve my running data on a scheduled job without exposing my credentials. But why stop there?
Making the Action Reusable
With some minor re-factors I figured this action could be used to retrieve data from just about any REST API. In order to achieve that I'm going to need to figure out how to get the following information:
- What endpoint does the user want to retrieve the data from?
- Do they need to make a token request first like the Strava API? If so how do I know what information from that request is needed in the data request?
I made some changes to my workflow configuration. The TOKEN_ENDPOINT
request with its accompanying configuration is now made first (if specified) followed by the ENDPOINT
data request. The only thing left to figure out is how to get the access token into the athlete/activities
configuration.
name: Refresh Feed
on:
schedule:
- cron: 25 17 * * 0-6
jobs:
refresh-feed:
runs-on: ubuntu-latest
steps:
- name: Fetch API Data 📦
uses: JamesIves/fetch-api-data-action@releases/v1
with:
TOKEN_ENDPOINT: https://www.strava.com/api/v3/oauth/token
TOKEN_CONFIGURATION: '{ "method": "POST", "body": {"client_id": "${{ secrets.client_id }}", "client_secret": "${{ secrets.client_secret }}"} }'
ENDPOINT: https://www.strava.com/api/v3/athlete/activities
CONFIGURATION: '...'
In order to get the keys from the first request in the second I decided to use mustache.js to replace strings formatted with triple brackets. In my workflow configuration I can now use triple brackets with keys from the token request to populate the data request.
import { render } from 'mustache'
const tokens = {
data: {
access_token: '123',
},
}
const formatted = render(
'{ "method": "GET", "headers": {"Authorization": "Bearer {{{ data.access_token }}}"} }',
tokens,
)
console.log(formatted)
// '{ "method": "GET", "headers": {"Authorization": "Bearer 123"} }'
With this setup the workflow can look something like this:
name: Refresh Feed
on:
schedule:
- cron: 25 17 * * 0-6
jobs:
refresh-feed:
runs-on: ubuntu-latest
steps:
- name: Fetch API Data 📦
uses: JamesIves/fetch-api-data-action@releases/v1
with:
# The token endpoint is requested first. This retrieves the access token for the other endpoint.
TOKEN_ENDPOINT: https://www.strava.com/api/v3/oauth/token
# The configuration contains secrets held in the Settings/Secrets menu of the repository.
TOKEN_CONFIGURATION: '{ "method": "POST", "body": {"client_id": "${{ secrets.client_id }}", "client_secret": "${{ secrets.client_secret }}"} }'
# Once the token endpoint has fetched then this endpoint is requested.
ENDPOINT: https://www.strava.com/api/v3/athlete/activities
# The bearer token here is returned from the TOKEN_ENDPOINT call. The returned data looks like so: {data: {access_token: '123'}}, meaning it can be accessed using the triple bracket syntax.
CONFIGURATION: '{ "method": "GET", "headers": {"Authorization": "Bearer {{{ data.access_token }}}"} }'
Looking good!
Using the Action
In order to use the action on my own site all I needed to do was create a feed.yml
file in the .github/workflows
directory. This workflow is setup to download the data on a schedule from the Strava API, and then it uses the GitHub Pages Deploy action to push the data into the correct folder in my repository.
name: Refresh Feed
on:
schedule:
- cron: 10 15 * * 0-6
jobs:
refresh-feed:
runs-on: ubuntu-latest
steps:
- name: Checkout 🛎️
uses: actions/checkout@v2
with:
persist-credentials: false
- name: Fetch API Data 📦
uses: JamesIves/fetch-api-data-action@releases/v1
with:
TOKEN_ENDPOINT: https://www.strava.com/api/v3/oauth/token
TOKEN_CONFIGURATION: '{ "method": "POST", "body": {"client_id": "${{ secrets.client_id }}", "client_secret": "${{ secrets.client_secret }}"} }'
ENDPOINT: https://www.strava.com/api/v3/athlete/activities
CONFIGURATION: '{ "method": "GET", "headers": {"Authorization": "Bearer {{{ data.access_token }}}"} }'
SAVE_NAME: strava
- name: Build and Deploy 🚀
uses: JamesIves/github-pages-deploy-action@releases/v3
with:
ACCESS_TOKEN: ${{ secrets.ACCESS_TOKEN }}
BRANCH: main # Pushes the updates to the main branch.
FOLDER: fetch-api-data-action # The location of the data.json file saved by the Fetch API Data action.
TARGET_FOLDER: src/data # Saves the data into the 'data' directory on the master branch.
I have another workflow setup called build.yml
that builds and deploys my page whenever a push is made to the master branch. This means whenever the feed.yml
workflow finishes running it will trigger the next one, making the updates automatic.
name: Build and Deploy
on:
branches:
- master
jobs:
build-and-deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout 🛎️
uses: actions/checkout@v2
with:
persist-credentials: false
- name: Install 🔧
run: |
npm install
npm run-script build
- name: Build and Deploy 🚀
uses: JamesIves/github-pages-deploy-action@releases/v3
with:
ACCESS_TOKEN: ${{ secrets.ACCESS_TOKEN }}
BRANCH: gh-pages
FOLDER: build
I can then use the data for my website by simply importing it. Depending on your project structure you may need to use Webpack or something similar to load the .json
file type.
import data from './images/blog/2020-03-07-fetching-authenticated-api-data/data/strava.json'
const latestRun = data.filter((item) => item.type === 'Run')[0]
Hackathon
This was my entry to the 2020 GitHub Actions Hackathon, and it's available for you to use on the GitHub marketplace. I had a ton of fun making this and I hope other people find it equally as useful. If you encounter any problems using it please create an issue.
Edit: Since the creation of this post I completed my first marathon!