How to Hide API Keys, Credentials, & Authentication Tokens on GitHub

How to Hide API Keys, Credentials, & Authentication Tokens on GitHub

With the rise of open-source, more and more public repositories are being hosted on GitHub. In fact, back in 2018 GitHub celebrated 100 million live repositories, and things have only been growing from there. However, with easy access to version control and open source, it’s important to make sure sensitive credentials and authentication tokens aren’t exposed to the public. 

Exposed Credentials 

Let’s say I’m writing an application that takes advantage of data from an API call. For example, I could be targeting weather data from OpenWeatherMap:

GET "https://api.openweathermap.org/data/2.5/weather?&id=5128581&appid={YOUR API KEY}"

As this API call is prepared, it’s not uncommon to store your API key (your secret) in a variable in the same file. After all, it’s quick and easy during testing. 

Here’s an example using Python and a fake API key, however this blog can be applied to any language:

# app.py 
api_key = "12abc3d45ef6789012345g6789h0ij12" 
city_id = "5128581
base_url = "https://api.openweathermap.org/data/2.5/weather?"  final_url = base_url + "appid=" + api_key + "&id=" + city_id 

So, what’s the issue here? The problem is that when this code is pushed to a public GitHub repo, it’s now exposing the secret API key to the world. This is your private access token and should never be exposed outside of privileged users in your organization. 

Some additional issues caused by this exposure: 

  • You may be in breach of your license agreement with the API vendor 
  • Users of the stolen key may make rapid API calls outside the scope of your license, causing the API vendor to throttle your requests 
  • Most importantly, if the API key gives access to sensitive data (say, a cloud storage account) then you’ve effectively given the bad guys the keys to a data breach 

So, as you can see, it’s very important to make sure credentials, access tokens, API keys, etc. are all secured in code before being pushed to GitHub (or any other public-facing version control or code repo). 

Securing Credentials 

How can we solve this problem? Well, the solution is simple. We just need a config file that stores our API keys (and other sensitive credentials) that’s included in other code files when necessary, but also ignored by version control (ex. gitignore). This allows your application to function as expected, while preventing any sensitive credentials from being pushed to GitHub. 

Revisiting our earlier example, we can create a second code file named config.py and include that in any code that need access to our API key:

# config.py 
api_key = “12abc3d45ef6789012345g6789h0ij12” 
 
# app.py 
import config
 
city_id = “5128581” 
 
base_url = "https://api.openweathermap.org/data/2.5/weather?" 
final_url = base_url + "appid=" + config.api_key + "&id=" + city_id

The change to app.py has been highlighted in RED and references the API key from config.py rather than assigning it to a variable directly in app.py.

Now we can prevent config.py from being pushed to GitHub by adding it to our gitignore file:

# .gitignore 
config.py 

When we push to GitHub, only app.py and .gitignore will be uploaded to the public repo. config.py, and all sensitive information contained in it, will not end up on GitHub. If sensitive data has been pushed to GitHub in the past, then GitHub has a guide for removing that data from a repository. 

This is a simple solution, but a powerful best practice any time you’re dealing with credentials. In fact, I like to always apply this principle and never assign secure credentials to variables unless the file is excluded from version control. 

Depending on your organizational standards, you may want to also apply this to self-hosted version control. In general, it’s a good idea to know exactly where sensitive credentials are stored and pushing them to version control is often not a secure practice regardless of where version control is hosted (local, cloud, etc.). 

How Stealthbits Can Help 

Stealthbits’ StealthAUDIT data access governance solution includes a sensitive data component that helps organizations identify where sensitive data is located, who has access to it, how it’s being accessed, and what they’re doing with it. 

StealthAUDIT includes: 

Host Discovery: Identify the different platforms within the network that may contain various unstructured and structured data repositories to ensure a comprehensive view of your organization’s sensitive data. 

Sensitive Data Discovery: Analyze content for patterns or keywords that match built-in or customized criteria. 

Remediation Actions: Automate all or portions of the tasks you need to perform to remediate sensitive data violations. 

Learn more about Stealthbits’ Data Access Governance here. 

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Start a Free Stealthbits Trial!

No risk. No obligation.

Privacy Preference Center

      Necessary

      Advertising

      Analytics

      Other