Create a dictionary of authors and author attributes and values for a journal article using the Scopus API and Python

As an exercise to brush up my Python skills, I decided to tinker around with the Scopus API. Scopus is an online database maintained by Elsevier that records and provides access to information about peer reviewed publications. Not only does Scopus let users search for journal articles based on key words and various other criteria, but the web services also allows users to explore these articles as networks of articles, authors, institutions, and so forth. If you’re interested in risk factors that lead to scholarly publications, publication citations, or impact factors, this is a place to start.

The following code yields a dictionary of author information by requesting content through the abstract retrieval API. This request is made using the Python package requests and parsed using the package BeautifulSoup. Enjoy!

#### Import python packages
import requests
from bs4 import BeautifulSoup

#### Set API key
my_api_key = 'YoUr_ApI_kEy'

#### Abstract retrieval API
# API documentation at
# Get article info using unique article ID
eid = '2-s2.0-84899659621'
url = '' + eid

header = {'Accept' : 'application/xml',
          'X-ELS-APIKey' : my_api_key}

resp = requests.get(url, headers=header)

print 'API Response code:', resp.status_code # resp.status_code != 200 i.e. API response error

# Write response to file
#with open(eid, 'w') as f:
#    f.write(resp.text.encode('utf-8'))

soup = BeautifulSoup(resp.content.decode('utf-8','ignore'), 'lxml')

soup_author_groups = soup.find_all('author-group')

print 'Number author groups:', len(soup_author_groups)

author_dict = {}

# Traverse author groups
for i in soup_author_groups:

    # Traverse authors within author groups
    for j in i.find_all('author'):

        author_dict.update({j.attrs['auid']:j.attrs}) # Return dictionary of attributes
        j.contents.pop(-1) # Pop dicitonary of attributes
        # Traverse author contents within author
        for k in j.contents:

            author_dict[j.attrs['auid']].update({ : k.contents[0]})
print author_list

13 thoughts on “Create a dictionary of authors and author attributes and values for a journal article using the Scopus API and Python

  1. Please be careful that author_list variable does not exist.
    Running your code provide nothing, just an empty list of authors

      • I registered as developer and got my API. The only thing is that I didn’t register a new website but as a text mining project.

  2. Makes sense to start with something you know. Looks like you’re working with the Mears et al. (2014) article. What version of python are you running? The code was written to work under Python 2.7.6.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s