As an exercise to brush up my Python skills, I decided to tinker around with the Scopus API. Scopus is an online database maintained by Elsevier that records and provides access to information about peer reviewed publications. Not only does Scopus let users search for journal articles based on key words and various other criteria, but the web services also allows users to explore these articles as networks of articles, authors, institutions, and so forth. If you’re interested in risk factors that lead to scholarly publications, publication citations, or impact factors, this is a place to start.
The following code yields a dictionary of author information by requesting content through the abstract retrieval API. This request is made using the Python package requests
and parsed using the package BeautifulSoup
. Enjoy!
#### Import python packages import requests from bs4 import BeautifulSoup #### Set API key my_api_key = 'YoUr_ApI_kEy' #### Abstract retrieval API # API documentation at http://api.elsevier.com/documentation/AbstractRetrievalAPI.wadl # Get article info using unique article ID eid = '2-s2.0-84899659621' url = 'http://api.elsevier.com/content/abstract/eid/' + eid header = {'Accept' : 'application/xml', 'X-ELS-APIKey' : my_api_key} resp = requests.get(url, headers=header) print 'API Response code:', resp.status_code # resp.status_code != 200 i.e. API response error # Write response to file #with open(eid, 'w') as f: # f.write(resp.text.encode('utf-8')) soup = BeautifulSoup(resp.content.decode('utf-8','ignore'), 'lxml') soup_author_groups = soup.find_all('author-group') print 'Number author groups:', len(soup_author_groups) author_dict = {} # Traverse author groups for i in soup_author_groups: # Traverse authors within author groups for j in i.find_all('author'): author_dict.update({j.attrs['auid']:j.attrs}) # Return dictionary of attributes j.contents.pop(-1) # Pop dicitonary of attributes # Traverse author contents within author for k in j.contents: author_dict[j.attrs['auid']].update({k.name : k.contents[0]}) print author_list
Please be careful that author_list variable does not exist.
Running your code provide nothing, just an empty list of authors
Thanks for the word of caution! That’s weird that the code isn’t working. It works for me. Did you replace the dummy API key with an API key acquired from Scopus?
I registered as developer and got my API. The only thing is that I didn’t register a new website but as a text mining project.
I think I have a problem with my APIkey since I’m always getting the error code
API Response code: 403
It might be the API key or it might be something else. What kind of access do you have to Scopus? In order to access Scopus through the API you or wherever you’re computing from needs to have paid for a subscription to the site.
I have access to Scopus through my University
What’s the eid of the article you want to look up?
this one
eid = ‘2-s2.0-84899659621’
actually is my article, I started from something I know in order to understand how it works.
Makes sense to start with something you know. Looks like you’re working with the Mears et al. (2014) article. What version of python are you running? The code was written to work under Python 2.7.6.
I’m on a mac platform using python 2.7.10
I don’t think is a python problem, but it should be maybe related to the KEY I generated.
I don’t know how to solve this.
Maybe try getting a new key. When you run the code and it fails does it report an error message? I might be able to help more if I know more about how it’s not working.
Error message 403. I think I’m gonna give up.. :(
Get with the times and download 3.5.1