Python Requests Library allows you to send HTTP/1.1 request to Apache2 servers to serve data requested by clients. With request one can add form data, headers, multi part files etc. In this blog we have explained below how you can use Python Requests library in 6 different ways in your project to access web and raise HTTP request from server for the client.
1.Get HTML code of a website
We can use Python requests library to download and save the source files of a website. Do keep this in mind that we will be able to download only the client side files of that website, so no basically you can’t download the whole Facebook and replicate it on to your server.
This feature can come handy if you are trying to save all the links of a website or want to download all the images over a particular website. A basic example of HTML retrieval is as follows.
# import requests library
# site whose HTML you want to retrieve.
site = "https://skwow.github.io/"
# send the request and recieve the response.
r = requests.get(site)
# response.text contains the website you recieved from its server
# now you can store it in a file or process it the way you wnat.
We can download stuffs from internet using python requests library. Do keep in mind that it can only download files over HTTP and FTP. It can take time depending on the size of the file so be patient if it seems to have stuck at request.get() statement.
# import requests library
# we are defining a function because we will be using it later as well.
def dnload(fileUrl,name): # takes link to the file to be download and name with which you want to save it
# we send the request and recieve it in variable file
file = requests.get(fileUrl)
# we create and open a file with specified name and write our response bit-by-bit in it.
with open(name, 'wb') as f:
# now downloading file is as simple as passing Url and name of file to this function
fileUrl = 'http://speedtest.ftp.otenet.gr/files/test10Mb.db'
3. Accessing APIs
The most widely used application of python requests library is to access the APIs. We can access any available API using requests. Some of the examples are:
Google Map API – Search of details of a place
Weather APIs – we can access any weather API and ask it for current temperature or anything using requests.
Bitcoin API – we can keep track of exchange rate of Bitcoin
and this endless list goes on…
we will see how to access Bitcoin API because it doesn’t require authentication and is very easy.
# as always we start by importing requests
# send request and store the response in a variable
r = requests.get("https://api.coindesk.com/v1/bpi/currentprice.json")
# this is going to be a big json file you can print it and find a way to extract the info we desire.
print("current price of Bitcoin : "+ r.json()['bpi']["USD"]['rate'])
current price of Bitcoin : 14,264.3038
And that’s the exchange rate at the time of writing this blog. 😉
4. Log into website (and extract data)
Using Python Requests library we can log into a website directly from our code and perform several tasks such as retrieve user info, Download data etc.
This can come in handy if you are collecting data for machine learning. It is also very useful if you want to automate things like if you want to be a blogger but don’t want to waste time writing blogs, instead you can steal blogs from other website and post it over yours after some tweaking.
# import requests library
# send request to the website with login credentials and store the response in a variable
r = requests.get('https://api.github.com/user', auth=('username', 'password'))
# retrieve the infor you want. here we are extracting profile pic of the user
avatarUrl = r.json()['avatar_url']
# then download the profile pic using the function we created in part 2.
And the pic is saved at the location of this script.
5. To diagnose problem with your website
We can easily see what’s wrong with a particular website by simply typing the URL in the web browser. But sometimes we want to automate things. What if we want to perform certain task automatically if something goes wrong with our website.
status_code can be used to get the http status of a website. Here is an extensive list of all the http status codes : https://en.wikipedia.org/wiki/List_of_HTTP_status_codes
we will now see a couple of examples of these http codes.
# import requests library as always
# sites to check
for site in sites: # loop through sites and print its status code
print(site + " : "+ str(requests.get(site).status_code))
https://www.google.com : 200
https://www.thisnofcmspkgnskfs.com : 404
https://api.github.com/user : 401
200 means site is OK
404 means not found
401 means unauthorized request
6. Web scraping
Though there are awesome tools for web scraping like beautiful soup and scrapy, we can also scrap the web using Python requests. However, this will require a lot of effort as you will have to download the HTML and parse it yourself (if you choose not to use any other library). We will use lxml to parse the HTML response of the request.
# import requests library and
import lxml.html as parseHTML
# lxml.html parses the response of the request we sent to the website
dom = parseHTML.fromstring(requests.get('http://www.google.co.in').content)
# we save all the links from the HTML except for the one which pointed to website itself
links = [x for x in dom.xpath('//a/@href') if '//' in x and 'www.google.com' not in x and 'www.google.co.in' not in x]
# and finally we print it
for link in links: