Home > Learning, Lesson, Problem, Projects/Experiments, Python > Python and Covid-19

Python and Covid-19



Learning : Using Beautifulsoup in Python
Subject: Grab Data from a Website Using Python

In this article we will use Python codes to fetche some data from a website, the Data we are looking for is about Covid-19, to do this we can use an API but here I am using web crawler with Beautifulsoup. So let’s start.

Introduction: First we need a Data Source web-Page, if we search the web we will find a lot of pages there I select this one (en.m.wikipedia.org/wiki/2019%E2%80%9320_coronavirus_pandemic_by_country_and_territory) Source:wikipedia.org, you need a basic knowledge about HTML to understand the stag’s. After reading the web-page source code we can found the following:
1. The HTML code structure contain 9 Tables.
2. The Data about Covid-19 is in Table No. 2.
3. The Table consist of several ‘tr’ Tag. In HTML ‘tr’ tag is Row.
4. Each ‘tr’ consist of several ‘td’ Tag. In HTML ‘td’ tag is Colunm.
5. Each ‘td’ may contain other tags or information, and some of those information are what we are looking for. We will talk about this later in bellow.

Now we will talk about our Python Application. We will write a Python Application that will display a Menu on the screen and ask the user to select from the Menu, and based on the user selection we will run a function to perform something, in our case we will have a Menu with four choice of selection as:
1. The Totals in the world.
2. The covid-19 by countries.
3. Covid-19 Table.
9. Exit.

To start writing our code, First we need to import some library we here is the code..

# To import needed Libraries

import requests, os, re
from bs4 import BeautifulSoup

The Menu: As we mentioned we will have a Main-Menu to help the users in there selections. Here is the code:

# COVID-19 APP MAIN MENU

def covid19_menu() :

    while True :
        # This will Clear the terminal
        os.system('clear')  
        print ('\n    =====[ COVID-19 APP MAIN MENU ]=====')
        print('    1. The Total in the World.')
        print('    2. The covid-19 by Countries')
        print('    3. Covid-19 Table. ')
        print('    9. Exit')

        user_input = input('\n    Enter Your choice: ')

        if user_input in ['1','2','3','9'] :
            if user_input =='1' :
                covid19_world_total()

            if user_input =='2' :
                covid19_in_country()

            if user_input =='3' :
                covid19_table()

            if user_input =='9' :
                return

        else :
            print (' You Must select from the above Menu.')
            input('\n      ... Press any key ...')



Now we will work on First function, before that we will write this two lines of code to set my_url (the webpage we want to read), and calling the BeautifulSoup for it, so all the webpage source will be as text in variable soup:

# To load the page in a variable soup
my_url ='https://en.m.wikipedia.org/wiki/2019%E2%80%9320_coronavirus_pandemic_by_country_and_territory'
source = requests.get(my_url, headers = headers).text
soup = BeautifulSoup(source, 'lxml')


Now our first function will get the last update of the data (Date and Time), after searching the web-site code, I found that the page contain 19 ‘P’ tag’s and the last update statement is in fifth Paragraph, but in case that the page may updated with more data or the page structure changed, we will write a code to search for it. Here is the function to get the Data last Update statement.

# Function to get Last update statement

def get_last_update(): 
    update_date_time = soup.find_all('p')
    # I found the date in P = 5, but i will search other the 19 P's
    for thep in range (0,19) : 
        if 'UTC on' in update_date_time[thep].text :
            last_update='The Data Updated on ' + update_date_time[thep].text[6:32]
            return last_update
        else:
            thep = thep + 1 


Now, our function will get the four set of information for Grand Totals in the world as Total Countries has COVID-19, Total Cases of COVID-19, Total Deaths by COVID-19 and Total Recoveries from COVID-19. Here is the function

# Function to get the covid19 World Totals

def covid19_world_total():
    os.system('clear')
    print('\n    ',last_update,'\n')
    # This section will fetch th Total Rows.
    total_rows = soup.find_all('th',class_='covid-total-row')
    # We know that total_rows has 4 lines, so we will access them as individual.
    print('\n   ::: TOTAL NUMBERS :::')
    print('   Total Countries has COVID-19:   ',(total_rows)[0].b.text)
    print('   Total Cases of COVID-19:        ',(total_rows)[1].b.text)
    print('   Total Deaths by COVID-19:       ',(total_rows)[2].b.text)
    print('   Total Recoveries from COVID-19: ',(total_rows)[3].b.text)
    input('\n    ... Press any Key ...')
RUN-Time:


Second function we will work on is covid19_table, in this function we will print-out a table of covid19 contains countries Name, cases, Deaths and Recoveries sorted by Number of cases. In this version of application we will display only 10 records.


Enhancement: In a Coming version the user will select the number of records also the sorting type.

# Function to print-out the Covid19 cases.     
def covid19_table(top =10):
    """
    This Function will print-out the countries table sorted by Number of cases.
    If the user did not select number of countries then defult will be first 10.
    """
    os.system('clear')
    print('\n   The List of Countries has Covid-19, Sorted by Total Cases..')
    print('   ',last_update,'\n')
    
    r_spc = 30
    tab_offset = (r_spc-1)
    # To print the Table Header.
    print('-'*r_spc,'+','-'*20, '+','-'*21,'+','-'*20)
    print(' '*8,'{0:<20}  |       {1:<15}|      {2:<17}|     {3:<20}'.format('Countries','Cases','Deths','Recoveries'))
    print('-'*r_spc,'+','-'*20, '+','-'*21,'+','-'*20)
    
    droped_data =[]
    c = 2
    while c <= top :
        try:
            # This will print the country name.
            contry_name = trs[c].find('a',title=re.compile('2020 coronavirus pandemic in *'))
            tds = trs[c].find_all('td')
            c_offset = len(contry_name.text.strip())
            print('   ',c-1,'-',contry_name.text.strip(),' '*(tab_offset - c_offset),(tds[0].text).strip(),' '*(30 -(20 + len((tds[0].text).strip())) +10) ,(tds[1].text).strip(),' '*(30 -(20 + len((tds[1].text).strip())) +12),(tds[2].text).strip())
            
        except :
            droped_data.append([c])
        c = c + 1    
    print('\n\n    We have {} Droped Country(ies) due to some error'.format(len(droped_data)))
    input('\n   .. Press  Any Key ... ')


RUN-TIME


In the last Function, we will let the user to select the Country name then we will display the Covid19 information regarding his selection. So to perform this action, First we will create a list of all Countries Name, and display the list on the screen asking the user to select (Enter) the index number of any Country, then we search for that country name and grab it’s Covid19 Data. Here is the code..

# Function to grab Covid19 Data by country. 
  
def covid19_in_country() :

    os.system('clear')
    print('\n    Covid19 Country Data.')
    print('\n    ',last_update,'\n')

    # First we will build a list of counries name.
    c_name_list=[]
    total_rows = soup.find_all('th',class_='covid-total-row')
    Total_Countries = int((total_rows)[0].b.text)
    drops_c = 0
    for x in range (2,Total_Countries) :
        try:
            c_name_list.append([x,trs[x].find('a',title=re.compile('2020 coronavirus pandemic in *')).text])

        except:
            drops_c +=1

    print('    We miss {} Countries due to some errors.'.format(drops_c))
    print('\n    Enter the Country Number to show it''s Data.')
    for cou in range (0,(len(c_name_list)),4) :
        try:
            print ('    ',c_name_list[cou],c_name_list[cou+1],c_name_list[cou+2],c_name_list[cou+3])
        except:
            pass # If there is an error just pass.

    while True :
        user_c_index = (input('\n\n    Enter the Index number of the Country to see it''s Data. [e to Exit] '))
        try:
            if (isinstance(int(user_c_index), int)) :
                # This will print the country name.
                print('\n\n     ',trs[int(user_c_index)].find('a',title=re.compile('2020 coronavirus pandemic in *')).text)

                # This will print the country information. 
                tds = trs[int(user_c_index)].find_all('td')
                print('        Country Rank is: ',int(user_c_index)-1)
                print('        Cases:',tds[0].text.strip())
                print('        Deths:',tds[1].text.strip())
                print('        Recoveries:',tds[2].text.strip())
        except :
            if ((str(user_c_index)) in ['e','E']) :
                return
            else :
                print('\n    You must Enter a Number or ''e''/''E'' to exit')

    input('   ... Press Any Key ...')



This application and the code was written as a matter of training and learning. The owner/creator is not responsible for any missing or wrong data may exist in the file and/or the Data. (the code and/or the data-sets).



To Download my Python code (.py) files Click-Here




Follow me on Twitter..




By: Ali Radwani




  1. No comments yet.
  1. April 16, 2020 at 8:17 am

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: