Example usage of sgfixedincome_pkg (Scraper)

This Jupyter Notebook vignette shows how to use some of sgfixedincome_pkg’s bank fixed deposit website scraping functions. The functions shown here are primarily found in scraper.py and consolidate.py.

Extracting fixed deposit tables

Many banks have websites with tables containing fixed deposit rates details. These tables are often formatted to look something like this (see DBS bank’s page below):

DBS Website Screenshot

We provide functions to extract such tables from the website and to convert them into a pandas dataframe:

import sgfixedincome_pkg as sfi

# Get website as parsed HTML BeautifulSoup object
url = "https://www.dbs.com.sg/personal/rates-online/fixed-deposit-rate-singapore-dollar.page"
soup = sfi.fetch_webpage(url)

# Extract table element
table_class = "tbl-primary mBot-24"
tables = sfi.extract_table(soup, table_class)
table = tables[0] # Since there is only 1 table in this website with that class

# Convert table to pandas dataframe
raw_df = sfi.table_to_df(table)
raw_df
Period $1,000 - $9,999 $10,000 - $19,999 $20,000 - $49,999 $50,000 - $99,999 $100,000 - $249,999 $250,000 - $499,999 $500,000 - $999,999
0 1 mth 0.3000 0.3000 0.0500 0.0500 0.0500 0.0500 0.0500
1 2 mths 0.5000 0.5000 0.0500 0.0500 0.0500 0.0500 0.0500
2 3 mths 1.0000 1.0000 0.0500 0.0500 0.0500 0.0500 0.0500
3 4 mths 1.4000 1.4000 0.0500 0.0500 0.0500 0.0500 0.0500
4 5 mths 1.7500 1.7500 0.0500 0.0500 0.0500 0.0500 0.0500
5 6 mths 2.1500 2.1500 0.0500 0.0500 0.0500 0.0500 0.0500
6 7 mths 2.3000 2.3000 0.0500 0.0500 0.0500 0.0500 0.0500
7 8 mths 2.3500 2.3500 0.0500 0.0500 0.0500 0.0500 0.0500
8 9 mths 2.3500 2.3500 0.0500 0.0500 0.0500 0.0500 0.0500
9 10 mths 2.4000 2.4000 0.0500 0.0500 0.0500 0.0500 0.0500
10 11 mths 2.4000 2.4000 0.0500 0.0500 0.0500 0.0500 0.0500
11 12 mths 2.4500 2.4500 0.0500 0.0500 0.0500 0.0500 0.0500

Reshape raw tables

While these tables are easy to interpret visually, it is hard to analyze the data in this format, especially once we consider multiple similar tables from multiple banks. The reshape_table() reformats this information. The result is a table with tenure (in months), rate, deposit lower bound, and deposit upper bound columns:

sfi.reshape_table(raw_df)
Tenure Rate Deposit lower bound Deposit upper bound
0 1.0 0.30 1000.0 9999.0
1 1.0 0.30 10000.0 19999.0
2 1.0 0.05 20000.0 49999.0
3 1.0 0.05 50000.0 99999.0
4 1.0 0.05 100000.0 249999.0
... ... ... ... ...
79 12.0 0.05 20000.0 49999.0
80 12.0 0.05 50000.0 99999.0
81 12.0 0.05 100000.0 249999.0
82 12.0 0.05 250000.0 499999.0
83 12.0 0.05 500000.0 999999.0

84 rows × 4 columns

Entire single-website scrape in one function

Instead of having to write multiple lines of code, use the scrape_deposit_rates() function to carry out the entire process above, and add some additional information to the final dataframe:

url = "https://www.dbs.com.sg/personal/rates-online/fixed-deposit-rate-singapore-dollar.page"
table_class = "tbl-primary mBot-24"
provider = "DBS" # Add info on bank name into dataframe
sfi.scrape_deposit_rates(url, table_class, provider) # req_multiples defaults to None
Tenure Rate Deposit lower bound Deposit upper bound Required multiples Product provider Product
0 1.0 0.30 1000.0 9999.0 None DBS Fixed Deposit
1 1.0 0.30 10000.0 19999.0 None DBS Fixed Deposit
2 1.0 0.05 20000.0 49999.0 None DBS Fixed Deposit
3 1.0 0.05 50000.0 99999.0 None DBS Fixed Deposit
4 1.0 0.05 100000.0 249999.0 None DBS Fixed Deposit
... ... ... ... ... ... ... ...
79 12.0 0.05 20000.0 49999.0 None DBS Fixed Deposit
80 12.0 0.05 50000.0 99999.0 None DBS Fixed Deposit
81 12.0 0.05 100000.0 249999.0 None DBS Fixed Deposit
82 12.0 0.05 250000.0 499999.0 None DBS Fixed Deposit
83 12.0 0.05 500000.0 999999.0 None DBS Fixed Deposit

84 rows × 7 columns

Scrape multiple bank websites

Instead of scraping bank websites one by one, use create_banks_df() to scrape multiple bank websites and store data in a single dataframe. Information on failed scrapes are also provided (both the website that we failed to scrape from, and the error encountered). The function requires a list of tuples as input.

Below, we scrape data from DBS and OCBC’s website simultaneously, and attempt to scrape from UOB’s website with a wrong table class. We also include an optional req_multiples input for OCBC in the hypothetical case that they require investments in multiples of $500:

# Define inputs (a list of tuples)
scrape_inputs=[
    (
        "https://www.dbs.com.sg/personal/rates-online/fixed-deposit-rate-singapore-dollar.page",
        "tbl-primary mBot-24",
        "DBS"
    ),
    (
        "https://www.uob.com.sg/personal/online-rates/singapore-dollar-time-fixed-deposit-rates.page",
        "non_existent_table_class",
        "UOB"
    ),
    (
        "https://www.ocbc.com/personal-banking/deposits/fixed-deposit-sgd-interest-rates.page",
        "table__comparison-table",
        "OCBC",
        500
    )
]

combined_df, failed_providers = sfi.create_banks_df(scrape_inputs)
print(f"Failed scraping processes: {failed_providers}")
combined_df # Display final dataframe
Failed scraping processes: [{'product': 'UOB bank fixed deposit', 'error': 'Failed to scrape deposit rates for UOB: No tables found with the specified class.'}]
Tenure Rate Deposit lower bound Deposit upper bound Required multiples Product provider Product
0 1.0 0.30 1000.00 9999.0 None DBS Fixed Deposit
1 1.0 0.30 10000.00 19999.0 None DBS Fixed Deposit
2 1.0 0.05 20000.00 49999.0 None DBS Fixed Deposit
3 1.0 0.05 50000.00 99999.0 None DBS Fixed Deposit
4 1.0 0.05 100000.00 249999.0 None DBS Fixed Deposit
... ... ... ... ... ... ... ...
201 36.0 0.20 100000.00 249999.0 500 OCBC Fixed Deposit
202 36.0 0.20 250000.00 499999.0 500 OCBC Fixed Deposit
203 36.0 0.20 500000.00 999999.0 500 OCBC Fixed Deposit
204 48.0 3.00 5000.00 20000.0 500 OCBC Fixed Deposit
205 48.0 0.20 20000.01 50000.0 500 OCBC Fixed Deposit

206 rows × 7 columns