A web scraping python script with pandas and requests packages

I created a python script where it accesses to a website from which I extracted some baseball data.

Pandas has a function called “read_html”, so you don’t have to use other packages like BeatifulSoup!

The structure of the script is the following:

  1. Get html information with the requests package.
  2. Read html in pandas.
  3. Output the result to a csv file.
import requests
import pandas as pd

URL = 'https://www.baseball-almanac.com/hitting/hihr5.shtml'

def get_table(html, table):
	df = pd.read_html(html, attrs={'class': 'boxed'}, header=1)[0]
	return df

def main():
	html = requests.get(URL).text
	df = get_table(html, {'class': 'boxed'})
	df.to_csv('HR Year-by-Year Leaders.csv', index=None)

if __name__ == '__main__':
	main()

Hope this is helpful in some way for those who are learning / using Python.

Source code in my Github repo