Pyhton Quant Toolkit: “Beautifulsoup”

Dec 20, 2022 | EasyLanguage, Python

Pyhton Quant Toolkit: “Beautifulsoup”

Many times multiple tools are needed to get the job done. Like a mechanic, a Quant needs tools to perform many programming tasks. In this post, I use a toolkit to build an EasyLanguage function that will test a date and determine if it is considered a Holiday in the eyes of the NYSE.

Why a holiday feature?

TradeStation will pump the holiday data into a chart and then come back and pull it out of the database. Many times the data will only be removed from the daily database, but will persist in the intraday database. Many mechanical day traders do not want to trade a shortened holiday session or use the data for indicator/signal calculations. This is an example of a gold chart that reflects President’s Day data in intraday data and not day-to-day.

Phyton | Quant Toolkit: “Beautifulsoup” - Holiday example for trading on a TradeStation chart

Holiday example for trading on a TradeStation chart

This affects many stock index day traders. Especially if automation is turned on. At the end of this post I provide a link to my YouTube channel for a full tutorial on using these tools to accomplish this task. It goes from the part of this post.

Get the data first

I searched the web for a list of historical holiday dates and came across this:

Phyton | Quant Toolkit: “Beautifulsoup” - Holidays to trade on the NYSE

Holidays to trade on the NYSE

You might be able to find this in a more user-friendly format, but this was perfect for this post.

Extract data with “BeautifulSoup” in Python

This is where Python and the plethora of its libraries come in handy. I used pip to install the requests and the bs4 libraries. If this sounds like Latin to you drop me an email and I’ll shoot you some instructions on how to install these libraries. If you have Python, then you have the download/install tool known as pip.

Here is the python code. Don’t worry, it’s pretty short:

# Created:     24/02/2020
# Copyright:   (c) George 2020
# Licence:     
#-------------------------------------------------------------------------------
import requests
from bs4 import BeautifulSoup
url = 'http://www.market-holidays.com/'
page = requests.get(url)
soup = BeautifulSoup(page.text,'html.parser')
print(soup.title.text)
all_tables = soup.findAll('table')
#print (all_tables)
print (len(all_tables))
#print (all_tables[0])
print("***")
a = list()
b = list()
c = list()
#print(all_tables[0].find_all('tr')[0].text)
for numTables in range(len(all_tables)-1):
    for rows in all_tables[numTables].find_all('tr'):
        a.append(rows.find_all('td')[0].text)
        b.append(rows.find_all('td')[1].text)
for j in range(len(a)-1):
    print(a[j],"-",b[j])

As you can see, this is very simple code. I first delist the variable url on the website where the vacation is located. I googled how to do this – another cool thing about python – tons of users. I pulled the data from the website and put it into the page object. The page object has several attributes (properties) and one of them is a text representation of the entire page. I pass this text to the BeautifulSoup library and report it for parsing with the html.parser. In other words, be prepared to extract certain values based on html tags. All_tables contains all the tables that were parsed from the text file using Soup. Don’t worry about how this works as it’s not important, just use it as a tool. In my younger days as a programmer I would have delved into how this works, but it wouldn’t be worth the time because I only need the data to accomplish my goal; this is one of the reasons why classically trained programmers never pick up on the concept of an object. Now that I have all the tables in a list I can loop through each row of each table. It looked like there were 9 rows and 2 columns in the different sections of the website, but I didn’t know for sure, so I let the library figure this out for me. So I played around with the code and found that the first two columns of the table contained the name of the holiday and the date of the holiday. So, I just stuck in filling the text values of these columns into two lists: a and b. Finally I print the content of the two lists, separated by a hyphen, in the interpreter window. At this point, I could just go ahead with Python and create the EasyLanguage statements and fill in the data I need. But I wanted to play around with Excel in case readers didn’t want to go the Python route. You could have used a powerful editor like NotePad++ to extract the data from the website instead of Python. GREP could have done this. GREP is an editing tool for finding and replacing expressions in a text file.

Use Excel to create a file for EasyLanguage

I created a new spreadsheet. I used Excel, but you could use any spreadsheet software. First I prototyped the code that I would need to encapsulate the data in array structures. This is what I want the code to look like:

Arrays: holidayName[300](""),holidayDate[300](0);

holidayName[1]=”New Year’s Day “;


holidayDate[1]=19900101;

These are just the first few lines of the function prototype. But a repeating pattern can be noticed. The array names stay the same: the only values that change are the array elements and the array indices. Computers love repetitiveness. I can use this information to build a spreadsheet – take a look.

Phyton | Quant Toolkit: “Beautifulsoup” - Extraction of holidays for trading in Excel

Extraction of holidays for trading in Excel

I haven’t copied the data I pulled from python yet. That will be step 2. Column A has the first array name holidayName (note that I put the left square [ bracket on the column as well). Column B will contain the array index and this is a formula. Column C contains ]”. Column D will contain the actual name of the holiday and Column E contains the; These columns will compile the array holidayName.

Columns G through K will construct the array holidayDates. Notice that column H is equal to column B. So whatever we do in column B (index) will be reflected in column H (index). So we’ve basically put all the EasyLanguage parts in columns A through K.

Phyton | Quant Toolkit: “Beautifulsoup” - Extraction of holidays for trading in Excel 2

Extraction of holidays for trading in Excel 2

Excel provides tools for manipulating strings and text. I will use the Concat function to build my EasyLanguage. But before I can use Concat all the things I want to chain must be in a string or text format. The only column out of the first five that is not a string is column B. So the first thing I need to do is convert it to text. First copy the column and paste special as values. Next, go to the Data tab and select Text to Columns.

Phyton | Quant Toolkit: “Beautifulsoup” - Extraction of holidays for trading in Excel 3

Extraction of holidays for trading in Excel 3

You will be wondering if the width is fixed or delimited – I don’t think it matters which you choose. In step 3 select text.

Phyton | Quant Toolkit: “Beautifulsoup” - Extraction of holidays for trading in Excel 4

Extraction of holidays for trading in Excel 4

The Text to Columns button will solve 90% of your formatting problems in Excel. Once you do this, you will notice that the numbers will remain justified – this means a text format. Now let’s select another sheet in the workbook and past the vacation data.

Copy vacation data to another spreadsheet:

  • New Years Day – January 1, 2021
  • Martin Luther King, Jr. Day – January 18, 2021
  • Washington’s Birthday (Presidents’ Day) – February 15, 2021
  • Good Friday – April 2, 2021
  • Memorial Day – May 31, 2021
  • Independence Day – July 5, 2021
  • Labor Day – September 6, 2021
  • Thanksgiving – November 25, 2021
  • Christmas – December 24, 2021
  • New Year’s Day – January 1, 2020
  • Martin Luther King, Jr. Day – January 20, 2020
  • Washington’s Birthday (Presidents’ Day) – February 17, 2020
  • Good Friday – April 10, 2020
  • Memorial Day – May 25, 2020
Phyton | Quant Toolkit: “Beautifulsoup” - Extraction of holidays for trading in Excel 5

Extraction of holidays for trading in Excel 5

Text to columns to the rescue. Here I’ll separate the data with the “-” as the delimiter and tell Excel to import the second column in Date format as MDY.

Phyton | Quant Toolkit: “Beautifulsoup” - Extraction of holidays for trading in Excel 6

Extraction of holidays for trading in Excel 6

Now, once the data is split accordingly into two correctly formatted columns, we need to convert the date column to a string.

Phyton | Quant Toolkit: “Beautifulsoup” - Extraction of holidays for trading in Excel 7

Extraction of holidays for trading in Excel 7

Now the last couple of steps are very easy. Once you’ve converted the date to a string, copy column A and past it into column D from the first worksheet. Since this is text, you can just copy and paste. Now go back to Sheet 2 and copy Column C and paste special [values] into Column J on Sheet 1. All we have to do now is concatenate the strings in columns A through E for the EasyLanguage for the array holidayName. Columns G through K will be concatenated for the holidayDate array. Check out.

Phyton | Quant Toolkit: “Beautifulsoup” - Extraction of holidays for trading in Excel 8

Extraction of holidays for trading in Excel 8

Now create a function in the EasyLanguage editor and name it IsHoliday and have it return a boolean value. Then all you need to do is copy/paste columns F and L and the website data is now available for use. Here is a part of the function code. Notice that I declare holidayNameStr as stringRef? I did this so that I could change the variable in the function and pass it back to the calling routine.

inputs : testDate(numericSeries),holidayNameStr(stringRef);
Arrays: holidayName[300](""),holidayDate[300](0);

holidayNameStr = "";
holidayName[1]="New Year's Day ";	holidayDate[1]=19900101;
holidayName[2]="Martin Luther King, Jr. Day ";	holidayDate[2]=19900115;
holidayName[3]="Washington's Birthday (Presidents' Day) ";	holidayDate[3]=19900219;
holidayName[4]="Good Friday ";	holidayDate[4]=19900413;
holidayName[5]="Memorial Day ";	holidayDate[5]=19900528;
holidayName[6]="Independence Day ";	holidayDate[6]=19900704;
holidayName[7]="Labor Day ";	holidayDate[7]=19900903;
holidayName[8]="Thanksgiving ";	holidayDate[8]=19901122;
holidayName[9]="New Year's Day ";	holidayDate[9]=19910101;
holidayName[10]="Martin Luther King, Jr. Day ";	holidayDate[10]=19910121;
holidayName[11]="Washington's Birthday (Presidents' Day) ";	holidayDate[11]=19910218;

// There are 287 holiays in the database.
// Here is the looping mechanism to compare the data that is passed
// to the database
vars: j(0);
IsHoliday = False;
For j=1 to 287
Begin
	If testDate = holidayDate[j] - 19000000 then
	Begin
		holidayNameStr = holidayName[j] + " " + numToStr(holidayDate[j],0);
		IsHoliday = True;
	end;
end;

I have created this post to demonstrate the need to have several tools at your disposal if you really want to become a Quant programmer. How you use those tools is up to you. Also you will be able to take bits and pieces of this post and use it in other ways to get the data you really need. I could have skipped the whole excel part of the post and done the whole thing in python. But I know a lot of Quants who love spreadsheets. You have to continually hone your craft in this business. And you can’t let a software application limit your creativity. If you have a problem, you will always be aware of alternative platforms and/or languages that can help you solve it.

You can see a video by George Pruitt in which the information seen in this article is expanded:

The Enigmatic Turtle Trading System in Python

The Enigmatic Turtle Trading System in Python

[Corrections in bold and results –  August 6, 2020] My favorite book on the Turtle Trading System is Curtis Faith’s “Way of the Turtle.”  I like this book because of the thorough explanation of the rules as told by Curtis.  Having been in this...

read more

Subscribe to our Newsletter

Join our mailing list to receive the latest news and updates from Quantified Models team.

Subscribe to our Newsletter

You have Successfully Subscribed!

Skip to content