Coding contributions:
1. getFamaFrenchFactors
If you’ve dealt with Fama French factors, then you know that while the data library is very useful, it does need a fair bit of cleaning, including:
- Deleting comments and descriptives.
- Splitting the data into annual, weekly, or monthly sets.
- Converting the factor returns to percentages.
The getFamaFrenchFactors() does a lot of this work for you, and returns the cleaned and ready to use result as a pandas dataframe on Python.
To install, open up your terminal / command prompt and type:
pip install getFamaFrenchFactors
2. CRSPcleaner
If you’re doing research in Finance, or working at a medium / large cap finance firm, there’s a good chance you’re dealing with data from CRSP via WRDS.
And I’m sure you’re well aware of the time and effort it takes to clean the dataset! Now there’s a slightly easier way. The CRSPcleaner() package takes in a CRSP database file as a csv. It then:
- Shifts all dates to the end of the month, and sets them as pandas datetime objects.
- Converts all prices to absolute (positive) values.
- Removes all non-financial firms (SIC codes between 6000 and 7000 inclusive).
- Removes return outliers – albeit in a slightly arbitrary manner (monthly returns greater than 300%).
- Renames all columns to match PEP8 standards (lower case separated by underscores, with “crsp” added to all column names to help identify the database source.)
- Optionally:
- Get CIK identifiers for CRSP firms, and include only firms with CIK identifiers.
To install, open up your terminal / command prompt and type:
pip install CRSPcleaner