In this homework you’re going to reinforce more of the Python skills you learned in class in Week 4: use Python to open and read a bunch of text files, process them in some way, and then do some basic analysis. Refer back to the in-class portion of Week 4 to help you. Once again, please alternate Markdown cells with Code cells where you explain each step.
The Data
For this week we’re going to use diary entries from the diary of Martha Ballard, a midwife from Maine made famous by Laurel Ulrich’s A Midwife’s Tale. A project at George Mason University digitized her diary and put it online. I’ve done some work on the entries, and am supplying you with two years’ worth of Ballard’s entries (1804 and 1805). Each of her entries for these two years are contained in a separate text file that I’ve already preprocessed, cleaned up, and put into all lowercase.
Set Up
- Create a new folder inside the 
homeworkfolder on your computer calledweek-06-homework. Launch Anaconda Navigator and then create a new Jupyter Notebook inside this folder using the filename convention:yourlastname-week-06-homework.ipynb. - Download Martha Ballard’s diary entries for 1804 and 1805 and put the file in your 
week-06-homeworkfolder. Unzip this and rename the folderdata. This is your directory of text files. - In your Jupyter Notebook: 
- Import the 
osandstringlibraries - Use 
osto tell Python where to look for the data (text files) 
 - Import the 
 
Wrangle the Data
The goal of this section is to take your hundreds of text files worth of diary entries and add them into two lists, one containing all of the diary entries for 1804 and one for 1805. You’re going to do this through the following steps:
- Make two new variables, 
year_1804andyear_1805, and make them empty lists. This is where you’re going to be adding individual entries for that year as items in your list. - Use 
os.listdir()to get a list of filenames contained in your data folder and assign it as a new variable - Write a 
forloop to go through your list of filenames andopen()each diary entry read() its contents. - Inside that same 
forloop, then use anifstatement to figure out which year the diary entry was written. Based on that,append()the entry to either youryear_1804oryear_1805list.- Hint: this is going to require a new function, but one that is related to the 
if f.endswith('.txt'):in your Week 4 exercise. Try Googling to see if you can figure out what function to use. 
 - Hint: this is going to require a new function, but one that is related to the 
 
Analyze the Data
Let’s do some basic analysis of Martha Ballard’s diary entries for these two years.
- In which year did she write more entries?
 - What is Ballard’s longest entry that she wrote in 1804 or 1805?
 - What is the shortest entry that she wrote?
 - What is the average length of her entries in 1804 vs. 1805? 
- Note: this is a tricky one that requires some thinking outside of the box or importing a library you haven’t used yet - if you can’t figure it out, feel free to skip it
 
 - What was the weather in Maine exactly 215 years ago from today? The goal is to generate a print statement that just prints out the sentence from that particular entry containing the weather. 
- Use a 
forloop to go back through your data files - Use an 
ifstatement to locate the correct text file based on its filename open(),read(), and assign the contents of just that file to a new variable- Use the 
split()function to create a new list, with each individual item being a different sentence from her diary entry. Think about what character you want to “split” on. print()just the sentence that talks about the weather. To do this, you’re going to tell Python which item inside your list of sentences you want to print using the brackets[]notation. Annoying Python feature: Python starts “counting” at 0 instead of 1. So to access the second item in your list, you’d useprint(somelist[3])NOTprint(somelist[2]). Fun right? :)
 - Use a