These exercises will get you more familiar with reading files and manipulating strings with loops. Remember strings have many useful functions associated with them, which can be found in Python Docs, and opening files generally look like the following:
inFile = open(">filename<", "r")
for line in inFile:
# do something
inFile.close()
TAKE NOTE: If you don't re-open and close your file every time you try to read from the file, you may not get any data back. This issue results from how Python handles file IO (using generators). You can address this issue either by closing and re-opening your target file each time, or calling inFile.seek(0) to reset the file. More information on file IO in Python is available here: (https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files)
Write code to open a file, read each line of the file (you can use the for loop iterator on the opened file object), and print each line to the screen.
As a test file, open and read the reversedTestFile.txt
file accompanying this notebook. Also try the other stringOps.txt
and tweetBodies.txt
files as well.
For each line in the test file you opened, use the len()
function to calculate and print the number of characters in each line. Also, using nested for loops, count the number of non-space (i.e., " ") characters present in the line as well.
Using the code you wrote in Exercise 2 and some basic math, calculate the average number of characters per line and average number of words per line (assume the number of words is 1+ the number of spaces in the text).
Write and call a function that searches for a given pattern of characters at the beginning of an input string. Call this function startsWith()
that should take as arguments an input string and a pattern and return a boolean value of whether the input string starts with the given pattern. Use this function to print only lines from a file that start with the given pattern.
As a test case, open stringOps.txt
and only print lines that start with "str.".
As an example, startsWith("str.lower() - Lowercase a string.", "str.")
should return true.
Using array indexing into a string, for loops, and string concatenation, write a function called reversed()
that takes a string as an argument and returns the reversed version of that string.
As a test case, open reversedTestFile.txt
and print the reversed lines to the screen.
As an example, reversed("Hello")
should return "olleH".
Write a function, called find()
, that takes two arguments, a search string and a pattern character, and returns the index of the first occurrence of that character in the search string. If the pattern does not exist in the search string, return -1. If the pattern exists in the search string multiple times, only return the index to the first instance. Use this function to determine whether a line in a file contains a hashtag (as marked by #
).
As a test case, open tweetBodies.txt
and print only those lines with hashtags.
As an example, find("Pattern", "t")
should return 2.
Write a function, called partition()
, that takes two arguments, a search string and a separator, and separates the input string by that separator. Your function should return three items:
If the separator is not found, return the whole string, and two empty strings.
As an example, partition("Pattern", "t")
should return ("Pa", "t", "tern").
Use the functions you've developed above to find and print all hashtags present in each line of the tweetBodies.txt
file.
I.e., for each line in the file, print all hashtags that are present on that line and only those hashtags, no other text.