INST326 HW01
Introduction and Programming Assessment
May 30, 2017
DUE: Tuesday, June 6
1 Introduce Yourself
- Come by my office hours and introduce yourself (see syllabus for office hours and location).
- Write a short summary of your programming background.
- Why are you taking this course?
- What do you expect to get out of this course?
- What challenges do you anticipate encountering during this course?
2 Python Installation
To ensure you have Python installed and working, please run the following code. Copy and paste the
program’s output into a file called hw_01.p02.txt.
3 Programming Assessment
-
High-Level Assignment:
- Write a program, in a language of your choice, to read a text file and
output the top 10 most common words in the file and their counts.
-
Requirements:
- Your program should satisfy the following requirements.
- The input file (i.e., the file to be read by the program) should be taken from
a command line argument. For example, if you wrote a Python program called
CommonToken.py, the user should be able to run it on a given text file by calling:
python CommonToken.py <filename.txt>.
- The output of your program should print the most common words in decreasing order
of frequency, where each line contains the word, a tab (or four white spaces), and the
frequency of that word.
- Your program should be case insensitive (i.e., ignore word case) when counting
frequency, so the word “Hello” and the word “hello” are counted as the same word.
- Your program should also ignore punctuation. Be careful to ensure tokens that end
with a punctuation mark are converted to an alphabetic word (e.g., “goodbye.” should
be counted as “goodbye”).
- Your program should be sufficiently documented, so I can understand how your
program works.
3.1 Testing
I have provided two text files and the appropriate output for you to use to check your work. File
test01_cc_sharealike.txt is a brief overview of the Creative Commons Sharealike license, and
test02_the_last_question.txt is a copy of Isaac Asimov’s “The Last Question” short
story.
3.1.1 Test 1: Expected Output for test01_cc_sharealike.txt
the | 20 |
work | 9 |
or | 8 |
you | 7 |
to | 7 |
rights | 6 |
of | 5 |
any | 5 |
in | 5 |
license | 4 |
|
3.1.2 Test 2: Expected Output for test02_the_last_question.txt
the | 261 |
of | 142 |
and | 123 |
a | 107 |
to | 103 |
it | 73 |
in | 65 |
was | 59 |
all | 53 |
that | 51 |
|
4 Submission
All submissions should be uploaded to ELMS subject to the instructions below.
- A text file answering the questions from Part 1, a text file for your output from Part 2, and the
source code for your program should submitted as a single ZIP file. This file should contain
all the uncompiled source code for your program, with the zip filename formatted
as:
<<lastname>>-hw01.zip
- Your code should be named something like CommonToken.py or similar depending on your
programming language of choice.
- Ensure your name, date and email address are at the top of all files and documents.