Programming Project Word Percentages Do writings by individu
Programming Project: Word Percentages
Do writings by individual authors have statistical signatures? They certainly do, and while such signatures may say little about the quality of an author\'s art, they can say something about literary styles of an era, and can even help clarify historical controversies about authorship. Statistical studies, for example, have shown that the Illiad and the Odyssey were not written by a single individual.
Project Objectives
1. Use inheritance to specialize the functionality provided by existing code.
2. Write statements that process lines of text from a file.
3. Use arrays to record observations about a data set.
4. Write a class that conforms to an existing specification.
For this assignment you are to create a program that analyzes samples of text -- novels perhaps, or newspaper articles -- and produces two statistics about these texts: word size frequency, and average word length.
The program consists of three classes: FileAccessor, WordPercentagesDriver and WordPercentages. For this project you will write the WordPercentages class, which must compile and work with the FileAccessor and WordPercentagesDriver classes provided. The FileAccessor class provides basic file I/O functionality. The driver class reads in the name of a file that contains the text to be analyzed, creates an instance of WordPercentages, obtains the statistics and prints them to the console.
Here are some files to use for testing:
testFiles.zip (basically large pieces of text)
And the results:
results.txt
Note: your numbers may differ by +-.01
You can obtain interesting sample texts by, for example, visiting the Gutenberg foundation website (Gutenberg.org), and downloading books from there.
Your job, then, is to code a solution to this problem, and provide these two statistics - word size percentage, for word lengths from 1 to 15 and greater, and average word length.
The source code files provided are WordPercentagesDriver.java and FileAccessor.java.
/JavaCS1/src/scrabble/wordpercentages/FileAccessor.java
/JavaCS1/src/scrabble/wordpercentages/WordPercentagesDriver.java
Notice that the output formatting is NOT produced by the WordPercentages code. It is done by the printWordSizePercentages method in the driver class.
Project Requirements:
1. Your WordPercentages class must extend the FileAccessor class to read the lines of the text file. Points will be deducted if you do not do this properly.
2. Your WordPercentages class must have a constructor that takes the file name as a parameter.
3. You must define and implement the getWordPercentages method in your WordPercentages class which takes no parameters and returns an array of type double. This array contains 16 cells. The index of each cell is the length of the word, the value of the cell is the percentage of all words in the text that have that length. For example, if the cell at index 5 has the value 12.958167330677291, then approximately 13% of all words in the text had a length of 5. NOTE: The output is formatted in the printWordSizePercentages method to a precision of 2 decimal places. The values in your array are not formatted. Note that the cell at index 0 will not be used, since there are no words of length 0. The cell at index 15 will have the percentage of words of length 15 and greater.
4. You must define and implement the getAvgWordLength method in your WordPercentages class which takes no parameters and returns a single double value which is the average word length that was observed in the text.
5. You MUST use the String class method split with ONLY these delimiters to tokenize each line of text into words: split(\"[,.;:?!() ]\") and no other filtering.
6. You must NOT include words of zero length in your calculations.
7. You must use an array to store the frequencies of word lengths.
Solution
Answer:
Save a txt file
The three classes: FileAccessor, WordPercentagesDriver and WordPercentages are:
Class FileAccessor.java
package wordpercentagesdriver;
import java.io.*;
import java.util.Scanner;
public abstract class FileAccessor
{
String fileName;
Scanner scan;
public FileAccessor(String f) throws IOException
{
fileName = f;
scan = new Scanner(new FileReader(fileName));
}
public void processFile()
{
while(scan.hasNext())
{
processLine(scan.nextLine());
}
scan.close();
}
protected abstract void processLine(String line);
public void writeToFile(String data, String fileName) throws IOException
{
try (PrintWriter pw = new PrintWriter(fileName)) {
pw.print(data);
}
}
}
Class myWordPercentages.java
package wordpercentagesdriver;
import java.io.*;
import java.util.Scanner;
//myWordPercentages class definition which inherits from fileAccessor
public class myWordPercentages extends FileAccessor
{
//local variables declerations
int[] mylength = new int[16];
double[] mypercentages = new double[16];
int mytotalWords = 0;
double myaverage = 0.0;
//class constructor
public myWordPercentages(String myS)throws IOException
{
super(myS);
}
//processig lines of the input file
public void processLine(String file)
{
super.fileName=file;
while(super.scan.hasNext())
{
//updating total words count
mytotalWords+=1;
//reading next word from the file
String myS = super.scan.next();
//calculating the length of the words given
if (myS.length() < 15)
{
mylength[myS.length()]+=1;
}
//If word length greater than 15
else if(myS.length() >=15)
{
mylength[15]+=1;
}
}
}
//words percentage calculation functions.
public double[] getWordPercentages()
{
for(int j = 1; j < mypercentages.length; j++)
{
mypercentages[j] += mylength[j];
mypercentages[j]=(mypercentages[j]/mytotalWords)*100;
}
return mypercentages;
}
public double getAvgWordLength()
{
for(int j = 1; j<(mypercentages.length); j++)
{
myaverage+=((j*(mypercentages[j])/mytotalWords));
}
return myaverage;
}
}
Class WordPercentagesDriver.java
package wordpercentagesdriver;
import java.io.*;
import java.util.Scanner;
import java.io.IOException;
public class WordPercentagesDriver
{
public static void main(String[] args) throws IOException
{
try
{
String fileName;
Scanner scan = new Scanner(System.in);
System.out.println(\"Enter a text file name to analyze:\");
fileName = scan.nextLine();
System.out.println(\"Analyzed text: \" + fileName);
myWordPercentages wp = new myWordPercentages(fileName);
wp.processFile();
double [] results = wp.getWordPercentages();
printWordSizePercentages(results);
System.out.printf(\"average word length: %4.2f\",wp.getAvgWordLength());
}
catch(Exception e)
{
System.out.println(e);
}
}
public static void printWordSizePercentages(double[] data)
{
for(int i = 1; i < data.length; i++)
if (i==data.length-1)
System.out.printf(\"words of length \" + (i) + \" or greater: %4.2f%%\ \",data[i]);
else
System.out.printf(\"words of length \" + (i) + \": %4.2f%%\ \",data[i]);
}
}




