Reaction Statistics Background When collecting experimental
Reaction Statistics
Background
When collecting experimental data from chemical reactions, it’s often useful to generate statistics based on the data. One experimental measure is the reaction rate in moles per second, representing the amount of product formed per unit time. If we have a set of these reaction rates collected in a data file, we can calculate summary statistical information, such as the minimum and maximum values, the arithmetic mean, variance, and standard deviation.
Finding the minimum and maximum are straightforward: we scan through all the data, and keep track of the smallest and largest values encountered. The arithmetic mean (or average) is defined as:
m = (X1+X2+…+Xn)/n
where n is the number of reaction rates, and xi represents one experimental reaction rate. Once you have the arithmetic mean, the variance can be calculated as the mean of the squares of the deviations from the mean:
v= ((Xn-m)^2+(X2 – m^2) + …+(Xn-m)^2)/n
where n is the number of reaction rates, xi represents one experimental reaction rate, and m is the arithmetic mean of the reaction rates. Once you have the variance, you can calculate the standard deviation as:
s = sqrt(v)
Assignment
You will develop a C program that reads data from an input text file containing chemical reaction rates (in moles per second), and computes the minimum, maximum, arithmetic mean, variance, and standard deviation for that set of data. Your instructor will provide input text files, which will each contain a series of double values, each on a line of its own within the file. Your program will read one of these input files into an array of doubles (i.e., it will populate the array using the data values from the file). Your program will then calculate statistics using that array of doubles, and will write the results out to a separate output text file.
The goals of this assignment are to provide you with experience reading and writing text data files, provide you with experience passing an array into a function, and give you more experience organizing your program into separate C functions.
When defining your C functions, you may either:
Define the functions before they are used by any other functions, OR
Place function prototypes near the top of your code (after all #include directives), and then define the functions in any order.
Part 1 – Opening Files and Reading Data
Create a new Visual Studio Win32 Console project named reactionstats. Create a new C source file named project4.c within that project. At the top of the source file, #define _CRT_SECURE_NO_WARNINGS, and then include stdio.h, math.h, stdlib.h, stdbool.h, and float.h.
Inside your main function, define the following:
A one-dimensional array of 600 doubles. They do not need to be initialized to anything at this stage.
An integer variable to hold the number of elements in the array, initialized using the approach demonstrated in class, using sizeof.
A FILE pointer variable, which will refer to the input data text file. You will later store the return value of fopen into this pointer variable.
Another FILE pointer variable, which will refer to the output text file. You will later store the return value of fopen into this pointer variable.
Download the ZIP file, extract the two input data files from it, and place the two files in a known location on your local computer. You will need to know the exact full pathname of the files in a later step, so take note of where they are stored. Choose an output file name, where your program will write its statistics. Ideally, the folder/directory of the output file should be the same as the folder containing the input files. But the output file name itself must be different from either of the input file names. Note that you don\'t need to create the output file at this stage, because your program will create it.
In your main function, after the above definitions, call the fopen function to open the first input data file for reading, and store the result of fopen into one of your FILE pointer variables. Then call fopen again, this time to open your output file for writing. When you specify pathnames in fopen, be sure to use two backslashes for each single backslash that is part of the path name. For example, if your input file is on the F drive and the full pathname is \"f:\\data\ eaction-data1.txt\", use \"f:\\\\data\\\ eaction-data1.txt\" in the call to fopen. Verify that you have specified the pathnames to the first input file and your output file correctly before proceeding.
Defining the ReadDoublesIntoArray Function
Define a separate C function named ReadDoublesIntoArray. This function has the following prototype:
int ReadDoublesIntoArray(FILE *fp, double data[], int arrayElements); or
int ReadDoublesIntoArray(FILE *fp, double *data, int arrayElements);
The two notations are equivalent. You can use either notation in your function definition. The function receives as parameters the FILE pointer of a file that has already been opened for reading, the address of a one-dimensional array of doubles, and the number of available elements in that array. The function returns the number of data values actually read from the file. The function must behave as follows:
Validate the incoming parameters: If the FILE pointer is NULL, or the array pointer is NULL, or the number of array elements is less than 1, return –1 to the caller. Otherwise, continue to the next step.
Define an integer variable that will be used to contain the number of values read from the file. Inside a loop (of your choice), do the following:
Call fscanf to read one double value into one local double variable, and check that the fscanf function returns 1 (indicating success) before copying the value into an element of the array. If the call to fscanf fails (returns anything other than 1, immediately break
Page 2 of 5
out of the loop with a break statement. (Failure of fscanf typically means either you\'ve reached the end of the file or the file contains something that isn\'t a double value.)
If the number of double values in the file exceeds the number of available elements in your array, you must stop reading the file (end the loop) when you reach the array\'s limit.
You must keep track of how many values were successfully read from the file and stored into the array.
After the loop is complete (either because fscanf reported failure or you reached the limit of available array elements), return to the caller the number of values successfully read from the file into the array.
That\'s all there is in the ReadDoublesIntoArray function definition. (Note that this function could be used to read data from a file into any array of doubles.)
Calling the ReadDoublesIntoArray Function
Back inside your main function, after the two fopen calls you wrote earlier, use an if statement to check if either of the FILE pointers returned by fopen are NULL. If either one is NULL, print an appropriate error message to the console window, and return –1 from main. Otherwise, proceed to call your ReadDoublesIntoArray function, passing in your input FILE pointer, the address of your array of doubles, and the number of elements in the array (not a hard-coded value, but the variable containing the computed number of elements you defined earlier in main). Store the integer result returned by ReadDoublesIntoArray into a local integer variable.
Immediately after the call to ReadDoublesIntoArray, use fclose to close the input file (because you\'ve already got all the data stored in your array), and then print the following to the console window:
Reaction count
=
516
First reaction rate
=
48.1000
moles/sec
Last reaction rate
=
38.2220
moles/sec
where the first value is the value that was returned by ReadDoublesIntoArray (the number of values actually read from the file and stored into your array), the second value is the very first element of your array (which should be the same as the first value in the data file), and the third value is the last element containing data in your array (which should be the same as the last value in the data file, in this case).
Build and run your program, to verify that you are able to open the first input file, read the values, print these lines of output to the console window, and that the values are correct (they exactly match this sample output above, when using the first input data file).
Now in your code change the name of the input file you\'re opening to the second input data file. Build and run your program again, and verify that you get the following output in the console window:
Reaction count
=
600
First reaction rate
=
48.1000
moles/sec
Last reaction rate
=
40.3800
moles/sec
In this case, using the second data file which contains more than 600 values, you are testing to make sure that you don\'t place more than 600 values into your array. The second value is the first element of your array, and the third value is the last element of your array (which should be the same as the 600th value in the data file).
Confirm that you are getting the correct output in the console window for both data files, and fix any problems you find before proceeding to the next part.
Part 2 – Generating Statistics
Define a separate C function named GenerateStatistics. This function has the following prototype:
bool GenerateStatistics(FILE *fp, double data[], int dataElementsToProcess); or
bool GenerateStatistics(FILE *fp, double *data, int dataElementsToProcess);
You can use either notation in your function definition. The function receives as parameters the FILE pointer of a file that has already been opened for writing, the address of a one-dimensional array of doubles, and the number of elements in that array that contain valid data. The function returns true if it succeeds and false if it fails. The function must behave as follows:
Validate the incoming parameters: If the FILE pointer is NULL, or the array pointer is NULL, or the number of array elements is less than 1, return false to the caller. Otherwise, continue to the next step.
Define local double variables to hold each the following: the sum of the values (initialized to 0), the maximum value (initialized to 0), the minimum value (initialized to the largest double value, DBL_MAX, defined in float.h), the mean, the variance, and the standard deviation.
Using a for loop (which needs its own loop variable), run through all valid elements of the array, starting from index 0. Inside the loop:
o Add the array element\'s value to the sum.
o Update the minimum and maximum variables. (You may use if statements for this, or you may use min and max, defined in stdlib.h.)
After the end of this loop, compute the mean using the formula on page 1, using the dataElementsToProcess parameter as the denominator.
Reset the sum variable to 0.
Using another for loop (which needs its own loop variable), run through all valid elements of the array, starting from index 0. Inside the loop:
o Use the sum variable to accumulate the numerator of the variance formula shown on page 1.
After the end of this loop, compute the variance by dividing the new sum by the dataElementsToProcess parameter.
Compute the standard deviation as shown on page 1, using the variance you\'ve just calculated. Using a series of calls to the fprintf function, passing in the output FILE pointer that was
passed into GenerateStatistics, print the values of your variables in following format in the output file, not to the console window. Here is a sample using the first input data file:
Reaction Rate Statistics:
Number of experiments =
516
Minimum rate
=
21.1300
moles/sec
Maximum rate
=
57.1300
moles/sec
Arithmetic mean
=
43.4559
moles/sec
Variance
=
15.2356
Standard deviation
=
3.9033
moles/sec
Return true to the caller.
All output is generated from within GenerateStatistics, and all of it output goes to the output file (not to the console window).
That\'s all there is in the GenerateStatistics function definition.
Calling the GenerateStatistics Function
Back inside your main function, in an if statement, check to see if the integer value you received from ReadDoublesIntoArray is greater than 0. If it\'s not, print an appropriate error message to the console window and return –1 from main. Otherwise, proceed to call GenerateStatistics, passing in your output FILE pointer, the address of your array, and the number of valid elements (which is the value you received from your earlier call to ReadDoublesIntoArray).
If the call to GenerateStatistics returns false, print an appropriate error message to the console window and return –1 from main. Otherwise, if GenerateStatistics returns true, use fclose to close the output file, and return 0 from main.
That\'s all there is in the main function.
Part 3 – Testing Your Program
Two data files are provided by your instructor for testing your program:
reaction-data1.txt contains fewer than 600 values. reaction-data2.txt contains more than 600 values.
Be sure that your program behaves correctly with each of these two input data files. When using reaction-data1.txt, you should see the following in the output file:
Reaction Rate Statistics:
Number of experiments =
516
Minimum rate
=
21.1300
moles/sec
Maximum rate
=
57.1300
moles/sec
Arithmetic mean
=
43.4559
moles/sec
Variance
=
15.2356
Standard deviation
=
3.9033
moles/sec
When using reaction-data2.txt, you should see the following in the output file:
Reaction Rate Statistics:
Number of experiments =
600
Minimum rate
=
21.1300
moles/sec
Maximum rate
=
57.1300
moles/sec
Arithmetic mean
=
43.4415
moles/sec
Variance
=
15.1802
Standard deviation
=
3.8962
moles/sec
These are the actual results from the actual input data files supplied by your instructor, so that you can verify your program is working. When generating your output, pay attention to detail. Make sure your output file format conforms to these examples.
Before submitting your project, verify that your program’s output is correct, the alignment of decimal points in the output is correct, the number of decimal places shown is correct, etc. Also, be sure to make whatever additions are necessary to meet all of the requirements listed in the Project Submission Requirements document.
| Reaction count | = | 516 | |
| First reaction rate | = | 48.1000 | moles/sec |
| Last reaction rate | = | 38.2220 | moles/sec |
Solution
#define _CRT_SECURE_NO_WARNINGS
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
#include<stdbool.h>
#include<float.h>
int ReadDoublesIntoArray(FILE *fp, double data[], int arrayElements);
bool GenerateStatistics(FILE *fp, double data[], int dataElementsToProcess);
int main()
{
double arr[600];
int n=sizeof(arr),result=0;
FILE* ip,*op;
bool test;
//for reading add your usr path
ip=fopen(\"C:\\\\Users\\\\IBM_ADMIN\\\\Downloads\\\\Input.txt\",\"r\");
if(ip==NULL)
printf(\"Filed To Open a File\");
result=ReadDoublesIntoArray(ip,arr,n);
fclose(ip);
//Add your user path for the writting
op=fopen(\"C:\\\\Users\\\\IBM_ADMIN\\\\Downloads\\\\Input.txt\",\"w\");
if(op==NULL)
printf(\"Filed To Open a File\");
if(result>0)
return -1;
else
{
test=GenerateStatistics(ip,arr,n);
if(test==true)
fclose(op);
else
printf(\"Writing data to file failed\");
}
return 0;
}
bool GenerateStatistics(FILE *fp, double data[], int dataElementsToProcess)
{
double sum=0,maxValue=0,minValue=DBL_MAX;
double mean=0,var,sd=0;
int i;
if((fp==NULL) || (dataElementsToProcess<1) || (data==nullptr))
return -1;
for(i=0;i<dataElementsToProcess;i++)
{
sum=sum+data[i];
maxValue++;
minValue++;
}
mean=sum/dataElementsToProcess;
sum=0;
for(i=0;i<dataElementsToProcess;i++)
{
sum=sum+pow((data[i]-mean),2);
}
var=sum/dataElementsToProcess;
sd=sqrt(var);
fprintf(fp,\"%f\ %f\ %f\ %f\ %f\ \",maxValue,minValue,mean,var,sd);
return true;
}
int ReadDoublesIntoArray(FILE *fp, double data[], int arrayElements)
{
int count=0;
if((fp==NULL) || (arrayElements<1) || (data==nullptr))
return -1;
else
{
while(!feof(fp))
{
fscanf(fp,\"%f\",data[count]);
count++;
}
return count;
}
}







