The digits 0 through 9 appear about the same number of times
The digits 0 through 9 appear about the same number of times among the first 100,000 digits. You probably expected this. But, do all the two-digit strings (00, 01, 02, 03, … 98, 99) appear about the same number of times? (To be clear, pi begins with the two-digit strings 31, 14, 41, 15, etc.) Using a for-loop, move through the digits array looking at two-digit strings (there will be 99,999 of these), and count up how many of each there are. You might want make your own R-script (File >> New File >> R Script), since this will likely take a few lines of code to do. What does your analysis reveal? [code, answer with some justification – do not write out the counts for all two-digit strings!] (If your code is correct, you should have 998 “00”s, 1027 “01”s, ….)
Solution
Let\'s imagine all numbers from 1 to 100, which is the same as 00 to 99. And suppose we already know the answer for the range 0 to 9.The range 0 to 9 repeats ten times, once for each leading digit, and each time this happens it increases the number of 5\'s by 1. However, when the leading digit is 5, we have to add an additional 10 to the number of 5\'s.This gives us a total of (1)(10) + 10 = 20.
Similarly, the answer for the range 000 to 999 is (20)(10) + 100 = 300, and the answer for the range 0000 to 9999 is (300)(10) + 1000 = 4000.
In general, if you write all numbers from 1 to 10n, 5 will be written n(10n1) times.
Proof:
It suffices to consider the numbers from 00...000 to 99...999, i.e. all (n-1)-digit strings of numbers, where 0 is a possible leading digit.
Now suppose 5 is written k times in all (n-2)-digit strings of numbers. Then it\'s written k times in all (n-1)-digit numbers starting with i, if i is any digit other than 5. If i = 5, we still get k 5\'s, plus an extra 5 for each (n-1)-digit number that starts with 5. In total our answer is 10k+10n1.
By the inductive hypothesis, k=(n1)(10n2), so our answer is (n1)(10n1)+10n1=n(10n1). This proves the statement.
Code:
#!/bin/bash
# Counting the number of lines in a list of files
# function version
get_files () {
files=\"`ls *.[ch]`\"
}
count_lines () {
f=$1 # 1st argument is filename
l=`wc -l $f | sed \'s/^\\([0-9]*\\).*$/\\1/\'` # number of lines
}
# the script should be called without arguments
if [ $# -ge 1 ]
then
echo \"Usage: $0 \"
exit 1
fi
# split by newline
IFS=$\'\\012\'
echo \"$0 counts the lines of code\"
l=0
n=0
s=0
get_files
# iterate over this list
for f in $files
do
count_lines $f
echo \"$f: $l\"loc
file[$n]=$f
lines[$n]=$l
n=$[ $n + 1 ]
s=$[ $s + $l ]
done
echo \"$n files in total, with $s lines in total\"
i=5
echo \"The $i-th file was ${file[$i]} with ${lines[$i]} lines\".

