awk '{w++;s+=length($1)*$2;t+=$2}END{print "words ",t,w,t/w," chars ",s,s/t}' words letters sentences total diff. repeat total per word total words per Shakespeare(all): 910655 23513 38.7 3696317 4.06 48101 18.9 Bible(KJV): 791829 12569 64.0 3224190 4.07 29811 26.6 Ulysses(Joyce): 268859 29020 9.3 1182287 4.40 23993 11.2 MobyDick: 214672 16693 12.8 935729 4.36 10119 21.2 PridePrejudice: 122816 6258 19.6 536408 4.37 7141 17.2 TimeMachine: 32775 4579 7.1 140653 4.29 1832 17.9 AliceWonderland: 27338 2573 10.6 107690 3.94 1637 16.7 Peter and Wendy: 47382 4971 9.5 195745 4.13 3330 14.2 SherlockAdventurs:104403 8333 12.5 431734 4.13 7289 14.3 Billy Budd 30319 5857 5.2 143355 4.7 1195 25.3 includes proper nouns. and variants of a word. Number of sentences is sum of periods, !s, and ?s.