UNIX Time-Sharing System: Statistical Text Processing

Abstract

Several studies of the statistical properties of English text have used the UNIX∗ system and UNIX programming tools. This paper describes several of the useful UNIX facilities for statistical studies and summarizes some studies that have been made at the character level, the character-string level, and the level of English words. The descriptions give a sample of the results obtained and constitute a short introduction, by case-study, on how to use UNIX tools for studying the statistics of English.

Keywords

TEXT
UNIX
CHARACTER
GIVE
MADE
CONSTITUTE
WORDS
STRING
INTRODUCTION
ENGLISH

This publication has 1 reference indexed in Scilit:

Computer detection of typographical errors
IEEE Transactions on Dependable and Secure Computing, 1975