This archive contains the dataset "Chronicling America Daily Words" that correspond to the paper: Dzogang, F. et al. (2016) Discovering Periodic Patterns in Historical News. Plos ONE. Please use the following citation when citing the data: @article{Dzogang2016PeriodicPatternsHistoricalNews, title={Discovering Periodic Patterns in Historical News}, author={Dzogang, Fabon and Lansdall-Welfare, Thomas and {The FindMyPast Newspaper Team} and Cristianini, Nello}, journal={Plos ONE}, year={2016}, publisher={Public Library of Science} } The content of the archive is listed below: data/ (daily records of 25K words frequency) - 25,001 files are released that correspond with the daily frequency of the most published words in News content in the United States between 1st January 1836 and 31st December 1922. The frequency was measured from a representative set of Newspaper across the United States at the time. - A file in this folder is named after the word it describes: `stem_identifier'.csv where `stem_identifier' was computed with the Porter Stemming algorithm. All files have two columns delimited by a space: `YYYYMMDD frequency_count'. - Days not included correspond with a frequency of 0 - to keep the series aligned across years, the 29th of February was not included on leap years. daily_total_frequencies.csv (daily records of total frequency) - 1 file that correspond with the daily records of the total frequency of all words with no gaps in time. A gap is define as a contiguous nine years interval where the frequency is 0. - to keep the series aligned across years, the 29th of February was not included on leap years.