Dmitry Dolgov
2021-02-01
is a text prediction model for mobile keyboards. Given some text input, it produces the next word in the sequence, saving keystrokes and time.
Data set: plain text files harvested from Twitter, blogs and news sites
1. 556 MB
2. 2.4 million lines of text
3. 105 million words
Model: stupid backoff using 1- to 6-grams
1. 22.8% accuracy for top-3 predictions
2. 20,000 words dictionary
3. 226 milllion ngrams analyzed
https://dmitrytoda.shinyapps.io/SwiftPredict/