SwiftPredict. Ngram text prediction model

Dmitry Dolgov
2021-02-01

SwiftPredict

is a text prediction model for mobile keyboards. Given some text input, it produces the next word in the sequence, saving keystrokes and time.

keyboard

Algorithm

  • Technology: Implemented in R using quanteda package for text mining and data.table for data storage and search optimization
  • Data set: plain text files harvested from Twitter, blogs and news sites

    1. 556 MB
    2. 2.4 million lines of text
    3. 105 million words
    
  • Model: stupid backoff using 1- to 6-grams

    1. 22.8% accuracy for top-3 predictions
    2. 20,000 words dictionary
    3. 226 milllion ngrams analyzed
    

Proof of concept web app

https://dmitrytoda.shinyapps.io/SwiftPredict/

  • accepts arbitrary text as input
  • produces top-3 predictions
  • explains how the predictions were obtained gui