Respuesta :

It converts the mic input into a string of raw data, then compares it to hundreds, even thousands of voice samples. The output is a polished string of data in words.