This is a hot topic now, and a lot is being written about it.
I work with speech recognition in conversational agents, and can tell you it’s still not perfect. While this is amusing in chatbots, it’s not quite ready for mission critical applications.
But voice search does not mean simply speech to text; voice search also implies text to speech. For instance, all of the Twitter readers I’ve tried are still imperfect and have a hard time with hashtags, URLs, and SMS language, which doesn’t make for fun listening.
The big component that’s still missing in almost all applications is summarization. Any voice search will need to at least partially abbreviate and interpret, if not tailor, results.
I imagine in the medium-term future people will cease to surf the web as they do today, and a layer of (voice) agents will use the web as we know it for their backend.