Well, this is not super on-topic for the site, but there's probably a lot of cross-over interest with people here on doing a local smart speaker.
It came up in another thread with @KevinT, but was really not the topic of that thread, so I'm starting this one to have a better place to hold the discussion.
So the current "smart speakers" of yours are really just output devices? Is the MQTT to speech internal to the phone, or does something else do that?
I'd definitely be interested in continuing the discussion. Kind of off topic for this whole site, since its use of MySensors would be minimal. There would be a two-way integration, but it wouldn't be very tightly coupled.
My thought was a raspberry pi, or even a powerful server if needed for the speech recognition and also for responses, and then just raspberry pi zeros for the satellite devices. (They would be doing wake word detection and then simply passing audio back and forth, so not a lot of power needed here.) It would be able to be a pretty cheap overall solution, while keeping full privacy because everything would be done here at the house, not sent out to the cloud.
I'm going to add links here to what I think are relevant projects.
https://snips.ai/ is what got me started on the idea and general architecture, but they got bought out and shut down before I could really get my whole plan off the ground. I was still fighting the learning curve and only had a single device kind-of working like I wanted when the announcement was made. Their website now is just a landing page that points to sonos, so it's useless. Used to be that you could still view the old info there for a while.
Project Alice is an open-source fork of what was snips. Snips had been mostly open-source, with the large exception of their web-configurator tool. Project Alice is not a finished product, but it looks like it's in a useable state, though it also looks like it's mostly the passion project of a single developer. I think this is the way that I'm going to go, once I get enough free time to wrap my mind around it all. https://github.com/project-alice-assistant/ProjectAlice https://community.projectalice.io/
Then there's Mycroft. https://mycroft.ai/ It looks reasonably good for a front-end, but they use Google on the speech recognition. At least they realize the privacy downside of this, and they aggregate everyone's speech snippets and proxy it through their own server, so that minimizes the amount of info that Google could get out of it. They have been supporting Mozilla's Deep Speech project, but I don't think that you can send requests to that over the cloud. Though they also say that if you have hardware with enough power to run the Mozilla service locally that you can set it to do that. https://mycroft-ai.gitbook.io/docs/using-mycroft-ai/customizations/stt-engine#default-engine
While Mycroft will run on a Raspberry Pi, it needs to be a larger, more powerful one. It can't just run on a zero. https://mycroft-ai.gitbook.io/docs/using-mycroft-ai/get-mycroft/linux#system-requirements Although, now that the Zero 2 has come out, it might be enough.
Finally, just to be complete, there was the Jasper project. https://jasperproject.github.io/ I was especially interested in this one when it came out, both because it ran on a Raspberry Pi and my son is named Jasper. I thought that would be cool. It seemed like a great start, but then it died off pretty quickly. I got the feel like it was a student project and the students then were done with school and it didn't go anywhere.
That could be totally false - I didn't follow it closely. But it's been basically abandoned for quite a few years now. I didn't understand the code well enough to just pick it up and run with it, so I never did anything other than read up on it every year or two.
So that's my list for now. If nothing else, it's nice to have this all in one place for future reference. Right now I still think I'll end up going with Project Alice, but with how long it takes me to get to these things it might be a long time before I actually get anything going. If anyone else has suggestions, or things that I've missed, I'd love to hear about them!