If an attacker has a botnet with a substantial amount of nodes and they run all the nodes as Tor relays they will all be banned from the Tor network. Tor directory authority servers have a lot of systems in place to prevent an attacker with a huge botnet from suddenly turning the entire thing into a bunch of Tor nodes. They would need to slowly add the nodes over time. There is a limit to how many new nodes can join the network at a time. That said, they would not even need to take their relays down one at a time. They can see the data arrive at the other end. If you can see a packet transmitted through Tor at any point on its path, you can use a timing attack to identify that packet at any other point you can see it at. So the attacker in your proposed scenario can simultaneously monitor traffic and immediately determine if a packet they see being routed through one of their nodes is the same packet they see arriving with the voice data at the end of the circuit.