Hello Stack Overflow Community,
I'm currently delving into creating a voice chatbot using OpenAI's API and I've hit a bit of a snag. My core understanding revolves around the chatbot creation itself, but I'm a bit perplexed about incorporating call receiving and answering functionalities. To give you a clearer picture, I'm considering two options:
Option 1: A basic setup without much online processing.
Option 2: A more advanced approach where most processing is done online, using VoIP services (like Twilio) for call handling, thereby bypassing the need to interface with Android directly.
I have a few specific concerns with Option 2:
VoIP Costs: I'm worried about the charges per minute and whether the call quality could potentially be inferior compared to regular telco calls. Also, is there a risk of latency or lag due to long-distance connections?
TTS and STT Services: The quality of modern text-to-speech and speech-to-text services is impressive, but they come with a cost per word. This adds to the expense, and I'm also concerned about potential delays in data transmission over the internet.
Bandwidth Considerations: My current internet setup is 5G, offering speeds between 270 and 650 Mbps at a cost-effective price. I'm inclined to believe that this mobile connection might be more reliable than fiber internet for this purpose.
Could the community provide insights on the following:
Estimated costs for** VoIP, TTS, and STT services**.
Recommendations on managing potential delays and ensuring voice quality. Any personal experiences or advice on using OpenAI for such a project. I'm also considering Option 1, which is a simpler setup, but I would like to explore Option 2 thoroughly first. Any advice or suggestions would be greatly appreciated!
Thanks in advance for your help and guidance!
What I've Tried:
- Researched VoIP services like Twilio for call handling.
- Considered using TTS (Text To Speech) and STT (Speech To Text) services for real-time communication.
- Explored the feasibility of using my current 5G internet setup for this project.
What I'm Expecting:
- A seamless integration where the chatbot can receive and answer calls.
- Minimal latency in voice transmission.
- Cost-effective solutions for VoIP, TTS, and STT services.