We sat down with Edge Signal CTO Burak Cakmak to discuss the recent "Voice AI Project" in collaboration with Carleton University.
Burak: One of our customers had a unique use case: they wanted insights into what is spoken in their stores—keywords, brand mentions, campaign names, and sentiment analysis—without transferring or recording audio files. This was crucial to maintain privacy and confidentiality.
Edge computing became a must for this project. Beyond keyword and sentiment extraction, context was vital. For instance, how often is a product mentioned, and what’s the emotional tone? Are customers happy or upset? The challenge was achieving all this using resource-intensive large language models (LLMs) on edge devices, ensuring on-the-fly processing without saving audio recordings.
Burak: The team from Carleton University was a great help! Aidan Lochbihler, Julien Lariviere-Chartier, and Phil Masson, under Dr. Bruce Wallace’s guidance, collaborated closely with our Edge Signal development team to design a system capable of:
Achieving this required running LLMs directly on edge devices—no cloud processing. The biggest hurdles included optimizing these heavy-weight models for limited-resource devices and supporting multiple languages seamlessly. Thanks to everyone’s hard work, we proved it’s doable.
The collaboration was made possible in part through funding from the National Research Council of Canada Industrial Research Assistance Program (NRC IRAP) for Carleton University’s SAM3 innovation hub.
Burak: Privacy was non-negotiable. We used LLMs to transcribe audio on the fly without saving recordings. To improve transcription accuracy, we introduced a two-pillar approach for context generation:
By combining these approaches, we avoided retraining models for each customer in different countries and different languages—a process that’s both costly and impractical—while delivering accurate results. This was a big accomplishment!
Burak: Splitting audio channels was critical. By isolating individual speakers, we provided cleaner, more structured input to the LLMs as opposed to inputting raw text files. This greatly enhanced the system’s ability to extract sentiment and context from conversations.
As for multilingual support, it was one of our toughest challenges. Edge devices have limited resources, yet we needed to accommodate multiple languages without compromising performance. Through careful optimization and collaboration with the Carleton team, we developed a solution that’s both efficient and scalable.
Burak: While the collaboration proved the concept, there’s still work to be done to make it production-grade. Optimization will be my team’s responsibility, to refine the system to handle larger-scale deployments while maintaining speed and accuracy. Another important step will be to fine tune the model to accommodate various accents as that’ll be a future requirement. We also plan to adapt our technology to support aging in place and use cases related to assisted living.
This project demonstrated that edge computing and LLMs can coexist to deliver real-time insights while upholding privacy. It’s a game-changer for businesses seeking actionable data without compromising customer trust.
Burak: Partnering with Carleton University was a great experience. The team brought fresh perspectives and technical depth, helping us solve some of the most challenging aspects of the project. Together, we achieved something truly groundbreaking, and I’m excited to see how this technology evolves in the future.
This latest collaboration between Edge Signal and the team at Carleton University is driving innovation in voice AI by leveraging edge computing to create more secure, efficient, and advanced AI solutions.