Featured Post
Show HN: LegendAI-Amazon Sales Tracker https://ift.tt/Qmk4XB9
Show HN: LegendAI-Amazon Sales Tracker Get Actual Not Estimate Amazon Product Data! Real-Time Amazon Sales and Data Insights. Get accurate s...
Monday, February 20, 2023
Show HN: Whisper.cpp and YAKE to Analyse Voice Reflections [iOS] https://ift.tt/dY0XJEr
Show HN: Whisper.cpp and YAKE to Analyse Voice Reflections [iOS] Six months ago, I went full-time indie, but I haven't released anything so far. The products just never felt good enough for me to publicly say this is what I'm doing now. To get out of this mindset, I decided to make an app for myself in a week, add monetization, release it and move on. The app idea was simple: Reflect on your day by answering the same four questions out loud. The answers are transcribed and with regular use you can see what influences you the most and take action. All on-device, as otherwise I wouldn't feel comfortable sharing my thoughts. I had all core features working within a day by simply modifying an existing example app. However I was dissatisfied with iOS's built-in offline transcription due to a lack of punctuation and the speech recognition permission prompt that made it seem like data would leave the device. Decided to use whisper.cpp [0] (small model) instead. This change, lead to many others, as I now felt too little of the app's code was mine. e.g.: - Added automatic mood analysis. First using sentiment analysis, then changed to a statistical approach - Show trends: First implemented TextRank to provide a summary for an individual day, then changed it to extract keywords to spot trends over weeks and months. Replaced TextRank with KeyBERT for speed and n-grams, then BERT-SQuAD, and ended on a modified YAKE [1] for subjectively better results. (Do you know of a better approach?) As a result, this tiny app took me over a month, but it still has its flaws: - Transcription is not live but performed on recordings, so if you immediately want the transcript of your most recent answer, you have to wait. - Mood and keyphrase extraction are optimized for my languages and way of speaking, so they might not generalize well. - Music in the background can result in nearly empty transcripts. Nevertheless, after using the app regularly and enjoying it, I feel ready to release. Hope you will find the app useful too. [0] Show HN: Whisper.cpp https://ift.tt/2daNBoE [1] YAKE: https://ift.tt/sznPBYc https://ift.tt/rtJMcjU February 20, 2023 at 10:08AM
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment