展示 HN：利用手机麦克风进行实时呼吸检测与生物反馈

展示 HN：利用手机麦克风进行实时呼吸检测与生物反馈
Show HN: Live breath detection and biofeedback from a phone microphone

原始链接: https://github.com/shiihaa-app/shiihaa-breath-detection

本项目介绍了一套基于手机的生物反馈系统，它利用设备麦克风实时检测呼吸模式，无需佩戴任何可穿戴设备，也不包含干扰性的游戏元素。该系统旨在培养自我意识，所有音频处理均在设备本地完成，以确保用户隐私；它分析的是呼吸的频谱形状和能量包络，而非语音内容，且不会上传任何原始音频数据。其核心技术采用了一套稳健的流程，包括信号处理、用于跟踪呼吸阶段的自适应状态机，以及用于剔除模糊信号以避免错误反馈的数据质量层。虽然该系统利用机器学习不断优化精度，但其核心仍基于规则，以确保在各种复杂的现实声学环境中保持可靠表现。目前，该工具已在应用程序 *shiihaa* 中实现，旨在提供温和、灵敏的实时反馈，而非通用的引导式呼吸。它仅作为健康与自我意识的辅助工具，而非医疗设备，旨在帮助用户找到属于自己的“共振范围”以保持平静。该项目目前仍处于研究阶段，正在针对临床标准进行验证，以提升其在各类不受控的日常硬件环境下的有效性。

瑞士家庭医生 Felix Zeller 推出了一款名为 **shii • haa** 的开源移动应用，旨在利用智能手机的麦克风提供呼吸模式的实时生物反馈。基于在急救和重症监护领域的经验，Zeller 开发该工具的核心目的在于促进自我觉察，而非游戏化体验。该应用利用信号处理、状态机和机器学习技术来分析呼吸的节奏、深度和规律性。为确保用户隐私，所有音频处理均在设备本地完成，不会将原始音频或语音数据上传至服务器。内置的质量评估层可过滤背景噪音，以保持反馈的准确性。与许多依赖奖励机制或绩效评分的健康应用不同，shii • haa 专注于正念，鼓励用户单纯地观察自己的呼吸习惯。Zeller 目前正在寻求具备信号处理、健康用户体验（UX）和移动音频工程专业知识的开发者提供反馈，以进一步完善该项目。

原文

Live breath detection and biofeedback using a phone microphone.

Can an app use breathing feedback to increase self-awareness instead of becoming another distraction?

That question is the reason this project exists. Most "mindfulness" software ends up competing for attention rather than handing it back. We wanted to know whether the phone could do the opposite: stay quiet, listen to how you breathe, and reflect it back closely enough that you notice your own pattern, without a wearable, without a coach, without turning it into a game.

The hard part is the listening. A phone microphone in a real room is a messy signal: room tone, traffic, a fan, the phone resting on fabric, the person shifting position. Out of that, we try to recover where one breath ends and the next begins, and which phase you are in right now.

Reads audio from the phone microphone and processes it on-device.
Estimates the current breathing phase (inhale, exhale, and the transitions and holds between them) and tracks completed breath cycles.
Drives biofeedback: the interface responds to the breath in close to real time, so the signal you see or feel is your own.

No speech analysis. The pipeline works on the envelope and spectral shape of breathing, not on words. It is not built to recognise or transcribe anything you say.
No raw audio upload. Audio is analysed locally. The raw microphone stream does not leave the device.

Three layers sit on top of the raw microphone signal:

Signal processing. The audio stream is cut into short overlapping windows. For each window we derive an amplitude/energy measure and basic spectral features (where the energy sits in frequency, where the peaks are). Inhale tends to be more turbulent and higher in the spectrum; exhale tends to be lower and smoother. None of this is reliable on a single window; it only becomes useful across a sequence.
A breathing state machine. Phase isn't decided per window in isolation. A small state machine tracks the current phase and the plausible transitions out of it (inhale → exhale, exhale → hold, and so on), using adaptive thresholds that recalibrate as ambient conditions drift. This is what lets the system distinguish a genuine phase change from a momentary dip or spike.
A data-quality layer. Before a window is allowed to influence the output, it has to pass quality checks. Windows that are too noisy, too quiet, or acoustically ambiguous are rejected rather than guessed. The point is to fail honestly: a brief "not sure" is better than a confident wrong phase that the user can feel is wrong.

Machine learning is part of the picture, but in a deliberately bounded way: it is used to sharpen feedback and to improve the model over time from quality-checked examples, not as a black box that the whole detection depends on. The rule-based pipeline is what runs the live experience; ML refines it.

A lot of the work is in the unglamorous part: handling real-world mobile audio quirks. Different phones, different microphone placements, the device flat on a table versus held in a hand, sudden transient sounds, automatic gain control fighting you. Most of the engineering effort went here rather than into the "interesting" signal-processing core.

Status and honesty about limits

This is a working approach that runs in a shipped app, not a finished science result. Microphone-only breath detection in uncontrolled conditions is genuinely hard, and published smartphone-only systems sit well below wearable-based ones. We are running a validation study against clinical ground truth to find out how good this actually is, and where it breaks. The method note and the research pitch in docs/ describe that in more detail, including the things we explicitly do not claim.

It is a wellness and self-awareness tool, not a medical device.

Where this connects to guided breathing

Breath detection isn't the end in itself; it's the feedback channel. During a session, the detected phase and how steady your breathing is can be reflected back to you in the moment, and that same signal can inform classical breathing designs: a preset can hint at when your pace has settled or drifted, instead of just counting seconds at you.

The longer-term loop we're working toward is a personal resonance range: roughly, the slow breathing pace at which your own physiology settles most, often somewhere near six breaths a minute but individual. Detected breath stability, optionally combined with heart rate or HRV when a sensor is available, gives the raw material to estimate that range over time and feed it back into the guided patterns, so the pacing adapts to you rather than to a fixed number.

This is a direction, not a finished feature, and not a clinical claim. We don't diagnose anything, and we don't promise an optimised state, only that the same detection layer can make guided breathing a little less generic.

Does my audio get uploaded, or analysed for speech? No. The microphone stream is processed on-device and the raw audio does not leave the phone. The pipeline works on the energy envelope and spectral shape of breathing, not on words, so it isn't built to recognise or transcribe speech. What's kept for improving the model is quality-checked (waveform, phase-label) material held on-device until you explicitly confirm it, not a continuous recording.

Do I need an account to try it? Not for the core breathing biofeedback; that works without creating one. Some surrounding features in the app use an account; the part this repo is about does not.

Is this a medical device? Can it diagnose or treat anything? No. It's a wellness and self-awareness tool. It isn't a medical device, it doesn't diagnose or treat anything, and it isn't validated for clinical use. If you feel dizzy or uncomfortable during a session, stop and breathe normally. Slow or paced breathing makes some people lightheaded, and there's nothing to push through.

Does it make me do extreme or forced breathing? No. It supports several guided patterns, including classical designs with holds such as 4-7-8, but the point is to reflect your own breathing back to you, not to force fast or extreme breathing. You set the pace, and stopping when it feels off is always the right call.

Why an app on the phone instead of a terminal/CLI? The phone microphone is the sensor. The app exists because it needs microphone access and has to render the feedback in close to real time as you breathe. That loop is the whole experience, and it doesn't map onto a text terminal.

Do I need a chest strap or HRV sensor? No. The microphone alone drives the live biofeedback. A respiration belt or heart-rate strap is optional, used for the validation study and as an extra signal for resonance-range estimation, never a prerequisite. By default the guided patterns use fixed presets; the personal resonance range is a direction under validation, not a shipped, calibrated feature.

Is the app open source? This repository documents the method and the research pitch, not the full app source. The docs here are released under CC BY 4.0.

What feedback is useful? Specifics on the signal processing and failure modes, on health/wellness UX, and on mobile audio: Android/iOS capture quirks, automatic gain control, microphone differences across devices. Those are the parts where outside eyes help most.

shii·haa is the app this came out of. If you want to feel the biofeedback rather than read about it:

If you want to try the app: shiihaa.app/download

Built by Felix Zeller ([email protected]).

Documentation in this repository is released under CC BY 4.0. See LICENSE.