Automatic adventitious respiratory sound analysis: A systematic review 文献翻译_a respiratory sound database for the development o-程序员宅基地

技术标签: 人工智能  

原文获取:

Abstract 
background 背景介绍

Automatic detection or classification of adventitious sounds is useful to assist physicians in diagnosing or monitoring diseases such as asthma, Chronic Obstructive Pulmonary Disease (COPD), and pneumonia. While computerised respiratory sound analysis, specifically for the detection or classification of adventitious sounds, has recently been the focus of an increasing number of studies, a standardised approach and comparison has not been well established.

自动检测或分类外音有助于医生诊断或监测哮喘、慢性阻塞性肺疾病(COPD)和肺炎等疾病。虽然计算机化的呼吸声音分析,特别是用于检测或分类外音,最近已成为越来越多研究的焦点,但标准化的方法和比较尚未得到很好的建立。

Objective 

To provide a review of existing algorithms for the detection or classification of adventitious respiratory sounds. This systematic review provides a complete summary of methods used in the literature to give a baseline for future works.

综述了现有的检测或分类非定音呼吸音的算法。本系统综述提供了文献中使用的方法的完整总结,为今后的工作提供了基线

Data sources 数据来源

A systematic review of English articles published between 1938 and 2016, searched using the Scopus (1938-2016) and IEEExplore (1984-2016) databases. Additional articles were further obtained by references listed in the articles found. Search terms included adventitious sound detection, adventitious sound classification, abnormal respiratory sound detection, abnormal respiratory sound classification, wheeze detection, wheeze classification,crackle detection, crackle classification, rhonchi detection, rhonchi classification, stridor detection, stridor classification, pleural rub detection, pleural rub classification, squawkdetection, and squawk classification.

对1938年至2016年间发表的英文文章进行系统回顾,使用Scopus(1938-2016)和IEEExplore(1984-2016)数据库进行检索。通过找到的文章中列出的参考文献,进一步获得了其他文章。搜索词包括:不定音检测、不定音分类、呼吸异常音检测、呼吸异常音分类、喘息检测、喘息分类、裂纹检测、裂纹分类、隆齐检测、隆齐分类、喘鸣检测、喘鸣分类、胸膜摩擦检测、胸膜摩擦分类、squawk检测、squawk分类。

Study selection 研究选择

Only articles were included that focused on adventitious sound detection or classification,based on respiratory sounds, with performance reported and sufficient information provided to be approximately repeated.

仅纳入了基于呼吸声音的非定音检测或分类的文章,这些文章报告了其表现并提供了足够的信息,可以大致重复。

Data extraction 数据提取

Investigators extracted data about the adventitious sound type analysed, approach and level of analysis, instrumentation or data source, location of sensor, amount of data obtained, data management, features, methods, and performance achieved.

研究人员提取了有关分析的非定音类型、分析方法和水平、仪器或数据源、传感器位置、获得的数据量、数据管理、特征、方法和实现的性能的数据。

Data synthesis 数据合成

A total of 77 reports from the literature were included in this review. 55 (71.43%) of the studies focused on wheeze, 40 (51.95%) on crackle, 9 (11.69%) on stridor, 9 (11.69%) on rhonchi, and 18 (23.38%) on other sounds such as pleural rub, squawk, as well as the pathology. Instrumentation used to collect data included microphones, stethoscopes, and accelerometers. Several references obtained data from online repositories or book audio CD companions. Detection or classification methods used varied from empirically determined thresholds to more complex machine learning techniques. Performance reported in the surveyed works were converted to accuracy measures for data synthesis.

本综述共收录了77篇文献报道。喘息55篇(71.43%),噼啪40篇(51.95%),喘鸣9篇(11.69%),隆奇9篇(11.69%),其他胸膜摩擦声、嘎嘎声及病理18篇(23.38%)。用于收集数据的仪器包括麦克风、听诊器和加速度计。一些参考文献从在线存储库或书籍音频CD同伴中获得数据。使用的检测或分类方法从经验确定的阈值到更复杂的机器学习技术各不相同。在调查工作中报告的业绩被转换为数据综合的准确性措施。

Limitations 局限性

Direct comparison of the performance of surveyed works cannot be performed as the input data used by each was different. A standard validation method has not been established, resulting in different works using different methods and performance measure definitions.

由于每项工程所使用的输入数据不同,因此无法直接比较所调查工程的表现。没有建立标准的验证方法,导致不同的工作使用不同的方法和性能测量定义。

Conclusion 结论

A review of the literature was performed to summarise different analysis approaches, features, and methods used for the analysis. The performance of recent studies showed a high agreement with conventional non-automatic identification. This suggests that automated adventitious sound detection or classification is a promising solution to overcome the limitations of conventional auscultation and to assist in the monitoring of relevant diseases.

对文献进行了回顾,总结了不同的分析方法、特征和分析方法。近年来的研究结果表明,该方法与传统的非自动识别方法具有较高的一致性。这表明,自动化的外音检测或分类是一种有希望的解决方案,可以克服传统听诊的局限性,并协助监测相关疾病。

Introduction

Most diseases related to an obstructed or restricted respiratory system can be characterised from the sounds generated while breathing. These include asthma, COPD, and pneumonia amongst others. Airway abnormalities can cause breathing sounds to be abnormal. Examples of this could be the absence of sounds or additive unusual ones. The latter are referred to as adventitious sounds. An expert can perform auscultation using a stethoscope to detect abnormalities in sounds and use this information when making a diagnosis. However, the correct detection of these sounds relies on both, the presence of an “expert”, and their degree of expertise.

大多数与呼吸系统阻塞或受限有关的疾病都可以从呼吸时产生的声音来表征。其中包括哮喘、慢性阻塞性肺病和肺炎等。气道异常会导致呼吸音异常。这方面的例子可能是没有声音或添加不寻常的声音。后者被称为外来音。专家可以使用听诊器进行听诊,以检测声音中的异常,并在诊断时使用这些信息。然而,这些声音的正确检测依赖于“专家”的存在和他们的专业程度。

While computerised respiratory sound analysis, specifically for the detection or classification of adventitious sounds, has been the focus of an increasing number of studies recently, a standardised approach and comparison has not been well established. Several reviews related to automatic adventitious sounds analysis have been published [1–6]. The article in [1] provided a review of 49 articles which included the type of sensor, the data set, the features, the analysis techniques, and also the performance metrics used. The review categorised features into time-domain, frequency-domain, wavelet-domain, and a combination of different domains. Signal preprocessing techniques such as de-noising, resampling, and analogue prefiltering were also presented, as well as the number of sensors and their positioning. Information on analysis type, approach, and data management was not reviewed. The conclusion of this work was that a multi domain feature has advantages in characterising different types of lung sounds.

虽然计算机化的呼吸声音分析,特别是用于检测或分类外音,最近已成为越来越多研究的焦点,但标准化的方法和比较尚未得到很好的建立。一些与自动定音分析相关的综述已经发表[1-6]。[1]中的文章对49篇文章进行了综述,包括传感器的类型、数据集、特征、分析技术以及使用的性能指标。该综述将特征分为时域、频域、小波域和不同域的组合。本文还介绍了信号预处理技术,如去噪、重采样和模拟预滤波,以及传感器的数量和定位。有关分析类型、方法和数据管理的信息未被审查。这项工作的结论是,多域特征在表征不同类型的肺音方面具有优势。

A review of computerised respiratory sounds specifically in patients with COPD was done in [2]. This included a total of seven papers. The focus of this review was studies that tried to find the characteristics of adventitious sounds in COPD (wheeze, crackle, and rhonchi), including occurrence timing and the power spectrum.

文献[2]回顾了计算机化呼吸音在COPD患者中的应用。总共包括7篇论文。这篇综述的重点是试图找到COPD中外源性声音(喘息、噼啪和隆齐)的特征,包括发生时间和功率谱的研究。

The review in [3] provided information on machine learning techniques used in lung sound analysis. This covered types of analysis, sensor type, number of subjects, machine learning techniques used, and the outcome of each reference. A total of 34 studies were reviewed. The review concluded that artificial intelligence techniques are needed to improve accuracy and enable commercialisation as a product. Another review, published by the same group, provided a summary of 55 studies on computer-based respiratory sound analysis [4]. The review included analysis type, sensor used, data set used, sensor location, and method of analysis. This work provided several recommendations for sensor type, position, and the use of more advanced machine learning techniques.

[3]中的综述提供了用于肺音分析的机器学习技术的信息。这涵盖了分析类型、传感器类型、主题数量、使用的机器学习技术以及每个参考的结果。共回顾了34项研究。该评估的结论是,需要人工智能技术来提高准确性,并使其作为一种产品实现商业化。同一组发表的另一篇综述总结了55项基于计算机的呼吸声音分析研究[4]。回顾包括分析类型、使用的传感器、使用的数据集、传感器位置和分析方法。这项工作为传感器的类型、位置和更先进的机器学习技术的使用提供了一些建议。
The survey in [5] focused on automated wheeze detection for asthmatic patients, and provided a review on instrumentation, placement, processing methods, features used, and the outcome, of a total of 27 studies. The study recommended placing the stethoscope on the trachea as this preserves more frequency information when compared to the chest wall.

[5]中的调查侧重于哮喘患者的自动喘息检测,并对总共27项研究的仪器、放置、处理方法、使用的特征和结果进行了回顾。该研究建议将听诊器放在气管上,因为与胸壁相比,这样可以保存更多的频率信息。

A systematic review and meta-analysis of computerised lung sound analysis to aid in the diagnosis of diseases was presented in [6]. A total of 8 articles were selected for this systematic review which consisted of studies on wheeze, crackle, and other adventitious sounds for specific diseases such as asthma and COPD. The review included the number of subjects, age range, gender ratio, methodology, case, recording device, algorithm, and type of sounds analysed. The quality of each study was assessed using the Newcastle-Ottawa Score (NOS). The NOS is normally used for assessing non-randomised studies including control-studies. Four of the selected articles were then used for meta-analysis. This obtained an average of 80% sensitivity and 85% specificity in abnormal sound detection.

[6]中提出了一项系统综述和荟萃分析,用于辅助疾病诊断的计算机化肺音分析。本系统综述共选择了8篇文章,其中包括对哮喘和COPD等特定疾病的喘息、噼啪声和其他外来声音的研究。回顾包括受试者数量、年龄范围、性别比例、方法、病例、记录设备、算法和分析的声音类型。使用纽卡斯尔-渥太华评分(NOS)评估每项研究的质量。NOS通常用于评估包括对照研究在内的非随机研究。其中四篇文章被用于荟萃分析。这在异常声音检测中获得了平均80%的灵敏度和85%的特异性。

This systematic review adds to these existing reviews by providing more thorough information in a standardised format, with more works being reviewed, and more recent developments included. The comparison of this work with the previously mentioned reviews can be seen below.

这个系统的综述通过以标准化的格式提供更全面的信息,增加了这些现有的综述,包括更多的工作被评审,以及更多的最新发展。这项工作与前面提到的评论的比较可以在下面看到。

Objectives

The objective of this systematic review is to provide a summary of the existing literature on algorithms for the detection or classification of adventitious respiratory sounds. The review is organised as follows: A summary of normal and adventitious sound characteristics is provided initially. Types of analysis performed are discussed, including the adventitious sound types analysed, approach of each analysis technique, and the level at which the analyses were performed. Instrumentation and data collection methods are also provided, including sensor type, number, and position, as well as the amount of data obtained. Service works obtained data for analysis from online repositories and book audio CD companions.These databases were listed as well. A summary of data management, features, and detection or classification methods is also presented, including the performance reported in each work. Overall, a total of 77 articles are considered. This systematic review provides a complete summary of methods used in the existing literature to give a baseline for future works.

本系统综述的目的是对现有的关于检测或分类非定音呼吸音算法的文献进行总结。审查组织如下:首先提供正常和非固定声音特征的摘要。讨论了分析的类型,包括分析的非定音类型,每种分析技术的方法,以及进行分析的水平。还提供了仪器和数据收集方法,包括传感器类型、数量和位置,以及所获得的数据量。几个作品从在线存储库和书籍音频CD同伴中获取数据进行分析。这些数据库也被列出。还介绍了数据管理、特征和检测或分类方法的摘要,包括在每个工作中报告的性能。总的来说,总共审议了77条。本系统综述提供了现有文献中使用的方法的完整总结,为未来的工作提供了基线。

Methods

The systematic review was performed following the recommendations of the Preferred Reporting Items for Systematic Reviews and meta-Analysis (PRISMA) statement [7]. The PRISMA checklist is provided in S1 File.

按照系统评价和荟萃分析首选报告项目(PRISMA)声明的建议进行系统评价[7]。PRISMA核对表在S1文件中提供。

Data sources and study selection

Studies included in this review are peer-reviewed articles written in English published between 1938 and 2016. The types of study are automatic detection or classification of adventitious sounds based on sound signal processing. No age limitation was considered as an eligibility criterion. Most data in the literature was taken from both healthy volunteers and patients with pulmonary diseases. The outcomes of the studies considered were reported as a performance measure of the automatic systems developed. The types of performance measures reported depend on the approach of each study.

本综述纳入的研究是1938年至2016年间发表的英文同行评议文章。研究的类型是基于声音信号处理的外来声音的自动检测或分类。不考虑年龄限制作为资格标准。文献中的大多数数据来自健康志愿者和肺部疾病患者。所考虑的研究结果作为绩效报告测量开发的自动系统。报告的绩效测量类型取决于每项研究的方法。

The references for this review were searched using the SCOPUS (1938-2016) and IEEEx-plore (1984-2016) databases. Additional articles were obtained from the bibliographies of articles found. The date of the last search was 1st November 2016. Electronic search terms for these databases included adventitious sound detection, adventitious sound classification, abnormal respiratory sound detection, abnormal respiratory sound classification, wheeze detection, wheeze classification, crackle detection, crackle classification, rhonchi detection, rhonchi classification, stridor detection, stridor classification, pleural rub detection, pleural rub classification, squawk detection, and squawk classification. Articles which focused on adventitious sounds detection or classification based on breath sound with performance reported were identified from the search results. Screening was done by selecting articles based on the title and abstract. Further selection was performed on screened articles based on the eligibility criteria

使用SCOPUS(1938-2016)和IEEExplore(1984-2016)数据库检索本综述的文献。从找到的文章的参考书目中获得了其他文章。最后一次搜索的日期是2016年11月1日。这些数据库包括不定音检测、不定音分类、呼吸异常音检测、呼吸异常音分类、喘息检测、喘息分类、裂纹检测、裂纹分类、rhonchi检测、rhonchi分类、stridor检测、stridor分类、胸膜摩擦检测、胸膜摩擦分类、squawk检测、squawk分类。从搜索结果中确定了基于报告性能的呼吸声的外来声音检测或分类的文章。筛选是根据标题和摘要选择文章。根据入选标准对筛选后的文章进行进一步选择。

To ascertain the validity of the review, only peer-reviewed articles that provided sufficient information to approximately reproduce the results achieved were considered. Issues related to data collection and management, which may introduce bias within each study, were identified and reviewed. Thorough information on types of instrumentation or repository used, total number of data, and how the data were used are reported in the review.

为了确定综述的有效性,只考虑提供足够信息以近似重现所获得结果的同行评议文章。确定并审查了与数据收集和管理相关的问题,这些问题可能会在每个研究中引入偏倚。在审查中报告有关使用的工具或存储库类型、数据总数以及如何使用数据的详细信息。

Data extraction and synthesis

Data extraction was performed by the investigators on eligible articles. A data extraction form was created to obtain important information from these articles. Extracted data were summarised into tables and further described in Section Results. Investigators extracted data about the adventitious sound type analysed, approach and level of analysis, instrumentation, location of sensor, amount of data obtained and used, data management, features, methods, and performance achieved for each study. The principal summary measure which will be used in this systematic review is the reviewed algorithm’s range of accuracy achieved for specific tasks.

数据提取与合成由研究者对符合条件的文章进行数据提取。创建了一个数据提取表单,以便从这些文章中获取重要信息。提取的数据汇总成表格,并在结果部分作进一步描述。研究人员为每项研究提取了有关分析的非定音类型、分析方法和水平、仪器、传感器位置、获得和使用的数据量、数据管理、特征、方法和性能的数据。主要的总结性措施,将使用在这个系统的审查是审查的算法的准确度范围达到了具体的任务。

A summary of normal and adventitious respiratory sounds and their characteristics is given prior to the article’s review. This summary aims to provide insight into the sounds that need to be detected or classified. Limitations of conventional auscultation are discussed next. A short description of the available commercial devices for automatic respiratory sound analysis is provided. Studies on different adventitious sound types and analysis types are identified and summarised. The different instrumentation used to collect data is also identified for each reference. The methods of analysis are discussed in separate sections. These are based on the techniques used to perform the detection or classification of adventitious sounds. The performance reported in the literature is transformed to overall accuracy where possible, for data synthesis. Balanced accuracy was used when sensitivity and specificity measures were reported instead of the overall accuracy.

在文章的回顾之前,对正常和偶然呼吸音及其特征进行了总结。这个总结的目的是提供需要检测或分类的声音的洞察力。下面讨论常规听诊的局限性。提供了用于自动呼吸声分析的现有商用设备的简短描述。对不同的定音类型和分析类型的研究进行了识别和总结。还为每个引用确定了用于收集数据的不同仪器。分析方法将在单独的章节中讨论。这些都是基于用于检测或分类外来声音的技术。在可能的情况下,将文献中报告的性能转换为数据的总体准确性。当报告敏感性和特异性措施时,使用平衡准确性,而不是总体准确性。

Results

This section provides the results of the systematic review performed. The section is organised as follows: A summary of normal and abnormal breath sounds is first given. This is followed by an outline of the limitations of conventional auscultation to underline the need for auto-mated detection or classification of adventitious sounds. Commercial devices related to respiratory sound analysis are also discussed in this section. The results of the systematic review are subsequently presented. These include explanations of the type of analysis, instrumentation, and methods.

这部分提供了系统评审的结果。本节内容安排如下:首先概述正常及异常呼吸音。接下来概述了常规听诊的局限性,强调了自动检测或分类外来音的必要性。本节还讨论了与呼吸声音分析有关的商业设备。随后将介绍系统评价的结果。这包括对分析类型、仪器和方法的解释。

Review of normal and abnormal respiratory sounds


Respiratory sounds are sounds generated by the respiratory system. These can usually be heard by performing auscultation. Auscultation is generally carried out to check physical health, and it involves listening to both, cardiac and respiratory sounds. Respiratory sounds heard from auscultation can be normal or abnormal. Finding abnormal respiratory sounds and differentiating them from normal sounds is important as abnormal sounds are characteristic of several serious diseases, such as asthma, COPD, and pneumonia.

呼吸音是由呼吸系统发出的声音。这些通常可以通过听诊听到。听诊通常是为了检查身体健康状况,它包括听心音和呼吸音。听诊听到的呼吸音可以是正常的也可以是不正常的。发现异常的呼吸音并将其与正常音区分开来是很重要的,因为异常音是哮喘、慢性阻塞性肺病和肺炎等几种严重疾病的特征。

Normal respiratory sounds. Normal respiratory sounds can be categorised based on the location where they are heard or generated. Depending on the auscultation location, different types of respiratory sounds have distinct characteristics such as duration, pitch, and sound quality. Normal respiratory sounds and their characteristics are briefly discussed below. A summary is also presented in Table 1.

正常呼吸音。正常的呼吸声音可以根据听到或产生的位置进行分类。根据听诊位置的不同,不同类型的呼吸音有不同的特征,如持续时间、音高和音质。下面简要讨论正常的呼吸音及其特征。表1也给出了一个总结。

• Vesicular Sounds 水疱音
Normal vesicular sounds are soft, non-musical, and can be heard on auscultation performed over most of the lung fields. Vesicular breath sounds are audible during the whole inspiration phase. However, due to the passive nature, as well as the origin, of the sounds, they can only be heard in the early expiration phase [8]. Hence vesicular sounds are longer during inspiration than during expiration. The pitch as well as the intensity are also higher in the inspiration phase compared to expiration. And while there is normally no pause between inspiration and expiration sounds in one cycle, different breath cycles are separated with a pause [8].

•正常的水疱音是柔和的,无音乐性的,在大部分肺野听诊时都能听到。在整个吸气阶段都能听到水泡呼吸音。然而,由于这些声音的被动性质和来源,它们只能在呼气早期被听到[8]。因此,吸气时的水疱声比呼气时的水疱声更长。与呼气相比,吸气阶段的音调和强度也更高。在一个周期内吸气和呼气的声音之间通常没有停顿,不同的呼吸周期之间用停顿来分隔[8]。

Vesicular sounds have a low pitch and very limited frequency range, usually with a drop in energy after around 100—200 Hz [9]. This is due to the chest wall acting like a low-pass filter on the sounds generated. The intensity of the vesicular sounds also varies depending on the part of the chest that auscultation is performed on [8].

水泡音音调低,频率范围非常有限,通常在100-200 Hz左右后能量下降[9]。这是由于胸壁对产生的声音起到了低通过滤器的作用。水泡音的强度也因听诊部位的不同而不同[8]。

• Bronchial Sounds 支气管音
Normal bronchial sounds are heard over the large airways on the chest, specifically near the second and third intercostal space. Bronchial sounds are more hollow and high-pitched compared to vesicular sounds [8]. Bronchial sounds are audible during both, inspiratory and expiratory phases [10]. In contrast with vesicular sounds, due to the sounds being originated in larger airways, the expiratory phase sounds are normally audible for longer than the inspiratory phase ones. The intensity of expiration phase sounds is also higher, compared to the intensity in the inspiration phase. Unlike in vesicular sounds, there is a short pause in between each cycle of breathing.Bronchial sounds contain more energy at a higher frequency bandwidth than vesicular sounds [8]. The sounds heard are usually high-pitched, loud, and tubular.

• 正常的支气管音可在胸部的大气道上听到,特别是在第二和第三肋间隙附近。与水疱音相比,支气管音更加中空和高音调[8]。在吸气期和呼气期都能听到支气管音[10]。与水疱声相反,由于声音起源于较大的气道,呼气期的声音通常比吸气期的声音听的时间更长。呼气期的声音强度也比吸气期的声音强度高。与水泡音不同的是,在每个呼吸循环之间有一个短暂的停顿。支气管音在更高的频率带宽上比水泡音包含更多的能量[8]。听到的声音通常是高音调的、响亮的、管状的。

• Broncho-vesicular Sounds
Broncho-vesicular sounds are normally heard on the posterior chest between the scapulae,as well as in the centre part of the anterior chest. The quality of the sound is between bronchial and vesicular sounds. They are softer than bronchial sounds but still mimic tubular sounds. The inspiratory and expiratory phases can be heard as having similar durations [11].

支气管瓣膜声通常在肩胛骨之间的后胸以及前胸的中部听到。声音的音质介于支气管音和水泡音之间。它们比支气管音更柔和,但仍然类似于管状音。吸气期和呼气期可以听到相似的持续时间[11]。

• Tracheal Sounds
Tracheal sounds are harsh, loud, and usually have high pitch [8]. The sounds are normally heard when auscultation is performed over the trachea, specifically on the suprasternal notch. The sounds heard are usually hollow and tubular as they are generated by turbulent airflow passing through the pharynx and glottis [10]. The gap between inspiratory and expiratory phases in tracheal sounds is distinct, with both phases having a similar duration. The energy distribution in frequency is more spread when compared to the other normal sounds, with a much energy in the higher frequency components. The frequency range of normal tracheal sounds can reach up to 5,000 Hz with an energy drop usually occurring from 800 Hz [12]. The sounds heard over the trachea have a high intensity and can give more information as they are not filtered by the chest wall.

•气管声音气管声音刺耳、响亮,通常有高音[8]。当听诊在气管上,特别是胸骨上切迹上时,通常可以听到这些声音。所听到的声音通常是空心管状的,因为它们是由湍流气流通过咽和声门产生的[10]。在气管音中,吸气期和呼气期之间的间隙是明显的,两个阶段的持续时间相似。与其他正常声音相比,频率上的能量分布更分散,在更高频率的成分中有更多的能量。正常气管声音的频率范围可达5000 Hz,能量下降通常发生在800 Hz以下[12]。通过气管听到的声音强度很高,可以提供更多的信息,因为它们没有被胸壁过滤。嘴巴发出的呼吸声是由中央气道产生的,是由声门下的湍流气流引起的。

• Mouth Sounds
Breath sounds heard from the mouth are produced by central airways, and caused by turbulent airflow below the glottis. Breath sounds from the mouth have a wide frequency range of 200 to 2,000 Hz [13]. The energy distribution is similar to that of white noise. For a healthy person, breath sounds from the mouth should be silent.

从口腔发出的呼吸音有200到2000赫兹的宽频率范围[13]。其能量分布与白噪声相似。对于一个健康的人来说,从嘴里发出的呼吸声应该是无声的。

The comparison and summary of the types and characteristics of normal respiratory sounds can be seen in Table 1.

正常呼吸音的类型和特征的比较和总结见表1。

Different locations for auscultation provide different sound characteristics, even for normal breath sounds. This may cause automatic analysis of lung sounds to be more complex when signals are obtained from multiple locations.

听诊的不同位置提供不同的声音特征,即使是正常的呼吸音。当信号来自多个位置时,这可能会导致肺音的自动分析变得更加复杂。

Abnormal respiratory sounds. Abnormal breath sounds include the absence or reduced intensity of sounds while breathing, normal breath sounds heard in abnormal areas, as well as adventitious sounds. Adventitious sounds refer to sounds superimposed on normal breath sounds. These can be characterised based on the underlying conditions and hence be very useful in helping diagnosis. Adventitious sounds can be classified into two categories, continuous
and discontinuous, based on their duration.

呼吸音异常。异常呼吸音包括呼吸时声音缺失或强度降低、异常区域听到正常呼吸音以及外来音。附加音是指叠加在正常呼吸音上的音。这些可以根据潜在的条件进行表征,因此在帮助诊断方面非常有用。根据其持续时间的长短,不定音可分为连续音和不连续音两类。

• Continuous Adventitious Sounds
Continuous Adventitious Sounds (CAS) are abnormal sounds superimposed on normal breath sounds with durations of more than 250 ms [14]. Based on the pitch, CAS can be further categorised as high-pitched (Wheeze, Stridor, and Gasp) or low-pitched (Rhonchi and Squawk). Based on the associated condition and cause of the adventitious sounds, different types of CAS can also be separated.

连续不定音(Continuous Adventitious sound, CAS)是叠加在正常呼吸音上的异常音,持续时间超过250 ms[14]。根据音高,CAS可以进一步分为高音(喘息,喘鸣和喘息)或低音(隆奇和Squawk)。根据不定音的关联条件和成因,也可以区分不同类型的非定音。

• Wheeze and Rhonchi
Wheeze and rhonchi are both continuous adventitious sounds which can be heard during inspiration, mostly at expiration, or during both phases [10]. Wheeze is a high-pitched CAS while rhonchi are low-pitched. Wheeze sounds are caused by the airway narrowing which then causes an airflow limitation [15] while rhonchi are related to the thickening of mucus in the larger airways [16]. According to [17], although wheeze and rhonchi belong to CAS, they do not necessarily have durations of more than 250 ms. Some have reported that wheeze and rhonchi can have minimum durations of around 80 to 100 ms.

Wheeze和Rhonchi都是连续的不定音,在吸气时可以听到,主要是在呼气时,或者在两个阶段都可以听到[10]。Wheeze是高音调的CAS,而rhonchi是低音调的CAS。喘息声是由气道狭窄引起的,气道狭窄导致气流受限[15],而隆奇声则与较大气道粘液增厚有关[16]。根据[17],虽然wheeze和rhonchi属于对于CAS,它们的持续时间不一定超过250毫秒。一些人报告说,喘息和隆齐的最短持续时间约为80至100毫秒。

Wheeze and rhonchi both present as sinusoid-like signals, with frequency ranges between 100-1,000 Hz. Wheeze is defined as a high-pitched continuous sound with dominant frequency of a minimum of 400 Hz, while rhonchi is a low-pitched continuous sound with dominant frequency of a maximum of 200 Hz [14]. Both wheeze and rhonchi are musical sounds, usually with up to three harmonic frequencies [18].

Wheeze和rhonchi都以正弦信号的形式出现,频率范围在100-1,000 Hz之间。Wheeze定义为主导频率最低为400hz的高音连续音,rhonchi定义为主导频率最高为200hz的低音连续音[14]。喘息和隆齐都是音乐声,通常有多达三个谐波频率[18]。

Diseases associated with wheeze are asthma and COPD. If the wheeze is localised, it may be caused by a foreign body blocking the airway, like a tumour [10]. Rhonchi is associated with COPD and Bronchitis due to the secretions in the bronchial tree [10].

与喘息相关的疾病有哮喘和慢性阻塞性肺病。如果喘息是局部的,它可能是由异物阻塞气道引起的,如肿瘤[10]。由于支气管树的分泌物,Rhonchi与COPD和支气管炎有关[10]

• Stridor

Stridor is a type of CAS with a sibilant and musical quality, similar to wheeze. Stridor can mostly be heard on the inspiration phase although, on some occasions, it can be heard on expiration or even in both phases [10]. Different from wheeze, stridor sound is generated by turbulent airflow in the larynx or bronchial tree, and is related to an upper airway obstruction. This is why stridor can be heard more clearly on the trachea, while wheezing can also be heard clearly by chest auscultation [19]. Stridor sounds are characterised by a high pitch of more than 500 Hz [10]. They are also normally harsher and louder than wheeze sounds. As a type of CAS, stridor sounds have a duration of more than 250 ms.

Stridor是CAS的一种,具有类似喘息的嘶嘶声和音乐性质。Stridor主要可以在吸气阶段听到,尽管在某些情况下,它可以在呼气时听到,甚至在两个阶段都可以听到[10]。与喘息不同,喘鸣音是由喉部或支气管树中的湍流产生的,与上呼吸道阻塞有关。这就是为什么气管上可以更清楚地听到喘鸣音,而胸部听诊也可以清楚地听到喘息声[19]。高音的特点是超过500赫兹的高音[10]。它们通常也比喘息声更刺耳、更响亮。作为CAS的一种,鸣音的持续时间超过250毫秒。

The differential diagnosis for stridor are epiglottitis, croup, and laryngeal oedema. All of these conditions are related to upper airway obstructions. Stridor sounds can also be heard when there is a foreign body such as a tumour in the upper airway tract.

对喘鸣的鉴别诊断是会厌炎、臀部和喉部水肿。所有这些情况都与上呼吸道阻塞有关。当上呼吸道有异物(如肿瘤)时,也可以听到刺耳的声音。

• Gasp
Inspiratory gasps can be heard usually after a bout of coughing when a patient finally tries to inhale. The whoop sound of an inspiratory gasp is caused by fast moving air through the respiratory tract. Whoop sounds typically have a high pitch and long duration, which makes inspiratory gasps belong to CAS. The whooping sound is a pathognomonic symptom of whooping cough (pertussis) [20]. This is the only disease associated with a whooping sound inspiratory gasp.

吸气性的喘息声通常可以在病人咳嗽后听到,这时病人终于要吸气了。吸气喘息的百日咳是由空气通过呼吸道快速流动引起的。Whoop声音通常具有高音和长持续时间,这使得吸气性喘息属于CAS。百日咳是百日咳的一种典型症状[20]。这是唯一一种与呼叫声吸气喘息有关的疾病。

• Squawk
Squawks are adventitious sounds that can be heard during the inspiratory phase. The sound is a mix of both musical and non-musical. Squawk is also called short wheeze as the sound’s characteristics are similar to a low-pitched wheeze but with a shorter duration [8]. The pitch of squawk is normally between 200–300 Hz [10]. The sounds are generated by oscillation at the peripheral airways [21]. Squawk can usually be heard in a patient with hypersensitivity pneumonia, although they have been reported in patients with common pneumonia several times [22].

叫声是在吸气阶段可以听到的偶然声音。声音是音乐和非音乐的混合体。Squawk也被称为短喘息,因为声音的特征类似于低音喘息,但持续时间较短[8]。尖叫的音调通常在200–300赫兹之间[10]。这些声音是通过外周气道的振荡产生的[21]。过敏性肺炎患者通常可以听到Squawk的声音,尽管常见肺炎患者也多次听到这种声音[22]。


• Discontinuous Adventitious Sounds
Discontinuous Adventitious Sounds (DAS) are abnormal sounds superimposed on normal breath sounds with a short duration of less than 25 ms [14]. DAS can be further classified based on the source from where the sounds are generated.

不连续不定音(DAS)是叠加在正常呼吸音上的异常声音,持续时间短于25毫秒[14]。DAS可以根据声音产生的来源进行进一步分类。

• Fine Crackle
Fine crackle sounds are caused by explosive openings of the small airways. The sound is high-pitched (around 650 Hz) and has a short duration (around 5 ms) [23]. Crackle sounds are explosive and non-musical [8, 24]. Fine crackles are audible only at the late stages of inspiratory phases. Fine crackle sounds are usually associated with pneumonia, congestive heart failure, and lung fibrosis.

细小的爆裂声是由小气道的爆炸性开口引起的。声音音调高(约650赫兹),持续时间短(约5毫秒)[23]。爆裂声具有爆炸性和非音乐性[8,24]。细微的爆裂声只有在吸气阶段的晚期才能听到。细小的爆裂声通常与肺炎、充血性心力衰竭和肺纤维化有关。

• Coarse Crackle
Coarse crackle sounds are generated by air bubbles in large bronchi. The sounds can be heard mostly during the early stages of inspiration, but are also audible at the expiratory stage. Coarse crackles have a low pitch, around 350 Hz, with a sound duration of around 15 ms [23]. Coarse crackle sounds can be heard on patients with chronic bronchitis, bronchiectasis, as well as COPD.

粗大的爆裂声是由大支气管中的气泡产生的。这些声音大多在吸气的早期阶段可以听到,但在呼气阶段也可以听到。粗裂纹的音高较低,约为350赫兹,声音持续时间约为15毫秒[23]。慢性支气管炎、支气管扩张症和慢性阻塞性肺病患者都能听到粗糙的爆裂声。

• Pleural Rub
Pleural rub are non-musical rhythmic sounds, which are categorised as DAS as the duration of each rub is around 15 ms [10]. Pleural rub sounds are caused by the rubbing of pleural membranes when breathing. The sound generated by the friction can be heard on both phases (biphasic), inspiration and expiration. Pleural rub sounds have a low pitch, normally below 350 Hz [10]. They are usually caused by inflammation of the pleural membrane [8]. Pleural tumour can also cause them [10].

胸膜摩擦是一种非音乐节奏的声音,由于每次摩擦的持续时间约为15毫秒,因此被归类为DAS[10]。胸膜摩擦音是由呼吸时胸膜的摩擦引起的。摩擦产生的声音可以在吸气和呼气两个阶段(双相)听到。胸膜摩擦音的音高较低,通常低于350赫兹[10]。它们通常是由胸膜炎症引起的[8]。胸膜肿瘤也会引起它们[10]。

Auscultation. Auscultation is the medical term referring to the use of a stethoscope or other tools to listen to the sounds generated from inside the body. It is used to help diagnose a vast number of conditions. Normally, auscultation is performed to listen to lung, cardiac, abdomen, and blood vessel sounds. Most of the time, auscultation is performed on the anterior and posterior chest [25].

听诊。听诊是一个医学术语,指的是使用听诊器或其他工具来听身体内部产生的声音。它被用来帮助诊断大量的疾病。正常情况下,听诊是为了听肺、心脏、腹部和血管的声音。大多数情况下,听诊是在胸部前后侧进行的[25]。

The stethoscope used for auscultation usually consists of two parts, a diaphragm and a bell. The diaphragm is used to listen to high-pitched sounds while the bell is for low-pitched sounds. Auscultation is recommended to be performed in a quiet environment to enable the expert to listen to the sounds clearly [8].

听诊用听诊器通常由隔膜和听诊器两部分组成。隔膜是用来听高音的,而铃是用来听低音的。听诊建议在安静的环境中进行,使听诊者能够清晰地听清声音[8]。

Drawbacks and Limitations of Conventional Auscultation. The first limitation of conventional auscultation is that it cannot be performed frequently and thus cannot provide continuous monitoring. Auscultation needs to be performed by an expert, especially when trying to detect and determine abnormal sounds. This is very limiting, for example, in the case of asthma, because symptoms such as wheezes most often occur during the night. The requirements of performing auscultation in a quiet environment, and ideally with the patient in a still position, are also very restrictive.

传统听诊的缺点和局限性。常规听诊的第一个限制是不能经常进行,因此不能提供连续监测。听诊需要由专家进行,特别是在试图检测和确定异常声音时。这是非常有限的,例如,在哮喘的情况下,因为诸如喘息之类的症状通常发生在夜间。在安静的环境中进行听诊,最好是患者处于静止的位置,这也是非常严格的要求。

The number of people capable of performing auscultation is also limited. An expert on auscultation needs to have lot of experience in order to be able to determine the types of sounds heard and decide on how this information can help in diagnosis or monitoring. Symptoms might be missed and their severity underestimated by both patients and physicians [26], resulting in proper care not being given.

能够听诊的人数也很有限。听诊专家需要有丰富的经验,以便能够确定听到的声音类型,并决定这些信息如何有助于诊断或监测。患者和医生可能会忽视症状,低估症状的严重程度[26],导致没有得到适当的护理。

Limitations of the human auditory system are also a drawback in conventional auscultation. A study in [27], advocates that conventional auscultation should not be used as a reference in research on automatic lung sound analysis. The intensity of respiratory sounds can mask the adventitious sounds, resulting in only normal sounds being heard. The varying amplitude of adventitious sounds may also cause the human ear to miss some cases where the intensity is too low to be detected.

人类听觉系统的局限性也是传统听诊的一个缺点。[27]的研究主张在肺音自动分析研究中不应以常规听诊作为参考。呼吸声音的强度可以掩盖外来的声音,导致只听到正常的声音。不确定声音的不同振幅也可能导致人耳错过某些强度太低而无法检测到的情况。

These limitations and drawbacks hinder the effectiveness of conventional auscultation as a mean of monitoring and managing symptoms. Automated lung sound analysis, specifically automatic detection and classification of adventitious sounds, could potentially overcome these limitations.

这些限制和缺点阻碍了常规听诊作为监测和管理症状的手段的有效性。自动肺音分析,特别是自动检测和分类外来音,可能会克服这些限制。

Available Automated Lung Sound Analysis Devices. Automatic lung sound analysis, aiming to overcome the limitations mentioned above, has been the recent focus of a significant amount of research, and some commercial systems for very specific applications are already in the market [25]. These include the Wheezometer [28], Wholter [29], VRI [30], LSA-2000 [31], LEOSound[32], Multichannel STG [33], STG for PC [34], and Handheld STG [35].

可用的自动肺音分析设备。自动肺音分析旨在克服上述局限性,已成为近期大量研究的焦点,一些用于非常特定应用的商业系统已经进入市场[25]。其中包括Wheezometer[28],Wholter[29],VRI[30],LSA-2000[31],LEOSound[32],Multichannel STG[33],PC STG[34]和 Handheld STG[35]。

Wheezometer and WHolter were developed by Karmelsonix (now Respiri). Wheezometer is used to measure the wheeze percentage and uses one sensor placed over the trachea. WHolter has a similar sensor and algorithm to Pulmotrack [36], but is intended for home monitoring use. The data recorded by WHolter is uploaded to a computer to be analysed. Vibration Response Imaging (VRI) developed by Deep Breeze uses 34 or 40 sensors placed on the posterior chest. The device is capable of detecting lung vibration energy and visualises it in a grayscale image. LSA-2000, by Kenzmedico uses up to 4 sensors attached over the chest to identify interstitial pneumonia. LEOSound developed by Heinen and Lowerstein uses 3 sensors capable of storing data for wheeze and cough detection. Multichannel STG uses 14 sensors placed on multiple locations on the posterior chest, trachea, and an over the heart sensor. The device is capable of counting crackles, rhonchi and wheezes. Smaller versions of STG use an electronic stethoscope coupled with either a PC (STG for PC) or a handheld device (Handheld STG).

Wheezometer和WHolter是由Karmelsonix(现为呼吸器)开发的。喘息计用于测量喘息百分比,并使用放置在气管上的一个传感器。WHolter具有与Pulmotrack相似的传感器和算法[36],但主要用于家庭监测。由WHolter记录的数据被上传到一台计算机上进行分析。由Deep Breeze开发的振动响应成像(VRI)使用34或40个传感器放置在后胸部。该设备能够检测肺部振动能量,并以灰度图像的形式将其可视化。Kenzmedico公司的LSA-2000使用多达4个附着在胸部的传感器来识别间质性肺炎。由Heinen和Lowerstein开发的LEOSound使用3个能够存储数据的传感器来检测喘息和咳嗽。多通道STG使用14个传感器,放置在胸部后部、气管和心脏上方的多个位置。该装置能够计数噼啪声、隆隆声和喘息声。较小版本的STG使用电子听诊器与PC (STG用于PC)或手持设备(Handheld STG)相结合。

Automated lung sound analysis devices should be easy to use, portable, and require as small a number of sensors as is possible [25], The use of multiple sensors and bulky devices is not suitable and cost-effective for home monitoring purposes. All the devices listed above are typically large and complex, with the exception of the Wheezometer, but this can only provide spot-checks, not continuous monitoring. WHolter has portability but works as data logger with a separate analysis device. While STG for PC and Handheld STG use an electronic stethoscope that is also not suitable for continuous monitoring. Thus, portable or wearable non-intrusive devices that can be used to monitor lung sounds without the help of experts are still needed.

自动肺声分析设备应易于使用、便携,并且需要尽可能少的传感器[25]。使用多个传感器和笨重的设备不适合用于家庭监测,而且成本效益不高。除Wheezometer外,上面列出的所有设备通常都是大型和复杂的,但这只能提供抽查,而不是连续监测。WHolter具有可移植性,但作为数据记录器与单独的分析设备一起工作。而用于PC的STG和Handheld  STG使用的电子听诊器也不适合连续监测。因此,在没有专家帮助的情况下,仍然需要便携式或可穿戴的非侵入式设备来监测肺音。

Other than the devices mentioned above, the development of algorithms to detect or classify lung sounds has been the focus of a lot of research works. These works developed detection or classification methods by extracting certain features from the sounds. The detection and classification methods used vary from empirically determined to the use of machine learning. A systematic review of automatic detection or classification of adventitious sounds is presented in next subsection.

除了上述设备之外,检测或分类肺音的算法的开发一直是许多研究工作的重点。这些作品通过从声音中提取某些特征来开发检测或分类方法。所使用的检测和分类方法从经验确定到使用机器学习各不相同。一个系统的审查自动检测或分类的不定音将在下一小节中提出。

Review of algorithms for automatic adventitious respiratory sound analysis

This section reviews published studies on the detection or classification of adventitious respiratory sounds. The review is organised as follows. The types of sound being investigated will be discussed first. This is followed by a discussion of the level at which the analysis is performed. The sensor types, number, and placement is reviewed next. Available online databases with recordings of adventitious sounds are presented. The methodology of analysis is reviewed last, including the use of the data, validation, features, and the classification or detection methods used.

自动不定音呼吸音分析算法综述本节综述了已发表的关于不定音呼吸音的检测或分类的研究。检讨的安排如下:我们将首先讨论所研究的声音类型。接下来是对执行分析的级别的讨论。传感器的类型、数量和位置将在下面进行介绍。提供在线数据库与录音的外来声音。最后回顾了分析的方法,包括数据的使用、验证、特征和使用的分类或检测方法。

Study selection. A total of 77 full-articles were included in this systematic review. Data-base search on SCOPUS and IEEExplore, as well as citation tracking identified a total of 1519 records. Removal of duplicates and non-accessible full-text articles left 1446 articles. Out of these, 1297 articles were excluded based on title and abstract screening. From the screening, 149 full-text articles were then assessed for eligibility, and 72 studies were excluded. This study selection resulted in a total of 77 eligible full-articles which were all included in the review. The flow diagram for this study selection can be seen in Fig 1.

Study selection. A total of 77 full-articles were included in this systematic review. Data-base search on SCOPUS and IEEExplore, as well as citation tracking identified a total of 1519 records. Removal of duplicates and non-accessible full-text articles left 1446 articles. Out of these, 1297 articles were excluded based on title and abstract screening. From the screening, 149 full-text articles were then assessed for eligibility, and 72 studies were excluded. This study selection resulted in a total of 77 eligible full-articles which were all included in the review. The flow diagram for this study selection can be seen in Fig 1.

研究选择。这篇系统综述共收录了77篇完整的文章。在SCOPUS和IEEExplore上的数据库搜索以及引文跟踪共确定了1519条记录。删除了重复和不可访问的全文文章,留下1446篇文章。其中,1297篇文章根据标题和摘要筛选被排除在外。从筛选中,149篇全文文章被评估为合格,72项研究被排除在外。本研究共筛选出77篇符合条件的完整文章,全部纳入综述。本研究选择的流程图见图1。

Characteristics of studies included in this systematic review are given in Tables 3 and 4. The characteristics summarised for each work are: type of sound analysis, approach and level of anal-ysis, instrumentation or database used to obtain data, and amount of data used in the analysis.

本系统综述纳入的研究特征见表3和表4。每项工作总结的特点是:声音分析的类型、分析的方法和水平、用于获取数据的仪器或数据库,以及分析中使用的数据量。

Types of sounds analysed. Although all eligible articles included in this review targeted adventitious sounds, different works had different specific aims. Hence, some of the works investigated one type of adventitious sound and compared it with normal breath sounds- this can be performed as a detection or classification scheme. Others reported the classification of several types of adventitious sounds. There were also works that performed classification on the cause of adventitious sounds generation.

分析声音的类型。虽然本综述纳入的所有符合条件的文章都针对外来声音,但不同的作品有不同的具体目的。因此,一些作品研究了一种类型的非定音,并将其与正常的呼吸音进行比较——这可以作为一种检测或分类方案。其他人则报告了几种类型的外来音的分类。也有作品对外来音产生的原因进行了分类。

Examples of the analysis performed in the published papers included: wheeze detection,wheeze classification against normal breath sounds, classification of monophonic and polyphonic wheeze, crackle detection in a recording, and classification of crackle and normal breath sounds. Other than wheeze and crackle analysis, adventitious sounds analysis was performed in combination in different works. Generally, the analysis was on classification tasks, such as: wheeze and rhonchi classification, classification of wheeze and crackle, wheeze and stridor classification, and other combinations. Another example was classification between sounds caused by airway obstruction or parenchymal. 55 (71.43%) of the studies focused on wheeze, 40 (51.95%) on crackle, 9 (11.69%) on stridor, 9 (11.69%) on rhonchi, and 18 (23.38%) on other sounds such as pleural rub, squawk, as well as the pathology. A summary of the types of sounds analysed in each article can be seen in Table 3.

在已发表的论文中进行的分析示例包括:喘息检测,喘息与正常呼吸音的分类,单音和复音喘息的分类,录音中的裂纹检测,以及裂纹和正常呼吸音的分类。除了喘息声和噼啪声分析外,还在不同作品中结合进行了非定音分析。一般来说,分析是在分类任务上进行的,例如:喘息和rhonchi分类,喘息和噼啪分类,喘息和喘鸣分类,以及其他组合。另一个例子是气道阻塞或实质引起的声音的分类。喘息55篇(71.43%),噼啪40篇(51.95%),喘鸣9篇(11.69%),隆奇9篇(11.69%),其他胸膜摩擦声、嘎嘎声及病理18篇(23.38%)。表3是每篇文章中分析的声音类型的摘要。

Level of analysis. There are three different levels of adventitious sound analysis that can be performed. Several studies performed detection and classification of adventitious sounds at a segment level. For detection at the segment level, features are usually extracted on segments generated by signal windowing. Classification may also be performed at the segment level. Random segments from both, adventitious and normal sounds, are obtained and used to perform this classification. Different from classification at the segment level, classification at the event level is usually done after obtaining manually isolated events of adventitious sounds and normal breath sounds. At the recording level, the task performed is usually the detection of events.

分析水平。可以进行三种不同级别的非定音分析。一些研究在音段水平上对不定音进行了检测和分类。对于段级检测,通常在信号窗生成的段上提取特征。分类也可以在分段级别进行。从非定音和正常音中获得随机片段,并用于执行这种分类。与片段级别的分类不同,事件级别的分类通常是在获得非定音和正常呼吸音的人工分离事件后进行的。在记录级别,执行的任务通常是检测事件。

Different levels of analysis result in different performance measures. At the segment level, one possible performance measure is to regard each segment as either true positive, true negative, false positive, or false negative. Another approach is to combine the detected segments, for example by taking a few consecutive detected segments as a positive event or by taking the mean values of extracted features. For the reported works using the event level (usually a classification task), the performance is measured from individually isolated events. Detection tasks performed at the recording level measure the performance at the event level. As for classifications performed at the recording level, the analysed recording will either be classified as containing abnormal sounds or as a normal recording. More detail on how each work in the literature performed analysis and measured the performance can be seen in Table 3.

不同层次的分析产生不同的性能度量。在片段层面,一种可能的绩效衡量方法是将每个片段视为真阳性、真阴性、假阳性或假阴性。另一种方法是将检测到的片段组合起来,例如将几个连续检测到的片段作为正事件,或者取提取特征的平均值。对于使用事件级别(通常是分类任务)报告的工作,性能是从单独孤立的事件度量的。在记录级别执行的检测任务度量事件级别的性能。至于在录音级别进行的分类,分析的录音将被分类为包含异常声音或正常录音。关于文献中每个工作如何进行分析和测量性能的更多细节见表3。

Sensor and its placement. Most research works on adventitious sound analysis used data recorded from patients in hospital. The most common sensors being used for data collection were microphones. The types of microphone mentioned were the SP0410HR5H-PB [114], KEC-2738 [115], TSD108 [116], Panasonic WM-61 [117], SONY ECM-44 BPT, and SONY ECM-77B [118]. Several articles also used microphones but without mentioning the type specifically. Electronic stethoscopes were also used by several researchers. These include the ThinkLab Rhythm:ds32a Digital Stethoscope [119], WelchAllyn Meditron Electronic Stetho-scope [120], and Littmann 3M Electronic Stethoscope Model 4000 [121], and 3200 [122]. One paper used an accelerometer BU-3173 [123] as a sensor. Other than the sensors above, several studies stated the use of either a microphone or stethoscope without specifically mentioning the type. In total, there were 31 studies that used microphones and 21 studies that used electronic stethoscope.

传感器及其放置。大多数关于非定音分析的研究工作使用的是医院患者记录的数据。用于数据收集的最常用传感器是麦克风。所提到的麦克风型号为SP0410HR5H-PB[114]、KEC-2738[115]、TSD108[116]、松下WM-61[117]、索尼ECM-44 BPT和索尼ECM-77B[118]。有几篇文章也使用了麦克风,但没有具体提到麦克风的类型。电子听诊器也被一些研究人员使用。这些包括ThinkLab Rhythm:ds32a数字听诊器[119],WelchAllyn Meditron电子听诊器[120],以及Littmann 3M电子听诊器型号4000[121],3200[122]。一篇论文使用加速度计BU-3173[123]作为传感器。除了上述传感器,一些研究表明使用麦克风或听诊器,但没有具体提到类型。总共有31项研究使用麦克风,21项研究使用电子听诊器。

Conventional auscultation is usually performed on the anterior and posterior chest in order to obtain vesicular breath sounds. For the development of algorithms for the detection or classification of adventitious sounds, several studies used the trachea, specifically the suprasternal notch, as the location for the sensor. Mouth breath sounds were also used in one of the papers to detect wheezes.

常规听诊通常在前胸和后胸进行,以获得水泡呼吸音。为了开发检测或分类非定音的算法,一些研究使用气管,特别是胸骨上切迹作为传感器的位置。其中一篇论文还使用了口音来检测喘息声。

The number of sensors used to perform the analysis varies from only one sensor up to a set of 14. In some papers, although only one sensor was used, the sensor is not kept in a fixed position but it is used to detect sounds from multiple locations, similar to performing conventional auscultation. This was generally the case when the analysis was performed using a digital stethoscope for data collection. A summary of the sensors used in each work can be seen in Table 4.

用于执行分析的传感器数量从一个传感器到14个传感器不等。在一些论文中,虽然只使用了一个传感器,但传感器并没有保持在固定位置,而是用于检测来自多个位置的声音,类似于传统听诊。当使用数字听诊器进行数据收集时,通常会出现这种情况。每项工作中使用的传感器的摘要见表4。

Databases. Several works used available databases as a source for analysis instead of collecting their own data. The databases used are from online repositories and from audio CD companion books. The online repositories available were from R.A.L.E [124], East Tennessee State University repository [125], Littmann repository [126], and from SoundCloud [127]. The audio CDs companion used were from books such as Understanding Lung Sounds 2nd Edition [128], Understanding Lung Sounds 3rd Edition [129], Auscultation Skills: Breath and Heart Sounds [130], Fundamentals of Lung and Heart Sounds [131], Understanding Heart Sounds and Murmurs [132], Heart and Lung Sounds Reference Library [133], Secrets Heart & Lung Sounds Workshops [134], Lung Sounds: An Introduction to the Interpretation of the Auscultatory Finding [135], and The Chest: Its Signs and Sounds [136].

数据库。一些作品使用可用的数据库作为分析来源,而不是收集自己的数据。使用的数据库来自在线存储库和音频CD配套书籍。可用的在线存储库来自R.A.L.E[124]、东田纳西州立大学存储库[125]、Littmann存储库[126]和SoundCloud[127]。配套使用的音频cd来自以下书籍:《理解肺音第二版》[128]、《理解肺音第三版》[129]、《听诊技巧:呼吸和心音》[130]、《肺音和心音基础》[131]、《理解心音和杂音》[132]、《心肺音参考图书馆》[133]、《秘密心肺音工作坊》[134]、《肺音:听诊发现解释导论》[135]和《胸腔》:它的符号和声音[136]。

Breath sounds from online or book databases were taken from multiple locations, such as the chest, neck, and mouth. The sensor used for the data collection varied and included an electret microphone and accelerometer in [124], and the Littmann Digital Stethoscope in Littman repository [126].

从在线或书籍数据库中提取的呼吸音来自多个位置,如胸部、颈部和口腔。用于数据收集的传感器多种多样,包括[124]中的驻极体麦克风和加速度计,以及Littmann存储库中的Littmann数字听诊器[126]。

Method of analysis and performance. Algorithms developed to detect or classify adventitious sounds usually involve two steps. The first step is to extract the relevant features that will be used as detection or classification variables. The second step is to use detection or classification techniques on the data, based on the features extracted. In developing a detection or classification algorithm, especially if machine learning techniques are used, it is important to take note of how the data is used to train, test, and validate the algorithm. In this section, the literature published will be discussed. The following aspects were reviewed: features extracted; classifier or detection techniques used; how the training, testing, and validation was performed; as well as the performance achieved. The section is organised based on the classifier or detection techniques used. These are empirical rule-based (such as with thresholding or peak selection), Support Vector Machine (SVM), Artificial Neural Network (ANN) variant, and other techniques such as clustering and statistical models. Table 5 is provided to summarise the review.

分析和性能的方法。用于检测或分类非定音的算法通常包括两个步骤。第一步是提取将用作检测或分类变量的相关特征。第二步是基于提取的特征,对数据使用检测或分类技术。在开发检测或分类算法时,特别是在使用机器学习技术时,重要的是要注意如何使用数据来训练、测试和验证算法。在本节中,将讨论已发表的文献。综述了以下几个方面:特征提取;使用的分类器或检测技术;培训、测试和验证是如何进行的;以及所取得的成绩。本节是根据所使用的分类器或检测技术来组织的。这些是基于经验规则的(如阈值或峰值选择),支持向量机(SVM),人工神经网络(ANN)变体,以及其他技术,如聚类和统计模型。表5概述了审查情况。

Empirical Rule Based Methods. A study by [62] performed crackle classification. The data used included 50 crackle events and 50 normal breath sounds. The sounds were recorded using a Littmann 3M 4000 Electronic Stethoscope at multiple positions on the chest. The classification performed was based on the mathematical morphology of a crackle event in the spectrogram. The classification achieved 86% sensitivity with a specificity of 92%.

基于经验规则的方法。[62]进行了裂纹分类研究。使用的数据包括50个爆裂声事件和50个正常的呼吸音。使用Littmann 3M 4000电子听诊器在胸部多个位置记录声音。所执行的分类是基于在光谱图中的裂纹事件的数学形态学。该分类灵敏度为86%,特异性为92%。

Wheeze classification was performed in [95]. The data used for the study was obtained from [129]. A total of 17 recordings, with 7 normal and 10 containing wheezes were used. The classification performed was to determine whether a recording was normal or contained wheezes. The feature used was extracted based on the entropy of each frame of the segmented recording. The feature set was the ratio and difference of the maximum and minimum entropy of the segments of a recording. Based on an empirical threshold, the classification was performed. The study achieved 84.4% sensitivity and 80% specificity.

在[95]中进行了喘息分类。本研究使用的数据来自[129]。共17次录音,7次正常,10次含喘息。进行分类是为了确定录音是正常的还是包含喘息声。所使用的特征是基于分割记录的每一帧的熵提取的。特征集是记录片段的最大和最小熵的比值和差。基于经验阈值进行分类。该研究的敏感性为84.4%,特异性为80%。

A empirical threshold was also used as a classifier by [50] to perform multi-class classifications between wheeze, stridor, crackle, and normal events. This study was a continuation of [95] above. The data used for this study was obtained from both hospital and the Soundcloud online repository with the search term ‘lung sounds’. A total of 45 recordings were used, containing several cycles of respiration each. Similar to the algorithm in [95], entropy was extracted from the segmented recording. For the multi-class classification, two entropy-based features were extracted instead of just one as in the previous study. The entropy-based features were the difference and ratio of maximum and minimum entropy of a segment in a recording. Similar to [95], the performance was measured by classifying a whole recording using the extracted features. The performance reported was 99% for stridor, 70% for wheeze, 87% for crackle, and 99% for normal sounds.

经验阈值也被[50]用作分类器,在喘息、喘鸣、噼啪和正常事件之间执行多类分类。本研究是上述[95]的延续。本研究使用的数据来自医院和Soundcloud在线存储库,搜索词为“肺音”。总共使用了45个记录,每个记录包含几个呼吸周期。与[95]中的算法类似,从分割的记录中提取熵。对于多类分类,我们提取了两个基于熵的特征,而不是像以前的研究那样只提取一个。基于熵的特征是记录中一个片段的最大和最小熵的差和比值。与[95]类似,性能是通过使用提取的特征对整个记录进行分类来衡量的。报告的表现为喘鸣99%,喘息70%,噼啪87%,正常声音99%。

A finding from [63] claimed that the delay coordinate can be used as a feature to perform a classification between wheeze events and normal breath sounds, achieving 98.39% overall accuracy. The underlying reason was that the wheeze sound signal is a sinusoid while a normal breath sound is noise-like. A threshold can be found to perform the classification based on the persistent homology of delay embeddings. Another study from the same group [73] previously focused on wheeze sound detection in a recording. The data used contained 6 wheeze events in a recording which could all be detected using an energy threshold classifier on certain frequency bands and wavelet packet decomposition.

[63]的一项研究表明,延迟坐标可以作为一个特征来对喘息事件和正常呼吸音进行分类,总体准确率达到98.39%。潜在的原因是喘息声信号是正弦波,而正常的呼吸声是噪音。基于延迟嵌入的持续同源性,可以找到一个阈值来执行分类。同一组的另一项研究[73]先前关注的是录音中的喘息声检测。所使用的数据在一个记录中包含6个喘息事件,这些事件都可以使用特定频段的能量阈值分类器和小波包分解来检测。

Wheeze detection was also studied by [77], with signals obtained using a stethoscope that was built using a microphone inside a chamber. The sounds were recorded from the neck. A total of 59 recordings, 25 with wheezes and 34 normal, from 8 young children were used for analysis. The feature used was the correlation coefficient, while the classifier was an empirically determined threshold. The features were extracted from each segment of a recording. Several consecutive high correlation coefficients were regarded as a wheeze event. Finally, each recording was classified as containing wheeze or being normal by using a threshold, calculated as the ratio between wheeze duration and normal respiratory duration. The performance achieved was 88% sensitivity with 94% specificity.

[77]也对喘息检测进行了研究,使用听诊器获得信号,听诊器使用室内的麦克风构建。声音是从脖子上录下来的。分析8例幼儿的59份录音,其中喘息25份,正常34份。所使用的特征是相关系数,而分类器是经验确定的阈值。特征是从录音的每个片段中提取出来的。几个连续的高相关系数被认为是一个喘息事件。最后,通过使用喘息持续时间与正常呼吸持续时间之间的比率计算阈值,将每个记录分类为包含喘息或正常呼吸。灵敏度为88%,特异度为94%。

The study in [74] also focused on wheeze detection. The wheeze sounds were recorded using a single digital stethoscope from multiple positions. In total, 40 recordings were used for the study. The features were obtained from time-frequency analysis, with a rule-based decision making, such as finding and selecting peaks based on energy threshold, derived from the algorithm developed by [101]. The study achieved 72.5% specificity with a sensitivity of 99.2%.

[74]的研究也侧重于喘息检测。喘息声是用一个数字听诊器从多个位置记录的。这项研究总共使用了40段录音。特征通过时频分析获得,基于规则的决策,如根据能量阈值寻找和选择峰值,源自[101]开发的算法。该研究的特异性为72.5%,敏感性为99.2%。

Classification of CAS and DAS against normal breath sounds was carried out by [80]. 47 recordings from an online repository [124] were used. These contained 10 normal, 20 CAS, and 17 DAS recordings. There were two features analysed in this study. The first feature was a similarity measure of segments in the recording using mutual information. The second feature was a weighted cepstral feature. The study claimed a high accuracy of classification by using a threshold classifier using the first feature, while a separability index of 1 was found using the second set of features for both CAS and DAS classification.

根据正常呼吸音对CAS和DAS进行分类[80]。使用了来自在线存储库的47段录音[124]。其中包括10个正常录音,20个CAS录音和17个DAS录音。本研究分析了两个特征。第一个特征是利用互信息对录音片段进行相似性度量。第二个特征是加权的倒谱特征。该研究声称,通过使用使用第一个特征的阈值分类器,分类精度很高,而使用CAS和DAS分类的第二组特征,可分离性指数为1。

Wheeze segment classification was performed in [81], also using a threshold-based classifier. A total of 180 segments were analysed. These contained 82 wheeze segments and 98 normal segments. The feature used in this study was the fractional Hilbert transform. The overall accuracy achieved was 90.5%. The same research group performed crackle detection also using the fractional Hilbert transform as a feature in [82]. The correlation coefficient was used as additional feature to detect crackle. The performance achieved was a sensitivity of 94.28% and Positive Predictive Value (PPV) of 97.05%, at the event level, on 10 short recordings with 33 crackle events.

在[81]中也使用基于阈值的分类器进行了喘息段分类。共分析了180个节段。这些包含82个喘息段和98个喘息段正常的部分。本研究中使用的特征是分数阶希尔伯特变换。总体准确度达到90.5%。同一研究小组在[82]中也使用分数阶希尔伯特变换作为特征进行裂纹检测。将相关系数作为裂纹检测的附加特征。在事件水平上,对10个短记录的33个裂纹事件的灵敏度为94.28%,阳性预测值(PPV)为97.05%。

Crackle detection was also performed in [56] by using thresholding on fractal dimension and the CORSA [143] criterion of crackle. A total of 24 recordings were used for the analysis, obtained using a stethoscope. The performance reported was an average sensitivity of 89 ± 10% and PPV of 95 ± 11%, at the event level, for different recordings.

在[56]中也使用分形维数阈值法和裂纹CORSA[143]准则进行了裂纹检测。共有24段录音用于分析,这些录音是通过听诊器获得的。在事件水平上,不同记录的平均灵敏度为89±10%,PPV为95±11%。

A study in [84] also performed crackle detection using a threshold-based classifier. The feature used in this study was the abnormality level. A total of 433 segments were used in the analysis with no further detail given. The performance reported was 84.5% accuracy.

[84]中的一项研究也使用基于阈值的分类器进行了裂纹检测。本研究使用的特征是异常水平。在分析中总共使用了433个片段,没有给出进一步的细节。所报告的性能准确率为84.5%。

Wheeze detection was performed in [83] using the Linear Predictive Coding (LPC) prediction error ratio as a feature. A total of 26 recordings were used for analysis, with 13 of them containing wheeze sounds. By using a threshold classifier on the prediction error, 70.9% sensitivity and 98.6% specificity at the event level was achieved.

在[83]中,使用线性预测编码(LPC)预测错误率作为特征来进行喘息检测。总共有26段录音被用于分析,其中13段包含喘息声。通过对预测误差使用阈值分类器,在事件水平上实现了70.9%的灵敏度和98.6%的特异性。

The work in [97] used peak selection based on time duration to perform wheeze detection. A total of 40 events were obtained from several databases. The only currently available database is [125]. From the 40 events, 19 of them were wheezes and 21 were normal respiratory sounds. The performance reported was 84% sensitivity and 86% specificity.

[97]中的工作使用基于持续时间的峰值选择来执行喘息检测。从几个数据库中总共获得了40个事件。目前唯一可用的数据库是[125]。在这40个事件中,19个是喘息,21个是正常的呼吸音。报告的表现敏感性为84%,特异性为86%。

Wheeze and normal respiratory event classification was performed in [98]. Signals from 14 volunteers were recorded using one SONY ECM-77B microphone. An additional 100 normal and 86 wheeze events from [129, 132] were obtained. The classification was done using distortion in histograms of sample entropy as a feature. Performance of 97.9% accuracy for expiration and 85.3% accuracy for inspiration phase, at the event level, was reported.

喘息和正常呼吸事件分类[98]。14名志愿者的信号用一台索尼ECM-77B麦克风记录下来。从[129,132]中获得了另外100例正常和86例喘息事件。分类是利用样本熵直方图的失真作为特征来完成的。据报道,在事件水平上,呼气期的准确率为97.9%,吸气期的准确率为85.3%。

Threshold on fractal dimension was used to perform the detection of crackle segments in [100]. A total of 18 recordings with 182 crackle events were analysed. 92.9% sensitivity and 94.4% PPV, at the event level, detection of crackle were achieved.

[100]采用分形维数阈值对裂纹段进行检测。共分析了18个记录的182个裂纹事件。灵敏度为92.9%,PPV为94.4%,在事件水平上实现了裂纹的检测。

The work in [106] performed wheeze detection with signals obtained from 16 asthmatic patients and 15 healthy volunteers. Data were recorded using one piezoelectric microphone placed on the neck. A threshold energy was used achieving 100% sensitivity and specificity for high airflow at the event level. Wheeze detection was also the focus of the study in [101]. Signals from 13 volunteers containing 422 wheeze events were recorded using five SONY ECM77B microphones placed on the neck, anterior, and posterior chest. Data from 10 out of 13 volunteers were used as a test set containing 337 wheeze events. The detection was made by selecting peaks based on sets of rules. Sensitivity of 95.5 ± 4.8% and specificity of 93.7 ± 9.3%, at the event level on the test set, was achieved.

[106]中的工作使用从16名哮喘患者和15名健康志愿者获得的信号进行喘息检测。数据记录使用一个压电麦克风放置在脖子上。使用阈值能量,在事件水平上对高气流达到100%的灵敏度和特异性。喘息检测也是[101]研究的重点。来自13名志愿者的422个喘息事件的信号被记录下来,使用5个索尼ECM77B麦克风分别放置在颈部、胸部前部和胸部后部。13名志愿者中有10人的数据被用作包含337个喘息事件的测试集。基于规则集选择峰值进行检测。在测试集的事件水平上,灵敏度为95.5±4.8%,特异性为93.7±9.3%。

The study in [108] studied both crackle segment detection and classification using signals obtained from the ACCP teaching tape. The feature used for detection was the correlation between a crackle signal in the time domain with a wavelet decomposition. The crackle segment detection achieved 99.8% accuracy. Classification between fine and coarse crackle was performed on the detected crackle segments. The article claimed that the achieved accuracy was “almost” 100%.

[108]中的研究使用ACCP教学磁带获得的信号对裂纹段进行检测和分类。用于检测的特征是裂纹信号在时域内与小波分解之间的相关性。裂纹段检测准确率达到99.8%。对检测到的裂纹段进行细裂纹和粗裂纹的分类。这篇文章声称,实现的准确率“几乎”是100%。

Prior to this, [112] also performed crackle detection and classification. A threshold on energy envelope was used to detect and isolate crackle segments. The detected crackles were further classified into fine or coarse by using crackle typical characteristics such as peak frequency and time duration. The algorithm was applied to signals from 9 patients obtained using a microphone. The study claimed to achieve 100% accuracy in classifying crackles into fine or coarse.

在此之前,[112]也进行了裂纹检测和分类。采用能量包络阈值对裂纹段进行检测和分离。利用裂纹的峰值频率和持续时间等典型特征,将检测到的裂纹进一步分为细裂纹和粗裂纹。该算法应用于使用麦克风获得的9名患者的信号。该研究声称,将裂纹分为细裂纹和粗裂纹的准确率达到100%。

Support Vector Machine Based Methods. The work in [37] used an SVM classifier to perform wheeze detection. The signals used were obtained with a single microphone (SP0410HR5H-PB) used to record mouth breath sounds. A total of 95 recordings were collected, with 27 of them containing wheezes. 70 recordings with wheezes in 20 of them, were used to train the SVM classifier while the rest were used to test the classifier. A separate set of 39 recordings with 10 wheezes were used as an additional test set. Spectral-based features were used for the classifier. The recordings were divided into segments and the features were extracted from each frame of the segmented recordings. Using this method, 71.4% sensitivity and 88.9% specificity was achieved on the validation set at the recording level. Logistic Regression Model (LRM) classifier was also used, but the result using SVM achieved a better overall performance.

基于支持向量机的方法。[37]中的工作使用SVM分类器进行喘息检测。使用单麦克风(SP0410HR5H-PB)获取信号,用于记录口腔呼吸音。总共收集了95个录音,其中27个包含喘息声。70个录音中有20个带有喘息的录音用于训练SVM分类器,其余的录音用于测试分类器。另一组有39段录音的10次喘息作为额外的测试集。基于光谱的特征被用于分类器。将录音分割成片段,并从每一帧的片段中提取特征。该方法在记录水平验证集上的灵敏度为71.4%,特异性为88.9%。虽然也使用了逻辑回归模型(Logistic Regression Model, LRM)分类器,但使用支持向量机(SVM)的结果总体性能更好。

A study in [41] used five TSD108 microphones to obtain recordings from 30 volunteers to be used for CAS classification. In total, 870 inspiratory cycles, from which 485 samples containing CAS, were recorded. Four of the sensors were placed on the back while one sensor was put on neck. From the 870 cycles, 1494 segments were obtained with 633 of them containing CAS. A feature set based on instantaneous frequency was extracted and an SVM classifier was used. To obtain the optimal SVM parameters, 10-fold cross-validation (CV) was used, using 559 cycles out of the 870 recorded. The SVM model was then developed using 100 iterations of 65%-35% of random data, split out of the 1494 segments. If at least one segment in a cycle was classified as CAS, the whole cycle would be classified as CAS. The best performance obtained was a sensitivity of 94.2% and a specificity of 96.1% at the cycle level.

[41]的一项研究使用5个TSD108麦克风获取了30名志愿者的录音,用于CAS分类。共记录了870个吸气周期,其中485个样本含有CAS。其中四个传感器安装在背部,一个传感器安装在颈部。从870个循环中得到1494个片段,其中633个含有CAS。提取基于瞬时频率的特征集,并采用支持向量机分类器进行分类。为了获得最佳的支持向量机参数,使用了10倍交叉验证(CV),使用了870个记录中的559个循环。然后使用100次迭代65%-35%的随机数据来开发SVM模型,从1494个片段中分离出来。如果一个周期中至少有一段被归类为CAS,则整个周期将被归类为CAS。在循环水平上获得的最佳性能是灵敏度为94.2%,特异性为96.1%。

The study in [38] used SVM to perform classification of recordings using a denoising autoencoder as feature set. The data for the study was recorded using a stethoscope on the neck, anterior, and posterior chest. A total of 227 recordings were obtained, 171 normal, 33 containing wheeze, 19 containing crackle, and 4 containing both wheeze and crackle. The performance achieved was 90% sensitivity with 64% specificity for wheeze and 90% sensitivity with 44% specificity for crackle at the recording level.

[38]中的研究使用SVM对录音进行分类,以去噪自编码器作为特征集。这项研究的数据是用听诊器在颈部、胸部前部和胸部后部记录的。共获得227条录音,其中正常171条,含喘息33条,含裂纹19条,同时含喘息和裂纹4条。在记录水平上,对喘息的灵敏度为90%,特异性为64%,对裂纹的灵敏度为90%,特异性为44%。

The same research group built a custom stethoscope and algorithm in [47] to perform wheeze detection. The detection scheme used consisted of processing the spectrogram of sound recordings to select potential wheezes by using the energy threshold, and performing the classification on selected potential wheezes to obtain the final classification result for the recording classification. The performance achieved was 86% accuracy at the recording level, by taking into account the expected number of false positives.

同一研究小组在[47]中构建了一个定制听诊器和算法来进行喘息检测。所采用的检测方案是:对录音的声谱图进行处理,利用能量阈值选择势音,并对所选择的势音进行分类,得到最终的分类结果进行录音分类。考虑到误报的预期数量,在记录水平上达到86%的准确率。

Classification of normal, wheeze, and crackle events was performed in [46] using k-Nearest Neighbour (k-NN) and SVM. A total of 600 events, with 200 normal, 200 wheezes, and 200 crackles were obtained using fourteen SONY ECM-44 BPT microphones. Leave-one-out cross-validation (LOOCV) was used with energy and wavelet coefficients as features. The best performance was achieved by using SVM. This was 95.17% average accuracy at the event level.

在[46]中,使用k-最近邻(k-NN)和SVM对正常、喘息和裂纹事件进行分类。使用14个索尼ECM-44 BPT麦克风共获得600个事件,其中200个正常,200个喘息和200个裂纹。采用以能量和小波系数为特征的留一交叉验证(LOOCV)。采用支持向量机的方法获得了最好的性能。在事件水平上,平均准确率为95.17%。

Differentiating between monophonic and polyphonic wheezes was performed by [59]. The recording of the wheezes was carried out using fourteen microphones (SONY ECM-44 BPT) positioned on multiple locations on the chest. A total of 7 recordings containing 121 monophonic and 110 polyphonic wheezes were used for analysis. A SVM was used as the classifier with quartile frequency ratio and mean crossing irregularity as features. The SVM performance reported was 69.29% accuracy. k-NN and Naive Bayes (NB) classifiers were also used.The best overall accuracy reported was 75.78%, achieved using k-NN.

单音和复音喘息的区分由[59]完成。使用放置在胸部多个位置的14个麦克风(SONY ECM-44 BPT)记录喘息声。共有7个录音包含121个单音和110个复音喘息被用于分析。采用支持向量机作为分类器,以四分位数频率比和平均交叉不规则度为特征。所报道的SVM性能准确率为69.29%。k-NN和朴素贝叶斯(NB)分类器也被使用。使用k-NN,报告的最佳总体准确率为75.78%。

Wheeze detection using Mel Frequency Cepstral Coefficients (MFCC), kurtosis, and entropy as features was developed in [53]. 45 recordings for the analysis were obtained using an accelerometer (BU-3173). Two parallel SVMs were used as classifiers with a final decision made using the product of the outputs of both. 21 recordings were used for training while the rest were used to test the model. 20%-80% data split was used for validation (repeated 20 times). The performance was reported as a reliability measure, which was defined as the true positive rate times the true negative rate. The reliability reported was 97.68%.

使用Mel频率倒谱系数(MFCC)、峰度和熵作为特征的喘息检测在[53]中得到了发展。45个记录用于分析加速度计(BU-3173)。使用两个并行支持向量机作为分类器,并使用两者输出的乘积做出最终决策。其中21条录音用于训练,其余录音用于测试模型。采用20%-80%的数据分割进行验证(重复20次)。该性能被报告为可靠性度量,它被定义为真阳性率乘以真阴性率。报告的信度为97.68%。

Another wheeze and normal sound classifier was developed in [61]. The detection was performed at the segment level, with data obtained from online repositories [125, 126] and their own recordings. The data used contained 130 wheeze segments and 130 normal segments. A SVM was also used as a classifier, with audio spectral envelope variation and a tonality index as features. A 10-fold CV was performed, with accuracy reported of 93%.

另一种喘息和正常音分类器是在[61]中开发的。检测是在段级别进行的,数据来自在线存储库[125,126]和他们自己的录音。使用的数据包含130个喘息段和130个正常段。以音频频谱包络变化和调性指数为特征,采用支持向量机作为分类器。进行了10倍CV,准确度为93%。

A C-weighted SVM was used in [58] to perform wheeze detection. Data for the study was obtained from [124], which included 26 recordings. A total of 1188 segments were annotated; 290 of them were wheeze segments. Leave-two-out cross-validation (LTOCV) was used in such a way that one of each normal and wheeze segments were used as a test set. MFCC, wavelet packet transform, and fourier transform features were used and compared. The performance achieved was 81.5 ± 10% sensitivity and 82.6 ± 7% specificity for MFCC features to detect wheeze segments.

在[58]中使用了c加权支持向量机进行喘息检测。本研究的数据来自[124],包括26段录音。共标注1188段;其中290个是喘息段。留二交叉验证(LTOCV)以这样一种方式使用,即每个正常和喘息段中的一个作为测试集。采用了MFCC、小波包变换和傅立叶变换特征并进行了比较。MFCC检测喘息片段的灵敏度为81.5±10%,特异性为82.6±7%。

Crackle and rhonchi classification was presented in [64]. 60 recordings were used for analysis, obtained using a WelchAllyn electronic stethoscope at multiple positions on the back. The frequency ratio, average and exchange time of instantaneous frequency, and eigenvalues were used for feature extraction. The feature set was extracted from each frame of the segmented recordings. 5-fold CV was used with a SVM as a classifier. The performance was obtained using each of the features, with one-versus-one and one-versus-all SVM classifiers. The accuracy was above 80% for all cases.

Crackle和rhonchi分类在文献[64]中提出。使用WelchAllyn电子听诊器在背部多个位置获得60段录音用于分析。利用瞬时频率的频率比、平均时间和交换时间以及特征值进行特征提取。从每一帧的分段录音中提取特征集。使用5倍CV和SVM作为分类器。使用每个特征,使用一对一和一对一的SVM分类器获得性能。所有病例的准确率均在80%以上。

The work in [65] developed new features to perform CAS classification. The CAS analysed were wheeze, stridor, and rhonchi. Data for the study was obtained from both volunteers and databases. The volunteer’s signals were recorded using a SONY ECM-77B microphone positioned on the trachea, while the databases used were from [129, 131, 132]. From the data collection, 339 events were obtained. The data from the database contained 239 events. A feature set of size 5 was obtained after performing feature selection. The features were extracted based on instantaneous kurtosis, discriminating functions, and sample entropy. LOOCV was used with a SVM classifier achieving accuracy of 97.7% for the inspiration cycle and 98.8% for the expiration cycle.

[65]中的工作开发了新的特征来执行CAS分类。CAS分析的是喘鸣、喘鸣和喘鸣。这项研究的数据来自志愿者和数据库。志愿者的信号是通过放置在气管上的索尼ECM-77B麦克风记录的,而使用的数据库来自[129,131,132]。从数据收集中,获得了339个事件。数据库中的数据包含239个事件。进行特征选择后,得到一个大小为5的特征集。基于瞬时峰度、判别函数和样本熵提取特征。LOOCV与SVM分类器结合使用,对吸气周期的准确率为97.7%,对呼气周期的准确率为98.8%。

Differently from the other works here, the study in [75] performed classification on the cause of adventitious sounds. The two classes for the classification were airway obstruction and parenchymal pathology. The data used for the study was obtained from [124] which contained 68 recordings. The recordings consisted of 17 normal, 26 with airway obstruction, and 25 with parenchymal pathology. The classification was performed with 60%-40% train-validation set repeated 25 times. MFCC were used as features with a SVM classifier, achieving an accuracy of 94.11% for classifying normal recordings, 92.31% for airway obstruction pathology, and 88% for parenchymal pathology.

与本文其他研究不同的是,[75]的研究对不定音的成因进行了分类。分为气道梗阻和实质病理两类。本研究使用的数据来自[124],其中包含68段录音。正常17例,气道梗阻26例,实质病理25例。以60%-40%的训练验证集进行分类,重复25次。将MFCC作为特征与SVM分类器结合使用,对正常记录的分类准确率为94.11%,对气道阻塞病理的分类准确率为92.31%,对实质病理的分类准确率为88%。

A SVM classifier was also used in [76] to perform a classification between crackle and normal sounds. Signals were obtained using fourteen SONY ECM-44 BPT microphones positioned on the chest. A total of 6000 segments with 3000 of them being crackle sounds were extracted from 26 different recordings. The data were split evenly for training, test, and validation of the SVM model. Multilayer Perceptron (MLP) and k-NN methods were also used for the classification. The performance was reported separately for each classifier. The study found that the SVM was superior to the k-NN and MLP, with an overall accuracy of 97.5% and sensitivity of 97.3%.

[76]中也使用SVM分类器对裂纹声和正常声音进行分类。通过放置在胸部的14个索尼ECM-44 BPT麦克风获得信号。从26段不同的录音中提取了6000段,其中3000段是噼啪声。将数据均匀分割,用于SVM模型的训练、测试和验证。多层感知器(MLP)和k-NN方法也被用于分类。每个分类器的性能分别报告。研究发现,SVM优于k-NN和MLP,总体准确率为97.5%,灵敏度为97.3%。

Another work which used a SVM as classifier was [78]. The focus of this study was to perform classification between normal and abnormal breath sounds. A ThinkLab digital stethoscope was used to obtain 28 recordings for the analysis. Out of the 28 recordings, 10 of them were normal, 10 contained wheezes, and 8 had crackles. A cortical model of the recordings was extracted as a feature, and 10-fold CV was performed. The performance achieved was 89.44% for sensitivity and 80.5% for specificity.

另一项使用SVM作为分类器的工作是[78]。本研究的重点是对正常呼吸音和异常呼吸音进行分类。使用ThinkLab数字听诊器获取28段录音进行分析。在28个录音中,10个是正常的,10个有喘息声,8个有噼啪声。提取录音的皮质模型作为特征,并进行10倍CV。灵敏度为89.44%,特异性为80.5%。

Artificial Neural Network Variant Methods. A MLP was used in [102] to perform a classification of respiratory sounds from 20 healthy volunteers, 18 patients with obstructive, and 19 patients with restrictive disorder. 50%-50% train-test set was used with Auto Regressive (AR) parameters and cepstral coefficients as features. The performance achieved was 10-20% average misclassification error on the test set at the event level for the cepstral coefficient feature set. Further post-processing was performed to increase the accuracy of the classification at the recording level.

人工神经网络变体方法。[102]中使用MLP对20名健康志愿者、18名阻塞性患者和19名限制性障碍患者的呼吸音进行分类。采用自回归(AR)参数和倒谱系数为特征的50%-50%训练集。对于倒谱系数特征集,所取得的性能是在事件级别的测试集上的平均误分类误差为10-20%。进一步进行后处理以提高记录级别分类的准确性。

A MLP classifier was also used in [94] to perform the classification of wheeze and normal events. The data for the classification was obtained from the online repository [124], Ausculta pulmonar, and IMD 420-C review of lung sounds. A total of 28 recordings with 40 wheeze events and 72 normal events were used to test the MLP classifier. For the MLP training, 40 separate events were used with 20 of them being wheeze events. A set of features with a size of 20 were extracted. The features were obtained from the amplitude and frequency of the 10 largest edges in a pre-processed spectrogram. The spectrogram of each event was pre-processed using a Laplacian mask. The result of the MLP wheeze classifier was an 86.1% sensitivity and an 82.5% specificity.

MLP分类器在[94]中也被用于对喘息和正常事件进行分类。分类数据来自在线资料库[124]、Ausculta pulmonar和IMD 420-C肺音复查。共有28个记录,40个喘息事件和72个正常事件被用来测试MLP分类器。对于MLP训练,使用了40个独立的事件,其中20个是喘息事件。提取了一组大小为20的特征。这些特征是由预处理谱图中10个最大边缘的幅度和频率得到的。每个事件的谱图使用拉普拉斯掩模进行预处理。MLP喘息分类器的灵敏度为86.1%,特异性为82.5%。

The work in [69] also used a MLP to perform the classification of wheeze, crackle, and normal breath sounds. The data was obtained from an online repository [124]. 13 events, with 4 containing wheeze, 4 containing crackle, and 5 normal were used with a LOOCV technique.The features used were 13 MFCCs. The recordings were first windowed and each segment was classified using the MLP. The event classification was performed based on the segment classification. An event was classified as a certain class if most of its segment’s were classified as that class. The event classification achieved individual accuracy of 100% for wheeze, 75% for crackle, and 80% for normal sounds.

[69]中的工作也使用MLP对喘息、噼啪声和正常呼吸音进行分类。数据来自在线存储库[124]。13个事件,其中4个包含喘息,4个包含裂纹,5个正常使用LOOCV技术。所使用的特征是13个mfc。录音首先被打开,每个片段使用MLP进行分类。在分段分类的基础上进行事件分类。如果一个事件的大部分片段都被归类为某个类别,那么该事件就被归类为该类别。事件分类在喘息声、噼啪声和正常声音中分别达到100%、75%和80%的准确率。

A MLP was also used as a classifier in [45]. The features used were 20 MFCCs. The data used for the study was obtained from an online repository [124] and from the IIT Kharagpur Institute of Pulmocare and Research Kolkata. 30 recordings containing 72 events were obtained, with 24 of them normal, 24 containing wheezes, and 24 others with crackle events. The LOOCV technique was used, achieving a 97.83% overall accuracy of classification. Other cepstral-based features were also discussed, such as: Linear Prediction Cepstral Coefficient (LPCC), Perceptual Linear Prediction Cepstral Coefficient (PLPCC), Linear Frequency Cepstral Coefficient (LFCC) and Inverted MFCC. These cepstral features were compared with wavelet-based features. The study concluded that cepstral-based features achieved better accuracy than wavelet-based ones.

MLP在[45]中也被用作分类器。所使用的特征是20个mfc。研究使用的数据来自在线存储库[124]和印度理工学院Kharagpur Pulmocare and Research加尔各答研究所。获得了包含72个事件的30个录音,其中24个正常,24个包含喘息,另外24个包含噼啪事件。使用LOOCV技术,总体分类准确率达到97.83%。本文还讨论了其他基于倒频谱的特征,如线性预测倒频谱系数(LPCC)、感知线性预测倒频谱系数(PLPCC)、线性频率倒频谱系数(LFCC)和倒MFCC。将这些倒谱特征与基于小波的特征进行比较。研究表明,基于倒谱的特征比基于小波的特征具有更好的准确性。

The study in [55] used a Fuzzy Neural Network (FNN) to perform a classification on abnormal and normal breath sounds. The normal breath sounds in the study consisted of bronchovesicular, normal bronchial, normal bronchophony, and normal egophony. The abnormal sounds included crackles, wheezes, abnormal bronchial, stridors, bronchophony by consolidation, and egophony. The sounds were obtained from [129, 130] audio CD book companion which contains 28 recordings. The data was split into 70%-15%-15% train-test-validation set. The features were extracted from the power spectral density of each events. The power spectrum was averaged into 32 frequency ranges, such that the feature vector was of size 32. The performance on the test set was 97.8% sensitivity with 100% specificity for abnormal sounds classification.

[55]的研究使用模糊神经网络(Fuzzy Neural Network, FNN)对异常和正常呼吸音进行分类。正常呼吸音包括支气管水疱音、正常支气管音、正常支气管音和正常支气管音。异常音包括噼啪声、喘息声、异常支气管声、喘鸣声、支气管实变音和低音。这些声音来自[129,130]音频CD图书同伴,其中包含28个录音。将数据分成70%-15%-15%的训练-测试-验证集。从每个事件的功率谱密度中提取特征。将功率谱平均为32个频率范围,得到大小为32的特征向量。测试集的性能为97.8%的灵敏度,异常声音分类的特异性为100%。

A back propagation neural network (BPNN) was used by [107] to perform the classification of abnormal and normal respiratory sounds. Data was recorded using two LS-60 microphones placed on the anterior chest. Additional data from [129, 131] were also obtained. The best performance achieved was a sensitivity of 59% and 81% specificity for recorded sounds, and a sensitivity of 87% and 95% specificity for CD additional data at the event level for abnormal respiratory sound classification. The feature used was averaged power spectrum.

[107]使用反向传播神经网络(BPNN)对异常和正常呼吸音进行分类。数据记录使用放置在胸前的两个LS-60麦克风。从[129,131]中也获得了其他数据。所获得的最佳性能是对录制声音的灵敏度为59%,特异性为81%,对CD附加数据在事件水平上对异常呼吸声音分类的灵敏度为87%,特异性为95%。所使用的特征是平均功率谱。

The study in [104] used BPNN to perform segment classification of crackle and noncrackle. Data was recorded using 25 microphones placed on the posterior chest of 10 healthy volunteers and 19 patients. 912 segments, of which 456 were normal and 456 were abnormal, were used to train the BPNN. 114 segments were used for validation while another separate 114 segments were used as a test set. A multi-variate AR model was used as a feature, achieving 80.7% sensitivity and 84.21% specificity, at the segment level on the validation set.

[104]的研究使用bp神经网络对裂纹和非裂纹进行分段分类。研究人员将25个麦克风放置在10名健康志愿者和19名患者的后胸部,记录数据。使用912个片段进行BPNN的训练,其中456个为正常片段,456个为异常片段。114个片段用于验证,而另一个单独的114个片段用作测试集。使用多变量AR模型作为特征,在验证集的片段水平上实现了80.7%的灵敏度和84.21%的特异性。

BPNN was also used by [49] to perform recording classification. The study used 58 recordings with 32 of them containing wheezes obtained using an ECM microphone. 13 wheeze and 10 normal recordings were used for training, while the rest were used to test the neural network. Before using the BPNN, potential wheeze episodes were first selected from the recordings by using the Order Truncate Average (OTA) method to preserve peaks. The peaks were further processed using a threshold to obtain potential wheezes. These potential wheezes were then classified using a BPNN. The features used were the duration, frequency range, boundary, normalised power spectra, and slope of the potential wheeze. The performance claimed by the study was high, with a sensitivity of 94.6% with 100% specificity for wheeze recording classification.

[49]也使用BPNN进行记录分类。该研究使用了58段录音,其中32段包含使用ECM麦克风获得的喘息声。13个喘息录音和10个正常录音用于训练,其余录音用于测试神经网络。在使用bp神经网络之前,首先使用顺序截断平均(OTA)方法从录音中选择潜在的喘息事件以保留峰值。使用阈值对峰值进行进一步处理以获得潜在的喘息。然后使用bp神经网络对这些潜在的喘息进行分类。使用的特征是持续时间、频率范围、边界、归一化功率谱和电位喘息的斜率。该研究声称的性能很高,对喘息记录分类的灵敏度为94.6%,特异性为100%。

Extreme Learning Machine (ELM) was used to perform a classification between abnormal and normal sounds in [68]. The abnormal sounds that were analysed included wheeze, crackle, and squawk sounds. The data was taken using a microphone placed on the trachea. A total of 30 recordings were obtained, from which 120 cycles were annotated. A 5-fold CV technique was used for the classifier. The feature vector for the classification consisted of lacunarity, sample entropy, kurtosis, and skewness of the event power spectrum. SVM classifier was also discussed in this study. The performance for the ELM classifier was 86.30% for sensitivity and 86.90% for specificity when the whole set of features were used. When the SVM classifier was used, 86.30% for sensitivity and 85.80% for specificity was achieved, also with all features used.

[68]中使用极限学习机(Extreme Learning Machine, ELM)对异常音和正常音进行分类。分析的异常声音包括喘息声、噼啪声和吱吱声。数据是通过放置在气管上的麦克风获取的。共获得30条录音,其中120个周期进行了注释。分类器采用5倍CV技术。分类的特征向量由事件功率谱的缺度、样本熵、峰度和偏度组成。本文还对支持向量机分类器进行了讨论。当使用全部特征集时,ELM分类器的灵敏度为86.30%,特异性为86.90%。当使用SVM分类器时,灵敏度达到86.30%,特异性达到85.80%,也使用了所有特征。

The work in [72] performed an analysis of wheeze and crackle using signals from patients with tuberculosis. The recordings for the analysis were taken using 7 microphones positioned on the neck, chest, and back. Signals from 60 volunteers were obtained. An Artificial Neural Network (ANN) was used with 75% of the data for training and 25% to test the model. The classification performed was to check if a recording was from a patient with tuberculosis or a normal one. The presence of a wheeze was detected by evaluating the spectrogram while crackles were identified using wavelet-based features for the ANN. The performance obtained was a sensitivity of 80% with a specificity of 67% in detecting tuberculosis.

[72]中的工作使用结核病患者的信号对喘息和噼啪声进行了分析。用于分析的录音是通过放置在颈部、胸部和背部的7个麦克风进行的。他们获得了60名志愿者的信号。使用人工神经网络(ANN), 75%的数据用于训练,25%的数据用于测试模型。进行分类是为了检查记录是来自结核病患者还是正常患者。通过评估谱图来检测喘息的存在,而使用基于小波的神经网络特征来识别裂纹。检测结核的灵敏度为80%,特异性为67%。

An ANN was used in [71] to perform event classification of respiratory sounds containing wheezes and crackles. Data was obtained from [129]. A total of 92 events with 27 normal, 31 crackles, and 34 wheezes were obtained. 60 events were used for training, 14 events were used for validation, and 18 events were used for the test set. A wavelet packet transform was used as the feature set, achieving a 98.89% best average accuracy for Symlet-10 wavelet base on the test set.

[71]中使用人工神经网络对包含喘息和噼啪声的呼吸音进行事件分类。数据来源于[129]。共获得92个事件,27个正常,31个裂纹,34个喘息。60个事件用于训练,14个事件用于验证,18个事件用于测试集。使用小波包变换作为特征集,在测试集上,Symlet-10小波基的最佳平均准确率达到98.89%。

Multiple variants of ANNs were used and compared in [91]. The classification task performed was to differentiate wheeze, crackle, stridor, squawk, pleural rub, and other types of sounds using a MLP, a Grow and Learn network (GAL), and an Incremental Supervised Neural Network (ISNN). A total of 360 events from 36 recordings were obtained. An averaged power spectrum was used as a feature, achieving a best accuracy of 98% for the ISNN classifier on a test set of 180 events.

Multiple variants of ANNs were used and compared in [91]. The classification task performed was to differentiate wheeze, crackle, stridor, squawk, pleural rub, and other types of sounds using a MLP, a Grow and Learn network (GAL), and an Incremental Supervised Neural Network (ISNN). A total of 360 events from 36 recordings were obtained. An averaged power spectrum was used as a feature, achieving a best accuracy of 98% for the ISNN classifier on a test set of 180 events.

[91]中使用并比较了多种变体的Ann。所执行的分类任务是使用MLP、Grow and Learn网络(GAL)和增量监督神经网络(ISNN)来区分喘息、爆裂声、刺耳声、嘎嘎声、胸膜摩擦和其他类型的声音。从36个记录中总共获得了360个事件。使用平均功率谱作为特征,在180个事件的测试集上实现了ISNN分类器98%的最佳精度。

The study in [110] used a Learning Vector Quantisation (LVQ) to detect wheeze and crackle segments. The feature used was a wavelet packet decomposition. Signals recorded from the chest of four healthy volunteers and nine patients were used for the analysis. A 50%-50% train-test data split was used. This study reported a performance of 59% for sensitivity and 24% for PPV for wheeze detection. For fine crackle detection, only 19% for sensitivity and 6% for PPV was achieved while 58% for sensitivity and 18% for PPV was the reported performance for coarse crackle detection.

[110]中的研究使用了学习向量量化(LVQ)来检测喘息和裂纹段。所使用的特征是小波包分解。从4名健康志愿者和9名患者的胸部记录的信号被用于分析。采用50%-50%训练-测试数据分割。该研究报告了喘鸣检测灵敏度为59%,PPV为24%的性能。对于细裂纹检测,仅实现了19%的灵敏度和6%的PPV,而据报道,粗裂纹检测的灵敏度和PPV分别为58%和18%。

Wheeze segment classification using several ANN variants was performed in [111]. The ANN variants used were BPNN, Radial Basis Function (RBF), Self-organising Map (SOM), and LVQ. A total of 710 segments, with 375 containing wheezes, were used for the classification. The data was split into three sets, one training and two test sets. The training set consisted of 242 segments where 128 of them contained wheezes. The first test set consisted of 233 segments with 107 wheeze segments, while the second test set had 235 segments with 140 wheeze segments. The feature used for the neural networks was extracted from the power spectrum of the segments. Highest overall accuracy of 93% on the first and 96% on the second test sets were achieved using LVQ.

在[111]中使用几个人工神经网络变体进行了喘息段分类。使用的ANN变体有BPNN、径向基函数(RBF)、自组织映射(SOM)和LVQ。共有710个片段,其中375个包含喘息,用于分类。数据被分成三组,一个训练集和两个测试集。训练集由242个片段组成,其中128个片段包含喘息声。第一个测试集由233个片段组成,其中有107个喘息片段;第二个测试集有235个片段,其中有140个喘息片段。神经网络所使用的特征是从片段的功率谱中提取出来的。使用LVQ实现了第一个测试集的93%和第二个测试集的96%的最高总体准确率。

Gaussian Mixture Model Based Methods. A MFCC coupled with a Gaussian Mixture Model (GMM) was used in [99] to perform a classification of wheeze and normal sounds. Data for the study was taken from 30 volunteers. The instrumentation used to record respiratory sounds was an ECM microphone and a 3M Littmann Classic S.E. stethoscope placed on the neck. The study reported an accuracy of 94.9% at the segment level detection. This approach of using MFCC with GMM was also performed in [40] for wheeze detection, by a different group. The data for analysis was recorded from 18 volunteers, with nine of them being asthmatic. 88.1% sensitivity and 99.5% specificity was reported as performance.

基于高斯混合模型的方法。[99]中使用了MFCC与高斯混合模型(GMM)相结合来对喘息声和正常声音进行分类。这项研究的数据来自30名志愿者。用于记录呼吸声音的仪器是ECM麦克风和放置在颈部的3M Littmann Classic S.E.听诊器。该研究报告了在片段水平检测的准确率为94.9%。在[40]中,另一组也采用了MFCC与GMM相结合的方法进行喘息检测。用于分析的数据来自18名志愿者,其中9人患有哮喘。敏感度为88.1%,特异度为99.5%。

A GMM was also used in [90] to perform wheeze segment detection. A total of 24 recordings, with 12 wheezing and 12 normal recordings were obtained from [124] and the ASTRA database CD. The recordings were segmented. 985 wheeze and 1822 normal segments were obtained. Several feature sets were extracted for the classification. The feature sets extracted were based on the Fourier transform, LPC, wavelet transform, and MFCC. The use of an ANN and Vector Quantisation (VQ) as detection techniques was also discussed. The LOOCV technique was used, achieving a sensitivity of 94.6% and a 91.9% specificity, when MFCC was used as a feature with GMM clustering.

在[90]中也使用了GMM来执行喘息段检测。从[124]和ASTRA数据库CD中共获得24段录音,其中12段喘息和12段正常录音。录音被分割。共获得985个喘息段和1822个正常段。提取了多个特征集用于分类。提取的特征集基于傅里叶变换、LPC、小波变换和MFCC。本文还讨论了人工神经网络和矢量量化(VQ)作为检测技术的应用。使用LOOCV技术,当MFCC作为GMM聚类的特征时,灵敏度为94.6%,特异性为91.9%。

Another implementation using, a GMM with MFCC as features, was presented in [87]. The clustering was performed to separate between crackle, wheeze, and stridor sounds. The sound recordings were obtained from an online repository [124]. LOOCV was used with 13 MFCC features. The performance was reported individually as a measure of accuracy of the CV result. The accuracy obtained was 46.1% for the normal data, 98% for crackle, 50% for asthma, and 26.9% for wheeze.

[87]中提出了另一种使用以MFCC为特征的GMM的实现。进行聚类以区分噼啪声、喘息声和喘鸣声。录音资料来自在线存储库[124]。LOOCV具有13个MFCC特性。作为对CV结果准确性的测量,分别报告了表现。正常数据的准确率为46.1%,噼啪声为98%,哮喘为50%,喘息为26.9%。

GMM was also used in [51] to separate between crackle and normal recordings. 41 recordings with 14 of them containing crackle sounds were used for classification. Spectral-based features were used for the clustering. The performance claimed was 92.85% sensitivity with 100% specificity.

在[51]中也使用GMM来区分裂纹和正常录音。41段录音,其中14段含有爆裂声。基于光谱的特征被用于聚类。灵敏度为92.85%,特异度为100%。

The study in [57] compared the performance of a GMM and a SVM for the classification of normal and abnormal recordings. An AR model was used as a feature set using LOOCV. The data used was 40 recordings obtained with fourteen SONY ECM-44 BPT microphones, placed on the posterior chest. A best total accuracy of 90% was achieved using a GMM.

[57]的研究比较了GMM和SVM在正常和异常记录分类方面的性能。使用LOOCV将AR模型作为特征集。使用的数据是由14个索尼ECM-44 BPT麦克风获得的40次录音,放置在胸部后侧。使用GMM获得的最佳总准确度为90%。

A clustering-based classifier similar to a GMM was used in [113] to perform the classification of events based on the underlying pathology. A total of 147 sound events were obtained from [136]. The types of sound observed included normal sounds from varying positions and the sounds of an asthma patient. LPC was used as feature vector for the classification. 42 events were used to find parameters for the clustering-based classifier, based on minimum distance metric; while 105 events were used to test the obtained model. An overall accuracy of 95.24% was achieved as only 5 events were misclassified.

[113]中使用了类似于GMM的基于聚类的分类器,根据潜在病理对事件进行分类。从[136]中共获得147个声事件。观察到的声音类型包括来自不同位置的正常声音和哮喘患者的声音。采用LPC作为特征向量进行分类。基于最小距离度量,利用42个事件寻找基于聚类的分类器参数;而105个事件被用来测试得到的模型。总体准确率为95.24%,只有5个事件被错误分类。

Random Forest Based Methods. A Random Forest (RF) was used in [54] to perform wheeze detection. The dataset used was obtained using a Littmann 3M 4000 Electronic Stethoscope on multiple positions on the chest and back of the patient. The signals were obtained from 12 volunteers, and consisted of a total of 24 recordings. 113 wheeze events were annotated in the recordings. The features used for detection were musical features and the spectrogram signature of wheezes, which included the peak selection. The potential wheezes were classified using a RF with the 10-fold CV technique. The performance achieved was 90.9% ± 2% sensitivity and 99.4% ± 1% specificity for the RF wheeze detector. A LRM was also used in the study using the same feature set. The performance achieved for the LRM model was 82.7% ± 2% sensitivity and 98.1% ± 1% specificity.

基于随机森林的方法。在[54]中使用随机森林(Random Forest, RF)进行喘息检测。使用的数据集是在患者胸部和背部的多个位置使用Littmann 3M 4000电子听诊器获得的。这些信号来自12名志愿者,共包括24段录音。在录音中注释了113个喘息事件。用于检测的特征是音乐特征和喘息的谱图签名,其中包括峰值选择。使用RF与10倍CV技术对潜在的喘息进行分类。检测灵敏度为90.9%±2%,特异度为99.4%±1%。研究中还使用了LRM,使用相同的特征集。LRM模型的灵敏度为82.7%±2%,特异性为98.1%±1%。

k-Nearest Neighbour Based Methods. The work in [96] used a k-NN method and achieved a 92% sensitivity and a 100% specificity on a test set at the recording level. Classification was performed to differentiate between pathological and normal recordings. Sounds from 65 volunteers recorded using a SONY ECM-44 microphone placed on the posterior chest were used. Data from 40 volunteers was then used as a training set with the LOOCV technique, and the rest was used for test set. AR coefficients were used as a feature set.

基于k近邻的方法。[96]中的工作使用k-NN方法,在记录水平的测试集上实现了92%的灵敏度和100%的特异性。进行分类以区分病理和正常记录。65名志愿者的声音是用索尼ECM-44麦克风录制的,麦克风放置在胸部后部。然后将40名志愿者的数据用作LOOCV技术的训练集,其余的用作测试集。使用AR系数作为特征集。

The study in [44] used higher order statistics to perform the classification of vesicular, fine and coarse crackle, and monophonic and polyphonic wheeze sounds. The classifier used was a combination of k-NN and NB. The k-NN classifier was used to separate normal, crackle, and wheeze sounds, while two separate NB classifiers were used to further separate fine and coarse crackle and also between monophonic and polyphonic wheeze. A total of 219 events, with 71 normal, 39 each for fine and coarse crackles, and 35 each for monophonic and polyphonic wheezes were used for training. The test was performed using 99 separate events containing 31 normal, 18 each for fine and coarse crackles, and 16 each for monophonic and polyphonic wheezes. 2nd, 3rd, and 4th order cumulants were extracted for each segment and used as features for the classification. A total of 800 features were extracted for each. Feature selection was performed using a Genetic Algorithm (GA) which was found to perform better than Fischer’s Discriminant Ratio (FDR). The classification accuracy obtained was 94.4 ± 1.5% for vesicular sounds, 91.9 ± 2.8% for fine crackles, 90.8 ± 3.2% for coarse crackles, 91.9 ± 2.3% for monophonic wheezes, and 90.3 ± 3.3% for polyphonic wheezes.

[44]的研究使用高阶统计量对水泡声、细裂声和粗裂声、单音和复音喘息声进行了分类。使用的分类器是k-NN和NB的组合。k-NN分类器用于分离正常、裂纹和喘息声,而两个独立的NB分类器用于进一步分离细裂纹和粗裂纹以及单音和复音喘息声。共219个项目,其中正常71个,细粗各39个,单音和复音各35个。测试使用99个单独的事件进行,其中包含31个正常事件,18个细裂纹和粗裂纹,16个单音和复音喘息。提取每个片段的二、三、四阶累积量作为特征进行分类。每个图像总共提取了800个特征。使用遗传算法(GA)进行特征选择,发现遗传算法的性能优于Fischer的判别比(FDR)。泡状音的分类准确率为94.4±1.5%,细裂纹为91.9±2.8%,粗裂纹为90.8±3.2%,单音喘息为91.9±2.3%,复音喘息为90.3±3.3%。

Adventitious sound classification was performed using a k-NN classifier in [86]. A total of 585 events, with 264 of them normal, 132 polyphonic wheeze, 93 monophonic wheeze, and 96 stridor events were used for the classification. The recordings were obtained using one SONY ECM-77B microphone. Databases [129, 131, 132] were also used for sounds. LOOCV was used with features extracted based on temporal spectral dominance spectrogram. The performance achieved was 92.4 ± 2.9% overall accuracy.

在[86]中使用k-NN分类器进行了不定音分类。共585个事件,其中正常事件264个,复音喘息事件132个,单音喘息事件93个,喘鸣事件96个。录音是使用一个索尼ECM-77B麦克风获得的。数据库[129,131,132]也用于声音。LOOCV与基于时间谱优势谱图提取的特征结合使用。总体准确率为92.4±2.9%。

The same research group as above used a new classification approach which was similar to the k-NN method, called Empirical Classification in [85]. The classifier performed similarly to k-NN, but instead of just checking the local similarity by measuring distance, global similarity was checked based on the variance difference. The feature used for the study was a multi-scale PCA. The classification was performed on data obtained by using one SONY ECM-77B microphone placed on neck. More data were also included from several audio CD companions of books [129, 131, 132]. A total of 689 events, including 130 normal, 413 CAS, and 146 DAS events, were obtained. The performance achieved was 97.3 ± 2.7% for accuracy of classification between normal and CAS and 98.34% between normal and combination of CAS and DAS.

与上述相同的研究小组使用了一种类似于k-NN方法的新分类方法,称为[85]中的经验分类。该分类器的性能与k-NN类似,但不是仅仅通过测量距离来检查局部相似性,而是基于方差差来检查全局相似性。用于研究的特征是多尺度PCA。对通过使用放置在脖子上的一个SONY ECM-77B麦克风获得的数据进行分类。更多的数据也包含在一些书籍的音频CD配套中[12913132]。共获得689个事件,包括130个正常事件、413个CAS事件和146个DAS事件。在正常和CAS之间的分类准确率为97.3±2.7%,在正常和CAS和DAS组合之间的分类正确率为98.34%。

Classification of recordings based on the underlying pathology was performed in [109]. Signals were recorded using two microphones on multiple positions on the chests of 69 volunteers. 28 of the volunteers were obstructive airway disease patients, while 23 of them had restrictive airway disease. At the segment level, LOOCV using a k-NN classifier with an AR model as a feature was performed. A multinomial classifier was employed on the result from each segment to determine the pathology of corresponding respiratory events. The final recording classification was then obtained from voting results of each event. The study achieved an overall accuracy of 71.07% at classifying recordings based on the disease.

文献[109]根据基础病理对录音进行分类。研究人员在69名志愿者胸前的多个位置安装了两个麦克风,记录下了这些信号。其中28名志愿者患有阻塞性气道疾病,23名患有限制性气道疾病。在片段级别,使用以AR模型为特征的k-NN分类器执行LOOCV。对每个片段的结果采用多项分类器来确定相应呼吸事件的病理。然后根据每个事件的投票结果得出最终的记录分类。该研究在基于疾病的记录分类方面达到了71.07%的总体准确率。

Hidden Markov Model Based Methods. Hidden Markov Models (HMM) were mainly used by studies from the same research group, as in [43, 52, 60, 79, 93]. The work in [93] used a HMM to perform the classification of abnormal and normal breath sounds. The data used was obtained from 162 volunteers, where 109 of them were patients with emphysema pulmonum. The data was segmented into a total of 1544 events, where 554 of them corresponded to abnormal sounds. The data was recorded using either a condenser or a piezoelectric microphone. LOOCV was carried out and the performance achieved was 93.2% for sensitivity at a 64.8% specificity.

基于隐马尔可夫模型的方法。隐马尔可夫模型(Hidden Markov Models, HMM)主要由同一研究组的研究使用,如[43,52,60,79,93]。[93]中的工作使用HMM对异常和正常呼吸音进行分类。所使用的数据来自162名志愿者,其中109名是肺气肿患者。这些数据被分割成总共1544个事件,其中554个与异常声音相对应。数据是用电容器或压电麦克风记录的。进行LOOCV检测,灵敏度为93.2%,特异度为64.8%。

The classification of abnormal respiratory sounds was also the focus in [79], but a new feature was added to improve performance. The duration distribution of noise and abnormal respiratory sounds was used to reduce false alarms caused by noise. The performance achieved by using LOOCV was 88.7% sensitivity and 91.5% specificity for the classification of abnormal versus normal events. Classification of recordings as normal or abnormal was also performed, achieving an 87% sensitivity and an 81% specificity at recognising abnormal recordings.

异常呼吸音的分类也是[79]的重点,但增加了一个新功能以提高性能。利用噪声和异常呼吸音的持续时间分布来减少噪声引起的误报。使用LOOCV对异常事件和正常事件进行分类的灵敏度为88.7%,特异性为91.5%。对录音进行正常或异常分类,在识别异常录音方面达到87%的灵敏度和81%的特异性。

MFCC were used as features in [43, 52, 60]. An electronic stethoscope was used to obtain data for the analysis. The correlation score with other auscultation points and segments was used as an additional feature to enhance the performance of the HMM in [52]; while [60] used a HMM, which could automatically adapt to different patients by including high-confident previously classified segments to retrain the model. Best sensitivity of 91.10% and specificity of 93.43%, using 8 auscultation points, at the event level, were achieved in [52]; while 89.4% sensitivity and 80.9% specificity, at the event level, were achieved in [60]. The study in [43] combined the timing of occurrence and joint probability of different segments as additional features, achieving a best accuracy of 82.82% at the segment level.

在[43,52,60]中使用MFCC作为特征。使用电子听诊器获取数据进行分析。在[52]中,与其他听诊点和听诊段的相关评分被用作增强HMM性能的附加特征;而[60]使用HMM,它可以通过包含高置信度的先前分类片段来重新训练模型,从而自动适应不同的患者。在事件水平上,使用8个听诊点的最佳灵敏度为91.10%,特异性为93.43% [52];而在事件水平上,[60]的敏感性为89.4%,特异性为80.9%。[43]的研究将不同片段的发生时间和联合概率作为附加特征,在片段水平上获得了82.82%的最佳准确率。

Logistic Regression Model Based Methods. A LRM was used in [67] to perform crackle detection. Two recordings were used in the study obtained from [124]. LOOCV was used as a validation method. The performance reported as Matthews Correlation Coefficient (MCC) was 80%. The detection was performed using wavelet, entropy, empirical mode decomposition, Teager energy, and fractal dimension as features. The same group then again employed LRM to perform crackle detection, but using different sets of features [42]. 10-fold CV was performed on 40 recordings obtained using a Littmann 3M 3200 stethoscope from 20 volunteers. The data contained 400 crackle events. The addition of musical features to the feature set resulted in 76 ± 23% sensitivity and 77 ± 22% PPV, at the segment level.

基于逻辑回归模型的方法。在[67]中使用LRM进行裂纹检测。本研究使用了来自[124]的两段录音。使用LOOCV作为验证方法。马修斯相关系数(MCC)为80%。采用小波、熵、经验模态分解、Teager能量和分形维数作为特征进行检测。然后,同一组再次使用LRM进行裂纹检测,但使用不同的特征集[42]。使用Littmann 3M 3200听诊器对20名志愿者的40段录音进行10倍CV。数据包含400个裂纹事件。将音乐特征添加到特征集中,在片段水平上,灵敏度为76±23%,PPV为77±22%。

Discriminant Analysis Based Methods. A discriminant function was used as a crackle event classification method in [105]. The classification was performed to separate coarse and fine crackles. Recordings from 2 volunteers, with 238 coarse and 153 fine crackles, were used in the analysis. Features were extracted using a wavelet network. The classification model was tested on 158 coarse and 73 fine crackles, and achieved an accuracy of 70% and 84% respectively.

基于判别分析的方法。判别函数被用作[105]中的裂纹事件分类方法。进行分类是为了区分粗裂纹和细裂纹。分析中使用了来自2名志愿者的记录,其中238个为粗糙裂纹,153个为精细裂纹。使用小波网络提取特征。该分类模型在158个粗裂纹和73个细裂纹上进行了测试,准确率分别为70%和84%。

Fischer Discriminant Analysis (FDA) was used as a wheeze and normal sound classifier in [89]. Data taken from 7 volunteers were recorded using fourteen SONY ECM-44 BPT microphones positioned on the chest. The data used for classification was extracted from the recordings in the form of 246 wheeze and 246 normal segments. The feature set was extracted as kurtosis, Renyi Entropy, frequency power ratio, and mean crossing irregularity. The performance reported in the study was a 93.5% accuracy.

Fischer Discriminant Analysis (FDA)在[89]中被用作喘音和正常音分类器。7名志愿者的数据由放置在胸前的14个索尼ECM-44 BPT麦克风记录。用于分类的数据以246个喘息段和246个正常段的形式从录音中提取。特征集提取为峰度、人义熵、频率功率比和平均交叉不规则度。该研究报告的性能准确率为93.5%。

A study in [92] performed the classification of squawks and crackles using discrimination analysis. Lacunarity was used as a feature to detect squawk and crackle data obtained from audio CD book companions [129, 132, 134, 135]. The data used was 25 recordings with 136 fine crackles, 93 coarse crackles, and 133 squawk events. The data was separated into 75%-25% train-test set and the process repeated 200 times. The maximum mean accuracy achieved at the segment level was 99.75%.

[92]中的一项研究使用判别分析对squawks和crackles进行了分类。利用缺源性作为特征来检测从CD有声读物同伴中获得的吱吱声和噼啪声数据[129,132,134,135]。使用的数据是25个录音,其中有136个细裂纹,93个粗裂纹和133个吱吱声事件。将数据分成75%-25%的训练集,重复200次。在分段水平上达到的最高平均准确率为99.75%。

Edge Detection on Spectrogram Image Based Methods. Image processing on the spectrogram of sound recordings was used as a wheeze detection technique in [103]. Wheeze detection was performed on recordings taken from 16 volunteers using one KEC-2738 microphone placed on the neck. Edge detection was applied to obtain horizontal edges which were then processed further to detect wheezes. The study claimed to achieve sensitivity and specificity value above 89%.

基于谱图图像边缘检测的方法。[103]将录音声谱图上的图像处理作为一种喘息检测技术。使用放置在脖子上的KEC-2738麦克风对16名志愿者的录音进行喘息检测。通过边缘检测获得水平边缘,然后对水平边缘进行进一步处理以检测喘息。该研究声称达到89%以上的灵敏度和特异性值。

Synthesis of results

The results achieved by the studies reviewed were synthesised as a measure of accuracy range of the algorithms. The synthesis was performed on groups of studies with the same sound type analysed, approach, and level of analysis. The groups considered for the synthesis were wheeze event detection (WED) and wheeze segment detection (WSD), classification between wheeze and other sound at segment (WSC) and event level (WEC), and classification between crackle and other sound at event level (CEC). The studies included in the analysis were articles with relevant information on the dataset size. Performance at the recording level is not analysed further, because for monitoring purposes only segment or event analysis is relevant. Other types were not considered for the synthesis due to the small number of studies having been reported. The summary of accuracy measures synthesised can be seen in Table 6.

所审查的研究取得的结果被综合为算法精度范围的衡量标准。合成是在具有相同声音类型、方法和分析水平的研究组上进行的。综合考虑的组是喘息事件检测(WED)和喘息片段检测(WSD),喘息与其他声音在片段(WSC)和事件级别(WEC)的分类,以及劈啪声与其他声音在事件级别(CEC)的分类。纳入分析的研究是具有数据集大小相关信息的文章。没有进一步分析记录级别的性能,因为为了监测目的,只有片段或事件分析是相关的。由于已报道的研究数量较少,因此未考虑其他类型的合成。综合精度测量的摘要见表6。

Wheeze segment detection reported in [40, 54, 58, 90, 110] achieved an accuracy of 71.2 − 97.9%. At the event level, the achieved accuracy range for wheeze detection by studies in [73, 74, 97, 101, 106] was 79.6 − 100%. Crackle detection at the segment level achieved an accuracy range of 62.7 − 99.8% in studies by [42, 84, 92, 108, 110]. For classification purposes, to differentiate between segments containing wheezes and not, the accuracy achieved by [61, 81, 89, 99, 111] was 90.5 − 96%. For classification between wheeze event and other types of sound, the accuracy of studies in [41, 46, 55, 59, 63, 65, 69, 83, 86, 91, 94, 98] was between 75.78 − 100%. Crackle event classification, as reported in [46, 55, 62, 91], achieved an 89 − 98.15% accuracy range. Based on the accuracy range reported, both wheeze and crackle sound automatic analyses showed that high agreement with the expert can be achieved under controlled conditions.

在[40,54,58,90,110]中报道的喘息段检测准确率为71.2−97.9%。在事件水平上,[73,74,97,101,106]研究中实现的喘息检测精度范围为79.6−100%。在[42,84,92,108,110]的研究中,分段级裂纹检测的准确率范围为62.7 ~ 99.8%。出于分类目的,为了区分包含喘息和不包含喘息的片段,[61,81,89,99,111]的准确率为90.5−96%。对于喘息事件与其他类型声音的分类,[41、46、55、59、63、65、69、83、86、91、94、98]的研究准确率在75.78 ~ 100%之间。据[46,55,62,91]报道,裂纹事件分类的准确率范围为89 - 98.15%。根据报告的精度范围,喘息声和噼啪声自动分析表明,在控制条件下,与专家的高度一致。

Discussion

The systematic review of algorithm development for adventitious sounds analysis is discussed in this section. This discussion is followed by a summary of the main findings, challenges, and future work in developing automatic adventitious respiratory sound analysis methods. Limitations and conclusions of this systematic review are finally given.

本节讨论了不定音分析算法开发的系统综述。在此讨论之后,总结了开发自动不定呼吸声分析方法的主要发现、挑战和未来工作。最后给出了系统综述的局限性和结论。

Development of automated adventitious sound analysis algorithms

There are two approaches in automated adventitious sound analysis, as can be seen in Table 3.The first approach is to perform detection, while classification is the second approach. The difference between these two approaches is on the purpose of analysis. The purpose of the detection approach is to determine whether or not adventitious sounds exist in a sound signal. The purpose of the classification approach is to determine if a certain sound signal belongs to a certain class.

在自动不定音分析中有两种方法,如表3所示。第一种方法是执行检测,第二种方法是分类。这两种方法的区别在于分析的目的。检测方法的目的是确定声音信号中是否存在非定音。分类方法的目的是确定某种声音信号是否属于某一类。

For an automated symptoms monitoring and management tool, real time adventitious sound monitoring may be needed. The development of real-time processing could allow for the timely identification of diseases, as well as changes in their severity. This functionality is important. For example, for the early detection and prevention of exacerbations. A detection approach could be used directly as it generally works at the segment level, allowing for the development of real-time processing in a straight-forward manner. For a classification approach to be used for monitoring, each breath cycle needs to be automatically segmented first, and isolated events need to be extracted. It is worth taking into account, however, that both approaches can be challenging in real life scenarios—as opposed to the controlled conditions normally used to extract data for algorithm development—due to the presence of strong acoustic artefacts that will corrupt the signal of interest [144].

对于自动症状监测和管理工具,可能需要实时的突发声音监测。实时处理技术的发展可以及时识别疾病及其严重程度的变化。这个功能很重要。例如,早期发现和预防病情恶化。检测方法可以直接使用,因为它通常在段级别上工作,允许以直接的方式开发实时处理。对于用于监测的分类方法,首先需要对每个呼吸周期进行自动分割,并提取孤立事件。然而,值得考虑的是,这两种方法在现实生活场景中都可能具有挑战性——与通常用于提取算法开发数据的受控条件相反——因为存在强烈的声学伪影,会破坏感兴趣的信号[144]。

Different sound types are related to different diagnoses. In the papers reviewed, a focus was given to wheeze and crackle analysis. A limited number of references used egophony, squawk, as well as pleural rub sounds in their analyses. It is also possible to perform analysis on how the adventitious sounds were generated, such as in [70].

不同的声音类型与不同的诊断有关。在回顾的论文中,重点是喘息和裂纹分析。在他们的分析中,有限数量的参考文献使用了自音、吱吱声以及胸膜摩擦声。也可以对非定音是如何产生的进行分析,如[70]。

Stethoscopes and microphones were generally used as the instrumentation to collect data for analysis. Several references also used data acquired from databases, which were mostly recorded using a digital stethoscope. Using a stethoscope for monitoring purposes may not be practical, as this is not a viable solution for continuous sensing. Using a microphone attached to the body, as in several references, would be a more desired approach, since this could potentially be done without disrupting the patient’s normal activities.

通常使用听诊器和麦克风作为收集数据进行分析的仪器。一些文献也使用了从数据库中获取的数据,这些数据大多是使用数字听诊器记录的。使用听诊器进行监测可能不实际,因为这不是连续传感的可行解决方案。在几个参考文献中,使用连接在身体上的麦克风将是一种更理想的方法,因为这可能在不干扰患者正常活动的情况下完成。

The number of sensors as well as positioning of those sensors in the reviewed literature, was also provided (in Table 4). The works which used stethoscopes as the instrument to collect data mostly performed data collection from multiple positions on the body. For a device to be non-intrusive and easy to use, it is important that the analysis is performed on a data obtained from a single location. This will also greatly increase the probabilities of patient’s compliance.

在文献综述中,我们还提供了传感器的数量以及传感器的位置(见表4)。使用听诊器作为数据采集工具的作品大多从身体的多个位置进行数据采集。对于非侵入性和易于使用的设备,重要的是对从单个位置获得的数据进行分析。这也将大大增加患者的依从性概率。

The positions which are used most often to place the sensors in the reviewed literature were the anterior and posterior chest wall. These locations are used in the conventional auscultation method. However, as discussed in the previous sections, the chest wall acts as a low-pass filter, which limits the frequency range of the sounds heard. Another problem is that sounds heard from the chest are limited at the expiration phase. This will reduce the amount of information which can be used for analysis. Collecting data from the trachea may, in some cases, be a better option as the dynamic range is wider, the sounds generated contain energy at higher frequencies, and the sound intensity is louder.

在回顾的文献中,最常用的位置是放置传感器的胸壁前后。这些位置用于常规听诊方法。然而,正如前几节所讨论的,胸壁作为一个低通滤波器,它限制了听到的声音的频率范围。另一个问题是,在呼气阶段,从胸部听到的声音是有限的。这将减少可用于分析的信息量。在某些情况下,从气管收集数据可能是一个更好的选择,因为动态范围更宽,产生的声音包含更高频率的能量,并且声音强度更大。

Obtaining data from different patients is also important, to be able to generalise the algorithms developed. Analysis performed using training and test sets from the same patients may cause an algorithm to be patient specific and reduce the generality of the model. Obtaining more data may also give more insight into the relevance or importance of the newly found features. It may also be useful to carry out research on whether the characteristics of adventitious sounds are, for example, population or disease severity specific.

从不同患者那里获得数据也很重要,以便能够推广所开发的算法。使用来自相同患者的训练集和测试集进行的分析可能会导致算法针对患者,并降低模型的通用性。获得更多的数据还可以对新发现的特征的相关性或重要性有更多的了解。研究不定音的特征是否是特定于人群或疾病严重程度的,也可能是有用的。

Machine learning techniques have gained a lot of interest and, as seen in the previous section, are used by most reported works. SVM and ANN variants were mostly used as classification methods. In these, it is important to find features that can differentiate between normal and abnormal segments for the detection or classification method to perform well. The complexity of a method is not only influenced by the type of detection or classification method used, but also by the complexity of the feature extraction. Using a high number of features may cause the detection or classification to over-fit the current data, resulting in the method not being generalisable in new data.

机器学习技术已经获得了很多兴趣,并且正如在前一节中所看到的,被大多数报道的作品所使用。支持向量机和人工神经网络变体是常用的分类方法。在这些情况下,重要的是要找到能够区分正常和异常片段的特征,以使检测或分类方法表现良好。一种方法的复杂性不仅受所使用的检测或分类方法的类型的影响,还受特征提取的复杂性的影响。使用大量的特征可能会导致检测或分类过度拟合当前数据,导致方法在新数据中无法推广。

Challenges and future works

Adventitious sounds monitoring is an integral part of the management of diseases such as asthma and COPD. Regular monitoring of lung function, and symptoms such as wheezes, crackles, cough, and breathlessness are needed for disease management, and could potentially be used for exacerbation prediction. However, continuous monitoring and management of adventitious sounds are challenging tasks to accomplish. Significant research is still needed to overcome these challenges. The focus of future work could be divided into several main categories, as follows.

非定音监测是哮喘和慢性阻塞性肺病等疾病管理的一个组成部分。定期监测肺功能和症状,如喘息、噼啪声、咳嗽和呼吸困难,是疾病管理所必需的,并可能用于恶化预测。然而,持续监测和管理外来声音是一项具有挑战性的任务。为了克服这些挑战,还需要进行大量的研究。今后工作的重点可分为以下几大类。

Algorithms for adventitious sound analysis could be improved further. Algorithms developed need to have a high accuracy to detect or classify adventitious sounds. More research could be carried out to find new features with high correlation with adventitious sounds characteristics; aiming to achieve high performance measures, even in real life scenarios in which the signals are going to be far more corrupted than those used in controlled experiments for algorithm development. Better signal to noise ratio could also improve analysis performance.

非定音分析算法可以进一步改进。开发的算法需要具有很高的准确性来检测或分类外来声音。可以进行更多的研究,以发现与外来音特征高度相关的新特征;旨在实现高性能测量,即使在现实生活场景中,信号将比算法开发中使用的控制实验中损坏得更严重。提高信噪比也能提高分析性能。

Most literature reviewed reported a high performance measure, but many of the works reported performance on CV sets instead of separate test sets. The problem stated in most published literature was lack of data, which caused LOOCV to be often used as a validation method. Performance measures obtained from cross-validation, especially those used for parameter tuning and model selection, can introduce high variance thus making the model unreliable [145–147]. In future works, particularly for machine learning based algorithms, it is recommended to report performance on a separate test set instead of a CV set. A separate test set contains new information not seen in model training and parameter optimisation and can give a more objective performance measure which will prevent over-fitting problems.

大多数文献回顾报告了一个高性能测量,但许多作品报告了CV集的性能,而不是单独的测试集。在大多数已发表的文献中指出的问题是缺乏数据,这导致LOOCV经常被用作验证方法。通过交叉验证获得的性能度量,特别是用于参数调整和模型选择的性能度量,可能会引入高方差,从而使模型不可靠[145-147]。在未来的工作中,特别是基于机器学习的算法,建议在单独的测试集而不是CV集上报告性能。单独的测试集包含模型训练和参数优化中未见的新信息,可以提供更客观的性能度量,从而防止过度拟合问题。

Increasing the performance of algorithms for adventitious sound analysis is important to assure the validity of the systems developed. Algorithm validity is important because doctors and patients tend to underestimate the severity of present symptoms [26]. With accurate detection of symptoms, the device developed could be used as a reference to the required treatment based on actual severity. This will ensure that the disease is properly treated and managed.

提高非定声分析算法的性能对于保证所开发系统的有效性是非常重要的。算法有效性很重要,因为医生和患者往往会低估当前症状的严重程度[26]。通过对症状的准确检测,开发的设备可以根据实际严重程度作为所需治疗的参考。这将确保该疾病得到适当治疗和管理。

Another important research focus should be on making a device that can be used by patients. There are several devices available to perform monitoring of symptoms and lung function at home, but these are mostly complex and large [25]. An optimum device should be portable and easy to use so that patient compliance in self-monitoring can be assured. In some cases, symptoms most often occur at night. Hence, an automated device that can continuously monitor symptoms without the need of expert interference is necessary. The size, number, and positioning of the sensors will also influence the usability. More complex systems will be harder to use, and hence the intended purpose may not be achieved. Newly developed devices also need to be non-intrusive so that they can be used without causing a disruption to daily activities.

另一个重要的研究重点应该是制造一种可供患者使用的设备。有几种设备可用于在家中监测症状和肺功能,但这些设备大多复杂而庞大[25]。最佳设备应便于携带和使用,以确保患者在自我监测中的依从性。在某些情况下,症状最常见于夜间。因此,一种能够在不需要专家干扰的情况下连续监测症状的自动化设备是必要的。传感器的大小、数量和位置也会影响可用性。更复杂的系统将更难使用,因此可能无法达到预期目的。新开发的设备也需要是非侵入性的,这样它们就可以在不干扰日常活动的情况下使用。

Using as foundations the detection and classification of adventitious sounds algorithms, new ones can be further developed to perform exacerbation prediction. Exacerbation prevention can help patients avoid worsening of conditions and adverse effects on the respiratory system.

以外来声音算法的检测和分类为基础,可以进一步开发新的算法来执行恶化预测。预防恶化可以帮助患者避免病情恶化和对呼吸系统的不良影响。

One of the main drawbacks of conventional auscultation is that it cannot be performed frequently [25]. As symptoms such as wheeze generally occur at night, an ideal device will be able to monitor these symptoms during the night. Power consumption issues need to be taken into account in future works, as well as the storage capacity in the device. The data could be processed so that only the results of symptoms monitoring are stored, or if possible, raw data can be saved for future reference.

传统听诊的主要缺点之一是不能经常听诊[25]。由于诸如喘息之类的症状通常发生在夜间,因此理想的设备将能够在夜间监测这些症状。在未来的工作中需要考虑功耗问题,以及设备中的存储容量。可以对数据进行处理,以便仅存储症状监测的结果,或者如果可能,可以保存原始数据以供将来参考。

Study limitations

The metrics used for this systematic review have been measure and comparison of accuracy.The main limitation of this study at the outcome level is that the data used in each published reference was different. Each work performed analysis on data from a different population and obtained with different collection methods. A standard validation and data management method has not been established; different methods were used across studies. Outcome measure definition also varied between different works. At the review level, the main limitation is the difficulty in assessing the quality of the different studies, as there is no standardised criterion yet.

用于本系统评价的指标是测量和比较准确性。本研究在结果水平上的主要局限性是每个发表的参考文献中使用的数据不同。每项工作都对来自不同人群的数据进行分析,并采用不同的收集方法获得数据。未建立标准的验证和数据管理方法;不同的研究使用了不同的方法。结果测量的定义在不同的研究中也有所不同。在审查层面,主要的限制是难以评估不同研究的质量,因为目前还没有标准化的标准。

Conclusion

This systematic review provided an introduction to the types of respiratory sounds and their analysis, with a focus on automatic adventitious sound detection or classification for disease monitoring and management.

本系统综述介绍了呼吸音的类型及其分析,重点介绍了用于疾病监测和管理的自动检测或分类的非定音。

The characteristics of normal and abnormal breath sounds, specifically adventitious sounds, were discussed. Several types of normal breath sounds based on their location were summarised. Adventitious sound definitions and characteristics were also reviewed. Diseases related to some of the adventitious sounds were briefly introduced.

讨论了正常呼吸音和异常呼吸音的特点,特别是非定音。总结了几种不同位置的正常呼吸音。本文还回顾了非定音的定义和特征。简要介绍了与一些异音有关的疾病。

References to algorithms development for adventitious sound detection or classification were also reviewed. For each paper the type of sound, approach, level of analysis, instrumentation, sensor number and positioning, total amount of data, features, methods, and performance were provided and summarised.

本文还回顾了有关非定音检测或分类算法发展的文献。对于每篇论文,提供并总结了声音的类型、方法、分析水平、仪器、传感器数量和定位、数据总量、特征、方法和性能。

Overall, based on the accuracy metric used in this systematic review, algorithms for automatic detection or classification of adventitious sounds achieved high agreement with the expert under controlled conditions. This makes automated adventitious sounds detection or classification a promising solution to overcome the limitations of conventional auscultation. Recommendations for future research and development would be:

• To pay increased attention to how to split the data for algorithm development in order to avoid under-fitting, over-fitting or patient specific results.

• To focus on increasing performance, ensuring usability and availability of sensors.

• To add functionality leading, for example, to exacerbations prediction.

• To carry out algorithms’ validation in real life use scenarios.

总体而言,基于本系统综述中使用的精度指标,在受控条件下,自动检测或分类非定音的算法与专家达到了高度一致。这使得自动化的外来声音检测或分类成为克服传统听诊限制的一个有希望的解决方案。对未来研究和发展的建议如下:

•更加关注如何分割数据进行算法开发,以避免欠拟合、过拟合或患者特定的结果。

•专注于提高性能,确保传感器的可用性和可用性。

•增加功能导致,例如,恶化预测。

•在实际使用场景中对算法进行验证。

版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/weixin_53937854/article/details/132889775

智能推荐

FTP命令字和返回码_ftp 登录返回230-程序员宅基地

文章浏览阅读3.5k次,点赞2次,收藏13次。为了从FTP服务器下载文件,需要要实现一个简单的FTP客户端。FTP(文件传输协议) 是 TCP/IP 协议组中的应用层协议。FTP协议使用字符串格式命令字,每条命令都是一行字符串,以“\r\n”结尾。客户端发送格式是:命令+空格+参数+"\r\n"的格式服务器返回格式是以:状态码+空格+提示字符串+"\r\n"的格式,代码只要解析状态码就可以了。读写文件需要登陆服务器,特殊用..._ftp 登录返回230

centos7安装rabbitmq3.6.5_centos7 安装rabbitmq3.6.5-程序员宅基地

文章浏览阅读648次。前提:systemctl stop firewalld 关闭防火墙关闭selinux查看getenforce临时关闭setenforce 0永久关闭sed-i'/SELINUX/s/enforcing/disabled/'/etc/selinux/configselinux的三种模式enforcing:强制模式,SELinux 运作中,且已经正确的开始限制..._centos7 安装rabbitmq3.6.5

idea导入android工程,idea怎样导入Android studio 项目?-程序员宅基地

文章浏览阅读5.8k次。满意答案s55f2avsx2017.09.05采纳率:46%等级:12已帮助:5646人新版Android Studio/IntelliJ IDEA可以直接导入eclipse项目,不再推荐使用eclipse导出gradle的方式2启动Android Studio/IntelliJ IDEA,选择 import project3选择eclipse 项目4选择 create project f..._android studio 项目导入idea 看不懂安卓项目

浅谈AI大模型技术:概念、发展和应用_ai大模型应用开发-程序员宅基地

文章浏览阅读860次,点赞2次,收藏6次。AI大模型技术已经在自然语言处理、计算机视觉、多模态交互等领域取得了显著的进展和成果,同时也引发了一系列新的挑战和问题,如数据质量、计算效率、知识可解释性、安全可靠性等。城市运维涉及到多个方面,如交通管理、环境监测、公共安全、社会治理等,它们需要处理和分析大量的多模态数据,如图像、视频、语音、文本等,并根据不同的场景和需求,提供合适的决策和响应。知识搜索有多种形式,如语义搜索、对话搜索、图像搜索、视频搜索等,它们可以根据用户的输入和意图,从海量的数据源中检索出最相关的信息,并以友好的方式呈现给用户。_ai大模型应用开发

非常详细的阻抗测试基础知识_阻抗实部和虚部-程序员宅基地

文章浏览阅读8.2k次,点赞12次,收藏121次。为什么要测量阻抗呢?阻抗能代表什么?阻抗测量的注意事项... ...很多人可能会带着一系列的问题来阅读本文。不管是数字电路工程师还是射频工程师,都在关注各类器件的阻抗,本文非常值得一读。全文13000多字,认真读完大概需要2小时。一、阻抗测试基本概念阻抗定义:阻抗是元器件或电路对周期的交流信号的总的反作用。AC 交流测试信号 (幅度和频率)。包括实部和虚部。​图1 阻抗的定义阻抗是评测电路、元件以及制作元件材料的重要参数。那么什么是阻抗呢?让我们先来看一下阻抗的定义。首先阻抗是一个矢量。通常,阻抗是_阻抗实部和虚部

小学生python游戏编程arcade----基本知识1_arcade语言 like-程序员宅基地

文章浏览阅读955次。前面章节分享试用了pyzero,pygame但随着想增加更丰富的游戏内容,好多还要进行自己编写类,从今天开始解绍一个新的python游戏库arcade模块。通过此次的《连连看》游戏实现,让我对swing的相关知识有了进一步的了解,对java这门语言也有了比以前更深刻的认识。java的一些基本语法,比如数据类型、运算符、程序流程控制和数组等,理解更加透彻。java最核心的核心就是面向对象思想,对于这一个概念,终于悟到了一些。_arcade语言 like

随便推点

【增强版短视频去水印源码】去水印微信小程序+去水印软件源码_去水印机要增强版-程序员宅基地

文章浏览阅读1.1k次。源码简介与安装说明:2021增强版短视频去水印源码 去水印微信小程序源码网站 去水印软件源码安装环境(需要材料):备案域名–服务器安装宝塔-安装 Nginx 或者 Apachephp5.6 以上-安装 sg11 插件小程序已自带解析接口,支持全网主流短视频平台,搭建好了就能用注:接口是公益的,那么多人用解析慢是肯定的,前段和后端源码已经打包,上传服务器之后在配置文件修改数据库密码。然后输入自己的域名,进入后台,创建小程序,输入自己的小程序配置即可安装说明:上传源码,修改data/_去水印机要增强版

verilog进阶语法-触发器原语_fdre #(.init(1'b0) // initial value of register (1-程序员宅基地

文章浏览阅读557次。1. 触发器是FPGA存储数据的基本单元2. 触发器作为时序逻辑的基本元件,官方提供了丰富的配置方式,以适应各种可能的应用场景。_fdre #(.init(1'b0) // initial value of register (1'b0 or 1'b1) ) fdce_osc (

嵌入式面试/笔试C相关总结_嵌入式面试笔试c语言知识点-程序员宅基地

文章浏览阅读560次。本该是不同编译器结果不同,但是尝试了g++ msvc都是先计算c,再计算b,最后得到a+b+c是经过赋值以后的b和c参与计算而不是6。由上表可知,将q复制到p数组可以表示为:*p++=*q++,*优先级高,先取到对应q数组的值,然后两个++都是在后面,该行运算完后执行++。在电脑端编译完后会分为text data bss三种,其中text为可执行程序,data为初始化过的ro+rw变量,bss为未初始化或初始化为0变量。_嵌入式面试笔试c语言知识点

57 Things I've Learned Founding 3 Tech Companies_mature-程序员宅基地

文章浏览阅读2.3k次。57 Things I've Learned Founding 3 Tech CompaniesJason Goldberg, Betashop | Oct. 29, 2010, 1:29 PMI’ve been founding andhelping run techn_mature

一个脚本搞定文件合并去重,大数据处理,可以合并几个G以上的文件_python 超大文本合并-程序员宅基地

文章浏览阅读1.9k次。问题:先讲下需求,有若干个文本文件(txt或者csv文件等),每行代表一条数据,现在希望能合并成 1 个文本文件,且需要去除重复行。分析:一向奉行简单原则,如无必要,绝不复杂。如果数据量不大,那么如下两条命令就可以搞定合并:cat a.txt >> new.txtcat b.txt >> new.txt……去重:cat new...._python 超大文本合并

支付宝小程序iOS端过渡页DFLoadingPageRootController分析_类似支付宝页面过度加载页-程序员宅基地

文章浏览阅读489次。这个过渡页是第一次打开小程序展示的,点击某个小程序前把手机的开发者->network link conditioner->enable & very bad network 就会在停在此页。比如《支付宝运动》这个小程序先看这个类的.h可以看到它继承于DTViewController点击左上角返回的方法- (void)back;#import "DTViewController.h"#import "APBaseLoadingV..._类似支付宝页面过度加载页

推荐文章

热门文章

相关标签