成a人片国产精品_色悠悠久久综合_国产精品美女久久久久久2018_日韩精品一区二区三区中文精品_欧美亚洲国产一区在线观看网站_中文字幕一区在线_粉嫩一区二区三区在线看_国产亚洲欧洲997久久综合_不卡一区在线观看_亚洲欧美在线aaa_久久99精品国产_欧美卡1卡2卡_国产精品你懂的_日韩精品91亚洲二区在线观看_国内一区二区视频_91丨国产丨九色丨pron

COM6511代寫(xiě)、Python語(yǔ)言編程代做

時(shí)間:2024-05-09  來(lái)源:  作者: 我要糾錯(cuò)



COM4511/COM6511 Speech Technology - Practical Exercise -
Keyword Search
Anton Ragni
Note that for any module assignment full marks will only be obtained for outstanding performance that
goes well beyond the questions asked. The marks allocated for each assignment are 20%. The marks will be
assigned according to the following general criteria. For every assignment handed in:
1. Fulfilling the basic requirements (5%)
Full marks will be given to fulfilling the work as described, in source code and results given.
2. Submitting high quality documentation (5%)
Full marks will be given to a write-up that is at the highest standard of technical writing and illustration.
3. Showing good reasoning (5%) Full marks will be given if the experiments and the outcomes are explained to the best standard.
4. Going beyond what was asked (5%)
Full marks will be given for interesting ideas on how to extend work that are well motivated and
described.
1 Background
The aim of this task is to build and investigate the simplest form of a keyword search (KWS) system allowing to find information
in large volumes of spoken data. Figure below shows an example of a typical KWS system which consists of an index and
a search module. The index provides a compact representation of spoken data. Given a set of keywords, the search module
Search Results
Index
Key− words
queries the index to retrieve all possible occurrences ranked according to likelihood. The quality of a KWS is assessed based
on how accurately it can retrieve all true occurrences of keywords.
A number of index representations have been proposed and examined for KWS. Most popular representations are derived
from the output of an automatic speech recognition (ASR) system. Various forms of output have been examined. These differ
in terms of the amount of information retained regarding the content of spoken data. The simplest form is the most likely word
sequence or 1-best. Additional information such as start and end times, and recognition confidence may also be provided for
each word. Given a collection of 1-best sequences, the following index can be constructed
w1 (f1,1, s1,1, e1,1) . . . (f1,n1 , s1,n1 , e1,n1 )
w2 (f1,1, s1,1, e1,1) . . . (f1,n1 , s1,n1 , e1,n1 )
.
.
.
wN (fN,1, sN,1, eN,1) . . . (fN,nN , sN,nN , eN,nN )
(1)
1
where wi is a word, ni is the number of times word wi occurs, fi,j is a file where word wi occurs for the j-th time, si,j and ei,j
is the start and end time. Searching such index for single word keywords can be as simple as finding the correct row (e.g. k)
and returning all possible tuples (fk,1, sk,1, ek,1), . . ., (fk,nk , sk,nk , ek,nk ).
The search module is expected to retrieve all possible keyword occurrences. If ASR makes no mistakes such module
can be created rather trivially. To account for possible retrieval errors, the search module provides each potential occurrence
with a relevance score. Relevance scores reflect confidence in a given occurrence being relevant. Occurrences with extremely
low relevance scores may be eliminated. If these scores are accurate each eliminated occurrence will decrease the number of
false alarms. If not then the number of misses will increase. What exactly an extremely low score is may not be very easy
to determine. Multiple factors may affect a relevance score: confidence score, duration, word confusability, word context,
keyword length. Therefore, simple relevance scores, such as those based on confidence scores, may have a wide dynamic range
and may be incomparable across different keywords. In order to ensure that relevance scores are comparable among different
keywords they need to be calibrated. A simple calibration scheme is called sum-to-one (STO) normalisation
rˆi,j = r
γ
 
i,j
ni
k=1 r
γ
i,k
(2)
where ri,j is an original relevance score for the j-th occurrence of the i-th keyword, γ is a scale enabling to either sharpen or
flatten the distribution of relevance scores. More complex schemes have also been examined. Given a set of occurrences with
associated relevance scores, there are several options available for eliminating spurious occurrences. One popular approach
is thresholding. Given a global or keyword specific threshold any occurrence falling under is eliminated. Simple calibration
schemes such as STO require thresholds to be estimated on a development set and adjusted to different collection sizes. More
complex approaches such as Keyword Specific Thresholding (KST) yield a fixed threshold across different keywords and
collection sizes.
Accuracy of KWS systems can be assessed in multiple ways. Standard approaches include precision (proportion of relevant retrieved occurrences among all retrieved occurrences) and recall (proportion of relevant retrieved occurrences among all
relevant occurrences), mean average precision and term weighted value. A collection of precision and recall values computed
for different thresholds yields a precision-recall (PR) curve. The area under PR curve (AUC) provides a threshold independent summative statistics for comparing different retrieval approaches. The mean average precision (mAP) is another popular,
threshold-independent, precision based metric. Consider a KWS system returning 3 correct and 4 incorrect occurrences arranged according to relevance score as follows: ✓ , ✗ , ✗ , ✓ , ✓ , ✗ , ✗ , where ✓ stands for correct occurrence and ✗ stands
for incorrect occurrence. The average precision at each rank (from 1 to 7) is 1
1 , 0
2 , 0
3 , 2
4 , 3
5 , 0
6 , 0
7 . If the number of true correct
occurrences is 3, the mean average precision for this keyword 0.7. A collection-level mAP can be computed by averaging
keyword specific mAPs. Once a KWS system operates at a reasonable AUC or mAP level it is possible to use term weighted
value (TWV) to assess accuracy of thresholding. The TWV is defined by
TWV(K, θ) = 1 −
 
1
|K|
 
k∈K
Pmiss(k, θ) + βPfa(k, θ)
 
(3)
where k ∈ K is a keyword, Pmiss and Pfa are probabilities of miss and false alarm, β is a penalty assigned to false alarms.
These probabilities can be computed by
Pmiss(k, θ) = Nmiss(k, θ)
Ncorrect(k) (4)
Pfa(k, θ) = Nfa(k, θ)
Ntrial(k) (5)
where N<event> is a number of events. The number of trials is given by
Ntrial(k) = T − Ncorrect(k) (6)
where T is the duration of speech in seconds.
2 Objective
Given a collection of 1-bests, write a code that retrieves all possible occurrences of keyword list provided. Describe the search
process including index format, handling of multi-word keywords, criterion for matching, relevance score calibration and
threshold setting methodology. Write a code to assess retrieval performance using reference transcriptions according to AUC,
mAP and TWV criteria using β = 20. Comment on the difference between these criteria including the impact of parameter β.
Start and end times of hypothesised occurrences must be within 0.5 seconds of true occurrences to be considered for matching.
2
3 Marking scheme
Two critical elements are assessed: retrieval (65%) and assessment (35%). Note: Even if you cannot complete this task as a
whole you can certainly provide a description of what you were planning to accomplish.
1. Retrieval
1.1 Index Write a code that can take provided CTM files (and any other file you deem relevant) and create indices in
your own format. For example, if Python language is used then the execution of your code may look like
python index.py dev.ctm dev.index
where dev.ctm is an CTM file and dev.index is an index.
Marks are distributed based on handling of multi-word keywords
• Efficient handling of single-word keywords
• No ability to handle multi-word keywords
• Inefficient ability to handle multi-word keywords
• Or efficient ability to handle multi-word keywords
1.2 Search Write a code that can take the provided keyword file and index file (and any other file you deem relevant)
and produce a list of occurrences for each provided keyword. For example, if Python language is used then the
execution of your code may look like
python search.py dev.index keywords dev.occ
where dev.index is an index, keywords is a list of keywords, dev.occ is a list of occurrences for each
keyword.
Marks are distributed based on handling of multi-word keywords
• Efficient handling of single-word keywords
• No ability to handle multi-word keywords
• Inefficient ability to handle multi-word keywords
• Or efficient ability to handle multi-word keywords
1.3 Description Provide a technical description of the following elements
• Index file format
• Handling multi-word keywords
• Criterion for matching keywords to possible occurrences
• Search process
• Score calibration
• Threshold setting
2. Assessment Write a code that can take the provided keyword file, the list of found keyword occurrences and the corresponding reference transcript file in STM format and compute the metrics described in the Background section. For
instance, if Python language is used then the execution of your code may look like
python <metric>.py keywords dev.occ dev.stm
where <metric> is one of precision-recall, mAP and TWV, keywords is the provided keyword file, dev.occ is the
list of found keyword occurrences and dev.stm is the reference transcript file.
Hint: In order to simplify assessment consider converting reference transcript from STM file format to CTM file format.
Using indexing and search code above obtain a list of true occurrences. The list of found keyword occurrences then can
be assessed more easily by comparing it with the list of true occurrences rather than the reference transcript file in STM
file format.
2.1 Implementation
• AUC Integrate an existing implementation of AUC computation into your code. For example, for Python
language such implementation is available in sklearn package.
• mAP Write your own implementation or integrate any freely available.
3
• TWV Write your own implementation or integrate any freely available.
2.2 Description
• AUC Plot precision-recall curve. Report AUC value . Discuss performance in the high precision and low
recall area. Discuss performance in the high recall and low precision area. Suggest which keyword search
applications might be interested in a good performance specifically in those two areas (either high precision
and low recall, or high recall and low precision).
• mAP Report mAP value. Report mAP value for each keyword length (1-word, 2-words, etc.). Compare and
discuss differences in mAP values.
• TWV Report TWV value. Report TWV value for each keyword length (1-word, 2-word, etc.). Compare and
discuss differences in TWV values. Plot TWV values for a range of threshold values. Report maximum TWV
value or MTWV. Report actual TWV value or ATWV obtained with a method used for threshold selection.
• Comparison Describe the use of AUC, mAP and TWV in the development of your KWS approach. Compare
these metrics and discuss their advantages and disadvantages.
4 Hand-in procedure
All outcomes, however complete, are to be submitted jointly in a form of a package file (zip/tar/gzip) that includes
directories for each task which contain the associated required files. Submission will be performed via MOLE.
5 Resources
Three resources are provided for this task:
• 1-best transcripts in NIST CTM file format (dev.ctm,eval.ctm). The CTM file format consists of multiple records
of the following form
<F> <H> <T> <D> <W> <C>
where <F> is an audio file name, <H> is a channel, <T> is a start time in seconds, <D> is a duration in seconds, <W> is a
word, <C> is a confidence score. Each record corresponds to one recognised word. Any blank lines or lines starting with
;; are ignored. An excerpt from a CTM file is shown below
7654 A 11.34 0.2 YES 0.5
7654 A 12.00 0.34 YOU 0.7
7654 A 13.30 0.5 CAN 0.1
• Reference transcript in NIST STM file format (dev.stm, eval.stm). The STM file format consists of multiple records
of the following form
<F> <H> <S> <T> <E> <L> <W>...<W>
where <S> is a speaker, <E> is an end time, <L> topic, <W>...<W> is a word sequence. Each record corresponds to
one manually transcribed segment of audio file. An excerpt from a STM file is shown below
2345 A 2345-a 0.10 2.03 <soap> uh huh yes i thought
2345 A 2345-b 2.10 3.04 <soap> dog walking is a very
2345 A 2345-a 3.50 4.59 <soap> yes but it’s worth it
Note that exact start and end times for each word are not available. Use uniform segmentation as an approximation. The
duration of speech in dev.stm and eval.stm is estimated to be 57474.2 and 25694.3 seconds.
• Keyword list keywords. Each keyword contains one or more words as shown below
請(qǐng)加QQ:99515681  郵箱:99515681@qq.com   WX:codinghelp




















 

標(biāo)簽:

掃一掃在手機(jī)打開(kāi)當(dāng)前頁(yè)
  • 上一篇:EBU6304代寫(xiě)、Java編程設(shè)計(jì)代做
  • 下一篇:COM4511代做、代寫(xiě)Python設(shè)計(jì)編程
  • 無(wú)相關(guān)信息
    昆明生活資訊

    昆明圖文信息
    蝴蝶泉(4A)-大理旅游
    蝴蝶泉(4A)-大理旅游
    油炸竹蟲(chóng)
    油炸竹蟲(chóng)
    酸筍煮魚(yú)(雞)
    酸筍煮魚(yú)(雞)
    竹筒飯
    竹筒飯
    香茅草烤魚(yú)
    香茅草烤魚(yú)
    檸檬烤魚(yú)
    檸檬烤魚(yú)
    昆明西山國(guó)家級(jí)風(fēng)景名勝區(qū)
    昆明西山國(guó)家級(jí)風(fēng)景名勝區(qū)
    昆明旅游索道攻略
    昆明旅游索道攻略
  • NBA直播 短信驗(yàn)證碼平臺(tái) 幣安官網(wǎng)下載 歐冠直播 WPS下載

    關(guān)于我們 | 打賞支持 | 廣告服務(wù) | 聯(lián)系我們 | 網(wǎng)站地圖 | 免責(zé)聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網(wǎng) 版權(quán)所有
    ICP備06013414號(hào)-3 公安備 42010502001045

    成a人片国产精品_色悠悠久久综合_国产精品美女久久久久久2018_日韩精品一区二区三区中文精品_欧美亚洲国产一区在线观看网站_中文字幕一区在线_粉嫩一区二区三区在线看_国产亚洲欧洲997久久综合_不卡一区在线观看_亚洲欧美在线aaa_久久99精品国产_欧美卡1卡2卡_国产精品你懂的_日韩精品91亚洲二区在线观看_国内一区二区视频_91丨国产丨九色丨pron
    91黄色小视频| 欧美大度的电影原声| 在线精品视频一区二区三四| 日韩美一区二区三区| 亚洲免费在线视频| 国产福利精品一区| 91精品国产入口| 一区二区高清在线| 成人激情综合网站| 久久众筹精品私拍模特| 婷婷久久综合九色国产成人| 91免费观看国产| 国产人成亚洲第一网站在线播放 | 亚洲国产婷婷综合在线精品| 成人免费视频视频在线观看免费 | 精品国产3级a| 无码av中文一区二区三区桃花岛| av在线一区二区三区| 国产拍揄自揄精品视频麻豆| 精品一区精品二区高清| 日韩免费性生活视频播放| 偷拍一区二区三区四区| 欧美日韩国产系列| 亚洲一区二区高清| 色婷婷国产精品| 一区视频在线播放| 99久久99久久精品国产片果冻| 国产精品久久三区| 国产激情一区二区三区四区| 久久亚洲一区二区三区明星换脸| 毛片av中文字幕一区二区| 欧美精品一卡两卡| 亚洲3atv精品一区二区三区| 欧美日韩免费不卡视频一区二区三区 | 日韩一区和二区| 奇米色777欧美一区二区| 欧美猛男男办公室激情| 五月激情丁香一区二区三区| 欧美久久久久中文字幕| 日日摸夜夜添夜夜添精品视频 | 欧美一区二区三区精品| 日欧美一区二区| 日韩一区二区在线播放| 久久精品国产亚洲一区二区三区| 日韩免费视频线观看| 国产麻豆午夜三级精品| 国产精品区一区二区三| 99久久国产免费看| 亚洲国产视频一区| 欧美一级二级在线观看| 国产专区欧美精品| 国产精品系列在线| 色综合中文字幕| 午夜伦理一区二区| 精品女同一区二区| 成人免费视频一区二区| 亚洲免费伊人电影| 欧美精品高清视频| 国产尤物一区二区| 日韩一区在线看| 欧美性受xxxx黑人xyx性爽| 日韩成人一级片| 久久久久久久综合| 91啪在线观看| 丝袜国产日韩另类美女| 2017欧美狠狠色| 99精品视频在线播放观看| 一区二区三区欧美久久| 欧美一区二区在线视频| 国产乱码精品一区二区三区忘忧草| 欧美国产精品一区二区| 色av成人天堂桃色av| 日韩电影免费一区| 久久久99精品久久| 色哟哟一区二区在线观看| 日本不卡视频在线观看| 久久久久99精品一区| 91色porny在线视频| 日韩av电影天堂| 国产精品欧美久久久久一区二区 | 国产亚洲欧美在线| 97aⅴ精品视频一二三区| 天堂久久一区二区三区| 国产午夜精品一区二区| 欧美亚洲综合色| 国模无码大尺度一区二区三区| 亚洲图片激情小说| 欧美一级免费观看| 99re6这里只有精品视频在线观看| 亚洲v中文字幕| 国产日韩三级在线| 欧美日韩激情在线| 亚洲精品一区二区三区蜜桃下载 | 青青青伊人色综合久久| 欧美激情中文字幕| 欧美日韩一区二区三区视频| 国产一区二三区| 亚洲影院理伦片| 久久久精品黄色| 欧美色图激情小说| 丁香天五香天堂综合| 亚瑟在线精品视频| 中文av一区特黄| 日韩美女一区二区三区| 一本一道波多野结衣一区二区| 精品一区中文字幕| 亚洲一区二区三区国产| 国产欧美日韩亚州综合| 9191成人精品久久| av午夜精品一区二区三区| 美女视频网站黄色亚洲| 亚洲免费观看高清完整版在线观看 | 91久久久免费一区二区| 国产在线精品一区二区不卡了 | 国产乱码精品一区二区三区忘忧草 | 久久精品一区四区| 欧美日韩高清在线播放| aaa欧美日韩| 国产揄拍国内精品对白| 视频在线观看国产精品| 亚洲人xxxx| 国产欧美一区二区三区在线老狼| 欧美电影一区二区| 99re在线视频这里只有精品| 国产美女久久久久| 日本va欧美va精品发布| 亚洲一区av在线| 中文字幕一区在线观看视频| 久久综合九色综合欧美就去吻 | 亚洲精品一区二区三区福利| 欧美午夜一区二区三区免费大片| 国产91丝袜在线播放0| 麻豆国产精品视频| 视频一区中文字幕国产| 一区二区激情视频| 综合久久综合久久| 国产蜜臀97一区二区三区| 精品国产亚洲在线| 欧美一级高清大全免费观看| 欧美日韩第一区日日骚| 在线区一区二视频| 色婷婷久久久亚洲一区二区三区| 成人午夜在线免费| 国产精品资源网| 国内精品免费在线观看| 日韩高清在线不卡| 亚洲18色成人| 香蕉加勒比综合久久| 亚洲制服丝袜av| 一区二区在线观看视频在线观看| 国产精品全国免费观看高清| 国产日韩av一区二区| 久久理论电影网| 久久久欧美精品sm网站 | 另类小说视频一区二区| 日韩高清国产一区在线| 日韩激情中文字幕| 日韩精品欧美成人高清一区二区| 午夜激情一区二区| 日韩精品一二三| 男人的天堂久久精品| 青青草97国产精品免费观看 | 成人中文字幕电影| 成人午夜电影小说| 成人av在线影院| 99r国产精品| 色哟哟国产精品| 欧洲精品视频在线观看| 欧美性色欧美a在线播放| 欧美日韩一区高清| 欧美日韩不卡一区二区| 3d动漫精品啪啪一区二区竹菊| 91精品综合久久久久久| 日韩午夜精品电影| 精品成人一区二区三区| 久久久综合激的五月天| 国产日韩欧美麻豆| 亚洲欧洲另类国产综合| 亚洲欧美日韩久久| 亚洲一区视频在线| 日本怡春院一区二区| 麻豆成人久久精品二区三区红 | 欧美日韩国产综合一区二区| 在线不卡免费av| 日韩精品一区在线| 久久久精品tv| 国产精品盗摄一区二区三区| 亚洲免费伊人电影| 午夜精彩视频在线观看不卡| 蜜桃视频在线观看一区二区| 国产在线观看免费一区| 本田岬高潮一区二区三区| 色狠狠桃花综合| 欧美精品xxxxbbbb| 精品国产免费人成电影在线观看四季 | 亚洲欧美综合色| 亚洲福利一二三区| 精品一区二区三区香蕉蜜桃| 成人免费毛片a| 在线看日本不卡|