成a人片国产精品_色悠悠久久综合_国产精品美女久久久久久2018_日韩精品一区二区三区中文精品_欧美亚洲国产一区在线观看网站_中文字幕一区在线_粉嫩一区二区三区在线看_国产亚洲欧洲997久久综合_不卡一区在线观看_亚洲欧美在线aaa_久久99精品国产_欧美卡1卡2卡_国产精品你懂的_日韩精品91亚洲二区在线观看_国内一区二区视频_91丨国产丨九色丨pron

CS5012代做、代寫Python設(shè)計程序

時間:2024-03-03  來源:  作者: 我要糾錯



CS5012 Mark-Jan Nederhof Practical 1
Practical 1: Part of speech tagging:
three algorithms
This practical is worth 50% of the coursework component of this module. Its due
date is Wednesday 6th of March 2024, at 21:00. Note that MMS is the definitive source
for deadlines and weights.
The purpose of this assignment is to gain understanding of the Viterbi algorithm,
and its application to part-of-speech (POS) tagging. The Viterbi algorithm will be
related to two other algorithms.
You will also get to see the Universal Dependencies treebanks. The main purpose
of these treebanks is dependency parsing (to be discussed later in the module), but
here we only use their part-of-speech tags.
Getting started
We will be using Python3. On the lab (Linux) machines, you need the full path
/usr/local/python/bin/python3, which is set up to work with NLTK. (Plain
python3 won’t be able to find NLTK.)
If you run Python on your personal laptop, then next to NLTK (https://www.
nltk.org/), you will also need to install the conllu package (https://pypi.org/
project/conllu/).
To help you get started, download gettingstarted.py and the other Python
files, and the zip file with treebanks from this directory. After unzipping, run
/usr/local/python/bin/python3 gettingstarted.py. You may, but need not, use
parts of the provided code in your submission.
The three treebanks come from Universal Dependencies. If you are interested,
you can download the entire set of treebanks from https://universaldependencies.
org/.
1
Parameter estimation
First, we write code to estimate the transition probabilities and the emission probabilities of an HMM (Hidden Markov Model), on the basis of (tagged) sentences from
a training corpus from Universal Dependencies. Do not forget to involve the start-ofsentence marker ⟨s⟩ and the end-of-sentence marker ⟨/s⟩ in the estimation.
The code in this part is concerned with:
• counting occurrences of one part of speech following another in a training corpus,
• counting occurrences of words together with parts of speech in a training corpus,
• relative frequency estimation with smoothing.
As discussed in the lectures, smoothing is necessary to avoid zero probabilities for
events that were not witnessed in the training corpus. Rather than implementing a
form of smoothing yourself, you can for this assignment take the implementation of
Witten-Bell smoothing in NLTK (among the implementations of smoothing in NLTK,
this seems to be the most robust one). An example of use for emission probabilities is
in file smoothing.py; one can similarly apply smoothing to transition probabilities.
Three algorithms for POS tagging
Algorithm 1: eager algorithm
First, we implement a naive algorithm that chooses the POS tag for the i-th token
on the basis of the chosen (i − 1)-th tag and the i-th token. To be more precise, we
determine for each i = 1, . . . , n, in this order:
tˆi = argmax
ti
P(ti
| tˆi−1) · P(wi
| ti)
assuming tˆ0 is the start-of-sentence marker ⟨s⟩. Note that the end-of-sentence marker
⟨/s⟩ is not even used here.
Algorithm 2: Viterbi algorithm
Now we implement the Viterbi algorithm, which determines the sequence of tags for a
given sentence that has the highest probability. As discussed in the lectures, this is:
tˆ1 · · ·tˆn = argmax
t1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
2
where the tokens of the input sentence are w1 · · ·wn, and t0 = ⟨s⟩ and tn+1 = ⟨/s⟩ are
the start-of-sentence and end-of-sentence markers, respectively.
To avoid underflow for long sentences, we need to use log probabilities.
Algorithm 3: individually most probable tags
We now write code that determines the most probable part of speech for each token
individually. That is, for each i, computed is:
tˆi = argmax
ti
X
t1···ti−1ti+1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
To compute this effectively, we need to use forward and backward values, as discussed
in the lectures on the Baum-Welch algorithm, making use of the fact that the above is
equivalent to:
tˆi = argmax
ti
P
t1···ti−1
Qi
k=1 P(tk | tk−1) · P(wk | tk)

·
P
ti+1···tn
Qn
k=i+1 P(tk | tk−1) · P(wk | tk)

· P(tn+1 | tn)
The computation of forward values is very similar to the Viterbi algorithm, so you
may want to copy and change the code you already had, replacing statements that
maximise by corresponding statements that sum values together. Computation of
backward values is similar to computation of forward values.
See logsumexptrick.py for a demonstration of the use of log probabilities when
probabilities are summed, without getting underflow in the conversion from log probabilities to probabilities and back.
Evaluation
Next, we write code to determine the percentages of tags in a test corpus that are
guessed correctly by the above three algorithms. Run experiments for the training
and test corpora of the three included treebanks, and possibly for treebanks of more
languages (but not for more than 5; aim for quality rather than quantity). Compare
the performance of the three algorithms.
You get the best experience out of this practical if you also consider the languages of
the treebanks. What do you know (or what can you find out) about the morphological
and syntactic properties of these languages? Can you explain why POS tagging is more
difficult for some languages than for others?
3
Requirements
Submit your Python code and the report.
It should be possible to run your implementation of the three algorithms on the
three corpora simply by calling from the command line:
python3 p1.py
You may add further functionality, but then add a README file to explain how to run
that functionality. You should include the three treebanks needed to run the code, but
please do not include the entire set of hundreds of treebanks from Universal
Dependencies, because this would be a huge waste of disk space and band
width for the marker.
Marking is in line with the General Mark Descriptors (see pointers below). Evidence of an acceptable attempt (up to 7 marks) could be code that is not functional but
nonetheless demonstrates some understanding of POS tagging. Evidence of a reasonable attempt (up to 10 marks) could be code that implements Algorithm 1. Evidence
of a competent attempt addressing most requirements (up to 13 marks) could be fully
correct code in good style, implementing Algorithms 1 and 2 and a brief report. Evidence of a good attempt meeting nearly all requirements (up to 16 marks) could be
a good implementation of Algorithms 1 and 2, plus an informative report discussing
meaningful experiments. Evidence of an excellent attempt with no significant defects
(up to 18 marks) requires an excellent implementation of all three algorithms, and a
report that discusses thorough experiments and analysis of inherent properties of the
algorithms, as well as awareness of linguistic background discussed in the lectures. An
exceptional achievement (up to 20 marks) in addition requires exceptional understanding of the subject matter, evidenced by experiments, their analysis and reflection in
the report.
Hints
Even though this module is not about programming per se, a good programming style
is expected. Choose meaningful variable and function names. Break up your code into
small functions. Avoid cryptic code, and add code commenting where it is necessary for
the reader to understand what is going on. Do not overengineer your code; a relatively
simple task deserves a relatively simple implementation.
You cannot use any of the POS taggers already implemented in NLTK. However,
you may use general utility functions in NLTK such as ngrams from nltk.util, and
FreqDist and WittenBellProbDist from nltk.
4
When you are reporting the outcome of experiments, the foremost requirement is
reproducibility. So if you give figures or graphs in your report, explain precisely what
you did, and how, to obtain those results.
Considering current class sizes, please be kind to your marker, by making their task
as smooth as possible:
• Go for quality rather than quantity. We are looking for evidence of understanding
rather than for lots of busywork. Especially understanding of language and how
language works from the perpective of the HMM model is what this practical
should be about.
• Avoid Python virtual environments. These blow up the size of the files that
markers need to download. If you feel the need for Python virtual environments,
then you are probably overdoing it, and mistake this practical for a software
engineering project, which it most definitely is not. The code that you upload
would typically consist of three or four .py files.
• You could use standard packages such as numpy or pandas, which the marker will
likely have installed already, but avoid anything more exotic. Assume a version
of Python3 that is the one on the lab machines or older; the marker may not
have installed the latest bleeding-edge version yet.
• We strongly advise against letting the report exceed 10 pages. We do not expect
an essay on NLP or the history of the Viterbi algorithm, or anything of the sort.
• It is fine to include a couple of graphs and tables in the report, but don’t overdo
it. Plotting accuracy against any conceivable hyperparameter, just for the sake
of producing lots of pretty pictures, is not what we are after.
請加QQ:99515681  郵箱:99515681@qq.com   WX:codehelp 

標(biāo)簽:

掃一掃在手機打開當(dāng)前頁
  • 上一篇:代做CS252編程、代寫C++設(shè)計程序
  • 下一篇:AcF633代做、Python設(shè)計編程代寫
  • 無相關(guān)信息
    昆明生活資訊

    昆明圖文信息
    蝴蝶泉(4A)-大理旅游
    蝴蝶泉(4A)-大理旅游
    油炸竹蟲
    油炸竹蟲
    酸筍煮魚(雞)
    酸筍煮魚(雞)
    竹筒飯
    竹筒飯
    香茅草烤魚
    香茅草烤魚
    檸檬烤魚
    檸檬烤魚
    昆明西山國家級風(fēng)景名勝區(qū)
    昆明西山國家級風(fēng)景名勝區(qū)
    昆明旅游索道攻略
    昆明旅游索道攻略
  • NBA直播 短信驗證碼平臺 幣安官網(wǎng)下載 歐冠直播 WPS下載

    關(guān)于我們 | 打賞支持 | 廣告服務(wù) | 聯(lián)系我們 | 網(wǎng)站地圖 | 免責(zé)聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網(wǎng) 版權(quán)所有
    ICP備06013414號-3 公安備 42010502001045

    成a人片国产精品_色悠悠久久综合_国产精品美女久久久久久2018_日韩精品一区二区三区中文精品_欧美亚洲国产一区在线观看网站_中文字幕一区在线_粉嫩一区二区三区在线看_国产亚洲欧洲997久久综合_不卡一区在线观看_亚洲欧美在线aaa_久久99精品国产_欧美卡1卡2卡_国产精品你懂的_日韩精品91亚洲二区在线观看_国内一区二区视频_91丨国产丨九色丨pron
    国产成人啪免费观看软件 | 国产精品久久久久久一区二区三区 | 成人动漫一区二区在线| 91精品国产综合久久精品| 亚洲久草在线视频| 成人免费的视频| 久久久久久久国产精品影院| 日韩中文字幕亚洲一区二区va在线 | 久久精品国产精品青草| 欧美亚洲图片小说| 成人欧美一区二区三区小说| 国产精品综合视频| 精品欧美一区二区三区精品久久| 亚洲国产cao| 日本高清成人免费播放| 亚洲丝袜自拍清纯另类| 成人av综合在线| 欧美国产精品一区二区三区| 黑人巨大精品欧美一区| 日韩欧美专区在线| 免费观看91视频大全| 91精品国产黑色紧身裤美女| 午夜a成v人精品| 欧美日韩1区2区| 香蕉乱码成人久久天堂爱免费| 91精品1区2区| 亚洲综合精品久久| 欧美综合久久久| 亚洲自拍都市欧美小说| 欧洲一区二区三区在线| 一级日本不卡的影视| 在线观看中文字幕不卡| 亚洲蜜臀av乱码久久精品蜜桃| 不卡av免费在线观看| 国产精品蜜臀av| 99综合电影在线视频| 中文字幕亚洲一区二区va在线| 成人国产视频在线观看| 中文字幕一区三区| 91年精品国产| 亚洲综合av网| 666欧美在线视频| 麻豆久久久久久| 精品盗摄一区二区三区| 国产精品69久久久久水密桃| 欧美韩国日本不卡| 91丝袜美腿高跟国产极品老师| 亚洲色图在线播放| 欧美三级在线看| 日韩在线一二三区| 精品国产乱码久久久久久1区2区| 国内不卡的二区三区中文字幕| 久久久久国产精品免费免费搜索| 风间由美一区二区三区在线观看| 欧美高清在线一区二区| 色又黄又爽网站www久久| 亚洲国产日日夜夜| 91精品国产91久久久久久一区二区 | 亚洲视频资源在线| 欧美主播一区二区三区美女| 视频一区中文字幕国产| 日韩三级.com| 国产成都精品91一区二区三| 亚洲天堂免费看| 欧美日韩精品福利| 久热成人在线视频| 欧美极品xxx| 在线免费av一区| 欧美aaaaa成人免费观看视频| 欧美精品一区二区久久婷婷| 成人综合婷婷国产精品久久蜜臀| 亚洲欧美日韩国产成人精品影院| 欧美军同video69gay| 国产在线观看免费一区| 国产精品第一页第二页第三页| 在线观看视频91| 九色|91porny| 亚洲人成网站精品片在线观看| 国产综合久久久久影院| 国产精品国模大尺度视频| 欧美日韩国产大片| 国产盗摄视频一区二区三区| 一区二区国产盗摄色噜噜| 日韩欧美一卡二卡| 92精品国产成人观看免费 | 欧美亚洲精品一区| 蜜臀av性久久久久蜜臀aⅴ四虎| 欧美国产亚洲另类动漫| 欧美日韩午夜精品| 国产伦理精品不卡| 亚洲综合在线观看视频| 久久综合99re88久久爱| 在线国产电影不卡| 国产精品一区二区久激情瑜伽| 夜夜嗨av一区二区三区四季av| 国产日韩欧美综合一区| 欧美性猛交xxxx黑人交| 国产精品一区二区在线观看网站| 亚洲综合视频在线| 国产日韩欧美麻豆| 在线播放日韩导航| 91香蕉视频污在线| 九九九精品视频| 亚洲综合一区二区三区| 国产日韩一级二级三级| 这里只有精品电影| 91免费国产在线观看| 国产在线国偷精品产拍免费yy| 亚洲综合色网站| 国产精品久久久久久久久搜平片| 欧美一区二区大片| 欧美在线一二三四区| 国产成都精品91一区二区三| 强制捆绑调教一区二区| 一区二区三区四区不卡视频| 国产欧美日韩不卡免费| 欧美电视剧在线看免费| 欧美色视频在线观看| 暴力调教一区二区三区| 国内精品久久久久影院薰衣草| 亚洲mv在线观看| 亚洲女同ⅹxx女同tv| 国产欧美精品国产国产专区| 一区二区国产盗摄色噜噜| 国产精品私房写真福利视频| 精品少妇一区二区三区免费观看| 欧美视频一区二| 91麻豆国产香蕉久久精品| 国产91精品露脸国语对白| 久久99精品久久久久久久久久久久 | 国产亚洲va综合人人澡精品| 欧美一区二区三区视频免费 | 国产亚洲一二三区| 日韩视频免费观看高清完整版| 在线中文字幕不卡| 成人深夜视频在线观看| 国产综合色在线视频区| 日韩电影一区二区三区| 亚洲一二三区在线观看| 亚洲欧美激情视频在线观看一区二区三区 | 国产婷婷色一区二区三区在线| 在线不卡一区二区| 欧美综合一区二区| 一本久久a久久精品亚洲| 成人av资源站| 成人aa视频在线观看| 国产风韵犹存在线视精品| 国产在线观看免费一区| 久久精品国产精品亚洲精品| 无码av中文一区二区三区桃花岛| 亚洲一区二区在线视频| 亚洲女厕所小便bbb| 亚洲色图色小说| 国产精品成人一区二区艾草| 亚洲国产精品av| 国产午夜精品久久久久久免费视| 精品国产伦一区二区三区观看方式 | 日韩三级视频中文字幕| 欧美高清视频一二三区| 欧美三级资源在线| 欧美视频中文字幕| 欧美日韩日日骚| 欧美三电影在线| 欧美日韩一区二区三区视频| 欧美亚洲综合另类| 欧美性videosxxxxx| 在线欧美小视频| 在线看不卡av| 欧美伊人久久大香线蕉综合69 | 3d动漫精品啪啪1区2区免费| 欧美日韩国产综合一区二区| 欧美午夜在线一二页| 在线免费观看日本欧美| 欧美日韩一区二区三区四区| 欧美日韩国产综合视频在线观看 | 久久久久9999亚洲精品| 2021国产精品久久精品| 久久久精品tv| 国产女人18毛片水真多成人如厕| 久久九九久久九九| 国产亲近乱来精品视频| 国产精品另类一区| 亚洲人被黑人高潮完整版| 一区二区三区欧美久久| 亚洲成人一区二区在线观看| 婷婷综合五月天| 免费av网站大全久久| 韩国精品免费视频| 国产精品一区二区三区乱码 | 日韩精品一区二| 久久久精品国产免费观看同学| 欧美激情资源网| 亚洲视频一二三| 亚洲国产日韩在线一区模特| 蜜桃视频免费观看一区| 国产麻豆午夜三级精品| 99在线视频精品| 欧美日韩在线不卡| 精品电影一区二区| 欧美国产欧美综合|