subject
Computers and Technology, 03.03.2020 01:28 196336

In word segmentation, you are given as input a string of alphabetical characters ([a-z]) without whitespace, and your goal is to insert spaces into this string such that the result is the most fluent according to the language model.

a. Consider the following greedy algorithm: Begin at the front of the string. Find the ending position for the next word that minimizes the language model cost. Repeat, beginning at the end of this chosen segment.
Show that this greedy search is suboptimal. In particular, provide an example input string on which the greedy approach would fail to find the lowest-cost segmentation of the input.
In creating this example, you are free to design the n-gram cost function (both the choice of n and the cost of any n-gram sequences) but costs must be positive and lower cost should indicate better fluency. Note that the cost function doesn't need to be explicitly defined. You can just point out the relative cost of different word sequences that are relevant to the example you provide. And your example should be based on a realistic English word sequence — don't simply use abstract symbols with designated costs.

b. Implement an algorithm that, unlike greedy, finds the optimal word segmentation of an input character sequence. Your algorithm will consider costs based simply on a unigram cost function.
Before jumping into code, you should think about how to frame this problem as a state-space search problem. How would you represent a state? What are the successors of a state? What are the state transition costs? (You don't need to answer these questions in your writeup.)
Fill in the member functions of the SegmentationProblem class and the segmentWords function. The argument unigramCost is a function that takes in a single string representing a word and outputs its unigram cost. You can assume that all the inputs would be in lower case. The function segmentWords should return the segmented sentence with spaces as delimiters, i. e. ' '.join(words).
For convenience, you can actually run python submission. py to enter a console in which you can type character sequences that will be segmented by your implementation of segmentWords. To request a segmentation, type seg mystring into the prompt. For example:

>> seg
Query (seg):
this is not my beautiful house

Console commands other than seg — namely ins and both — will be used for the upcoming parts of the assignment. Other commands that might help with debugging can be found by typing help at the prompt.
Hint: You are encouraged to refer to NumberLineSearchProblem and GridSearchProblem implemented in util. py for reference. They don't contribute to testing your submitted code but only serve as a guideline for what your code should look like.
Hint: the final actions that ucs (a UniformCostSearch object) takes can be accessed through ucs. actions.

ansver
Answers: 2

Other questions on the subject: Computers and Technology

image
Computers and Technology, 24.06.2019 03:00, greenhappypiggies
Using a conditional expression, write a statement that increments numusers if updatedirection is 1, otherwise decrements numusers. ex: if numusers is 8 and updatedirection is 1, numusers becomes 9; if updatedirection is 0, numusers becomes 7.
Answers: 1
image
Computers and Technology, 24.06.2019 12:00, tipbri6380
An npn transistor is correctly biased and turned on if the a. base is negative. b. collector is negative. c. collector is positive with respect to the emitter and negative with respect to the base. d. collector is the most positive lead followed by the base.
Answers: 1
image
Computers and Technology, 25.06.2019 03:00, reearamrup27
Match the categories in the first column with examples in the second column. 1. good for watching movies 2. maximum power with small size 3. older style mobile devices that may or may not have internet connectivity tablet computer a.)pda b.)smart phone c.)tablet computer
Answers: 1
image
Computers and Technology, 25.06.2019 06:20, joe7977
If you want to change the speed of a layer's horizontal scrolling, what should you change? a. the x coefficient b. the y coefficient c. the virtual width d. the order of the game's layers select the best answer from the choices provided
Answers: 2
You know the right answer?
In word segmentation, you are given as input a string of alphabetical characters ([a-z]) without whi...

Questions in other subjects:

Konu
English, 30.06.2019 01:00