subject
Mathematics, 12.11.2019 00:31 124319

For a given day i, we let yi = 1 if the ground-level ozone concentration near some city (houston, in our data) is at a dangerously high level. this is called an "ozone day". we let yi = 0if the ozone concentration is low enough to be considered safe. we want to predict yi from more easily measured "features" describing atmospheric pollutant levels and meteorological conditions (temperature, humidity, wind speed, there are a total of m = 72 of these features collected each day, which we denote by xi = {xij j = 1, m}. each feature xij e r is a real number, and we will thus use a gaussian distribution to model these continuous random variables. we will build a "naive bayes" classifier, which predicts observation i to be an ozone day if p(yį = 1| xi) > p(yi = 0 | x; ), and a non-ozone day otherwise.
using bayes rule, this classifier is equivalent to one that chooses y; = 1 if and only if py(1)fx|v(xi | 1) py(0)fx|v(xi|0) fx(x; ) fx(xi) in py(1) + in fxy(xi|1) > in py(0) + in fx|v(xi|0).

in this equation, py(yi) is the probability mass function that defines the prior probability of ozone and non-ozone days. the conditional probability density function fxy(ti yi) describes the distribution of the m = 72 environmental features, which we assume depends on the type of day. we make two simplifying assumptions about these densities: the features x; are conditionally independent given y, and their distributions are gaussian. thus:

м. fx y(xi| 1) = ii 1 exp{_ (tij – h1; 1.1 2007, exp{-"*"2013" }; fx/y(xi|0) = ii 1 17270 exp exp{-(tij – 203} given y; = 1, xij is gaussian with mean mi; and variance on. given y, = 0, xij is gaussian with mean mo and variance go. there are a total of 2m mean parameters and 2m variance parameters, since every feature xij has a distinct distribution for each of the two classes.

a) derive equations for in fxy (u; | 1) and in fxy(x; 0), the (natural) logarithms of the conditional probability density functions in equations (2,3). for numerical robustness, simplify your answer so that it does not involve the exponential function.

because ozone days are relatively rare, a classifier that always predicts yi = 0 would be correct over 95% of the time, but would obviously not be practically useful for reducing ozone hazard. to evaluate our classifiers, we will thus separately compute the numbers of false alarms (predictions of ozone days when in reality y = 0) and missed detections (predictions of non-ozone days when in reality y; = 1). we are willing to allow some false alarms as long as there are very few missed detections. for all parts below, assume that the mean parameters muje moj are set to match the mean of the empirical distribution of the training data. the demo code computes these means.

b) start by assuming the classes are equally probable (py(1) = py(0) = 1/2), and have unit variance (01 = 0; = 1). write code to compute the log conditional densities from part (a). then using equation (1), classify each test example. report your classification accuracy, and the numbers of false alarms and missed detections. hint: your classifer should have fewer than 10 missed detections.

c) rather than assuming features have variance one, set the variance parameters oli, equal to the variance of the empirical distribution of the training data. classify each test example using equation (1) with these variance estimates. report your classification accuracy, and the numbers of false alarms and missed detections.

d) rather than assuming the classes are equally probable, estimate py(1) as the fraction of training examples that are ozone days. classify each test example using equation (1) with this informative class prior, and the variances from part (c). report your classification accuracy, and the numbers of false alarms and missed detections.

ansver
Answers: 1

Other questions on the subject: Mathematics

image
Mathematics, 21.06.2019 15:40, pookie879
What term best describes a line and a point that lie in the same plane? a. congruent b. coplanar c. collinear d. equal
Answers: 1
image
Mathematics, 21.06.2019 16:40, shikiaanthony
You have 3 boxes, one "strawberries"; one "mentos" and one "mixed".but you know that all the labels are in incorrect order .how do you know witch is witch?
Answers: 1
image
Mathematics, 21.06.2019 19:00, 592400014353
The test scores of 32 students are listed below. construct a boxplot for the data set and include the values of the 5-number summary. 32 37 41 44 46 48 53 55 57 57 59 63 65 66 68 69 70 71 74 74 75 77 78 79 81 82 83 86 89 92 95 99
Answers: 1
image
Mathematics, 21.06.2019 19:50, gomez36495983
Raj encoded a secret phrase using matrix multiplication. using a = 1, b = 2, c = 3, and so on, he multiplied the clear text code for each letter by the matrix to get a matrix that represents the encoded text. the matrix representing the encoded text is . what is the secret phrase? determine the location of spaces after you decode the text. yummy is the corn the tomato is red the corn is yummy red is the tomato
Answers: 2
You know the right answer?
For a given day i, we let yi = 1 if the ground-level ozone concentration near some city (houston, in...

Questions in other subjects:

Konu
Social Studies, 01.11.2021 21:40
Konu
Mathematics, 01.11.2021 21:50
Konu
Mathematics, 01.11.2021 21:50