subject

Assume you have the following code

/* Accumulate in temporary */
void inner4(vec_ptr u, vec_ptr v, data_t *dest)
{
long int i;
int length = vec_length(u);
data_t *udata = get_vec_start(u);
data_t *vdata = get_vec_start(v);
data_t sum = (data_t) 0;
for (i = 0; i < length; i++) {
sum = sum + udata[i] * vdata[i];
}
*dest = sum;
}
and you modify the code to use 4-way loop unrolling and four parallel accumulators. Measurements for this function with the x86-64 architecture shows it achieves a CPE of 2.0 for all types of data.

Assuming the model of the Intel i7 architecture shown in class (one branch unit, two arithmetic units, one load and one store unit), the performance of this loop with any arithmetic operation can not get below 2.0 CPE because of Answerthe number of available registersthe number of available load unitsthe number of available integer unitsthe number of available floating point units.

When the same 4x4 code is compiled for the IA32 architecture, it achieves a CPE of 2.75, worse than the CPE of 2.25 achieved with just four-way unrolling. The mostly likely reason this occurs is because of Answerthe number of available registersthe number of available load unitsthe number of available integer unitsthe number of available floating point units.

ansver
Answers: 3

Other questions on the subject: Computers and Technology

image
Computers and Technology, 22.06.2019 06:00, isalita
Pthe price of tickets in a group when a purchased in bulk can be found with the equation c=px+24 were c is the cost, p is the number of people, and x is the price per ticket. what is price of of each ticket if it costs $189 to buy tickets for 15 people ? a $8 b $24c $9d $11 show work
Answers: 1
image
Computers and Technology, 22.06.2019 13:00, ajayfurlow
Which option should u select to ignore all tracked changes in a document
Answers: 1
image
Computers and Technology, 23.06.2019 00:30, Thisisdifinite
Which of the following would you find on a network
Answers: 3
image
Computers and Technology, 23.06.2019 04:00, terrell31
Write a method that takes in an array of point2d objects, and then analyzes the dataset to find points that are close together. be sure to review the point2d api. in your method, if the distance between any pair of points is less than 10, display the distance and the (x, y)s of each point. for example, "the distance between (3,5) and (8,9) is 6.40312." the complete api for the point2d adt may be viewed at ~pf/sedgewick-wayne/algs4/documenta tion/point2d. html (links to an external site.)links to an external site.. try to write your program directly from the api - do not review the adt's source code.
Answers: 1
You know the right answer?
Assume you have the following code

/* Accumulate in temporary */
void inner4(vec_p...

Questions in other subjects: