Search for question
Question

Given:

64 bit logical address

Page size=4KB

Page table entry size=4 byte

Bonus #8

Find how many levels of hierarchical paging needed. Follow the same way as in the attached

sample exercise.

Fig: 1


Most Viewed Questions Of Computer Organisation And Architecture

5. (16 points) Consider deadlock avoidance with single resources. Assuming we have 4 process P1,P2, P3, P4. and 4 resources r1, r2, r3, r4. Consider all| 4 processes are writers (i.e. no resources can be shared among processes). Suppose the processes are to acquire and release the release the resources in the following order (and assume the OS has the complete knowledge of the schedule beforehand) – we only show the first 8 requests. if a request is denied, the process immediately quits and release all resources it is holding/waiting (if any). if a process have to wait for a resource (if another process is holding it), the avoidance algorithm can still grant the request and allow that process to wait all waiting queues for each resources are FIFO a.Draw the resource allocation graph before the first request is made b. Will the third request (P3 request r3) be granted? Explain your answer using the resource allocation graph. c. Is there any request after that one that will be denied? If so, show the first request that will be denied and explain why using the resource allocation graph. If not, show the request allocation graph at the of the 8-th request.


Consider the below assembly code and assume that AX initially contains ABCD: MOV BX, 1234 MOV AX, [BX] MOV AX, BX The contents of Register AX after the execution of the above code is: a. unknown AX = (1234)16 .AX = (ABXX)16 -- x = any hex value (0 - F) .AX (xx00)16--> x is any hex value (0 F)


Consider the below assembly code and assume that AX initially contains ABCD: MOV BX, 1234 MOV AL, [BX] ADD AL, CL The contents of Register AX after execution of the above code is: a. AX = (ABCD)16 b. unknown с. АХ 3D (0000)16 d. AX = (ABXX)16 - X = any hex value (0 - F) e. AX = (xx00)16 --> x is any hex value (0 - F)


8) For a vector addition, assume that the vector length is 8000, each thread calculates one output element, and the thread block size is 1024 threads. The programmer configures the kernel launch to have a minimal number of thread blocks to cover all output elements. How many threads will be in the grid? (a) 8000 (b) 8196 (c) 8192 (d) 8200


Consider the following sequence of instructions to compute x² + 4x + 1 for each element x in a vector stored in ve. multvv.d V1, ve, ve # V1 = x^2 multvs.d V2, V8, 4 # V2 = 4 x addvv.d V3, V1, V2 addvs.d V4, V3, 1 # V3 = x^2 + 4x #V4 = x^2 + 4x + 1 Assume that we have a vector processor with two vector multiplication unit whose latency is 7 and two vector addition unit whose latency is 6. Let n=32 represent the length of the vector supported on our processor. The processor has fully- pipelined vector execution units. The vector processor supports chaining. How many clock cycles will this instruction sequence take? a. 48 cycles b. 32 cycles 0.c. 52 cycles d. 120 cycles e. 51 cycles f. 100 cycles g. 200 cycles h. 150 cycles 1. 74 cycles


Consider the below assembly code and assume that all memory locations below EA=490A0 store values (35)16, and higher addresses have (45)16 stored. DS=4000; DI=90A0; DX-0000 (initially) MOV AX, 5678 MOV BX, 1232 ADD AX, BX MOV CX, [DI+8] ADD DX, AX The content of Register AX, DX after execution of the above code is: O AX = (5678)16 and DX = (0000)16 AX = (68AA)16 and DX = (68AA)16 AX = (AA68)16 and DX = (68AA)16 O AX = (68AA)16 and DX = (0000)16


Q5. [5 Pts: 2, 3 pts, 10 minutes] 1. Write a CUDA kernel incrementing a float array A of size N for a 1D grid, using ID thread blocks, and assuming that each thread increments one element. } Solution: _global____ void increment_On_Device(float *A_d, int N) {


Design and develop a piece of software which interacts directly with computer hardware, including parallel architectures. You are required to deliver a software solution with a report (1500 words). You should ensure the following are included in your development (this list is not exhaustive): Part A (Design, implement and evaluate programs): • You can select an application of your choice and parallelize it. (Ex. image filtering, discrete wavelet transform, matrix multiplication, discrete cosine transforms, etc.) • You can use any programming language (Python, Java, C/C++, etc.) with which you are conversant and submit your source code. • You are free to use any hardware (CPU, GPU, or APU) • You are free to use any Operating system (Linux, Windows, etc.). • You can use external libraries such as OpenMP, OpenCL, CU DA, etc. Part B (Report-1500 words): You are required to submit a report of about 1500 words along with the code (both sequential version and parallel version). Also, you need to provide a demo video/presentation of the working of your code. Your report should contain at least the following information: • Summary or Introduction • Programming language and hardware details: In this section, you should include details about programming language and hardware. Also, this is a section to mention external libraries.


9) Assume that a kernel is launched with 1000 thread blocks each of which has 512 threads. if a variable is declared as a shared memory variable, how many versions of the variable will be created through the lifetime of the execution of the kernel? (al (b) 1000 (c) 512 (d) 512000


Let assume that the vector processor supports chaining. How long would it take the original loop to execute on this processor? a.52 cycles b.148 cycles c.74 cycles wrong!! d.200 cycles wrong!! e.32 cycles f.48 cycles g.120 cycles h.100 cycles i.150 cycles