Question

1 Introduction

Analyzing the Mariana Trench

1.1 The Mariana Trench

The Mariana Trench is the deepest trench in the world, lying in the Pacific Ocean between Japan and Papua New Guinea. Because it is

so deep, and the sides so steep, it is a unique environment on earth and the subject of much scientific inquiry.

The United States National Oceanic and Atmospheric Administration (NOAA) has conducted a series of bathymetric investigations

of the Mariana Trench, collecting high-resolution data of the region. This data is often difficult to work with, containing an extremely

high number of data points. You've been asked to reduce the size of the data in a way that

maintains the overall structure of the trench

and perform some scientific analysis.

1.2 The Singular Value Decomposition

To perform this science, we will need to compute the Singular Value Decomposition (SVD) of the dataset. The SVD is a special

factorization of a given matrix A. Similarly to standard numbers, which can be factored several ways, for example,

123-4-3-2-2 6-4 11/11

a matrix can have many different factorizations, with many different purposes and benefits. The SVD factoriz

A into three parts,

223

A = UEVT,

where U and V are orthogonal matrices (that is, UTU = I, UUTI, VTV = I and VVT = I) and Σ is a diagonal matrix.

A is mxn

U is mxm

Σ is mxn

VT is nxn

Each of the three factor matrices tells you information about the matrix. Σ contains the "singular values", which are the square roots of

the eigenvalues of ATA, and the columns of V are the associated eigenvectors. U holds the same information about the eigenvectors of

AAT.

breaks a given matrix

Physically speaking, this representation tells you how the matrix A acts on vectors. The columns of V represent a basis for R",

and the columns of U are a basis for R". A is a matrix which maps the basis V to the basis U, scaled by the singular values on the

diagonal of E. Written mathematically, that is

AX = A (G₁V₁ +₂V₂ + ... + ₁₂V₁) = c

V) = c2U + c2U,+...+c Στη

1.3 The Incomplete SVD

This representation is helpful for more than just a deeper understanding of the matrix A, though. If we order the singular values and

vectors so that Σ₁1 ≥ 22 ..., then we can make the approximation

C₁E11Ū₁ + €₂22Ū₂ + ··· + CkEkkŰk + ... +€mΣmmŪm ≈ 1 Σ11Ū₁ + €₂Σ22Ū₂ + ... + CkEkkŰk

for some k

the matrix A works so that we are losing as little information as possible.

CONTINUED/nThis is called the Incomplete Singular Value Decomposition. The core idea behind using the Incomplete SVD is that the columns of

U and V corresponding to large diagonal elements of Σ essentially matter more, and we can save a lot of energy by using tall, skinny

approximations to U and V which throw away the columns we don't care about.

A is mxn

U is mxk

A = importdata ('mariana_depth.csv');

lor = importdata ('mariana_longitude.csv

lat importdata ('mariana_latitude.csv

Σ is kxk

1.4 Acquiring the data

NOAA hosts bathymetric datasets of the Mariana trench. A cleaned and lightly sampled version of that dataset has been posted to the

Canvas page for this course. The units for the depths are meters with latitude and longitude in degrees. The original data set is 25 times

the size given here, and is difficult to work with on personal computers. Download the three files

mariana_depth.csv

mariana_latitude.csv

mariana_longitude.csv

from the Project Module in Canvas and put them in your working directory. CSV files are easy to import into

For example, in MATLAB one would use the commands

VT is kxn

coding languages.

2 Questions

When writing your report, do not simply number and answer the questions below. sure you answer all the questions, but remember

your write-up needs to have the look and feel of a scientific report.

2.1 Investigating the Trench

1. Import the data, and display it as an image in your report. Be sure to label your axes and have latitude increase to the north (in the

positive y-direction) and longitude increase to the east (in the positive x-direction).

2. How deep is the deepest point sampled in the trench? What are its latitude and longitude?

3. The ocean floor can be nominally defined in this region at a depth of 6 kilometers. What is the average depth of the trench, that is,

the mean depth among all points deeper than the ocean floor?

2.2 Computing Eigenvectors

Before running your code on the trench depth data, it is good to develop the code using a smallish matrix for which you know the

eigenvalues and eigenvectors. Do not include the testing data in the report.

CONTINUED

1. Code your own algorithm to find the first eigenvector and eigenvalue of ATA, where A denotes the depth matrix. To do this,

(a) First begin with a random guess vector of the correct size and magnitude one.

(b) Apply ATA to that vector, and then divide the result by its magnitude to make sure it's still magnitude one. That is your

updated guess.

(c) Repeat the two steps above until the vector stops changing./nPut another way.

(a) ₁ = a random unit vector (magnitude 1).

(b) un+1 = A¹Aun/A¹Aun||

(c) Repeat until n+1- un is small. It should only take around ten steps.

Note that the notation ||u| denotes the norm (magnitude) of the vector u that has N components, and is defined as

Call the eigenvector you have found at the end V₁. Provide the eigenvalue and a plot of the eigenvector V₁ in your report. The

z-axis of this plot should go from 1 to N, and the y-axis should be the corresponding value of the component of the eigenvector.

Also provide a hypothesis for why this method of finding the eigenvector works. HINT: Note that the eigenvectors of ATA

constitute a basis of R", and try expressing u, in terms of that basis.

2. We can also compute the i largest eigenvalues and associated eigenvectors of the system by using the fact that every eigenvector of

a symmetric matrix (such as ATA) must be orthogonal to all the previous ones, with a process called Gram-Schmidt Orthogonal-

ization. The basic idea is that we make sure that the current eigenvector we are building is orthogonal to every other eigenvector

we have built so far. In other words, if we are currently working on constructing V, un,

V₁ = ₁V₂ = = V₁-1=0.

From a next-guess vector +1 = ATA, that we have not touched with this Gram-Schmidt method yet,

it to the i previously computed eigenvectors V₁, V2, ... V, by computing

un+1 = u n+1

i-1

-Σ(˜‚+ V‚) V₂.

j=1

These components we are subtracting from +1 are called the "projection of +1 onto

Using this idea, alter your code to compute the i= 50 largest eigenvalues and associated

ization step should be done after you multiply your previous guess vector by ATA, but before you divide your new guess by its

own magnitude.

(a) Initialize V₁ to V50 as a matrix of zeros.

(b) For i = 1 to 50,

eigenvectors of ATA. This orthogonal-

i. ₁ a random unit vector (magnitude 1)

ii. +1A¹Au,

iii. Un+1 = +1-EV

iv. Un+1 = Un+1/|un+1||

v. repeat ii-iv until || un+1 - || is small (less than 10-² or 10-3 should be fine for us), and store the final un+1 as V₁.

CUoulder

Store the eigenvectors you compute as the columns of a 1440 x 50 matrix and call it V. Provide a semilog plot of the eigenvalues

you find.

2.3 The Incomplete SVD Decomposition

1. Construct a 50 x 50 matrix with the square roots of the eigenvalues you found in the previous part on the diagonal. Call that

matrix E. Likewise, define a matrix U so that each column of U is equal to A times the associated column of V, divided by the

associated element of Σ.

2. The V, E and U matrices you have constructed constitute the incomplete SVD decomposition of the matrix A. The idea is that

we've captured the relevant features of A without using so much storage. To test this, first calculate how many numbers we need

to save in total between the U. Σ and V matrices we've built, and compare that to the total number of elements of A.

3. Now, provide a picture like you did of the matrix A in Question 2.1, but now of the matrix product UZVT. Describe the

relationship between these two images.

4. Try using fewer columns of U and V and a similarly smaller and see how the image holds up. If your computer can handle it,

try using more than 50. Describe how the number of singular values used affects the image.

CONTINUED/n3 Submission

Your group's report needs to be submitted to Gradescope as a single pdf file. Add all groups members to the Gradescope submission in

addition to including them in the actual report. Any code that you include should be included within the single pdf as an appendix. Be

sure to type up your final report. Further details on project write-ups can be found in the Project Information Modele in Canvas.

Question image 1Question image 2Question image 3Question image 4