FINAL EXAM This is the evolving page of the "Final Exam" . New Rules: 1) You must work individually for the rest of the semester. 2) You can consult with me, but not with each other. 3) You must declare whatever external resources you use: web pages, books, papers, advice from external sources (in case you happen on someone who gives you unwanted advice). 4) You are under honor code to follow 1)-3). No longer post your answers on your web page. Send them to me by email instead. ---------------------------------------------------------------- Problems/Questions: 1) Python semantics: Interpret the following three lines of Python code: a = [1,2,3] b = a del a What are the two possible semantics of these three statements? Which one was chosen by the Python designer? What Python statements are needed to achieve the results of the second possibility? 2) The low-level nature of C: What would one have to do in C to reconstruct something like the following Python statement? a = [1,2,3] Consider the case where you know at compile time how long "a" is, and the more difficult case where you do not know. Explain in English; don't write a program. If you don't know enough C, explain what facility C would have to offer. 3) Write a Python one-liner that creates a list containing 100 samples from the standard Gaussian, conditional on being in the interval [0,2]. It is sufficient if you achieve the 100 size with very high probability. Once you try to solve the problem you will understand what this cryptic remark means. 4) Download the file "Numeric-22.0.tar.gz" by going to, for example, http://prdownloads.sourceforge.net/numpy/Numeric-22.0.tar.gz?use_mirror=telia Alternatively, go to "www.python.org/" -> Vaults of Parnassus -> math -> -> Numerical Python -> numpy ... "Download" -> Numeric-22.0.tar.gz -> Reston, VA (e.g.) Save to this directory: c:/cygwin/lib/python2.2/site-packages/ In a bash say: cd c:/cygwin/lib/python2.2/site-packages/ gunzip Numeric-22.0.tar.gz tar xfv Numeric-22.0.tar cd Numeric-22.0/ python setup.py install If this does not work, let me know. If it works, you should be able to run python and import the modules Numeric, LinearAlgebra, RandomArray Abbreviate them as "N", "L", and "R" in what follows. 5) Generate a Numeric matrix "mat1" of size 3x3 containing the integers 1...9. Define "mat2" to be the slice "mat1[0:2,:]". Assign 100 to "mat2[0,0]". Check "mat1". Then generate the same submatrix of "mat1" with "N.take" and call it "mat3". Assign 200 to "mat3[0,1]". Check "mat1". Comment on the semantics of slicing and "take"ing. 6) What is the largest size of Numeric vector (= 1-dim array) that you can reasonably create and sort in Python using Numeric? Once you run into memory errors, quit Python, re-enter and try again. How much physical memory does your machine have? How does the size of the largest array compare with the size of the memory? 7) Recall: A database index is a pairing of keys (= values of an attribute) and addresses (of cases) such that finding the addresses of cases with a given key is quick. We use indices that are sorted on the keys, for bisection search. Implement a class that is a database consisting of two attributes such as "attr1" and "attr2", implemented as vectors: attr1 = R.permutation(100000) + 10000000 attr2 = R.permutation(100000) + 20000000 The i'th case has values "attr1[i-1]" and "attr2[i-1]" on the two attributes. On instantiation with two attribute vectors, the class instance should build two indices for looking up the second attribute given a key on the first, and vice versa. Use the function "N.argsort" to build the indices, and the function "N.searchsorted" for methods that do the lookup. Implement each index not as a list of pairs, but as two conformal integer vectors, one for the sorted keys, the other for the corresponding addresses. 8) When implementing an orthogonal projection, it is not recommended to actually form P = U U^t, where U is an orthonormal frame. Write a function a) that takes an argument U and a vector x; b) that checks whether U is an orthonormal frame and x is conforming with U, and bails out if not; c) that projects x onto the space spanned by the columns of U without computing U U^t. 9) Someone who does not know multiple linear regression very well thinks that x_1 + x_2 is a very meaningful quantity and should be added to the predictors x_1, x_2,..., x_p as a p+1'st predictor. He doesn't tell you, but out of caution you run an svd on the Nx(p+1) matrix. What will you discover?