Assessment 1

  1. Classification accuracy is the most straightforward measure for the performance of a classifier.
    Explain the reason why we need so many other performance measures such as sensitivity,
    specificity, area under the curve, etc. (5%) Provide an example where classification accuracy
    might be misleading but those other measurements are more appropriate. (5%)
  2. List the limitations of bag-of-words representation for document analysis. (5%)
  3. Describe what is Term Frequency-Inverse Document Frequency (TF-IDF) weighting and what
    problem it tries to solve? (5%)
  4. Describe the purpose of Principal Component Analysis (PCA) and how it works (5%). Do you
    think PCA is a dimensionality reduction technique? (5%)
  5. Describe the purpose of t-SNE and how it works (10%)
  6. Use your own language to describe what is deep learning (5%)
  7. For regular feedforward artificial neural network, which includes multiple layers of artificial
    neurons to map the input feature variables to the output variables, the neurons in adjacent layers
    are fully connected with each other. The mapping from any specific layer (except the top layer)
    to one layer above includes two operations: 1) a linear transformation (weighted sum); 2) a
    nonlinear activation. Please explain why the nonlinear activation is needed (hint: think of what
    would happen if there’s no such nonlinear activation). (10%)
  8. As introduced in the class, in convolutional neural network (CNN), a specific neuron in each
    layer (except the top) is connected to a subset of neurons of the layer below. This subset
    corresponds to a specific spatial local region in the neuron matrix. Explain the benefits of this
    comparing with full cross-layer connections. (10%)
  9. Suppose we have an image illustrated as follows
    The size of the image is 7×8, where each cell represents a pixel. The image is simply black and
    white, where the white pixel is with intensity value 255, and the black pixel is with intensity value
  10. If we use the following convolution filter to convolve the entire image, please write down the
    resultant image after the convolution. Please explain what are the effects of these convolution
    filters. (20% = 5%x4)
    (a) (b) (c) (d)
  11. Both PCA and Autoencoder can be viewed as methods aiming at learning novel
    representations of the data vectors. Please perform a head-to-head comparison between PCA
    and Autoencoder and explain the potential pros and cons of them both. (5%)
  12. Autoencoder include two phases: the encoding phase learns a compact representation of the
    original data vector through a multi-layer fully connected feedforward neural network, the
    decoding phase maps the compact representation back to the original data space with the
    transpose (reverse) of the encoding network. Autoencoder is designed for vector based data
    representations. Now think about images. If we want to design a CNN based autoencoder, which
    first transforms the input image into a compact vector based representation through several
    layers of CNN including convolution+activation and pooling layers in the encoding phase and then
    decode such representation back to the original image space. The model parameters will be
    optimized through the minimizing reconstruction loss between the original image and decoded
    image. Please describe how you want to design the decoding phase and the key operations. (10%

Leave a Comment

Your email address will not be published. Required fields are marked *