FORECASTING ASSIGNMENT 2022

MATH6011
Your coursework must be submitted electronically via Blackboard by 3pm on Friday
March 25th. Any work handed in after this time will be subject to the following
penalties: 10% of your marks lost per working day up to 5 working days. Do not write
your names anywhere on your work, as marking will be anonymous. Your student
IDs should be included in the filenames but not your name; see further instructions
on file naming and labelling in Section 3 below. An extension, for bona fide reasons,
may be allowed by prior agreement, but only well before the deadline; you can contact
the Student Office if you would like to apply for an extension. Computer crashes or
file losses a day or two before the deadline will not be an acceptable reason for an
extension. It is therefore advisable to keep back-up copies of your work. Components
of the project will receive different weightings in producing your final mark: 40
marks for the exponential smoothing part, 20 for ARIMA, 20 for regression, 10 for
the presentation slide, and 10 marks for the overall organization of your submitted
material, including the description of your codes/files.
You are expected to complete the assessment in groups of 3 students; working alone or
in a group of 2 could be accepted if there is a valid reason to do so. Please email the
lecturers as soon as possible and no later than 2 weeks before the submission deadline
if you are not able to form a group of 3 to complete the assignment. All students in
each group will get the same mark for their work and for any late submission, all the
group will endure the same level of penalties as indicated above.

  1. Background and analysis
    In light of the recent United Nations Climate Change Conference that took place in Glasgow,
    Scotland, from 31 October to 13 November 2021, the UK government through its new Clean
    Green Initiative has employed you as a consultant. Your task is to forecast the behaviour
    of a number of key environmental indicators until December 2022, to help support the
    decision process for new policies to support the country’s efforts to reduce the impact of
    climate change. The data is provided by a number of public organizations, including the
    Meteorological (Met) Office and the Office for National Statistics (ONS).
    1.1. How to get the data. From the four weblinks given below, download the data sets
    and save them in xlsx or xls format. The resulting files might have multiple columns or
    sheets; follow the corresponding instructions to access the data necessary for your analysis.
    Copy the data sets from the required columns as described below; i.e., MSTA, CH4, GMAF,
    and ET12, scrolling down, where necessary, to find the monthly observations.
    (A) Global Mean Surface Temperature Anomaly (MSTA) in ◦C:
    https://www.metoffice.gov.uk/hadobs/hadcrut5/data/current/download.html
    MSTA: to get the data, see first table monthly box in the Global row; select the CSV-file type
    – the data is located in the Anomaly column.
    1
    2
    (Source: The Meteorological Office, abbreviated as the Met Office, which is the United Kingdom’s national weather service).
    (B) Global Monthly Atmospheric Carbon Dioxide Levels (CH4):
    https://gml.noaa.gov/webdata/ccgg/trends/ch4/ch4 mm gl.txt
    CH4: see average monthly values in 4th column.
    (Source: Global Monitoring Laboratory of the USA National Oceanic and Atmospheric
    Administration, an American scientific and regulatory agency within the United States Department of Commerce).
    It is recommended that for time series in (B), you copy the data into text files, using, for
    example, Notepad, and then open the text files using Excel, as space delimited. The files can
    then be saved as Excel workbooks.
    (C) International Passenger Survey, UK visits abroad (GMAF):
    https://www.ons.gov.uk/peoplepopulationandcommunity/leisureandtourism/datasets/interna
    tionalpassengersurveytimeseriesspreadsheet
    GMAF: select the xlsx file; see data in GMAF column, scrolling down to the monthly data.
    (Source: UK Office for National Statistics).
    (D) UK inland monthly energy consumption (ET12), million tonnes of oil equivalent–xls
    file can be downloaded by clicking on the corresponding expression with the this link:
    https://www.gov.uk/government/statistics/total-energy-section-1-energy-trends
    ET12: use the data in the Total unadjusted column of the Month worksheet.
    (Source: Department for Business, Energy & Industrial Strategy).
    1.2. Tasks. As it so often happens in the real world, the data sets are of different lengths.
    You will have to use your own judgment in inspecting and preparing the data before carrying
    out any technical analysis. The analysis is in three parts:
    (a) You are asked to take all four series separately and to forecast monthly behaviour until
    December 2022, using exponential smoothing-type forecasting methods.
    (b) The Clean Green Initiative team have been satisfied in the past with exponential smoothingtype forecasting methods and are happy to see these techniques used in the analysis. However, they are interested in the possible use of the ARIMA methodology to predict MSTA.
    You are asked to fit the ARIMA model to MSTA, for analysis in which you compare the use
    of ARIMA forecasting and an exponential smoothing method. You should make a recommendation as to future use of ARIMA on this time series.
    (c) The Clean Green Initiative team is interested to know whether global temperatures (that
    is, series MSTA) are affected by carbon dioxide levels, international air travel, and the consumption of fuels (as exemplified by series CH4, GMAF, and ET12). Develop a multiple
    regression model, use it for prediction of MSTA until December 2022, and report on whether
    you think the model is satisfactory or not.
  2. What you must produce
    You must produce a technical report describing all the analysis done to select the most suitable forecasting methods, as well as the results obtained. The report must be accompanied
    3
    by a single-page slide summarizing your main results, and also the codes used to perform
    the technical analysis, as well as the resulting graphs. More details on each of the aspects
    of the work are given in the next subsections.
    2.1. The technical report. The technical report must follow the structure described in
    Subsection 2.5. It should address the three parts of the analysis: exponential smoothing,
    ARIMA, and regression. For each part, give details of the preliminary analysis, data preparation, models chosen and analysis carried out. Also describe why each model was built and
    explain the analysis carried out, including an evaluation of the effectiveness of the models.
    2.2. Single presentation slide. The executive board members of the Clean Green Initiative are particularly interested in knowing how the three methods (exponential smoothing,
    ARIMA, and regression) perform on the four variables mentioned above; i.e., MSTA, CH4,
    GMAF, and ET12. You are asked to produce a single-page slide summarizing the main
    results of your analysis, in order to enable them to quickly grasp the results without necessarily having to read your technical report. Where necessary, attention should be given to
    the comparison of the performance of the methods, while highlighting the best results. This
    slide will be judged on the suitability of its presentational style, clarity, and quality.
    2.3. Python codes. You must also prepare and submit python codes that you use to generate the results that will be included in your technical report. If any preliminary operations
    on your data are needed before applying/developing a python code for your analysis, it is
    fine to include this in the corresponding excel file containing your data sets. However, you
    must complete all the main tasks of your analysis using python. You can use the codes from
    the course, use different ones or develop your own. Marking on this aspect of your work will
    not be based on how well you can program in python, but rather on the functionality of your
    codes and their relevance in the corresponding analysis.
    To help us easily know what you do in each code, you must produce a single page document,
    as Appendix A to your technical report, to give a brief one or two sentences description of
    what it does. If you do any preliminary operations on your data in the excel file containing
    your data set, a line or two should also be included to describe this.
    2.4. Analysis and forecast graphs. You are expected to produce graphs to illustrate your
    analysis in the technical report. Do not include these graphs in the main part of the report
    (Sections 1 – 3; see details in next subsection), but rather, put all of them in Appendix B.
    You are allowed up to 12 pages for the graphs produced for your analysis. Organize the
    graphs in three main parts, each corresponding to one of the main sections of the technical
    report. Also number each of your graphs accordingly to be able to easily refer to them, as
    necessary, in Sections 1, 2, and 3. You do not need to repeat graphs in Appendix B. For
    example, if you want to refer to a graph under the ARIMA section, which was already done
    in the section dedicated to exponential smoothing, you are encouraged to instead use the
    figure number of that specific graph rather than repeating the graph again.
    2.5. Organizing your technical report. The report must be organized as follows:
  3. Exponential smoothing (maximal length: 4 pages; total marks: 40)
    Marks to be attributed based on how well you articulate the following aspects:
    4
    • Describe data preparation (and its effects) prior to t he implementation of
    exponential smoothing methods.
    • Describe preliminary analysis undertaken (and conclusions drawn) prior to
    the implementation of exponential smoothing methods.
    • Give details of how exponential smoothing models were selected for each of
    the time series, and how effective these methods are at forecasting.
    • Clarity and quality of presentation.
    • Functionality of python codes.
    • Quality and suitability of illustrative or forecast result graphs.
  4. ARIMA forecasting (maximal length: 2 pages; total marks: 20) Marks
    to be attributed based on how well you articulate the following aspects:
    • Describe any data preparation prior to ARIMA, and its effects.
    • Describe preliminary analysis undertaken prior to ARIMA modelling, and
    the conclusions drawn.
    • Give details of how an ARIMA model was selected, tested, and its effectiveness evaluated.
    • Compare ARIMA and exponential smoothing forecasting, both in general
    terms and in this particular instance.
    • Clarity and quality of presentation.
    • Functionality of python codes.
    • Quality and suitability of illustrative or forecast result graphs.
  5. Regression prediction (maximal length: 2 pages; total marks: 20)
    Marks to be attributed based on how well you articulate the following aspects:
    • Describe any data preparation prior to regression.
    • Describe any preliminary analysis undertaken prior to regression and the
    conclusions drawn.
    • Give details of how a regression model has been selected and comment on
    its suitability for prediction.
    • Clarity and quality of presentation.
    • Functionality of python codes.
    • Quality and suitability of illustrative or forecast result graphs.
    Appendix A: Code descriptions (maximal length: 2 pages; full marks: 10), etc.
    Marks here will be attributed based on the overall organization of the material
    that you submit, and on how clear, informative, and concise is your description of
    what each of your python codes (or excel file, in case any preliminary operations
    is carried out there) does.
    Appendix B: Analysis and forecast graphs (maximal length: 12 pages) This appendix should be organised in 3 sections, with the first, second, and third one
    dedicated to graphs related to the exponential smoothing, ARIMA, and regression methods, respectively. As you can see above, marks dedicated to this appendix are attributed under the corresponding sections; i.e., Sections 1, 2, and
    3, respectively.
    In summary, the following guidelines must be followed while producing the technical report:
    5
    • The technical report must be organized as described above, with maximum 22 pages
    in total: maximum 10 pages in total for Sections 1, 2, 3, and Appendix A; and
    maximum 12 pages dedicated to the analysis and forecast graphs (Appendix B).
    • Do not include graphs in Sections 1, 2, and 3. All graphs should be included under
    Appendix B with appropriate numbering, in order to easily refer to them in your
    discussions under Sections 1, 2, and 3.
    • No theory of forecasting is required, or repeat of the material from lectures, unless
    you have used models not included in notes.
    • Formal English should be used, avoiding abbreviations (such as “doesn’t”), slang,
    and casual vocabulary.
    • In Sections 1, 2, and 3, references to codes developed/used for specific tasks can
    be made by using the corresponding code’s name. But no other details of python
    modelling are needed in those sections.
    • At most 2 sentences are needed in Appendix A to explain what each python code (or
    excel file, if necessary) does.
    • Feel free to include subsections to Sections 1, 2, 3, and Appendices A and B, if they
    seem necessary to help make some parts clearer.
    • No introduction, table of contents or conclusions should be written for the report.
  6. Submission
    All submissions should be done under the corresponding assignment tab on Blackboard.
    Submit one zipped folder (.zip), not an archived file (.rar), without internal folders, which
    contains a pdf copy of the technical report, a pdf copy for the single slide, four spreadsheets
    with the data sets provided for the analysis. You should also include an adequate number of
    files with your python codes. Remember not to put your name anywhere on your work, as
    marking is anonymous. Include the ID numbers of all the students in your group in
    your technical report and single-page slide; use the following naming pattern (involving
    the IDs of all the students in your group) for all the files to be submitted via Blackboard:
    • 1 pdf file with the technical report:
    – TechnicalReport StudentID1 StudentID2 StudentID3.pdf
    • 1 pdf file for the single slide (convert the slide to pdf):
    – Slide StudentID1 StudentID2 StudentID3.pdf
    • 4 data files:
    – MSTAdata StudentID1 StudentID2 StudentID3.xlsx
    – CH4data StudentID1 StudentID2 StudentID3.xlsx
    – GMAFdata StudentID1 StudentID2 StudentID3.xlsx
    – ET12data StudentID1 StudentID2 StudentID3.xlsx
    • Python codes: each file name should have three components, with first one related
    to the corresponding methodology, second to the specific task, and the third being
    the student IDs. For example, if you produce/use a code to illustrate something
    related to the exponential smoothing, ARIMA, or regression methods, you should
    respectively apply the following naming pattern to your files:
    – ExpSmooth MSTATimePlot StudentID1 StudentID2 StudentID3.py
    – ARIMA ACFPlot StudentID1 StudentID2 StudentID3.py
    – Regression Correlation StudentID1 StudentID2 StudentID3.py
    6
    The middle terms MSTATimePlot, ACFPlot, and Correlation are related to specific
    tasks that could be carried out under the corresponding parts. This middle term
    should not exceed fifteen characters.

Leave a Comment

Your email address will not be published.