Search for question
Question

Module Number: SWE6204 Module Name: Machine Learning Year/Semester: 2023-24 / Semester 2 Module Tutor/s: MD Maksudur Mazumder Assessment Number 1 Assessment Type and Weighting Report of 4000 words (+/- 10%) and 100% Assessment Name Coursework Portfolio Assessment Submission Date Week 15 Learning Outcomes assessed. LO1: Develop an understanding of a wide selection of Machine Learning Algorithms LO2: Identify fundamental issues of applying Machine Learning in designing and implementing real-world applications. LO3: Demonstrate the application of machine learning algorithms to solve real-world problems. LO4: Critically evaluate the performance of machine learning solutions and identify the scope of improvements and optimisations. LO5: Identify social, and ethical issues/implications in the application of machine learning. LO6: Critically summarise the entire project, and prepare a concise presentation summarising the key findings, challenges faced, and lessons learned during the project 1 | Page Task 1 - Machine Learning [40 marks] a. We all know we have three main types of Machine Learning (ML), such as Supervised Learning, Unsupervised Learning, and Reinforcement Learning. Assume you have given the following scenarios. Your task is to identify what type of ML you can apply to the following scenarios and why explain in your own words. [LO1] (4 marks) i. Imagine you work for a financial consulting firm, and one of your responsibilities is to develop a predictive model that can forecast stock prices based on various financial indicators. This model will serve as a valuable tool for investors and traders to make informed decisions about buying or selling stocks, ultimately helping them maximise their returns. (1 mark) ii. Imagine you work for a healthcare organisation, and your objective is to create a patient segmentation plan to optimise and improve patient care and treatment strategies. To effectively tailor medical services, the goal is to identify distinct groups of patients with similar health profiles or medical needs. This segmentation will help the healthcare facility allocate resources efficiently and provide personalised care to each patient group. (1 mark) iii. Imagine that you work for a social media platform, and your role involves creating a system that can automatically identify and flag inappropriate content posted by users. This system is crucial for maintaining a safe and enjoyable online environment, as it helps in swiftly removing offensive or harmful material from the platform. (1 mark) iv . Imagine you're tasked with creating an autonomous delivery drone system for a futuristic logistics company. This drone must learn how to efficiently navigate a complex urban environment, follow aviation regulations, and make intelligent decisions on-the-fly. (1 mark) b. What is the difference between K means clustering algorithm and the k nearest neighbours (KNN) classification? [LO1] (8 marks) c. Can you explain what is a 'loss' in machine learning, and how to calculate that for linear regression? [LO2, LO4] (4 marks) d. Explain "Overfitting" in Machine Learning. [LO2, LO4] (8 marks) | Page e. Given the following training (T) and validation (V) error curves what actions would you take, if any, to improve performance given that m is the number of training pairs being used? Each point of the curve is obtained by training until convergence. Provide an explanation for your reasoning. [LO2, LO4] 1. (4 marks) error V T Relatively small Error m II. (4 marks) V error T Relatively large Error > m 3 | Page f. Can you give an example with an explanation of the challenges & risks involved in the application of AI/machine learning in the following table? [LO5] (8 marks) Challenge/Risks Example with an explanation Bias can affect the results Errors may cause harm A solution may not work for everyone Who's liable for Al-driven decisions? 4 | Page Task 2 - Predicting House Prices Using Regression Techniques [60 marks] Scenario: You are a bachelor's student enrolled in a Computer Science program at a university in the UK. As part of your undergraduate studies, you have been tasked with a Machine Learning project. The objective of this project is to create a predictive model capable of accurately estimating property prices using a range of input features. You are provided an example of Python source code to generate a sample dataset comprising details about houses in a specific city, including the size of the house, number of bedrooms, number of bathrooms, location, and other significant features. Your task is to build a Regression Model that can effectively predict the selling price of a house given its features. Assignment Tasks: 1. Data Exploration and Pre-processing [LO3]: (10 marks) i. Load/import the dataset from house_prices_dataset.csv, examine its structure, and print the first 20 rows. (5 marks) ii. Visualise data for features 'size', 'bedrooms', 'location', and 'prices' using appropriate plots or graphs. (5 marks) 2. Model Selection and Evaluation [LO3, LO4]: (30 marks) i. Split the dataset into training and testing sets using an appropriate ratio. For example, split 65% data for training and 35% data for testing. (5 marks) ii. Select at least one regression algorithm (e.g., Linear Regression, Decision Tree Regression, Random Forest Regression) to build predictive models. (10 marks) iii. Train each model using the training data and evaluate their performance using appropriate evaluation metrics such as Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R-squared. (15 marks) 3. Model Fine-tuning and Optimisation [LO3, LO4]: (10 marks) i. Perform hyperparameter tuning on the selected regression model using techniques like Grid Search or Random Search. (10 marks) 5 | Page 4. Conclusion and Presentation [LO6]: (10 marks) i. Summarise the entire project, including the problem statement, data exploration, model selection, optimisation, and interpretation of results. (5 marks) ii. Prepare a concise presentation summarising the key findings, challenges faced, and lessons learned during the project. (5 marks) Note: You must use Python programming language and feel free to use any machine learning libraries (e.g., scikit-learn, TensorFlow) to complete the assignment. Remember to document your code, provide appropriate comments, and include necessary visualisations to support your analysis. Here is an example code of how you can generate a sample dataset using Python: import numpy as np import pandas as pd # Set random seed for reproducibility np.random.seed(42) # Generate synthetic data for house prices num_samples = 1000 # Features size = np.random.randint(500, 3000, size=num_samples) bedrooms = np.random.randint(1, 6, size=num_samples) bathrooms = np.random.randint(1, 4, size=num_samples) location = np.random.choice(['City Centre', 'Suburbs', 'Rural Area'], size=num_samples) 6 | Page # Target variable prices = 50000 + (size * 100) + (bedrooms * 20000) + (bathrooms * 15000) prices += np.random.normal(0, 20000, size=num_samples) # Create Data Frame data = pd.DataFrame({ 'Size': size, 'Bedrooms': bedrooms, 'Bathrooms': bathrooms, 'Location': location, 'Price': prices }) # Save the dataset to a CSV file data.to_csv('house_prices_dataset.csv', index=False) Assessment Deliverables: You are required to produce a report (+/- 4000 words) that discusses all the above factors in Task 1 and Task 2. Formatting requirements . References list must include a minimum of 5-10 academic sources with a minimum of 3 peer-reviewed academic journals. Harvard referencing format must be used to credit secondary research sources. In-text citations should be included within your discussion (where relevant) using the author-date format and full reference details should be included in your bibliography. . Diagrams should be captioned and discussed in the body of your report. · A table of contents should be included. · Page numbers should be inserted in the centre of the footer. · The student ID number be placed in the header of each page. Submission Please submit to the Turnitin assignment section through Moodle. 7 | Page Grading A percentage mark will be provided based on General Assessment Guidelines for Written Assessments. Grading is as follows: A: 70 - 100% B: 60 - 69% C: 50- 59% D: 40 - 49% F: Below 40% Glossary: . Analyse: Break an issue or topic into smaller parts by looking in depth at each part. Support each part with arguments and evidence for and against (Pros and cons) . Critically Evaluate/Analyse: When you critically evaluate you look at the arguments for and against an issue. You look at the strengths and weaknesses of the arguments. This could be from an article you read in a journal or from a textbook. · Discuss: When you discuss you look at both sides of a discussion. You look at both sides of the argument. Then you look at the reason why it is important (for) then you look at the reason why it is important (against). . Explain: When you explain you must say why it is important or not important. · Evaluate: When you evaluate you look at the arguments for and against an issue. · Describe: When you give an account or representation of in words. . Identify: When you identify you look at the most important points. · Define: State or describe the nature, scope, or meaning. · Implement: Put into action/use/effect · Compare: Identify similarities and differences · Explore: To find out about · Recommend: Suggest/put forward as being appropriate, with reasons why. 8 | Page GENERAL ASSESSMENT GUIDELINES - LEVEL HE4 Relevance Learning outcomes must be met for an overall pass Knowledge and understanding Analysis, Creativity and Problem-Solving Self-awareness and Reflection Research/ Referencing Written English Presentation and Structure The presentational style and layout are correct for the type of assignment. Evidence of planning and logically structured. Where relevant, there is effective placement of, and reference to, figures, tables and images. Class I (Excellent Quality) 70% - 84% Work is relevant and comprehensively addresses the requirements of the brief. Learning outcomes are met. Demonstrates an excellent breadth of knowledge and understanding of theory and practice for this level. Demonstrates in-depth understanding of key concepts. Presents an excellent and cohesive appraisal of findings through the critical analysis of information. Draws clear, justified and thoughtful conclusions. Demonstrates creative flair, originality and initiative. Demonstrates a critical understanding of problem-solving approaches and applies strong problem-solving skills. Presents an excellent and cohesive discussion of findings through the interpretation and evaluation of information sources. Draws clear, justified and thoughtful conclusions. Demonstrates clearly creativity and initiative. Applies excellent problem-solving skills. Class I (Exceptional Quality) 85% - 100% Work is directly relevant and expertly addresses the requirements of the brief. Learning outcomes are met. Demonstrates breadth of knowledge and understanding of theory and practice beyond the threshold expectation for the level. Demonstrates excellent understanding of key concepts in different contexts. Provides insightful reflection and self- awareness in relation to the outcomes of own work and personal responsibility. A wide range of contemporary and relevant reference sources selected and drawn upon. Sources cited accurately in both the body of text and in the Reference List/ Bibliography. Writing style is clear and appropriate to the requirements of the assessment. An exceptionally well written answer with competent spelling, grammar and punctuation. For example, paragraphs are well structured and include linking and signposting. Sentences are complete and different types are used. A wide range of appropriate vocabulary is used. Provides excellent reflection and self- awareness in relation to the outcomes of own work and personal responsibility. A range of contemporary and relevant reference sources selected and drawn upon. Sources cited accurately in both the body of text and in the Reference List/Bibliography. Writing style is clear and appropriate to the requirements of the assessment. An excellently well written answer with competent, spelling, grammar and punctuation. For example, paragraphs are well structured and include linking and signposting. Sentences are complete and different types are used. A wide range of appropriate vocabulary is used. The presentational style and layout are correct for the type of assignment. Evidence of planning and logically structured. Where relevant, there is effective placement of and reference to, figures, tables and images. Class II/i (Very Good Quality) 60% - 69% Work is relevant and addresses most of the requirements of the brief well. Learning outcomes are met. Demonstrates a thorough breadth of knowledge and understanding of theory and practice for this level. Demonstrates very good understanding of key concepts. Presents a perceptive and cohesive discussion of findings through the interpretation and evaluation of information sources. Draws clear and justified conclusions. Demonstrates creativity and initiative. Applies strong problem-solving skills. Provides justified reflection and self- awareness in relation to the outcomes of own work and personal responsibility, as required by the assessment. A range of appropriate reference sources selected and drawn upon. Sources cited accurately in the main in the text and in the Reference List/ Bibliography. Writing style is clear and appropriate to the requirements of the assessment. A very well written answer with competent spelling, grammar and punctuation. For example, paragraphs are well structured and include linking and signposting. Sentences are complete and different types are used. A range of appropriate vocabulary is used. The presentational style and layout are correct for the type of assignment. Evidence of planning and logically structured in the main. Where relevant, there is effective placement of figures, tables and images. 9 | Page Relevance Learning outcomes must be met for an overall pass Knowledge and understanding Analysis, Creativity and Problem-Solving Self-awareness and Reflection Research/ Referencing Written English Presentation and Structure The presentational style and layout are largely correct for the type of assignment. Logically structured in the most part. Where relevant, effective placement of some figures, tables and images. Class III (Satisfactory Quality) 40% - 49% Class II/ii (Good Quality) 50% - 59% Work addresses key requirements of the brief. Some irrelevant content. Learning outcomes are met. Demonstrates a sound breadth of knowledge and understanding of theory and practice for this level. Demonstrates sound understanding of key concepts. Presents a logical discussion of findings through the interpretation and evaluation of information sources. Draws clear and justified conclusions. Demonstrates some creativity and initiative. Applies sound problem-solving skills. Provides valid reflection and self- awareness in relation to the outcomes of own work and personal responsibility, as required by the assessment. Relevant reference sources selected and drawn upon. Some sources accurately cited in both the body of text and in the Reference List/Bibliography. Writing style is mostly appropriate to the requirements of the assessment - Grammar, spelling and punctuation are generally competent and minor lapses do not pose difficulty for the reader. Paragraphs are structured and include some linking and signposting. Sentences are complete. A range of appropriate vocabulary is used. Work addresses the requirements of the brief, although superficially in places. Some irrelevant content. Demonstrates a sufficient understanding of key concepts. Demonstrates a sufficient breadth of knowledge and understanding of theory and practice for this level. Presents a valid discussion of findings through the interpretation and evaluation of information sources. Draws justified conclusions. Demonstrates creativity and initiative in places. Applies sufficient problem-solving skills. Provides some reflection and self- awareness in relation to the outcomes of own work and personal responsibility, as required by the assessment. Some relevant reference sources selected and drawn upon. Some weaknesses in referencing technique. Writing style is occasionally not appropriate for the assessment. Grammar, spelling and punctuation are generally competent, but may pose minor difficulties for the reader. Some paragraphs may lack structure, and there is limited linking and signposting. Some appropriate vocabulary is used Learning outcomes are met. Borderline Fail 35% - 39% Work addresses only some of the requirements of the brief. Irrelevant and superficial content. One or more learning outcomes have not been met. Demonstrates limited knowledge and understanding of theory and practice for this level. Demonstrates a lack of understanding of key concepts. Presents a limited discussion of findings through the interpretation of information sources. Draws some irrelevant conclusions. Creativity and initiative are lacking. Problem-solving skills are lacking. Provides limited reflection and self- awareness in relation to the outcomes of own work and personal responsibility, when required. Sources selected are limited and lack relevance. Poor referencing technique employed. Writing style is unclear and does not match the requirements of the assessment in question. Deficiencies in spelling, grammar and punctuation makes reading difficult and arguments unclear in places. Paragraphs are poorly structured. The presentational style and layout are largely correct for the type of assignment. Adequately structured. Inclusion of some figures, tables and images but not always relevant and/or clear. For the type of assignment the presentational style, layout and/or structure are lacking. Figures, tables and images included when required but these lack clarity and relevance. 10 | Page Fail <34% Work does not address the requirements of the brief. Irrelevant and superficial content. One or more learning outcomes have not been met. Demonstrates inadequate knowledge and understanding of theory and practice for this level. Demonstrates insufficient understanding of key concepts. Presents a limited discussion of findings with little consideration of the quality of information drawn upon. Draws irrelevant conclusions. Creativity, initiative and problem- solving skills are absent. Provides inadequate reflection and self- awareness in relation to the outcomes of own work and personal responsibility, when required. There is an absence of relevant sources. Poor referencing technique employed. Writing style is unclear and does not match the requirements of the assessment in question. Deficiencies in spelling, grammar and punctuation makes reading difficult and arguments unclear. Unstructured paragraphs. For the type of assignment the presentational style, layout and/or structure are lacking. Figures, tables and images are absent when required or lack relevance/clarity. 11 | Page