Boost your journey with 24/7 access to skilled experts, offering unmatched machine learning homework help

**Q1:**Assignment 5: Matrix as a Linear TransformationSee Answer**Q2:**Q1 Consider the problem where we want to predict the gender of a person from a set of input parameters, namely height, weight, and age.See Answer**Q3:**Q2. Using the data from Problem 2, build a Gaussian Naive Bayes classifier for this problem. For this you have to learn Gaussian distribution parameters for each input data feature, i.e. for p(height|W), p(height|M), p(weight|W), p(weight|M), p(age|W), p(age|M). a) Learn/derive the parameters for the Gaussian Na ive Bayes Classifier for the data from Question 2 a) and apply them to the same target as in problem 1a). b) Implement the Gaussian Na ive Bayes Classifier for this problem. c) Repeat the experiment in part 1 c) and 1 d) with the Gaussian Native Bayes Classifier. Discuss the results, in particular with respect to the performance difference between using all features and using only height and weight. d) Same as 1d but with Naïve Bayes. e) Compare the results of the two classifiers (i.e., the results form 1 c) and 1d) with the ones from 2 c) 2d) and discuss reasons why one might perform better than the other.See Answer**Q4:**For this programming assignment you will implement the Naive Bayes algorithm from scratch and the functions to evaluate it with a k-fold cross validation (also from scratch). You can use the code in the following tutorial to get started and get ideas for your implementation of the Naive Bayes algorithm but please, enhance it as much as you can (there are many things you can do to enhance it such as those mentioned at the end of the tutorial):See Answer**Q5:**Q1 Consider the problem where we want to predict the gender of a person from a set of input parameters, namely height, weight, and age. a) Using Cartesian distance, Manhattan distance and Minkowski distance of order 3 as the similarity measurements show the results of the gender prediction for the Evaluation data that is listed below generated training data for values of K of 1, 3, and 7. Include the intermediate steps (i.e., distance calculation, neighbor selection, and prediction). b) Implement the KNN algorithm for this problem. Your implementation should work with different training data sets as well as different values of K and allow to input a data point for the prediction. c) To evaluate the performance of the KNN algorithm (using Euclidean distance metric), implement a leave- one-out evaluation routine for your algorithm. In leave-one-out validation, we repeatedly evaluate the algorithm by removing one data point from the training set, training the algorithm on the remaining data set and then testing it on the point we removed to see if the label matches or not. Repeating this for each of the data points gives us an estimate as to the percentage of erroneous predictions the algorithm makes and thus a measure of the accuracy of the algorithm for the given data. Apply your leave-one-out validation with your KNN algorithm to the dataset for Question 1 c) for values for K of 1, 3, 5, 7, 9, and 11 and report the results. For which value of K do you get the best performance? d) Repeat the prediction and validation you performed in Question 1 c) using KNN when the age data is removed (i.e. when only the height and weight features are used as part of the distance calculation in the KNN algorithm). Report the results and compare the performance without the age attribute with the ones from Question 1 c). Discuss the results. What do the results tell you about the data?See Answer**Q6:**6. (Programming) You need to implement the kNN algorithm as in the slides. The data we use for binary classification tasks is the UCI a4a data.See Answer**Q7:**Question 1 Download the SGEMM GPU kernel performance dataset from the below link. https://archive.ics.uci.edu/ml/datasets/SGEMM+GPU+kernel+performance Understand the dataset by performing exploratory analysis. Prepare the target parameter by taking the average of the THREE (3) runs with long performance times. Design a linear regression model to estimate the target using only THREE (3) attributes from the dataset. Discuss your results, relevant performance metrics and the impact of normalizing the dataset.See Answer**Q8:**Question 2 Load the wine dataset from sklearn package. Perform exploratory data analysis and Design a simple TWO (2) layer neural network for the classification. Compare the performance with the Naïve Bayes algorithm. Train the neural network such that it has better or same performance as that of the Naïve Bayes algorithm.See Answer**Q9:**Question 3 Download the MAGIC gamma telescope data 2004 dataset available in Kaggle (https://www.kaggle.com/abhinand05/magic-gamma-telescope-dataset). Prepare the dataset and perform exploratory data analysis. Set-up a random forest algorithm for identifying whether the pattern was caused by gamma signal or not. Propose optimal values for the depth and number of trees in the random forest. Assess and compare the performance of optimized random forest with the Naïve Bayes algorithm. Discuss the performance metrics and the computational complexity.See Answer**Q10:**Question 4 Use the Fashion MNIST dataset from the keras package. Perform exploratory data analysis. Show a random set of FIVE (5) images from each class in the dataset with their corresponding class names. Prepare the dataset by normalizing the pixel values to be between 0 and 1. Design a CNN with TWO (2) convolutional layers and FOUR (4) dense layers (including the final output layer). Employ 'ReLU' activation and "MaxPooling'. Keep 15% of the train dataset for validation. Rate the performance of the algorithm and provide necessary plots. Pick a random image from the test dataset, pass it to the algorithm and compare the algorithm output with the actual class label.See Answer**Q11:**Question 5 Select any stock listed in Singapore stock exchange. Using Yahoo finance, download the daily stock data (Open, High, Low, Close, Adj Close, Volume) from year 1 Jan 2020 to 3 Jan 2022. Use data until 31 Dec 2020 for training and the remaining data for testing. You must select the stock such that the data is available from 1 Jan 2020 to 3 Jan 2022. Use previous 30 days of stock information to predict the next day stock price. Use the data in 'High' column to predict the price, i.e., the next day high price of the stock. Design a LSTM network to do the predictions. You are required to use LSTM with a cell state of at least 60 dimension and do at least 50 epochs of training. Rate the performance of the LSTM classifier and provide necessary plots.See Answer**Q12:**This is a machine learning model in python using scikit learn to classify the handwritten Arabic letters. There are two files. The train data and the test data. The code is available, and we need to optimize the code so under box number 6 when we do the cross validation of the model, the accuracy of the model should be in high 80s and low 90s. we should be tuning the hyperparameters and improve the pipeline as needed. Anything is allowed to be used from the scikit learn but nothing more. The code as it is, the model accuracy is 79 The goal is to modify the code to be able to get an accuracy of the model in the high 80s and low 90s. In box 3 of the code, there are the hyperparameters that need to be tuned and the pipeline that might need to be modifed. Voting model can be used to get high accuracy. We need to improve the model accuracy from the existing code. Info about the dataset: The dataset is composed of 16,800 characters written by 60 participants, the age range is between 19 to 40 years, and 90% of participants are right-hand. Each participant wrote each character (from 'alef' to 'yeh') ten times on two forms. The forms were scanned at the resolution of 300 dpi. The dataset is partitioned into two sets: a training set (13,440 characters to 480 images per class) and a test set (3,360 characters to 120 images per class). Writers of training set and test set are exclusive. Ordering of including writers to test set are randomized to make sure that writers of test set were not from a single institution (to ensure variability of the test set). The code: This is a machine learning model in python using scikit learn to classify the handwritten Arabic letters. There are two files. The train data and the test data. The code is available, and we need to optimize the code so under box number 6 when we do the cross validation of the model, the accuracy of the model should be in high 80s and low 90s. we should be tuning the hyperparameters and improve the pipeline as needed. Anything is allowed to be used from the scikit learn but nothing more. Voting model can be used to improve accuracy. Goal: build an image classifier to classify handwritten Arabic language characters using scikit learn. The model accuracy have to be in high 80s like 89% or low 90s like 92% This is all about tuning the hyperparameters and the model pipelineSee Answer**Q13:**This is a machine learning model in python using scikit learn to classify the handwritten Arabic letters. There are two files. The train data and the test data. The code is available, and we need to optimize the code so under box number 6 when we do the cross validation of the model, the accuracy of the model should be in high 80s and low 90s. we should be tuning the hyperparameters and improve the pipeline as needed. Anything is allowed to be used from the scikit learn but nothing more. The code as it is, the model accuracy is 79 The goal is to modify the code to be able to get an accuracy of the model in the high 80s and low 90s. In box 3 of the code, there are the hyperparameters that need to be tuned and the pipeline that might need to be modifed. Voting model can be used to get high accuracy. We need to improve the model accuracy from the existing code. Info about the dataset: The dataset is composed of 16,800 characters written by 60 participants, the age range is between 19 to 40 years, and 90% of participants are right-hand. Each participant wrote each character (from 'alef' to 'yeh') ten times on two forms. The forms were scanned at the resolution of 300 dpi. The dataset is partitioned into two sets: a training set (13,440 characters to 480 images per class) and a test set (3,360 characters to 120 images per class). Writers of training set and test set are exclusive. Ordering of including writers to test set are randomized to make sure that writers of test set were not from a single institution (to ensure variability of the test set). The code: This is a machine learning model in python using scikit learn to classify the handwritten Arabic letters. There are two files. The train data and the test data. The code is available, and we need to optimize the code so under box number 6 when we do the cross validation of the model, the accuracy of the model should be in high 80s and low 90s. we should be tuning the hyperparameters and improve the pipeline as needed. Anything is allowed to be used from the scikit learn but nothing more. Voting model can be used to improve accuracy. Goal: build an image classifier to classify handwritten Arabic language characters using scikit learn. The model accuracy have to be in high 80s like 89% or low 90s like 92% This is all about tuning the hyperparameters and the model pipelineSee Answer**Q14:**There are four folders, each folder contains a set of exercises, the expected results are written at the top of each ipynb. some files are just example solutions Day 1 all about fitting a linear regression or logistic regression to the data. Also to determine the decision boundaries. Day 2 Use Neural Networks to solve simple classification examples Day 3 Using Convolutional Neural Network with PyTorch with one example solution Day 4 Deep learning, the solution is ready just we add the testing data and test the built model and output a submission file with labelsSee Answer**Q15:**The main aim of this project is to analyze a movie review's textual content in order to determine its underlying sentiment. In this project, we try to classify whether a person liked the movie or not based on the review they give for the movie. 1) You need to develop a python code to calculate the sentiment using NLP analysis and should use CNN and logisitic regression 2) You need to create a report of what you have done in the code and also you need to explain how our work is different from the references we have taken (references are in the document)See Answer**Q16:**Programming Assignment 2 For this programming assignment you will implement the Lenet 5 CNN using either pytorch or tensorflow, but not Keras. You can take a look at other implementations in internet but please, when coding use your personal coding style and add references to your sources. The goal of this implementation is that you completely understand what happens in the code because our TA will ask you questions about it when reviewing your assignment (you need to make an appointment with your TA for this). Here is an implementation in Pytorch! -implementing-yann-lecuns-lenet-5-in-pytorch-5e05a0911320 - lenet5_pytorch.ipynb Here is an implementation in Tensorflow (careful: the tutorial and implementation don't match, I couldn't find the pair from the same author)! - lenet-with-tensorflow-a35da0d503df - 6751b1b92fe8f4ff617f10c7f9f9d315 Test your implementation with the MNIST dataset from Kaggle. Submission: - Code of your implementation of Le-Net 5. - Brief report of the results on the MNIST dataset. - An analysis of your results on the MNIST dataset. TA Review: - Your will show your implementation to our TA and he will ask you details about how LeNet works in order grade you. NOTES: 1. DO NOT JUST COPY THE CODE FROM THE TUTORIALSee Answer**Q17:**Linear Regression: 1. Consider a simplified fitting problem in the frequency domain where we are looking to find the best fit of data with a set of periodic (trigonometric) basis functions of the form 1, sin²(x), sin(kx), sin²(2kx)..., where k is effectively the frequency increment. The resulting function for a given "frequency increment", k, and "function depth", d, and parameter vector is then: y = 00 * 1+(9; * sin(i + k + x)* sin(i * k*x) i=1 Try "frequency increment" k from 1-10 For example, if k = 1 and d = 1, your basis (feature) functions are: 1, sin²(x) if k = 1 and d = 2, your basis (feature) functions are: 1, sin(x), sin²(2.x) if k=3 and d = 4, your basis (feature) functions are: 1, sin²(3*1*x), sin²(3*2*x), sin²(3*3*x), sin²(3*4*x) This means that this problem can be solved using linear regression as the function is linear in terms of the parameters Ⓒ. Try "frequency increment" k from 1-10 and thus your basis functions as part of the data generation process described above. a) Implement a linear regression learner to solve this best fit problem for 1 dimensional data. Make sure your implementation can handle fits for different "function depths" (at least to "depth" 6). b) Apply your regression learner to the data set that was generated for Question 1b) and plot the resulting function for "function depth" 0, 1, 2, 3, 4, 5, and 6. Plot the resulting function together with the data points c) Evaluate your regression functions by computing the error on the test data points that were generated for Question 1c) Compare the error results and try to determine for what "function depths" overfitting might be a problem. Which "function depth" would you consider the best prediction function and why? For which values of k and d do you get minimum error? d) Repeat the experiment and evaluation of part b) and c) using only the first 20 elements of the training data set part b) and the Test set of part c). What differences do you see and why might they occur? Locally Weighted Linear Regression 2. Another way to address nonlinear functions with a lower likelihood of overfitting is the use of locally weighted linear regression where the neighborhood function addresses non-linearity and the feature vector stays simple. In this case we assume that we will use only the raw feature, x, as well as the bias (i.e. a constant feature 1). Thus the locally applied regression function is y = 0 + 0₁ *x As discussed in class, locally weighted linear regression solves a linear regression problem for each query point, deriving a local approximation for the shape of the function at that point (as well as for its value). To achieve this, it uses a modified error function that applies a weight to each data point's error that is related to its distance from the query point. Here we will assume that the weight function for the i data point and query point x is: w(s) (x) = e (z (6)_x)² Use y: 0.204 where y is a measure of the "locality" of the weight function, indicating how fast the influence of a data point changes with its distance from the query point. a. Implement a locally weighted linear regression learner to solve the best fit problem for 1 dimensional data. b. Apply your locally weighted linear regression learner to the data set that was generated for Question 1b) and plot the resulting function together with the data points c. Evaluate the locally weighted linear regression on the Test data from Question 1 c). How does the performance compare to the one for the results from Question 1 c) ? d. Repeat the experiment and evaluation of part b) and c) using only the first 20 elements of the training data set. How does the performance compare to the one for the results from Question 1 d) ? Why might this be the case? e. Given the results form parts c) and d), do you believe the data set you used was actually derived from a function that is consistent with the function format in Question 1? Justify your answer. Logistic Regression 3. Consider again the problem from Questions 1 and 2 in the first assignment where we want to predict the gender of a person from a set of input parameters, namely height, weight, and age. Assume the same datasets you generated for the first assignment. Use learning rate=0.01. Try different values for number of iterations. a. Implement logistic regression to classify this data (use the individual data elements, i.e. height, weight, and age, as features). Your implementation should take different data sets as input for learning. b. Plot the resulting separating surface together with the data. To do this plotting you need to project the data and function into one or more 2D space. The best visual results will be if projection is done along the separating hyperplane (i.e. into a space described by the normal of the hyperplane and one of the dimension within the hyperplane) c. Evaluate the performance of your logistic regression classifier in the same way as for Project 1 using leave-one- out validation and compare the results with the ones for KNN and Naïve Bayes Discuss what differences exist and why one method might outperform the others for this problem. d. Repeat the evaluation and comparison from part c) with the age feature removed. Again, discuss what differences exist and why one method might outperform the others in this case.See Answer**Q18:**CSE 6363 - Machine Learning Data Set Use the dataset given at the bottom of this file. Do Not Use You are not allowed to use any ML libraries other than NumPy. You cannot use sklearn or any ML library. If used, you will receive a penalty of 90 points. You cannot use pandas. If used, you will receive a penalty of 20 points. Libraries You are allowed to use NumPy, math. You can use matplotlib to plot graphs. If you want to use any other library apart from these, please check with your GTA and get their approval. Where to code 1. We will provide you with a directory structure with python files for each part of every question. You must write your code in these files. 2. It will contain a script to execute the files. You must run this script and verify that your code runs before you submit. To run this script you must make it executable first or else you will get permission denied error.See Answer**Q19:**1. Design and develop a text classifier which can be used as an amazon review categorizer. Your classifier must be able to train to classify reviews into one of two classes. Positive and negative reviews. Description can be found in the readme file. Please note that we are using only the test set as the dataset is huge. This test set contains 400k data points. a. Data set can be found in the canvas b. Use the TfidfVectorizer found in Sciekit-learn library in python to vectorize the dataset c. Use GaussianNB for the classifier d. Calculate the accuracy of the model. You need to use the data partitioning to create train set and test set from the data set given. e. Input a sample text and determine the class of the text provided See Answer**Q20:**Use the dataset given at the bottom of this file.See Answer

- C/C++
- Java
- Python
- Agile Software Development
- Android App Development
- Artificial Intelligence
- Assembly Programming
- Big Data
- C#
- Cloud Computing
- Compiler Design
- Computer Graphics
- Computer Networks
- Computer Organisation And Architecture
- Cryptography
- Cyber Security
- Data Mining
- Data Science
- Data Structures And Algo
- Data Warehousing
- DBMS
- Deep Learning
- Distributed Computing
- Formal Language Automata
- Haskell Programming
- Internet Of Things
- Machine Learning
- Mobile Computing
- Multimedia Technology
- Natural Language Processing
- Object Oriented Analysis And Design
- Operating System
- Programming Language Principle And Paradigm
- Prolog Programming
- Real Time System
- Software Engineering
- Web Designing And Development
- Design And Analysis Of Algorithms
- React
- Coding

TutorBin believes that distance should never be a barrier to learning. Over 500000+ orders and 100000+ happy customers explain TutorBin has become the name that keeps learning fun in the UK, USA, Canada, Australia, Singapore, and UAE.