My Account

HomeRecent QuestionsQuestion #71884Oth220422614

Recent Question/Assignment

The purpose of this project is to follow the process of going from data to knowledge using a data
set that applies to a real-world problem. For this project, you will form teams of 4 to 5 students.
Your team’s objective is to locate a data set in order to help solve a specific problem. This
means that when locating a data set, you should be thinking about an impactful problem that
working with this data set would solve and how this data set will allow you to work towards
solving the problem. You may use any data mining software packages or libraries you wish for
performing data mining tasks and any programming language for cleaning and pre-processing
the data
Project Report [8-10 pages, 2500-3500 words]
• Executive Summary – give a high-level description of the problem and what will be
included in the remainder of the report. Be sure to mention the overall results of the project
as part of this summary!
• Problem Description - give a 2-3 sentence description of the problem your team solved.
• Data Exploration
o Source(s) of the data – provide link(s) and any background on the data that might
be of interest. For example, some data is “simulated” and did not come from a real-
world source.
o Number of records
o Attribute description – begin with a high-level description of each attribute,
including what the attribute is, its type (continuous, discrete). Using a table to do this
is ideal. Next, for each attribute, provide summary statistics (min, max, mean,
standard deviation, frequency, mode). There should be figures/plots for at least some
of the attributes, especially ones that appear to be interesting.
o Missing values – did you find any missing values? If so, how do you plan to deal
with these?
o Outliers – did you find any outliers? If so, how do you plan to deal with these?
• Data Preprocessing – which preprocessing steps did you use? Below is some guidance on
what to write about if you performed any of the following steps. Only include
preprocessing steps that you did as part of your project.
o Discretization - show the discretization scheme you used and the new distribution
for any attributes you discretized.
o Sampling - describe the process you used and show the summary statistics of the
attributes for the sampled data set.
o Aggregation – describe the process used for aggregating data within an attribute
and show the new distribution. Give reasoning for why you did this.
o Dimensionality Reduction/Feature Selection – how/why did you decide to remove
features from the data set. What was the result of having done this?
• Data Mining Techniques/Algorithms Used - Describe the techniques and algorithms used
at a high level and why you decided to use them.
• Results - Perform an appropriate analysis of results. For example, discuss errors in
classification models using confusion matrices. The purpose here is to compare the results
obtained from various models or approaches that you tried.
• Conclusions and Lessons Learned – What are the major takeaways from this project in
terms of how well you were able to solve the problem you stated. What did you learn from
working on the project together as a team?
Data Mining Tools/Languages/Libraries
• R - https://www.r-project.org
• Weka - https://www.cs.waikato.ac.nz/ml/index.html
• PowerBI - https://powerbi.microsoft.com/en-us/
Examples of Past Projects
IMDB movie earnings prediction - https://www.imdb.com/interfaces/
Credit card customer default prediction -
https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients
Predicting success of Kickstarter Campaigns - https://www.kaggle.com/kemical/kickstarter-
projects
Predicting video game sales - https://www.kaggle.com/rush4ratio/video-game-sales-with-ratings

Looking for answers ?

Recent Questions

Student/Group Name______________________ Student No._______________________48353: CONCRETE DESIGNAutumn 2024 - Assignment 2b (14 Marks) Due Date: 17th MayQuestion: The flat plate floor system shown in...ENHANCING BUSINESS COMPETITIVENESSObjectiveThis assessment is to illustrate how a chosen business organisation employs contemporary management principles to achieve a competitive advantage. The focus is...Dear Support team,below is the detailsTopicPiston Pump(except Liquid Silicone Rubber Technology)Prepare a Report purely based on Scientific ResearchThe report should be maxm 15-20pagesThere are no any...Its a group assignment, I want my part onlyThis is a philosophy and methodology in psy Instructions From TutorThe aim of this assessment task is to evaluate the appropriate use of the Moment Distribution Method for structural analysis.• Please download the eTask4 spread sheet from Canvas...Health EconomicsAssessment 2 __________________________________________________________________________ BackgroundModules 1 to 4 have provided an overview of economic principles and concepts, and the eTest...Assessment 3: Individual Reflective Essay (30%)Due Date Week 12 (Start of Week 12) 1800 - 2000 words• Please note that this assessment is based on your weekly journal. The final essay must demonstrate...Show All Questions

Recent Question/Assignment

Looking for answers ?

Recent Questions

Nursing Assignment Help Services| Australia Best Tutors

What Makes You Happy In The Workplace?

Refund and Cancellation Policies - Australianbesttutors.com