Recent Question/Assignment

Pfda assignment question file
Attr.txt (this file must be used for the attributes in r studio)
Student.csv (this file is used to upload for data in r studio to use)
I am required to create my own question and analysis in my assignment. I need to have 4 questions followed by 5 analysis which are relatable and can be applied as per the Attr.txt file. Please revert to me soon.
Degree Students Academic Performance Dataset
For the assignment, you are asked to explore the application of data analytics techniques to the dataset which is provided. You must study data problems related to the dataset, giving special consideration to the unique properties of the problem domain, and testing one or more techniques on it.
Your analysis should be deep and in details, also it must go further than what has already been covered in this course. You must adopt the data Exploration, Manipulation, Transformation and Visualization concepts to guide you through the solution process. It is very important to explain and justify the techniques that have been chosen.
You also may need to pre-process your data to get it into an appropriate format. The assignment should involve a number of techniques by categorize it into different criteria and a detailed exploration with the commands using in each criteria. Outline the findings, analyze it and justify correctly with an appropriate graph. Also, a supporting document is needed to reflect the graph and code using R programming concepts.

This assignment will help you to explore and analyse a set of data and reconstruct it into meaningful representations for decision making.
3.0 TYPE
Individual Assignment
This dataset contains the three-year final scores of degree students marks at the end of their academic programs with several features that might or might not impact the outcome of these students’ performance. It includes their personal details, academic performance, family backgrounds and daily routines. And need to analyse how these factors influence their academic performance. This dataset contains 33 columns with 16 integer datatype values,9 string values and 8 boolean values.
The techniques used to explore the dataset using various data exploration, manipulation, transformation, and visualization techniques which covered in the course. And as an additional feature must explore the further concepts which can improve the retrieval effects.
The dataset provided for this assignment is related to the Degree student’s academic performance. It contains 32 columns and 396 rows. It contains data related to their academic, family background and their daily routine.

• The R program should compile and be executed without errors.
• Validation should be done for each entry from the users to avoid logical errors.
• No duplication is allowed in dataset.
• You should;
o Include good programming practice such as comments, variable naming conventions and indentation.
o Carried out additional research from Internet to comprehend the knowledge and information on the given dataset when examine the data.
• The analysis should be meaningful and effective in providing the information for the decision making.
• Any additional features implemented must improve the retrieval effects.
• In a situation where a student:
o Failed to demo the assignment, overall marks awarded for the assignment will be adjusted to 50% of the overall existing marks if it is more than 50%.
o Found to be involved plagiarism, the offence and will be dealt in accordance to APU and Staffordshire University regulations on plagiarism.
• The complete RScript (source code) and report must be submitted to APU Learning Management System (Moodle).
• RScript (Program Code):
o Name the file under your name and TP number.
o Start the first two lines in your program by typing your name and TP number. For example:
o For each question example, give an id and explain the what you want to know. For example:
# Question 1: How the student achieves better score in Level 3.
# Analysis 1-1: Find the relationship between extra education support with L1 and L2 score…
# Analysis 1-2: Find the relationship between study time and ….
# Analysis 1-3: Find the relationship between …
o For each extra feature example, give an id and provide the explanation.
# Extra feature 1
# comments about the extra feature
? As part of the assessment, you must submit the project report in printed and softcopy form, which should have the following format:
A) Cover Page:
All reports must be prepared with a front cover. A protective transparent plastic sheet can be placed in front of the report to protect the front cover. The front cover should be presented with the following details:
? Module
? Coursework Title
? Intake
? Student name and id
? Date Assigned (the date the report was handed out).
? Date Completed (the date the report is due to be handed in).
B) Contents:
o Introduction and assumptions (if any)
o Data import / Cleaning / pre-processing / transformation
o Each question must start in a separate page and contains:
? Analysis Techniques - data exploration / manipulation / visualization
? Screenshot of source code with the explanation.
? Screenshot of output/plot with the explanation.
? Outline the findings based on the results obtained.
o The extra feature explanation must be in a separate page and contains:
? Screenshot of source code with the explanation.
? Screenshot of output/plot with the explanation.
? Explain how adding this extra feature can improve the results.

C) Conclusion
D) References
? The font size used in the report must be 12pt and the font is Times New Roman. Full source code is not allowed to be included in the report. The report must be typed and clearly printed.
? You may source algorithms and information from the Internet or books. Proper referencing of the resources should be evident in the document.
? All references must be made using the Harvard Naming Convention as shown below:

The theory was first propounded in 1970 (Larsen, A.E. 1971), but since then has been refuted; M.K. Larsen (1983) is among those most energetic in their opposition……….
* Following source code obtained from (Danang, S.N. 2002)
int noshape=2;
? List of references at the end of your document or source code must be specified in the following format:
Larsen, A.E. 1971, A Guide to the Aquatic Science Literature, McGraw-Hill, London.
Larsen, M.K. 1983, British Medical Journal [Online], Available from (Accessed 19 November 1995)
Danang, S.N., 2002, Finding Similar Images [Online], The Code Project, *Available from, [Accessed 14th *September 2006]
? Further information on other types of citation is available in Petrie, A., 2003, UWE Library Services Study Skills: How to reference [online], England, University of Western England, Available from, [Accessed 4th September 2003].
The assignment assessment consists of three components: Coding (50%), Documentation (30%) and Presentation (20%). Details of the division for each component are as follows:
Coding (50%)
Documentation (30%)
Criteria Marks Allocated Criteria Marks Allocated
Data Exploration
10% Structure of the report and references
• Description and justification of the R concepts incorporate.
• Program out screenshots, graphs
Project description, limitation, and conclusion
Data Manipulation 10%
Data Transformation
Data Visualization 20%
Presentation (20%)
Marks Allocated
• Ability to answer questions addressed by the lecturer pertaining to the work done and presented 20%
The program written for this assignment should be written in R Studio
? You are expected to maintain the utmost level of academic integrity during the duration of the course.
? Plagiarism is a serious offence and will be dealt with according to APU and Staffordshire University regulations on plagiarism.