Data Science Practical Exam Work: Data Set Preprocessing in Orange Tool & Visualization in Power BI

Maharshi Relia
2 min readNov 16, 2021

18IT110: Maharshi Chetan Relia

Task 1: What is need to be done to improve the accuracy of the classification result of the given dataset?

o Encoding
o Normalization
o Missing value handling
o Feature Selection

Orange Tool

Orange is an open-source data visualization, machine learning and data mining toolkit. It features a visual programming front-end for explorative rapid qualitative data analysis and interactive data visualization.

Orange Tool
Preprocessing Done

Preprocessing is crucial for achieving better-quality analysis results. The Preprocess widget offers several preprocessing methods that can be combined in a single preprocessing pipeline. Some methods are available as separate widgets, which offer advanced techniques and greater parameter tuning.

Task 2: Dashboard of the preprocessed dataset in Power BI

Power BI is a business analytics service by Microsoft. It aims to provide interactive visualizations and business intelligence capabilities with an interface simple enough for end users to create their own reports and dashboards.

Dashboard with entities
Visualization of entire Data set
Pie-Chart of Capital gain and character frequency
Column Chart of entire dataset
Statistics

Conclusion

Thus we preprocessed the spambase data in Orange tool, same can be done with Google Colab or Jupyter also. After that we imported the csv file in Power BI and made all the visualizations with proper entities and statistics.

Data Science Practical Exam: 18IT110 Maharshi Chetan Relia

--

--

Maharshi Relia

IT Consultant | UI-UX Designer | Web Developer | SEO Analyst & Executive | Marketing Executive | Passionate Hotelier