Decision Tree: Notes and Interview Questions
What is a Decision Tree? Decision tree is a flowchart-like tree structure, where each internal node denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node (terminal node) holds a class label. Advantages of Decision Tree. - Simple to understand, interpret and visualize. - Used for both classification and regression problems. - Handle both continuous and categorical variables. - No feature scaling required as it uses a rule-based approach instead of distance calculation. - Handles non-linear parameters efficiently. - Automatically handle missing values. - Robust to outliers and can handle them automatically. Disadvantages of Decision Tree. - Generally leads to overfitting of the data which ultimately leads to wrong predictions. - Due to the overfitting, there are very high chances of high variance in the output which leads to many errors in the final estimation. - Adding a new data point can lead to regeneration of the overall tree and all nodes...