Data Understanding

Dataset description

This is a user behavioral analysis problem which is a binary classification problem in which prediction is made regarding the intention of user to perform a transaction. User not performing any transaction is classified into negative while the one that performs a transaction is classified into postive class.

The data contains features for 12330 sessions that were created whenever user visted the company website. Each session belong to a different user. The total period that was considered for the experiment is 1 year to avoid any tendency to a specific campaign, special day, user profile or period. Out of all the observations, 84.5% are tagged as negative class while rest as postive class.

There are 18 features that are available in the dataset. They include information regarding types of pages, metrics collected from Google Analytics, period information when the transaction was performed, and some other general information regarding the system of the user. Both numerical and categorical features are present in the dataset.