This is an example of regression. The saved 'Beijing PM 2.5 pollutant' data set is used for the demo. The original dateset and description can be find at the UCI dataset repository. This hourly data set contains the PM2.5 data of US Embassy in Beijing. The 'pm2.5' is the output variable and the rest are input variables. Descriptions of each variable are listed here:

Although there is nothing much to change in this demo, I still want to keep it dynamic to prove the concept. The only interactive input here is the percentage of training set (for now?). All the plots are generated in real time once the 'load' button is pressed. The tricky one here is the 'cbwd' variable. As a categorical variable, it should be encoded properly before training. And be careful of the dummy variable trap.


Percentage of training set: % (integer only, 10~90)

Method: