Collaborative Filtering (Recommender) - Python Code Part 2

Collaborative Filtering Using GraphLab for Implicit Dataset

Objective : Apply a recommender of collaborative Filtering to propose recommendation to new users (with no transaction)

Note: Graphlab's recommender is able to compare all suitable/ relevant model, and select the best model as a recommender.

Step 1 of 4: Upload Relevant Libraries

Note: Need to sign up for academic license for the use of Graphlab's Library & Python 2 only.

In [1]:
import pandas as pd
    import graphlab as gl
    

Step 2 of 4: Download & Prepare Dataset

Note: print the data to check if the dataset has been imported correctly

In [2]:
df = pd.read_csv('dataset\dataset02_master.csv', sep = ',')
    print(df.head(5)) ## check data
    
D:\Program Files\Anaconda2\envs\gl-env\lib\site-packages\IPython\core\interactiveshell.py:2723: DtypeWarning: Columns (8) have mixed types. Specify dtype option on import or set low_memory=False.
      interactivity=interactivity, compiler=compiler, result=result)
    
     Type    Card_ID  SegmentNo Gender  Age Age_Grp  Length_of_Membership_MTH  \
    0  Active  104829316          5      F   40   35-44                         0
    1  Active  101480021          5      M   44   35-44                         0
    2  Active  104219628          5      M   21   15-24                         0
    3  Active  104219628          5      M   21   15-24                         0
    4  Active  106272169          5      M   29   25-34                         0

      Membership_Grp           pdt_type  total_count
    0        <= 1 YR    MP3Players_high          2.0
    1        <= 1 YR    MP3Players_high          2.0
    2        <= 1 YR  MP3Players_medium          2.0
    3        <= 1 YR    MP3Players_high          1.0
    4        <= 1 YR    Hardware_medium          1.0
    
2 datasets are required - (1) recommendation filtering , (2) User's information
In [3]:
# Step 1: Prepare Datatset 

    ## Set 1, dataset for recommendation filtering
    df_all = df[['Card_ID','pdt_type']].astype(str)

    ## Set 2, dataset for user info
    df_user_data = df[['Card_ID','Gender', 'Age', 'Age_Grp',
           'Length_of_Membership_MTH', 'Membership_Grp','Type']].\
           drop_duplicates().reset_index(drop = True)
    df_user_data.astype(str)

    ## convert into S-Frame
    df_all_SFrame = gl.SFrame(df_all)
    df_user_data_SFrame = gl.SFrame(df_user_data)
    
This non-commercial license of GraphLab Create for academic use is assigned to hanying.ong.2015@mitb.smu.edu.sg and will expire on June 10, 2018.
    
[INFO] graphlab.cython.cy_server: GraphLab Create v2.1 started. Logging: C:\Users\HANYIN~1.201\AppData\Local\Temp\graphlab_server_1497427481.log.0
    

Step 3 of 4 : Create recommendation model

In [4]:
all_model = gl.recommender.create \
                    (df_all_SFrame, user_id='Card_ID', item_id='pdt_type',\
                    user_data=df_user_data_SFrame)

    recs_final  = all_model.recommend()
    results_final= recs_final.to_dataframe()
    
Recsys training: model = ranking_factorization_recommender
Preparing data set.
    Data has 251946 observations with 280264 users and 22 items.
    Data prepared in: 1.48602s
Training ranking_factorization_recommender for recommendations.
+--------------------------------+--------------------------------------------------+----------+
| Parameter                      | Description                                      | Value    |
+--------------------------------+--------------------------------------------------+----------+
| num_factors                    | Factor Dimension                                 | 32       |
| regularization                 | L2 Regularization on Factors                     | 1e-009   |
| solver                         | Solver used for training                         | adagrad  |
| linear_regularization          | L2 Regularization on Linear Coefficients         | 1e-009   |
| binary_target                  | Assume Binary Targets                            | True     |
| side_data_factorization        | Assign Factors for Side Data                     | True     |
| max_iterations                 | Maximum Number of Iterations                     | 25       |
+--------------------------------+--------------------------------------------------+----------+
  Optimizing model using SGD; tuning step size.
  Using 31493 / 251946 points for tuning the step size.
+---------+-------------------+------------------------------------------+
| Attempt | Initial Step Size | Estimated Objective Value                |
+---------+-------------------+------------------------------------------+
| 0       | 6.25              | 0.133722                                 |
| 1       | 3.125             | 0.026804                                 |
| 2       | 1.5625            | 0.0121303                                |
| 3       | 0.78125           | 0.0123325                                |
| 4       | 0.390625          | 0.0141736                                |
| 5       | 0.195312          | 0.0284249                                |
+---------+-------------------+------------------------------------------+
| Final   | 1.5625            | 0.0121303                                |
+---------+-------------------+------------------------------------------+
Starting Optimization.
+---------+--------------+-------------------+-----------------------------------+-------------+
| Iter.   | Elapsed Time | Approx. Objective | Approx. Training Predictive Error | Step Size   |
+---------+--------------+-------------------+-----------------------------------+-------------+
| Initial | 0us          | 1.38643           | 0.693178                          |             |
+---------+--------------+-------------------+-----------------------------------+-------------+
| 1       | 2.38s        | 1.6155            | 0.794203                          | 1.5625      |
| 2       | 4.65s        | 0.771283          | 0.375829                          | 1.5625      |
| 3       | 6.57s        | 0.568471          | 0.276656                          | 1.5625      |
| 4       | 8.40s        | 0.354027          | 0.165047                          | 1.5625      |
| 5       | 10.19s       | 0.20638           | 0.089717                          | 1.5625      |
| 6       | 12.52s       | 0.117082          | 0.0436964                         | 1.5625      |
| 7       | 14.75s       | 0.0669788         | 0.0200838                         | 1.5625      |
| 8       | 19.25s       | 0.0405952         | 0.00932651                        | 1.5625      |
| 9       | 22.82s       | 0.0289993         | 0.00520643                        | 1.5625      |
| 10      | 26.31s       | 0.0228424         | 0.00329324                        | 1.5625      |
| 11      | 29.06s       | 0.0191343         | 0.00220472                        | 1.5625      |
| 12      | 32.37s       | 0.0174462         | 0.00167625                        | 1.5625      |
| 13      | 35.07s       | 0.0157746         | 0.00126697                        | 1.5625      |
| 14      | 37.53s       | 0.0145576         | 0.00102637                        | 1.5625      |
| 15      | 40.25s       | 0.0139252         | 0.000754554                       | 1.5625      |
| 16      | 42.31s       | 0.0134434         | 0.0006686                         | 1.5625      |
| 17      | 44.74s       | 0.0130922         | 0.000610974                       | 1.5625      |
| 18      | 47.50s       | 0.0125131         | 0.00045735                        | 1.5625      |
| 19      | 51.19s       | 0.0125824         | 0.000387206                       | 1.5625      |
| 20      | 53.99s       | 0.0121075         | 0.000340692                       | 1.5625      |
| 21      | 55.94s       | 0.0119453         | 0.000302641                       | 1.5625      |
| 22      | 58.24s       | 0.0118297         | 0.000253013                       | 1.5625      |
| 23      | 1m 1s        | 0.0115955         | 0.000221591                       | 1.5625      |
| 24      | 1m 3s        | 0.0115526         | 0.000197239                       | 1.5625      |
| 25      | 1m 6s        | 0.0114544         | 0.000177922                       | 1.5625      |
+---------+--------------+-------------------+-----------------------------------+-------------+
Optimization Complete: Maximum number of passes through the data reached.
Computing final objective value and training Predictive Error.
       Final objective value: 0.0114165
       Final training Predictive Error: 0.000161572
WARNING: Differing categorical key types present in list or dictionary on column Card_ID; promoting all to string type.
recommendations finished on 1000/280264 queries. users per second: 64036.9
recommendations finished on 2000/280264 queries. users per second: 20644.3
recommendations finished on 3000/280264 queries. users per second: 27300.5
recommendations finished on 4000/280264 queries. users per second: 32023.6
recommendations finished on 5000/280264 queries. users per second: 28745.4
recommendations finished on 6000/280264 queries. users per second: 31096.3
recommendations finished on 7000/280264 queries. users per second: 29167
recommendations finished on 8000/280264 queries. users per second: 29195.8
recommendations finished on 9000/280264 queries. users per second: 27773.8
recommendations finished on 10000/280264 queries. users per second: 28163.8
recommendations finished on 11000/280264 queries. users per second: 27562.4
recommendations finished on 12000/280264 queries. users per second: 28428.6
recommendations finished on 13000/280264 queries. users per second: 27887.3
recommendations finished on 14000/280264 queries. users per second: 28330.3
recommendations finished on 15000/280264 queries. users per second: 27312.5
recommendations finished on 16000/280264 queries. users per second: 28308
recommendations finished on 17000/280264 queries. users per second: 27364.2
recommendations finished on 18000/280264 queries. users per second: 27938
recommendations finished on 19000/280264 queries. users per second: 28053
recommendations finished on 20000/280264 queries. users per second: 26063.3
recommendations finished on 21000/280264 queries. users per second: 26464.7
recommendations finished on 22000/280264 queries. users per second: 26674.3
recommendations finished on 23000/280264 queries. users per second: 25580.7
recommendations finished on 24000/280264 queries. users per second: 26083.9
recommendations finished on 25000/280264 queries. users per second: 25046
recommendations finished on 26000/280264 queries. users per second: 25610.8
recommendations finished on 27000/280264 queries. users per second: 26056.5
recommendations finished on 28000/280264 queries. users per second: 24882.9
recommendations finished on 29000/280264 queries. users per second: 25385.8
recommendations finished on 30000/280264 queries. users per second: 25906.6
recommendations finished on 31000/280264 queries. users per second: 25240
recommendations finished on 32000/280264 queries. users per second: 25677.4
recommendations finished on 33000/280264 queries. users per second: 25438.1
recommendations finished on 34000/280264 queries. users per second: 25654.8
recommendations finished on 35000/280264 queries. users per second: 25654
recommendations finished on 36000/280264 queries. users per second: 25745
recommendations finished on 37000/280264 queries. users per second: 25958.2
recommendations finished on 38000/280264 queries. users per second: 25914.2
recommendations finished on 39000/280264 queries. users per second: 25872.2
recommendations finished on 40000/280264 queries. users per second: 26187.6
recommendations finished on 41000/280264 queries. users per second: 25958.1
recommendations finished on 42000/280264 queries. users per second: 26341
recommendations finished on 43000/280264 queries. users per second: 26683.3
recommendations finished on 44000/280264 queries. users per second: 26276.1
recommendations finished on 45000/280264 queries. users per second: 26540.1
recommendations finished on 46000/280264 queries. users per second: 26420.3
recommendations finished on 47000/280264 queries. users per second: 26657.6
recommendations finished on 48000/280264 queries. users per second: 26964.4
recommendations finished on 49000/280264 queries. users per second: 26942.7
recommendations finished on 50000/280264 queries. users per second: 27075.1
recommendations finished on 51000/280264 queries. users per second: 27102.9
recommendations finished on 52000/280264 queries. users per second: 27314.7
recommendations finished on 53000/280264 queries. users per second: 27252.7
recommendations finished on 54000/280264 queries. users per second: 27372.3
recommendations finished on 55000/280264 queries. users per second: 27401.2
recommendations finished on 56000/280264 queries. users per second: 27488.1
recommendations finished on 57000/280264 queries. users per second: 27626.3
recommendations finished on 58000/280264 queries. users per second: 27576
recommendations finished on 59000/280264 queries. users per second: 27656.4
recommendations finished on 60000/280264 queries. users per second: 27696.7
recommendations finished on 61000/280264 queries. users per second: 27672.5
recommendations finished on 62000/280264 queries. users per second: 27898.1
recommendations finished on 63000/280264 queries. users per second: 28132.8
recommendations finished on 64000/280264 queries. users per second: 27844.9
recommendations finished on 65000/280264 queries. users per second: 28036.2
recommendations finished on 66000/280264 queries. users per second: 27913.1
recommendations finished on 67000/280264 queries. users per second: 28004.3
recommendations finished on 68000/280264 queries. users per second: 27920.2
recommendations finished on 69000/280264 queries. users per second: 28052.9
recommendations finished on 70000/280264 queries. users per second: 28034.7
recommendations finished on 71000/280264 queries. users per second: 28265.3
recommendations finished on 72000/280264 queries. users per second: 28114.6
recommendations finished on 73000/280264 queries. users per second: 28372
recommendations finished on 74000/280264 queries. users per second: 28627
recommendations finished on 75000/280264 queries. users per second: 28398.2
recommendations finished on 76000/280264 queries. users per second: 28625
recommendations finished on 77000/280264 queries. users per second: 28763
recommendations finished on 78000/280264 queries. users per second: 28963.5
recommendations finished on 79000/280264 queries. users per second: 28937
recommendations finished on 80000/280264 queries. users per second: 29068.8
recommendations finished on 81000/280264 queries. users per second: 29251.4
recommendations finished on 82000/280264 queries. users per second: 29118
recommendations finished on 83000/280264 queries. users per second: 29234.1
recommendations finished on 84000/280264 queries. users per second: 29236
recommendations finished on 85000/280264 queries. users per second: 29298.2
recommendations finished on 86000/280264 queries. users per second: 28837.2
recommendations finished on 87000/280264 queries. users per second: 28760.6
recommendations finished on 88000/280264 queries. users per second: 28941.5
recommendations finished on 89000/280264 queries. users per second: 28323.6
recommendations finished on 90000/280264 queries. users per second: 28450.1
recommendations finished on 91000/280264 queries. users per second: 28621.3
recommendations finished on 92000/280264 queries. users per second: 28104.3
recommendations finished on 93000/280264 queries. users per second: 27982
recommendations finished on 94000/280264 queries. users per second: 28088.3
recommendations finished on 95000/280264 queries. users per second: 28226.9
recommendations finished on 96000/280264 queries. users per second: 28180.4
recommendations finished on 97000/280264 queries. users per second: 28249.8
recommendations finished on 98000/280264 queries. users per second: 28310.2
recommendations finished on 99000/280264 queries. users per second: 28263.9
recommendations finished on 100000/280264 queries. users per second: 28379.2
recommendations finished on 101000/280264 queries. users per second: 28317.3
recommendations finished on 102000/280264 queries. users per second: 28351.1
recommendations finished on 103000/280264 queries. users per second: 28368.6
recommendations finished on 104000/280264 queries. users per second: 28239.3
recommendations finished on 105000/280264 queries. users per second: 28102.3
recommendations finished on 106000/280264 queries. users per second: 28106.3
recommendations finished on 107000/280264 queries. users per second: 27955.8
recommendations finished on 108000/280264 queries. users per second: 27910.6
recommendations finished on 109000/280264 queries. users per second: 27952.1
recommendations finished on 110000/280264 queries. users per second: 27943.3
recommendations finished on 111000/280264 queries. users per second: 27955.6
recommendations finished on 112000/280264 queries. users per second: 28101.2
recommendations finished on 113000/280264 queries. users per second: 28077.1
recommendations finished on 114000/280264 queries. users per second: 28171.5
recommendations finished on 115000/280264 queries. users per second: 28327.6
recommendations finished on 116000/280264 queries. users per second: 28219.3
recommendations finished on 117000/280264 queries. users per second: 28338.3
recommendations finished on 118000/280264 queries. users per second: 28272.2
recommendations finished on 119000/280264 queries. users per second: 28409.5
recommendations finished on 120000/280264 queries. users per second: 28532.4
recommendations finished on 121000/280264 queries. users per second: 28498.9
recommendations finished on 122000/280264 queries. users per second: 28579.6
recommendations finished on 123000/280264 queries. users per second: 28532.8
recommendations finished on 124000/280264 queries. users per second: 28625.4
recommendations finished on 125000/280264 queries. users per second: 28637.9
recommendations finished on 126000/280264 queries. users per second: 28728.6
recommendations finished on 127000/280264 queries. users per second: 28785.8
recommendations finished on 128000/280264 queries. users per second: 28849
recommendations finished on 129000/280264 queries. users per second: 28795.1
recommendations finished on 130000/280264 queries. users per second: 28723.2
recommendations finished on 131000/280264 queries. users per second: 28785
recommendations finished on 132000/280264 queries. users per second: 28770.7
recommendations finished on 133000/280264 queries. users per second: 28806.4
recommendations finished on 134000/280264 queries. users per second: 28816.9
recommendations finished on 135000/280264 queries. users per second: 28845.7
recommendations finished on 136000/280264 queries. users per second: 28904.9
recommendations finished on 137000/280264 queries. users per second: 28914.4
recommendations finished on 138000/280264 queries. users per second: 28942
recommendations finished on 139000/280264 queries. users per second: 29005.7
recommendations finished on 140000/280264 queries. users per second: 28847
recommendations finished on 141000/280264 queries. users per second: 28803.6
recommendations finished on 142000/280264 queries. users per second: 28872.1
recommendations finished on 143000/280264 queries. users per second: 28893.2
recommendations finished on 144000/280264 queries. users per second: 28908.1
recommendations finished on 145000/280264 queries. users per second: 28986.6
recommendations finished on 146000/280264 queries. users per second: 29052.8
recommendations finished on 147000/280264 queries. users per second: 28872.4
recommendations finished on 148000/280264 queries. users per second: 28785.9
recommendations finished on 149000/280264 queries. users per second: 28762
recommendations finished on 150000/280264 queries. users per second: 28804.8
recommendations finished on 151000/280264 queries. users per second: 28639.4
recommendations finished on 152000/280264 queries. users per second: 28622.2
recommendations finished on 153000/280264 queries. users per second: 28675.4
recommendations finished on 154000/280264 queries. users per second: 28441.4
recommendations finished on 155000/280264 queries. users per second: 28536.4
recommendations finished on 156000/280264 queries. users per second: 28494.8
recommendations finished on 157000/280264 queries. users per second: 28474.4
recommendations finished on 158000/280264 queries. users per second: 28382.8
recommendations finished on 159000/280264 queries. users per second: 28390.3
recommendations finished on 160000/280264 queries. users per second: 28391.3
recommendations finished on 161000/280264 queries. users per second: 28422.3
recommendations finished on 162000/280264 queries. users per second: 28360.9
recommendations finished on 163000/280264 queries. users per second: 28451.2
recommendations finished on 164000/280264 queries. users per second: 28545.9
recommendations finished on 165000/280264 queries. users per second: 28486.9
recommendations finished on 166000/280264 queries. users per second: 28484.6
recommendations finished on 167000/280264 queries. users per second: 28524
recommendations finished on 168000/280264 queries. users per second: 28509.5
recommendations finished on 169000/280264 queries. users per second: 28591.9
recommendations finished on 170000/280264 queries. users per second: 28659.2
recommendations finished on 171000/280264 queries. users per second: 28692.2
recommendations finished on 172000/280264 queries. users per second: 28768.3
recommendations finished on 173000/280264 queries. users per second: 28767
recommendations finished on 174000/280264 queries. users per second: 28770.5
recommendations finished on 175000/280264 queries. users per second: 28755.1
recommendations finished on 176000/280264 queries. users per second: 28442.3
recommendations finished on 177000/280264 queries. users per second: 28069.5
recommendations finished on 178000/280264 queries. users per second: 27856.8
recommendations finished on 179000/280264 queries. users per second: 27640.1
recommendations finished on 180000/280264 queries. users per second: 27443.6
recommendations finished on 181000/280264 queries. users per second: 27048.8
recommendations finished on 182000/280264 queries. users per second: 26745.8
recommendations finished on 183000/280264 queries. users per second: 26254.4
recommendations finished on 184000/280264 queries. users per second: 25776.5
recommendations finished on 185000/280264 queries. users per second: 25368.9
recommendations finished on 186000/280264 queries. users per second: 24907.8
recommendations finished on 187000/280264 queries. users per second: 24631.6
recommendations finished on 188000/280264 queries. users per second: 24249.1
recommendations finished on 189000/280264 queries. users per second: 24044.3
recommendations finished on 190000/280264 queries. users per second: 23888.7
recommendations finished on 191000/280264 queries. users per second: 23751.2
recommendations finished on 192000/280264 queries. users per second: 23590.2
recommendations finished on 193000/280264 queries. users per second: 23399.2
recommendations finished on 194000/280264 queries. users per second: 23083.6
recommendations finished on 195000/280264 queries. users per second: 22704
recommendations finished on 196000/280264 queries. users per second: 22386.5
recommendations finished on 197000/280264 queries. users per second: 22238.7
recommendations finished on 198000/280264 queries. users per second: 22059.7
recommendations finished on 199000/280264 queries. users per second: 21627.4
recommendations finished on 200000/280264 queries. users per second: 21467.7
recommendations finished on 201000/280264 queries. users per second: 21300.4
recommendations finished on 202000/280264 queries. users per second: 21154.4
recommendations finished on 203000/280264 queries. users per second: 21059.6
recommendations finished on 204000/280264 queries. users per second: 20930.8
recommendations finished on 205000/280264 queries. users per second: 20789
recommendations finished on 206000/280264 queries. users per second: 20565.3
recommendations finished on 207000/280264 queries. users per second: 20294.1
recommendations finished on 208000/280264 queries. users per second: 20119.9
recommendations finished on 209000/280264 queries. users per second: 20001.2
recommendations finished on 210000/280264 queries. users per second: 19834.1
recommendations finished on 211000/280264 queries. users per second: 19682
recommendations finished on 212000/280264 queries. users per second: 19584.8
recommendations finished on 213000/280264 queries. users per second: 19528.1
recommendations finished on 214000/280264 queries. users per second: 19470.4
recommendations finished on 215000/280264 queries. users per second: 19386.6
recommendations finished on 216000/280264 queries. users per second: 19274.6
recommendations finished on 217000/280264 queries. users per second: 19193.9
recommendations finished on 218000/280264 queries. users per second: 19059.6
recommendations finished on 219000/280264 queries. users per second: 18868
recommendations finished on 220000/280264 queries. users per second: 18649.1
recommendations finished on 221000/280264 queries. users per second: 18458
recommendations finished on 222000/280264 queries. users per second: 18290.3
recommendations finished on 223000/280264 queries. users per second: 18139.5
recommendations finished on 224000/280264 queries. users per second: 17998.1
recommendations finished on 225000/280264 queries. users per second: 17930
recommendations finished on 226000/280264 queries. users per second: 17860.1
recommendations finished on 227000/280264 queries. users per second: 17797.9
recommendations finished on 228000/280264 queries. users per second: 17701.1
recommendations finished on 229000/280264 queries. users per second: 17610.6
recommendations finished on 230000/280264 queries. users per second: 17520.1
recommendations finished on 231000/280264 queries. users per second: 17409.5
recommendations finished on 232000/280264 queries. users per second: 17182.4
recommendations finished on 233000/280264 queries. users per second: 17006.7
recommendations finished on 234000/280264 queries. users per second: 16891.9
recommendations finished on 235000/280264 queries. users per second: 16719.3
recommendations finished on 236000/280264 queries. users per second: 16641.6
recommendations finished on 237000/280264 queries. users per second: 16554.4
recommendations finished on 238000/280264 queries. users per second: 16492.8
recommendations finished on 239000/280264 queries. users per second: 16406.3
recommendations finished on 240000/280264 queries. users per second: 16318
recommendations finished on 241000/280264 queries. users per second: 16266.4
recommendations finished on 242000/280264 queries. users per second: 16243.6
recommendations finished on 243000/280264 queries. users per second: 16077.9
recommendations finished on 244000/280264 queries. users per second: 15885.1
recommendations finished on 245000/280264 queries. users per second: 15711.7
recommendations finished on 246000/280264 queries. users per second: 15581.8
recommendations finished on 247000/280264 queries. users per second: 15491
recommendations finished on 248000/280264 queries. users per second: 15446.3
recommendations finished on 249000/280264 queries. users per second: 15397.7
recommendations finished on 250000/280264 queries. users per second: 15364.1
recommendations finished on 251000/280264 queries. users per second: 15270.6
recommendations finished on 252000/280264 queries. users per second: 15179.9
recommendations finished on 253000/280264 queries. users per second: 15109.8
recommendations finished on 254000/280264 queries. users per second: 15032.2
recommendations finished on 255000/280264 queries. users per second: 14915.5
recommendations finished on 256000/280264 queries. users per second: 14825.4
recommendations finished on 257000/280264 queries. users per second: 14768.8
recommendations finished on 258000/280264 queries. users per second: 14699.9
recommendations finished on 259000/280264 queries. users per second: 14618.4
recommendations finished on 260000/280264 queries. users per second: 14588.7
recommendations finished on 261000/280264 queries. users per second: 14563.8
recommendations finished on 262000/280264 queries. users per second: 14538.7
recommendations finished on 263000/280264 queries. users per second: 14499.2
recommendations finished on 264000/280264 queries. users per second: 14476.4
recommendations finished on 265000/280264 queries. users per second: 14440
recommendations finished on 266000/280264 queries. users per second: 14414.3
recommendations finished on 267000/280264 queries. users per second: 14403.6
recommendations finished on 268000/280264 queries. users per second: 14278.1
recommendations finished on 269000/280264 queries. users per second: 14168.2
recommendations finished on 270000/280264 queries. users per second: 14132.2
recommendations finished on 271000/280264 queries. users per second: 14073.8
recommendations finished on 272000/280264 queries. users per second: 14055.8
recommendations finished on 273000/280264 queries. users per second: 14011.2
recommendations finished on 274000/280264 queries. users per second: 13975.5
recommendations finished on 275000/280264 queries. users per second: 13935.9
recommendations finished on 276000/280264 queries. users per second: 13901.5
recommendations finished on 277000/280264 queries. users per second: 13880.9
recommendations finished on 278000/280264 queries. users per second: 13852.4
recommendations finished on 279000/280264 queries. users per second: 13788.4
recommendations finished on 280000/280264 queries. users per second: 13529.5

Step 4 of 4 : Keep records of "New Users Only" (Cluster 7)

In [5]:
## keep only customers from cluster 7 : New customers

    df_user_data_7 = df[df.SegmentNo == 7].astype(str)
    results_final_7 = df_user_data_7['Card_ID'].isin(results_final['Card_ID'])

    results_final_7 = results_final[results_final['Card_ID'].\
                                           isin(df_user_data_7['Card_ID'])].\
                                           reset_index(drop = True)
    
In [6]:
## output into csv file for further analysis & visualization in tableau
    df_user_data_7.to_csv('dataset\dataset_final_all_user.csv', header = True, index= True, sep='\t', encoding='utf-8')
    results_final_7.to_csv('dataset\dataset_final_all.csv', header = True, index= True, sep='\t', encoding='utf-8')