cwb_xgboost ~~~~~~~~~~~~~~ This page provides a description of the cwb_xgboost command. | | **Usage** | | This command is used to perform the XGBoost post-production analysis and includes all the XGBoost functionalities added to cWB. | | **Syntax** | - ``cwb_xgboost`` (no arguments) Prints help. - ``cwb_xgboost xgb_type 'xgb_options'`` Runs the required XGBoost functionality. | | **Further information** | - ``xgb_type = datasplit/tuning/training/mlabel(prediction)/report`` The different XGBoost functionality types available with ``cwb_xgboost`` - For each ``xgb_type``, the following ``xgb_options`` can be passed while running the ``cwb_xgboost`` command. | .. code-block:: bash - datasplit - xgb_options = '--mlabel merge_label --ifrac (trainig[ifrac]/testing[1-ifrac]) (--ulabel user_label --verbose false/true)' (--cfile chunk list --wavecuts training selection cuts (wave)) (--mdccuts training selection cuts (mdc)) - tuning - xgb_options = '--nfile wave_bkg.root/.py --sfile wave_sim.root /.py --ofile output.txt --search bbh/imbhb/blf/bhf/bld --nfrac bkg_fraction_selection --sfrac sim_fraction_selection (--config xgb_config.py --verbose false/true) --learning-rate 0.03 --max-depth 8 --min-child-weight 5.0 --gamma 1 --colsample-bytree 0.9 --subsample 0.6' - training - xgb_options = '--nfile wave_bkg.root/.py --sfile wave_sim.root/.py --model xgb_model.dat --search bbh/imbhb/blf/bhf/bld --nfrac bkg_fraction_selection --sfrac sim_fraction_selection (--config xgb_config.py --verbose false/true)' --dump false/true)' - mlabel (prediction) - xgb_options = '--model xgb_model.dat --ulabel user_label --rthr output_rhor_threshold' --search bbh/imbhb/blf/bhf/bld (--fgetrhor user_getrhor --verbose false/true --config xgb_config.py)' - report - xgb_options = '--type tuning/prediction(training not yet implemented, all training report plots are stored when --dump true is set during training) (--ulabel user_label --verbose false/true --config config.py)' note: --config option is used only for prediction. To get more help type: 'cwb_xgboost report help' | | **Examples** | - Datasplit ``cwb_xgboost datasplit '--mlabel M1.V_hvetoLH.C_xgb_cut --ifrac -100 --verbose true --cfile $CWB_CONFIG/O3a/CHUNKS/BBH//Chunk_List.lst'`` - Tuning ``cwb_xgboost tuning '--nfile nfile.py --sfile sfile.py --ofile data/output.out --nfrac 0.2 --sfrac 0.2 --search bbh --config config/user_xgboost_config_r1.py --learning-rate 0.03 --max-depth 13 --min-child-weight 10.0 --gamma 2 --colsample-bytree 1 --subsample 0.6 --balance-slope q=1 --balance-balance A=10 --caps-rho0 11'`` - Training ``cwb_xgboost training '--nfile nfile.py --sfile sfile.py --model xgb/v1/xgb_model_name_v1.dat --nfrac 1.0 --sfrac 1.0 --search bbh --verbose true --dump true --config config/user_xgboost_config_v1.py'`` - Prediction/Testing ``cwb_xgboost M1.V_hvetoLH.C_xgb_cut.XK_Test_Y100_TXXXX '--model /out_path/XGB/model_dir_XGB/xgb/v1/xgb_model_name_v1.dat --ulabel v1 --rthr 0 --search bbh'`` - Report ``cwb_xgboost report '--type prediction --subtype hrho/lrho/hchirp/qaqp/roc/efreq --config report_config.py'`` ``cwb_xgboost report '--type tuning --ulabel r2 --verbose true'`` | For more report help: ``cwb_xgboost report help`` .. code-block:: bash # ----------------------------------------------------------------------------------------------------------- # # Report Prediction Example: # # cwb_xgboost report '--type prediction --subtype type --config report_config.py' # # where: type = hrho, lrho, hchirp, qaqp, roc, efreq # # to select a more subtype plots uses (eg: roc,efreq): # # cwb_xgboost report '--type prediction --subtype roc/efreq --config report_config.py' # # ----------------------------------------------------------------------------------------------------------- # -------------------------------------------------------------------------------- # report configuration file: report_config.py # -------------------------------------------------------------------------------- # ---------------------------------- # input root files # ---------------------------------- wave_rhor_fname = 'merge/wave_wlabel.M1.V_hvetoLH.XGB_rthr3_v0.S_ifar.root' # wave root file with xgb statistic (all) wave_rho0_fname = 'merge/wave_wlabel.M1.V_hvetoLH.S_bin1a_cut.S_bin1b_cut.S_bin2_cut.root' # wave root file with pp-cuts statistic (roc/efreq) mdc_rhor_fname=wave_rhor_fname.replace("wave_","mdc_") # mdc root file with xgb statistic (roc) mdc_rho0_fname=wave_rho0_fname.replace("wave_","mdc_") # mdc root file with pp-cuts statistic (roc) sim_rhor_fname = wave_rhor_fname bkg_rhor_fname = 'path/wlabel/merge/wave_wlabel.M1.V_hvetoLH.C_xgb_cut.X_Test_F50.XGB_rthr3_v0.root' # background root file (hrho/lrho/hchirp/qaqp) # ----------------------------------------------------------- # roc: efficiency vs far # # plot_fname: output plot file name # far_inf: far inferior limit # far_sup: far superior limit # ----------------------------------------------------------- PLOT['roc'] = 'plot_fname=xgb/v0/roc_new_format.png far_inf=0.001 far_sup=10' ROC['GA:rhor'] = 'wave_fname='+wave_rhor_fname+' line_color=red wave_cuts=type[1]>=1&&type[1]<=4&&factor!=0' ROC['GA:rho0'] = 'wave_fname='+wave_rho0_fname+' line_color=blue wave_cuts=type[1]>=1&&type[1]<=4&&factor!=0 trials_factor=3' ROC['GA:rhor'] += ' mdc_fname='+mdc_rhor_fname+' mdc_cuts=type>=1&&type<=4&&factor!=0' ROC['GA:rho0'] += ' mdc_fname='+mdc_rho0_fname+' mdc_cuts=type>=1&&type<=4&&factor!=0' # ----------------------------------------------------------- # qaqp: Qa vs Qp # # plot_fname: output plot file name # rho_name: rho name (default = 'rho0') used to select the background entries (rho_name>rho_thr) # rho_thr: rho threshold (default = 8) used to select the background entries (rho_name>rho_thr) # rho_label: rho label (default = 'rho0') used in the legend # qfactor: qfactor value (default = 0.15) used to draw the dashed line Qa = qfactor/(Qp-qoffset) # qoffset: qoffset value (default = 0.8) used to draw the dashed line Qa = qfactor/(Qp-qoffset) # ----------------------------------------------------------- PLOT['qaqp'] = 'plot_fname=xgb/v0/qaqp_new_format.png rho_name=rho0 rho_thr=9 rho_label=rho0 qfactor=0.15 qoffset=0.6 qa_sup=6.0 qp_sup=10.0' QAQP['sim'] = 'wave_fname='+wave_rhor_fname+' wave_cuts=type[1]>=1&&type[1]<=4&&factor!=0' QAQP['bkg'] = 'wave_fname='+bkg_rhor_fname+' marker_color=red marker_size=150' # ----------------------------------------------------------- # efreq: efficiency vs frequency[0] # # plot_fname: output plot file name # fmin: xaxis minimum frequency # fmax: xaxis maximum frequency # fbin: xaxis number of bins # ifar: ifar threshold # ----------------------------------------------------------- PLOT['efreq'] = 'plot_fname=xgb/v0/efreq_new_format.png fmin=20 fmax=1000 fbin=44 ifar=100' EFREQ['rho_r'] = 'ifname='+wave_rhor_fname+' color=red' EFREQ['rho_0'] = 'ifname='+wave_rho0_fname+' color=blue' # ----------------------------------------------------------- # hrho: sim/bkg rho histogram # # plot_fname: output plot file name # rho_label: rho label (default = 'rho0') used in the legend # ----------------------------------------------------------- PLOT['hrho'] = 'plot_fname=xgb/v0/hrho_new_format.png rho_label=$rho_r$' HRHO['sim'] = 'ifname='+sim_rhor_fname+' rhoid=1 color=green' HRHO['bkg'] = 'ifname='+bkg_rhor_fname+' rhoid=1 color=blue' # ----------------------------------------------------------- # lrho: likelihood vs rho # # plot_fname: output plot file name # rho_label: rho label (default = 'rho0') used in the legend # ----------------------------------------------------------- PLOT['lrho'] = 'plot_fname=xgb/v0/lrho_new_format.png rho_label=$rho_r$' LRHO['sim'] = 'ifname='+sim_rhor_fname+' rhoid=1 color=green' LRHO['bkg'] = 'ifname='+bkg_rhor_fname+' rhoid=1 color=blue rho0_capvalue=8' # ----------------------------------------------------------- # hchirp: sim/bkg chirp histogram # # plot_fname: output plot file name # rho_label: rho label (default = 'rho0') used in the legend # ----------------------------------------------------------- PLOT['hchirp'] = 'plot_fname=xgb/v0/hchirp_new_format.png rho_label=$rho_r$' HCHIRP['sim'] = 'ifname='+sim_rhor_fname+' color=green' HCHIRP['bkg'] = 'ifname='+bkg_rhor_fname+' color=blue'