TopJetCombination

From T2B Wiki
Jump to navigation Jump to search


Top Jet Combination

This study is actually done with sisCone05 jets calibrated at L2L3 level.
The selection of jets happens in different steps:

0. Jet cleaning

electron removal done in PAT-L0

1. Jet rejection

use of the JetRejectorTool (ported in CMSSW_2_2_3 by Ilaria)

2. Selection of a collection of 4 jets

tools & study performed by Gregory

3. Selection of a combination in a 4 jets collection: MVA package

Tuning of variables for MVA package (Joris)

  • A list of >80 variables is implemented and can be found in (code is compatible with TQAF tag V04-07-01)
/user/jmmaes/CMSSW/CMSSW_2_2_4_v1/CMSSW_2_2_4/src/TopQuarkAnalysis/TopTools/interface/TtSemiLepJetComb.h
Due to Recent changes in TQAF this file became incompatible

  • For this list the MVA trainer is used to make the distributions for signal and background in two cases
    • case1: The 4 top jet candidates are selected from the 6 jets with highest pt. The 4 jets are chosen according the chi square method in the jet selection tool of gregory (see previous section)
    • case2: The 4 top jet candidated are the 4 highest pt jets
in the following locations you can find the complete cfg file used to produce the root files containing signal and background plots
/user/jmmaes/CMSSW/CMSSW_2_2_4_v1/CMSSW_2_2_4/src/TopQuarkAnalysis/TopJetCombination/test/caseX/ttSemiLepJetCombMVATrainer_muons_cfg.py
  • The root files created are in
/msa1/TopGroup/Results/TopJetCombination/SValues/root/train_monitoring_caseX.root
  • For the list of all variables a separation power is calculated, two definitions are used to calculated S:
    • The overlapping area of the (normalized) signal and background histograms (surface)
    • The maximum value of the scan of the multiplication of efficiency for signal times 1-efficiency for background (pte)
The code for calculating S can be found in
/user/jmmaes/CMSSW/CMSSW_2_2_4_v1/CMSSW_2_2_4/src/Svalue/DistributionAnalyzer.C
The list of S values for the 2 cases X times the 2 definitions Y is located in
/msa1/TopGroup/Results/TopJetCombination/SValues/table/caseX_Y.txt
The main conclusion is that the ordering of the variables doesn't change significantly when using another definition. Also when selecting the 4 jets from the 6 highest pt jets the relative ranking of the variables w.r.t. separation power stays the same
  • To make a selection of O(5) variables we have to take into account the correlation between them. This list is also in the former text file.
  • The list of 5 variables looks like this (numbering comes from the 81 variables)
    • 1. mass of hadronic W (massHadW)
    • 80. sum of b-tag discriminant value of leptonic b and hadronic b (sumBTag1HadBLepB)
    • 78. relative pt of the hadronic top (relPtHadTop)
    • 53. delta theta between leptonic b and hadronic top (deltaThetaLepBHadTop)
    • 50. delta theta between leptonic b and muon (deltaThetaLepBLepton)
  • The plots of the variables Z can be found in
/msa1/TopGroup/Results/TopJetCombination/SValues/image/varZ_caseX.png
  • The correlation between these 5 variables is made from the train_monitoring.root files of petra. The numbering of the variables is different from the previous section. The plots for both cases X can be found in
/msa1/TopGroup/Results/TopJetCombination/SValues/image/ProcMatrix_rot_caseX.png
  • The numbering of the variables is (1:"massHadW",2:"deltaThetaLepBLepton",3:"deltaThetaLepBHadTop",4:"relPtHadTop",5:"sumBTag1HadBLepB")
  • in the txt files you can find explicitly the value of the correlation
/msa1/TopGroup/Results/TopJetCombination/SValues/table/caseX_corr.txt

Study of MVA tool (Petra)

  • MVA tools: NN & LR are performed by Petra


Template:TracNotice