TopJetCombination
Top Jet Combination
This study is actually done with sisCone05 jets calibrated at L2L3 level.
The selection of jets happens in different steps:
0. Jet cleaning
electron removal done in PAT-L0
1. Jet rejection
use of the JetRejectorTool (ported in CMSSW_2_2_3 by Ilaria)
2. Selection of a collection of 4 jets
tools & study performed by Gregory
3. Selection of a combination in a 4 jets collection: MVA package
Tuning of variables for MVA package (Joris)
- A list of >80 variables is implemented and can be found in (code is compatible with TQAF tag V04-07-01)
/user/jmmaes/CMSSW/CMSSW_2_2_4_v1/CMSSW_2_2_4/src/TopQuarkAnalysis/TopTools/interface/TtSemiLepJetComb.h
Due to Recent changes in TQAF this file became incompatible
- For this list the MVA trainer is used to make the distributions for signal and background in two cases
- case1: The 4 top jet candidates are selected from the 6 jets with highest pt. The 4 jets are chosen according the chi square method in the jet selection tool of gregory (see previous section)
- case2: The 4 top jet candidated are the 4 highest pt jets
in the following locations you can find the complete cfg file used to produce the root files containing signal and background plots
/user/jmmaes/CMSSW/CMSSW_2_2_4_v1/CMSSW_2_2_4/src/TopQuarkAnalysis/TopJetCombination/test/caseX/ttSemiLepJetCombMVATrainer_muons_cfg.py
- The root files created are in
/msa1/TopGroup/Results/TopJetCombination/SValues/root/train_monitoring_caseX.root
- For the list of all variables a separation power is calculated, two definitions are used to calculated S:
- The overlapping area of the (normalized) signal and background histograms (surface)
- The maximum value of the scan of the multiplication of efficiency for signal times 1-efficiency for background (pte)
The code for calculating S can be found in
/user/jmmaes/CMSSW/CMSSW_2_2_4_v1/CMSSW_2_2_4/src/Svalue/DistributionAnalyzer.C
The list of S values for the 2 cases X times the 2 definitions Y is located in
/msa1/TopGroup/Results/TopJetCombination/SValues/table/caseX_Y.txt
The main conclusion is that the ordering of the variables doesn't change significantly when using another definition. Also when selecting the 4 jets from the 6 highest pt jets the relative ranking of the variables w.r.t. separation power stays the same
- To make a selection of O(5) variables we have to take into account the correlation between them. This list is also in the former text file.
- The list of 5 variables looks like this (numbering comes from the 81 variables)
- 1. mass of hadronic W (massHadW)
- 80. sum of b-tag discriminant value of leptonic b and hadronic b (sumBTag1HadBLepB)
- 78. relative pt of the hadronic top (relPtHadTop)
- 53. delta theta between leptonic b and hadronic top (deltaThetaLepBHadTop)
- 50. delta theta between leptonic b and muon (deltaThetaLepBLepton)
- The plots of the variables Z can be found in
/msa1/TopGroup/Results/TopJetCombination/SValues/image/varZ_caseX.png
- The correlation between these 5 variables is made from the train_monitoring.root files of petra. The numbering of the variables is different from the previous section. The plots for both cases X can be found in
/msa1/TopGroup/Results/TopJetCombination/SValues/image/ProcMatrix_rot_caseX.png
- The numbering of the variables is (1:"massHadW",2:"deltaThetaLepBLepton",3:"deltaThetaLepBHadTop",4:"relPtHadTop",5:"sumBTag1HadBLepB")
- in the txt files you can find explicitly the value of the correlation
/msa1/TopGroup/Results/TopJetCombination/SValues/table/caseX_corr.txt
Study of MVA tool (Petra)
- MVA tools: NN & LR are performed by Petra