This lesson is in the early stages of development (Alpha version)

Trigger Exercise for the CMSDAS @ LPC

Introduction

Overview

Teaching: 30 min
Exercises: 0 min
Questions
Objectives
  • Overview of CMS trigger system

We will learn the overview of CMS trigger system, both L1 and HLT, with slides.

Key Points

  • You will be familiar with basic concepts and terminology related to triggering, so that you can apply them in your work (e.g. turn-on, prescale, matching…)

  • You will be able to describe the main features of the CMS trigger system

  • You will have an overview of the CMS trigger menu, allowing you to identify suitable triggers for your physics analysis

  • You will be familiar with the analysis tools needed to access trigger-related information in CMS datasets

  • You will know how to perform trigger efficiency measurements


HLT timing studies

Overview

Teaching: 30 min
Exercises: 0 min
Questions
Objectives
  • Measure the time it takes to run an HLT menu

  • Reference repo: link

Prerequisites

A CERN account with access to lxplus - that’s it!

Instructions

Exercise 1: CPU/GPU Timing measurements without creation of a CMSSW environment

  1. Log in to lxplus and clone the timing repository somewhere (e.g. in your EOS space)

     git clone https://gitlab.cern.ch/cms-tsg/steam/timing.git
     cd timing
    
  2. Submit a timing job to the timing machine using CMSSW_14_0_11, the GRun menu V173 and the default dataset on the timing machine.

     python3 submit.py /dev/CMSSW_14_0_0/GRun/V173 --cmssw CMSSW_14_0_11 --tag YOUR_TAG_HERE
    

    If you have 2FA, there is a possibility that lxplus9 will require pip3 install tsgauth==0.10.2. Which can be fixed using the lines bellow.

     python3 -m venv venv
     source venv/bin/activate
     pip3 install --upgrade pip
     pip3 install tsgauth==0.10.2
    

    The lxplus9 also will give you a link (like https://auth.cern.ch/auth/realms/cern/...) which should be copy and pasted in the browser to grant access.

  3. Check the status of your job using the job_manager.py script.

     python3 job_manager.py
    

    It takes around 20-30 minutes to run.

  4. Re-submit your job using the –rerun option, followed by the job ID of the first submitted job. This will re-submit the first job with the exact same parameters and can be useful if you want to re-run a job multiple times to get an idea of the variance of the timing measurements. This also leads to the program re-using the same CMSSW area as before on the timing machine, so it saves up some disk space there. Also make sure to add a new --tag to your job so you can distinguish the two in the job queue

     python3 submit.py --rerun JOB_ID --tag YOUR_NEW_TAG_HERE
    
  5. Remove the recently added job from the queue using the job_manager.py script and the --rm option.

     python3 job_manager.py --rm JOB_ID_OF_RESUBMITTED_JOB
    

    NOTE: It is currently not possible to cancel an already running job. Only queued jobs can be cancelled.

  6. Submit another job with the same settings as before, but now only using the CPUs by adding the --cpu-only option. This will run the same job, but only on the CPUs of the timing machine. This is useful to compare the performance of the CPUs and GPUs.

     python3 submit.py /dev/CMSSW_14_0_0/GRun/V173 --cmssw CMSSW_14_0_11 --cpu-only --tag YOUR_CPU_JOB_TAG_HERE
    
  7. Once your jobs have finished, get the reports for one of the jobs using the job_manager.py script and the --report option.

     python3 job_manager.py --report JOB_ID
    

    This will download a tar.gz file containing output and error files for :

    • the creation of the CMSSW environment on the timing machine (i.e. the scram proj/cmsrel step)
    • the building of the CMSSW environment including the merging of provided pull requests etc. (i.e. the scram b step)
    • the benchmarking script (found in the run_benchmark.py file) which initializes and controls the settings of the actual timing measurements on multiple CPUs/GPUs using the patatrack-scripts` repo
    • the final cmsRun of the timing job

    NOTE: All of these steps are taken care of by the server for you as a user and the reports are just helpful tools when occasionally some measurement is crashing.

  8. Investigate the results using the timing GUI

    • Use the timing GUI to compare your CPU-only job with your GPU job. The timing GUI can be accessed from this Link. Once you arrive at the GUI, klick the “open” button and find your CERN username, then klick the little arrow/triangle next to the box to get a drop-down list which contains all your finished measurements. Click the boxes on the GPU and the CPU ones and hit the “OK” button.
    • The results take some time to load. Once you see your results, klick on the “TIMING” tab to see the different timing distributions. You should be able to identify the fast peaks and tracking/PF tails that were discussed in the slides.
    • In the lower panel, you see the timing of the individual paths of the menu. Klick on one of the job ID names to sort the path timing in ascending/descending order for that respective measurement. Investigate which paths are the fastest and the most time-consuming ones and if it checks out with what we discussed in the slides. You can also klick on a path/row to see the timing distribution of that path (left panel) and the average timing of each module inside that path (right panel). For the right panel, it’s sometimes difficult to see which module you’re looking at, so you can also hover over the “bins” and some information for that bin will pop up, including the module name.
    • Finally, let’s find the path with the largest timing difference between CPU and GPU measurements. To do this, switch on the “Display Diff” button in the top right of the lower panel. This will cause the column of the second measurement to show the timing difference with the column of the first measurement. Note that this works also when more than two measurements are selected for comparison: The first selected measurement column is always the “reference” and other columns show the difference relative to it. Now you can sort ascending/descending by timing difference by clicking the job name of the second measurement. Which paths have the largest timing difference? Is this what you would expect? You can also look at the timing distributions of the paths again by clicking on the path name again.

Exercise 2 (Bonus): CPU/GPU Timing measurements with creation of CMSSW area

It is also possible to submit a timing job with a previously created CMSSW area. When you do this, your area will get “cloned” to the timing machine and it will run a measurement using the cloned area. Since the arealess submission already allows for a lot of flexibility, this is not really necessary and the arealess submission is the recommended way to submit jobs. However, if you run some very specialized expert workflows that for some reason require additional tinkering with the CMSSW area that is not covered by the timing code, it can sometimes still be useful.

  1. Create a CMSSW area on lxplus and build it using the GRun menu V173 and the default dataset on the timing machine.

     cmsrel CMSSW_14_0_11
     cd CMSSW_14_0_11/src
     cmsenv
     git cms-init
     scram build -j 4
    
  2. Download the same GRun menu as in Exercise 1 using the hltGetConfiguration command. When you do this outside this exercise, make sure you choose the correct globaltag and l1 menu for your use case. For this exercise you can copy/paste the command below.

     hltGetConfiguration /dev/CMSSW_14_0_0/GRun/V173 --globaltag 140X_dataRun3_HLT_v3 --data --process TIMING --full --offline --output minimal --type GRun --max-events 20000 --era Run3 --timing --l1 L1Menu_Collisions2024_v1_3_0-d1_xml > hlt.py
    
  3. Clone the timing repository.

     git clone https://gitlab.cern.ch/cms-tsg/steam/timing.git
    
  4. Submit your config to the timing server. Many of the options that we needed in the arealess submission are not needed anymore, since they are already specified in the config.

     python3 ./timing/submit.py hlt.py --tag YOUR_NEW_TAG_HERE
    
  5. Again you can investigate the job using the job_manger.py script.

     python3 ./timing/job_manager.py
    
  6. Once the job is done, you can investigate the results in the timing GUI. For example, you can check if your result of the arealess submission coincides with your result of the submission with an area (a variance of about 1-3% is normal).

Key Points


Finding information about a trigger path

Overview

Teaching: 10 min
Exercises: 10 min
Questions
Objectives
  • Learn different ways to look up information about a trigger path

Prerequisites

Set up your machine following instructions in setup first.

Find the L1 seed of the MET HLT path

There are different ways to learn which is the L1 seed of a specific HLT path. Two examples will be tested in this exercise:

In the context of this exercise, we will retrieve the information of the HLT_PFMET170_HBHECleaned_v* path from OMS for a specific run (run 284043) and HLT configuration (/cdaq/physics/Run2016/25ns15e33/v4.2.3/HLT/V2).

OMS method

As a first step, connect to OMS

Then move to the L1 Trigger rate page in OMS and search for the L1 seeds there to have a look at the rates and prescales of these L1 seeds.
(Note that you might need to increase the number of rows at the bottom of the page to find them.)

Web-based confdb

The web-based confdb is a web-based GUI of the database that holds all of CMS HLT configurations.
One can inspect any given HLT menu without downloading one or connecting to the database.
More information about the web-based confdb can be found in these slides.

Let’s try to find the /cdaq/physics/Run2016/25ns15e33/v4.2.3/HLT/V2 menu in the GUI.

After retriving from the DB, you can see all the paths.

Checklist

You should see the same L1 seeds that you find from OMS.

Inspecting a HLT configuration

hltGetConfiguration is the official command to retrive a HLT configuration from the database.

Caution

Since hltGetConfiguration involves connecting to the confdb database directly, this command should never be used in a large number of jobs or programatic loop. It should only be used interactively.

ssh -f -N -D 1080 <yourUserName>@lxplus.cern.ch 
hltConfigFromDB --configName --adg /cdaq/physics/Run2016/25ns15e33/v4.2.3/HLT/V2 --dbproxy --dbproxyhost localhost --dbproxyport 1080 > dump_hlt_online_2016G.py

If you have connection issue, simply inspect one that has been downloaded for you

xrdcp root://cmseos.fnal.gov//store/user/cmsdas/2023/short_exercises/Trigger/dump_hlt_online_2016G.py .

Then, inspect the HLT configuration dump_hlt_online_2016G.py to look for the information about the L1 seed of the path into it:

grep 'process.HLT_PFMET170_HBHECleaned_v9' dump_hlt_online_2016G.py

You should see the expected output below:

process.HLT_PFMET170_HBHECleaned_v9 = cms.Path(process.HLTBeginSequence+process.hltL1sETM50ToETM120+process.hltPrePFMET170HBHECleaned+process.HLTRecoMETSequence+process.hltMET90+process.HLTHBHENoiseCleanerSequence+process.hltMetClean+process.hltMETClean80+process.HLTAK4PFJetsSequence+process.hltPFMETProducer+process.hltPFMET170+process.HLTEndSequence)

Questions

To conclude, answer the following questions:

  • Which was the lowest threshold L1 seed active in the L1 menu?
  • Which is the lowest threshold L1 seed unprescaled?
  • How is it called the HLT module that contains the information about the L1 seeding?

Key Points


Measuring trigger efficiencies

Overview

Teaching: 30 min
Exercises: 0 min
Questions
Objectives
  • Learn how to access the trigger information stored in MiniAOD and NanoAOD

  • Learn what is trigger objects and how to access them

  • Measure trigger efficiency using the tag-and-probe method

Prerequisites

Set up your machine following instructions in setup first.

Objective

The goal of the following exercises is to learn how to access and play with the trigger objects in our data and compute the efficiency of a specific HLT path and look also at its Level 1 (L1) seed.

The focus will be on a HLT path used during the 2016 data-taking to select events with a certain amount of missing transverse energy: HLT_PFMET170_HBHECleaned_v*.

Compute a MET trigger efficiency

We will first run this exercise on MiniAOD format, then run it again on NanoAOD.

MiniAOD

The MINIAOD format was introduced at the beginning of Run 2 to reduce the information and file size from the AOD file format.
This means that several redundant versions of Ntuples for different analysis groups are stored in the limited CMS storage spaces.
For Run 2 analyses, most of the analysis groups at CMS skimmed the centrally produced MiniAOD files into smaller, analysis-specific ROOT Ntuples.

MiniAOD events contain two trigger products that we will need in these exercises.
The TriggerResults product contains trigger bits for each HLT path, whereas the TriggerObjectStandAlone product contains the trigger objects used at HLT.
In addition, the trigger prescales, L1 trigger decisions, and L1 objects are stored in MiniAOD.
A more detailed description of the trigger-related MiniAOD event content can be found here.

In this exercise we work with a skimmed MiniAOD file. (In case you are wondering where this skimmed file came from: it has been created using the configuration in ShortExerciseTrigger/test/skim_pfmet100.py, which selects events with offline MET above a threshold of 100 GeV.)

Inspect MiniAOD content

First, inspect the contents of the skimmed MiniAOD input file as follows:

edmDumpEventContent root://cmseos.fnal.gov//store/user/cmsdas/2023/short_exercises/Trigger/skim_pfmet100_SingleElectron_2016G_ReReco_87k.root --regex=Trigger

You can also inspect the full file content by dropping the –regex parameter.

As you see, there are indeed multiple TriggerResults products here, as well as other trigger-related collections.
We will learn how to interact with these two products and how to use their packed information in our physics analyses in this and the following exercises.

Extract the MET turn-on of the MET HLT path

Glimpse through the configuration file in test/ana_METMiniAOD.py and try to get a general idea of what it does.
Lines 45-48 of this configuration file show that it invokes an analyzer module called METTrigAnalyzerMiniAOD, and gives it two HLT paths as input parameters (with a specific version number that keeps track of minor updates to the HLT path from one menu to another one):

process.metTrigAnalyzerMiniAOD = cms.EDAnalyzer("METTrigAnalyzerMiniAOD")
process.metTrigAnalyzerMiniAOD.refTriggerName = cms.untracked.string("HLT_Ele27_eta2p1_WPTight_Gsf_v7")
process.metTrigAnalyzerMiniAOD.sigTriggerName = cms.untracked.string("HLT_PFMET170_HBHECleaned_v6")

We will use the METTrigAnalyzerMiniAOD analyzer module to perform a simple trigger efficiency measurement.
We will use HLT_Ele27_eta2p1_WPTight_Gsf as a reference trigger to measure the trigger efficiency of the HLT_PFMET170_HBHECleaned signal trigger.

Next, check through the code in plugins/METTrigAnalyzerMiniAOD.cc and try to get a general idea of what it does.

Then run the configuration file as follows:

cd test
voms-proxy-init --voms cms
cmsRun ana_METMiniAOD.py 

Use the TBrowser to explore the file histos_METTrigAnalyzer.root and look at the histograms h_met_all and h_met_passtrig.

Question

Can you explain the shape of each distribution?

The plotting macro plot_trigeff_met.C uses the TEfficiency class to create an efficiency plot using the two histograms produced above.

Inspect the code in plot_trigeff_met.C and then run this macro as follows:

root -l plot_trigeff_met.C

The resulting turn-on is the result of our efficiency measurement.
It shows the trigger efficiency of the signal trigger (vertical axis) in bins of offline-reconstructed MET (horizontal axis).

Question

Inspect the resulting efficiency plot. At which value of offline MET does the trigger turn-on reach its maximal value, and the flat “plateau” region start?

NanoAOD

A centrally maintained NanoAOD format was proposed in 2018, aiming for a common Ntuple format that can be used by most of the CMS analysis groups. Information about the NanoAOD format can be found here.

In this exercise, we will repeat the trigger efficiency measurement using a NanoAOD input file (events stored in it are different from the MiniAOD file used earlier) format and the related tools.

First, check out the NanoAOD-tools package for reading NanoAOD files:

cd $CMSSW_BASE/src
git clone https://github.com/cms-nanoAOD/nanoAOD-tools.git PhysicsTools/NanoAODTools
cd PhysicsTools/NanoAODTools
scram b -j 4
cd $CMSSW_BASE/src/LPCTriggerHATS/ShortExerciseTrigger/test

Copy the skimmed NanoAOD file to your working directory as follows:

xrdcp root://cmseos.fnal.gov//store/user/cmsdas/2023/short_exercises/Trigger/New_NanoAOD_M1000.root .

Then, first run the code (written using packages of NanoAOD tools available centrally) to produce the required histograms to compute the efficiency.
Next run the code to take the ratio of the two histograms and plot the efficiency and store it as a pdf:

python MET_Efficiency_NanoAOD_gp.py
python MET_Efficiency_Plotting.py

Question

Inspect the resulting efficiency plot. Though we used a different sample, is the shape of this result consistent with what you obtained in MiniAOD?

Key Points


Accessing trigger objects

Overview

Teaching: 30 min
Exercises: 0 min
Questions
Objectives
  • Learn how to access the trigger information stored in MiniAOD and NanoAOD

  • Learn what is trigger objects and how to access them

  • Measure trigger efficiency using the tag-and-probe method

MiniAOD

In the efficiency measurement Ex., we performed a simple efficiency measurement using the TriggerResults product in a MiniAOD file and a NanoAOD file.
Sometimes, we want to know what is the exact object reconstructed and used in the HLT path.
This is what TriggerObjectStandAloneCollection contains - the actual physics objects reconstructed at the HLT.

Have a look at the code in plugins/SingleMuTrigAnalyzerMiniAOD.cc, especially the analyze function starting from line 95, as well as the configuration file ana_SingleMuMiniAOD.py.

Additional info

The configuration file shows that also this time we are using as input a skimmed MiniAOD file called skim_dimu20_SingleMuon_2016G_ReReco_180k.root. In case you are wondering again, this skim has been produced with the configuration skim_dimu20.py, requires two offline muons with pt above 20 GeV.

This time we don’t have a signal trigger and a separate reference trigger. Instead, we focus only on one single-muon trigger, namely HLT_IsoMu24:

process.singleMuTrigAnalyzerMiniAOD.triggerName = cms.untracked.string("HLT_IsoMu24_v2")

The SingleMuTrigAnalyzerMiniAOD.cc analyzer is longer and more complicated than the one in efficiency measurement Ex., and we will discuss it not only in this exercise, but also in two others that follow. So don’t worry if some parts look somewhat mysterious first.

Similarly to efficiency measurement Ex., in line 129 the name of the HLT path (triggerName_) is used to retrieve the corresponding index (triggerIndex):

const unsigned int triggerIndex(hltConfig_.triggerIndex(triggerName_));

which is then used (in line 148) to access the HLT decision to accept or reject the event:

  bool accept = triggerResultsHandle_->accept(triggerIndex);

You can find some new, more interesting tricks on lines 180-190. There we create an empty vector trigMuons, loop over all trigger objects (that is, physics objects reconstructed at HLT), select the objects that HLT classified as muons that would pass our single-muon trigger, and add the four-vectors of these trigger-level muon objects to the trigMuons vector:

  std::vector trigMuons;
  if (verbose_) cout << "found trigger muons:" << endl;
  const edm::TriggerNames &names = iEvent.triggerNames(*triggerResultsHandle_);
  for (pat::TriggerObjectStandAlone obj : *triggerOSA_) {
    obj.unpackPathNames(names);
    if ( !obj.id(83) ) continue; // muon type id
    if ( !obj.hasPathName( triggerName_, true, true ) ) continue; // checks if object is associated to last filter (true) and L3 filter (true)
    trigMuons.push_back(LorentzVector(obj.p4()));
    if (verbose_) cout << "  - pt: " << obj.pt() << ", eta: " << obj.eta() << ", phi: " << obj.phi() << endl;
  } // loop on trigger objects

Run the configuration file as follows:

cd $CMSSW_BASE/src/LPCTriggerHATS/ShortExerciseTrigger/test
cmsRun ana_SingleMuMiniAOD.py 

The code will output a file called histos_SingleMuTrigAnalyzer.root, which we will inspect in the next exercise.

NanoAOD

Trigger objects are also stored in NanoAOD. Here are a short example of accessing

from coffea.nanoevents import NanoAODSchema
fpath="root://cmseos.fnal.gov//store/user/cmsdas/2023/short_exercises/Trigger/New_NanoAOD_M1000.root"
events = NanoEventsFactory.from_root(fpath,schemaclass=NanoAODSchema).events()

Trigger object is one of the fields of the event

trig = events.TrigObj
trig.fields
['pt',
 'eta',
 'phi',
 'l1pt',
 'l1pt_2',
 'l2pt',
 'id',
 'l1iso',
 'l1charge',
 'filterBits']

The trigger objects has an id attribute so that you can easily find same type of trigger objects (e.g. trigger muons)

trigMuon = trig[trig.id==13]
# ID of the object: 11 = Electron (PixelMatched e/gamma), 22 = Photon (PixelMatch-vetoed e/gamma), 13 = Muon, 15 = Tau, 1 = Jet, 6 = FatJet, 2 = MET, 3 = HT, 4 = MHT
# Note that id matches the pdgId of the particles

Filter bits

The information of which trigger paths the trigger object has passed is stored compactly in the filterBits field.
The filterBits encoding can be found from the documentations links below.

For example, for muons, the bits meaning is the following:

  1    = TrkIsoVVL,
  2    = Iso, 
  4    = OverlapFilter PFTau,
  8    = 1mu,
  16   = 2mu,
  32   = 1mu-1e, 
  64   = 1mu-1tau,
  128  = 3mu, 
  256  = 2mu-1e,
  512  = 1mu-2e,
  1024 = 1mu (Mu50), 
  2048 = 1mu (Mu100)

Therefore, if you want to find trigger muons in Iso paths:

filterBit = 2
passIso = trigMuon.filterBits &filterBit==filterBit # mask for each trigMuon
IsoTrigMuon = trigMuon[passIso] 

How can you find the trigger muons in 1mu and Iso paths?

References

Key Points


Matching trigger objects

Overview

Teaching: 10 min
Exercises: 0 min
Questions
Objectives
  • Learn how to access the trigger information stored in MiniAOD and NanoAOD

  • Learn what is trigger objects and how to access them

  • Measure trigger efficiency using the tag-and-probe method

Match trigger objects to offline objects

Trigger objects are physics objects(electrons,muons,jets,MET..etc) reconstructed at HLT and are used for making HLT decisions of each HLT path. Offline objects are reconstructed offline, which in general has more precise reconstructions. If we want to know whether an offline object(e.g. a muon) is responsible for firing the HLT, we can look for a trigger muon within a certain dR of the offline muon.

Collection names

We can learn from the example in the analyze function in SingleMuTrigAnalyzerMiniAOD.cc.
The offline objects are reconstructed primary vertices (offlineSlimmedPrimaryVertices) and offline-reconstructed muons (slimmedMuons).
The trigger object is trigMuons, which we obtained from the TriggerObjectStandAloneCollection from last episode.
Here we are not interested in primary vertices per se, we just need to get them because later we will use them in identification of “tight” muons.

Matcing by momentum direction dR

Question

Take a look at lines 231-286 in SingleMuTrigAnalyzerMiniAOD.cc, where you can find two for loops that iterate over the offline muons.

Since the second for loop is inside the first one, together these two for loops are going over all possible pairs of offline muons. The first muon in each pair is called “tag”, and the second is called “probe”, for reasons that will soon become apparent. Inside the loop, several selection criteria are placed on muons, such as cuts on pt and eta, tight identification, and isolation requirements.

Now that we have both trigger-level muons and offline-reconstructed muons, we can compare them with each other. In order to do this, we need to figure out which trigger muon corresponds to which offline muon. This is called trigger matching. In the code, trigger matching is performed both for tag muons (lines 245-248) and probe muons (lines 277-280). For tag muons, the matching code looks like this:

   bool trigmatch_tag = false;
   for (unsigned int itrig=0; itrig < trigMuons.size(); ++itrig) {
      if (ROOT::Math::VectorUtil::DeltaR(muon_tag->p4(),trigMuons.at(itrig)) < dr_trigmatch) trigmatch_tag = true;
   }

Since reconstruction of muon trajectory is less precise in HLT than in offline reconstruction, there might be a small difference between the directions of the trigger muon and the offline muon, even if they correspond to the same original real muon. Therefore a DeltaR cone of 0.2 is used to geometrically match the trigger object and the offline object (dr_trigmatch = 0.2).

Finally, if the two muons pass the selections in the code, and the tag muon passes the trigger matching, the invariant mass of this dimuon system is calculated and saved in a histogram:

     LorentzVector dimuon = LorentzVector(muon_tag->p4() + muon_probe->p4());
     hists_1d_["h_mll_allpairs"]->Fill(dimuon.M());
     if (verbose_) cout << " - probe dimuon mass: " << dimuon.M()  << endl;
     if (dimuon.M() < 81. || dimuon.M() > 101.) continue;
     hists_1d_["h_mll_cut"]->Fill(dimuon.M());

Use the TBrowser to open the file histos_SingleMuTrigAnalyzer.root and have a look at the histogram h_mll_allpairs, which shows the dimuon invariant mass distribution of muon pairs that passed the selections.

Question

Can you explain the shape of the distribution?

Key Points