Here is the minimal set of steps organized by priority

a.) Produce agreed on set of analysis tasks (suggested below) where we can both see and agree on what should be tried to reduce combinatorics.

b.) Compete decomposition of mass spectrum and achieving similar scale combinatorics as see previously with 906.

c.) Demonstrate error propagation analysis for physics extraction (steps already define assuming b is complete). 

d.) Complete with help from me the QTracker analysis note as described on main page with each section as a stand-alone unit.

e.) If we are out of time and got a-d, we are done, if we can achieve anything further on previous page Great!!

f.) Contingency Plan: If we try what I'm suggesting and run out of time with no success and assuming the Event Finder has been tailored to use for the public as a stand-alone utility, we present this as the final product as a tool for SeaQuest/SpinQuest. 

-Here is what I'm suggest-

I believe we need to meet more frequently, at least two or three times a week where you are showing results (plots and histograms and/or analysis scripts are fine too as long as I don't have to run them on a beast) and test analysis in slides or posted on the webpage of what we agree to look at previously until things are on track. 


Here is the critical tasks I think we should focus on:

We must compare to ktrackers set of cuts and produce all histograms for checking. I think you mostly have this I just haven't seen the results.  It very important to make histograms of whats been done and checked and tested showing how things changed or didn't.  The cuts that I'm talking about are just the basic timing cuts, hodo masking, intensity, fiducial, and all the basic things done in every 906 analysis having nothing to do with the internal workings of ktracker.

The point for this type of analysis is just to have control as much information going in and out of the dataset as much as possible and all other necessary variables for testing and diagnostics.  We need to know all the cuts normally taking in the standard analysis with ktracker - before and after and during reconstruction.  The easiest way to do this is to make your own mass distribution with ktracker and taking all the normal cuts that Kei and Hellen did so that you understand what essential cuts are required to get the mass distribution we are comparing to.  Then study those in a similar way in the Qtracker approach.  Obviously, you cannot cut on the chi2 fit results as in ktracker but there are analogues in Qtracker but we need to see step by step that our analysis cuts are doing what they are supposed to at each step and find out which step has the weakness (or weaknesses).

The point of making combinatoric background MC is to again have more control.  I hope its clear that with an algorithm like ktracker it easier to get alway with using real experimental data for analysis like that but for AI its very dangerous unless there is a way get information about what is going into it and have some control of variation so that it can be used in studies and the resulting model can be studies analytically based on those inputs.  The way I would make this MC is use the real data to determine the critical features.  It should be clear that production of partial track hit pattern must not be determinable or the AI will find the pattern and build a bias and the degree of partial-ness in a track need not be correlated to its origin as many track can easily be confused as one and this has to be properly simulated and built in to the MC.  We know we are getting there when an AI built to separate can't tell the difference between the two. In terms of getting the distributions right, this is actually the easy part as you can use the real data to produce the same distributions in a multidimensional Von Neumann rejection sampling from a large uniform distribution created with the right hit patterning from single muons of broad range of momentum.  The final model should not be dependent on the shape of these distributions as each trigger makes these different so the idea would be to vary these and ensure there is no embedded bias based on distribution as it has nothing to do with the selection criterial of being an incomplete track.  In any case there are AI tools to help with data matching.  This could be as simple or as complex as we choose.  Getting some simple but working MC of this type would be pretty quick.  This was always my strong suggestion.  Note sure why you didn't try.  Maybe there is good reason.  Here I'll offer a few options that do not strictly rely on this type of MC.

There seems to be lots of information missing in QTracker that would really help us now primarily around the quality of each track.  Like how many missing hits?  How far away were the hits? What hit sequences give a high probability of a quality reconstruction?  Without these types of metrics its difficult to say track by track if we are working with something useful or misleading.

The original concept of QTracker was proposed to be modular such that each independent part could work to achieve a particular part and could if needed work as a stand-alone function/class.  As an example The Event Filter could help anyone filter events for their analysis of help filter OM/OR events if it is made to be used that way.  This also make each module easier to test directly and turn into something useful. We should at least try this with two of the QTracker models.  The Track Finder also fits into this category if its coupled with a Tracker Filter.

The Track Finder should be followed by a Track Filter.  After track candidates are found they should be filter out based on a quality metric as well as by seeing if the track hit patterns actual match that of something coming from the target.  This may be two distinct cuts, one from the final track quality metric and one from a target-dump classifier (could go together if metric is part of classifier training).  There is certainly some major overlap of target and dump hits that make this part challenging, but this can be iterated as well.  Perhaps one filter before reconstruction and one filter after which also uses the final 3-momentum and vertex information which has a narrowed training space.

My bet is if we focus on the above we will not need much else to bring things together.  The response function I mentioned could also help. Recall the Response function is just generic lingo for that quantitative evaluation of the goodness of fit.  Such a thing is really needed for reconstruction evaluation as there is no chi2 to cut on. For something like the 4-momentum or vertex this could be calculated for the vector as a whole or for the individual components.  In the analysis the person may want to choose a high confidence level (CL) and get rid of all fits that are not within that CL or keep everything as long as it's not a total garbage fit.  Assuming the regression performed well and for the MC validation data the true and predicted values are quite close, the next thing would be to quantify this event by event.  One way is to again use a bootstrap method involve regenerating the training dataset and retraining the model multiple times. For each event, you then predict using these multiple models to get a distribution of predictions. The confidence interval can be estimated from the percentiles of this distribution (e.g., the 2.5th and 97.5th percentiles for a 95% confidence interval).  A simpler quick and dirty way might be to use a big MC set and just measure the difference between true and predicted values from a single model and calculate the MSE for each 4-vector for the large MC data set.  Then train a model to predict the MSE based on output of the original model (the 4-vector components).  You can still produce a CL this way to, if needed. Having this response function be built from reconstruction of tracks (or dimuons) just from the target would lead to dump tracks (or dimuons) having less a lower CL.  Not having these types of evaluations may be part of the reason we are struggling with the combinatoric backgrounds.  We have no way of knowing how good the track or the final reconstruction is.  QTracker thinks there is a lot of good tracks that are worth reconstructing that are just random junk.  That is something we can train against and evaluate then make cuts.



  • No labels