Processing math: 100%

1  Machine Learning and its application in Muon Tomography

Machine Learning inspired techniques can be applied to Muon Tomography to resolve many of associated problems. The two problems of interest to us are low resolution tomograms and data inefficiency. In this notebook, I am discussing the use of ML as applied to MT to solve problem of data inefficiency.

1.1  Data Inefficiency Problem

pic

We define "good muon events" as events with 4 by 4 coincidence hits. Such events have defined spatial points in both top and bottom trays. For example, for the hit on the right the information A(x1,y1) B(x2,y2)

are known. Due to limited capabilities of our instruments, such events are rare (less than 40% withour best set up) with many events following the trend of the hit on the left.

C(x3,y3)D(x4,?)

We know that y4(ymin,ymax) and is discernable by yres which is the limited hardware spatial resolution of the telescope. Now, the question is what this y4 may be?

Once we know how to address this question, we can reduce our data redundancy drastically and improve our overall resolution and efficiency of our schema.

1.2  Track reconstruction using non-perfect events

Each event is assigned a probability that describes the likelihood of using information from that event to reconstruct the muon track. Such a scheme means that events with 4 by 4 coincidence hits have a probability of 1 and events with no hits have probability 0. Events with 3 by 4 hits are thus also assigned a probability.

The way we assign this probability requires the use of some neural networks - RNN's and LSTM's. The idea is as follows:

  1. Contextualize the entire dataset into a RNN framework
  2. Calculate key statistics of overall dataset and individual events
  3. Locate events with missing information (i.e. not perfect 4 by 4 coincidence hits)
  4. Assign probabilities to candidates for the missing information using wholistic statistics, TDC data, and the angular distribution of muons.
  5. Use RNN's and LSTM's to predict the missing datum in 3 by 4 events using the probabilistic approach.

1.3  Probability distribution of the muon hit of a single spatial dimension

Using the coordinate with full information (C in our case) and the other spatial coordinate (x4), we assign probabilites to each yi(ymin,ymax). Formally, it can be represented using Bayes' theorem.

P(y4=yi|x3,y3,x4)

Now, the question is how do we create such a probability distribution? What controlled/measured factors can help us make such decisions?

1.3.1  No information known

When we know previous/extra information about the muon hit, one can imagine the distribution is uniform on the space of the spatial dimension.

P(y4=yi|x3,y3,x4)U(ymin,ymax)

1.3.2  One spatial coordinate and one single spatial dimension known

The distribution for such a case is much more complicated. The following are some ways of generating the distribution.

1.3.2.1  Exact Aggregate Data Approach

In this approach, we simply assign the probabilities based on the most likely value from the set of "good" events that share the exact coordinates with the event under consideration. The probability simply becomes a fraction of the two aggregate type of events.

f1(yi)=P(y4=yi)=E(yi,x3,y3,x4)E(x3,y3,x4)

1.3.2.2  Similar Aggregate Data Approach

We do a similar thing like the Exact Aggregate Data Approach with the added consideration of neighbors to influence the statistic. For a tolerance unit of χ. The following is the probability distribution.

f2(yi)=P(y4=yi)=E(yi,x3+χ,y3+χ,x4+χ)E(x3+χ,y3+χ,x4+χ)

1.3.2.3  Shift of TDC Approach

Since, we have demonstrated that TDC values are correlated with the transverse distance of the muon hit along the scintilator bar. It is possible to extrapolate 2D information from such values. In this approach, we do just that to determine the missing spatial dimension.

Let's consider the following case of event D.

Y and X direction has been mistakenly reveresed here in the plots

pic2

We are classing the groups of muon hits along the readout channel of event D into 4 arbitrary groups - W, Q,Y,T. We know that there is a relationship between the average TDC measured at the one of the channels of this group and the measured distance along the axis as illustrated by the following figure.

pic3

The discernable peaks/mean here represent the different "classes". The vector of such statistic is directly correlated with the value of the missing orthogonal dimension. Using this relationship - yp=mμ+c-, we can predict the missing dimension (y4 in our case).

pic4

We define a function g(y4=y1) such that

g(y4=yi)=|(mμknown+c)yi|ymin+ymax

Thus,

f3(yi)=P(y4=yi|μ=μknown)={1g(y4=yi)g(y4=yi)10g(y4=yi)1

1.3.2.4  Most Common Zenith Angle Approach

In this approach, we select the missing spatial dimension such that the angle resembles the most common zenith angle, αmean with α being the vector of angles computed from the data set.

αmean=E(α)

Since the space of yi is discrete. It is, thus, possible to generate a zenith angle distribution over such a space given the set x3,y3,x4 is known as is the case in the event of interest.

For example, lets consider such an arbitrary distribution.

Example

We define a function d(yi) (given x3,y3,x4) as follows:

d(yi|x3,y3,x4)=|αmeanαi(yi)|αmean

Thus, using this function we generate a probability distribution that selects for the most common zenith angle.

f4(yi)=P(y4=yi)=1d(yi|x3,y3,x4)

The best method should be some combination of all such approaches. This is where ML comes in. The ML scheme would work to solve for the coffecients that of such a weighted sum to maximize resolution of tomogram and minimize data inefficiency.

1.4  Applying ML

1.4.1  Comprehensive Probability Function

P(y4=yi)=Af1(yi)+Bf2(yi)+Cf3(yi)+Df4(yi)

Let θ be the vector of the coefficients.

θ=[ABCD]

1.4.2  Objective Function for Training

Here, Etotal is the total number of events in the data set and Eused(θ) is the number of events that are being used for analysis purposes as dictated by θ.

minimizeθEtotalEused(θ)