Machine learning

From SEG Wiki
Jump to navigation Jump to search

Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems use to effectively perform a specific task without using explicit instructions, relying on patterns and inference instead. It is seen as a subset of artificial intelligence. Machine learning algorithms build a mathematical model based on sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to perform the task.[1]

What is Machine learning

The Machine learning concept refers to computational methods (Figure 1) that allow machines to organize big amounts of data to make either classes or predictions in an accurate way. The machine learning processes comprise complex algorithms or source codes that identify the data and find relations or patterns around it.

ML Applications

Machine learning is used in different sectors for various reasons. Trading systems can be calibrated to identify new investment opportunities. Marketing and e-commerce platforms can be tuned to provide accurate and personalized recommendations to their users based on the users’ internet search history or previous transactions [2] Previous applications of machine learning to seismic reflection data focus on the detection of geologic structures, such as faults and salt bodies (e.g., [3]Huang et al., 2017) and unsupervised seismic facies classification, in which an algorithm chooses the number and types of facies (e.g., Coléou et al [4]). Although early studies primarily used clustering algorithms to classify seismic data, recent studies focus on the application of artificial neural networks (e.g Huang et al.[3]) in Zhao et al [5]

                      Figure 1: Examples of supervised and unsupervised machine learning techniques.[6]   Figure 2: Machine learning categorized in terms of familiar geophysics problems. [7]

ML algorithms are being used in the Oil and Gas sector for various applications such as:

Although ML algorithms appeared decades ago, SEG members started publishing about them in the mid-90s. In 2019, in response to "the digital transformation of Oil and Gas" the AAPG, SEG & SPE decided to organize the first conference fully dedicated to the topic: "Energy in Data"


Generally, machine learning methods could be subdivided in two according to their end members, Unsupervised and supervised methods (Figure 2)

Unsupervised learning

Unsupervised learning methods refer to those where information that is used to train has not been classified or labeled. In other words, the machine attempts to find similarities, patterns, or relations among the introduced inputs by deciphering hidden or intrinsic structures from the data and reduce dimensionality. We are in some sense working blind the term unsupervised is used because we lack a response variable that can supervise (or teach or evaluate) our analysis [6]. In this methods, error is difficult to determine. Some examples of Unsupervised learning techniques are:

Supervised learning

Supervised machine learning methods, also known as feeding forward methods [5] are those where inputs and outputs are already known. Therefore, the models are trained to predict a somehow known result data. The goal is to fit a model that relates response to predictors for accurately predicting the response of other observations or understanding the relationship between the response and the predictors (inference) [6]. The evaluation of these methods relies on error measures (e.g., misclassification rate, expected loss, Confusion matrix) see more on Some examples of Supervised learning techniques are:

Machine learning methods can be subdivided in a number of ways: algorithm architecture, loss function, learning style, label data type, and so on. However, for readers new to machine learning, it is simplest to categorize machine learning by the data set you have (inputs) and by the kind of data you want to end with (outputs). Figure 2 illustrates machine learning categorized by the kinds of problems commonly solved by geophysicists. [7]

Principal Component Analysis (PCA)

Principal component analysis (PCA) is one of the oldest multivariate analysis techniques. It was first introduced by Pearson in 1901 [6]. This quantitative unsupervised ML technique consists of a mathematical procedure that transforms a set of variables into a smaller number of variables that are called principal components (eigenvectors). For this is cataloged as a dimensionality reduction technique. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component (orthogonal to each preceding) accounts for as much of the remaining variability (Guo et al., 2009; Haykin, 2009 in [5] (Figure 3)

                                                                        Figure 3: Representation of principal components (eigenvectors) in a set of data.[6]

Self Organizing Maps (SOM)

SOMs are a manifold projection technique first described by Teuvo Kohonen [8] “Kohonen map”. It is a type of neural network [4] originally developed for pattern recognition, nowadays used to cluster multi-dimensional data. Self-organizing maps learn the latent space by a recursive clustering algorithm. An initial manifold is selected and uniformly populated with cluster centers. The observed waveforms are then recursively entered into the model in a random manner. Each observation is mapped to a neighborhood of closest clusters defined by point-to-cluster distances, and the clusters are subsequently updated, thereby pulling the latent space to better fi t the data. It is superior to the commonly used k-means algorithm as it assigns the ordered clusters which can be used with an ordered color map [4], and it is this ordering that justifies categorizing SOM as a method of latent space modeling. While the clusters themselves are defined in the original n-dimensional space, they are mapped into a lower-dimensional, typically one- or two-dimensional, latent space. A latent space is a lower-dimensional space, into which the original input data are projected. Analyzing data in a particular latent space may discover data properties that are easily overlooked in the original space (Figure 4)

                                     Figure 4: (a) A distribution of data points in 3D attribute space. The statistics of this distribution can be defined by the covariance matrix. (b) k-means will cluster data into a user-defined number of distributions (four in this example) based on the Mahalanobis distance measure. (c) The plane that best fits these data is defined by the first two eigenvectors of the covariance matrix. The projection of the 3D data onto this plane provides the first two principal components of the data, as well as the initial model for our SOM and GTM algorithms. (d) SOM and GTM deform the initial 2D plane into a 2D manifold that better fits the data. Each point on the deformed 2D manifold is in turn mapped to a 2D rectangular latent space. Clusters are color-coded or interactively defined on this latent space .[5] Figure 5: How SOM works (10 seismic attributes). [9]

Recent work using Self Organizing Map and Multi-attribute Analysis and PCA has revealed geologic features that were not previously identified or easily interpreted from the seismic data. The ultimate goal in this multiattribute analysis is to enable the geoscientist to produce a more accurate interpretation and reduce exploration and development risk [10] The SOM, a form of unsupervised neural networks, has proven to take many of these seismic attributes and produce meaningful and easily interpretable results (Figure 5). SOM analysis reveals the natural clustering and patterns in data and has been beneficial in defining stratigraphy, seismic facies (Figure 6), direct hydrocarbon indicator features, and aspects of shale plays, such as fault/fracture trends and sweet spots. With modern visualization capabilities and the application of 2D color maps, SOM routinely identifies meaningful geologic patterns [9]

                                                                   Figure 6:a) SOM clusters extracted on a slice horizon close to the base of the Kora volcano indicated by the purple pick in (b). The colors in the horizon slice indicate similar facies (similar colors= similar facies) Notice the purple-ish colors in a fan-like geometry with a ~ 2 km scour, suggesting TP 3 are associated with a landslide from the west flank of the Kora volcano (b) A vertical slice through the seismic amplitude volume inside the subaqueous flow showing the extension of TP 3. (c) A vertical slice perpendicular to that in (b) through co-rendered amplitude and SOM clusters. Notice the distinct purple facies associated with the land slide or Volcanic Mass transport Deposit (VMTD). [11] />

Convolutional Neural Net (CNN)

CNN's are part of Deep learning methods that discover intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. The architecture of a typical ConvNet (Figure 7) is structured as a series of stages. The first few stages are composed of two types of layers: convolutional layers and pooling layers. Units in a convolu¬tional layer are organized in feature maps, within which each unit is connected to local patches in the feature maps of the previous layer through a set of weights called a filter bank. The result of this locally weighted sum is then passed through a non-linearity such as a ReLU. All units in a feature map share the same filter bank. Different feature maps in a layer use different filter banks. The reason for this architecture is twofold. First, in array data such as images, local groups of values are often highly correlated, forming distinctive local motifs that are easily detected. Second, the local statistics of images and other signals are invariant to location. In other words, if a motif can appear in one part of the image, it could appear anywhere, hence the idea of units at different locations sharing the same weights and detecting the same pattern in different parts of the array. Mathematically, the filtering operation performed by a feature map is a discrete Convolution, hence the name. [12]

                                                            Figure 7: CNN architecture. [3]

Waldeland et al [13] performed Salt classification using deep learning, demonstrating how CNNs can classify salt bodies in seismic data. He trained a CNN on one manually labeled slice of a 3D seismic cube and used the network o automatically classify the full 3D salt body (Figure 8) without needing seismic attributes for it. This shows how Machine learning techniques are advancing, becoming more frequently used and accurately help in the definition of features, which facilitates the interpretation and saves time to the interpreter.

                                                                   Figure 8: One slice of a 3d seismic volume with two class labels: salt (red) and not salt (green). this is the training data. on the right: extracted 3d salt body in the same dataset, colored by elevation.[13]

See also

Seismic attributes (Principal component analysis) (Self organizing map),_supervised/unsupervised (Unsupervised learning),_supervised/unsupervised (Supervised learning)

External links


  3. 3.0 3.1 3.2 Huang, L., Dong, X., Clee, E., 2017. A scalable deep learning platform for identifying geologic features from seismic attributes. The leading Edge. Special section: Data analytics and machine learning. Vol 521, 433-444p
  4. 4.0 4.1 4.2 Coléou, T., M. Poupon, and K. Azbel, 2003, Unsupervised seismic facies classification: A review and comparison of techniques and implementation: The Leading Edge, 22, 942–953, doi: 10.1190/1.1623635.
  5. 5.0 5.1 5.2 5.3 Zhao, T; Jayaram, V; Roy, A and Marfurt, K, 2015, A comparison of classification techniques for seismic facies recognition: Interpretation. Special edition. Pattern recognition and machine learning. DOI: 10.1190/INT-2015-0044.1.
  6. 6.0 6.1 6.2 6.3 6.4 Nicholson, C., 2016. Data Analytic lectures. The University of Oklahoma, Gallogly College of Engineering, School of Industrial and systems engineering
  7. 7.0 7.1 Dahl, M., Geo, P., 2018, A Gentle introduction to machine learning. Recorder. Vol 43. No 01.
  8. Kohonen, T., 1982, Self-organized formation of topologically correct feature maps: Biological Cybernetics, 43, 59 –69
  9. 9.0 9.1 Roden, R., and Sacrey, D., 2016. Seismic interpretation with machine learning. GeoExpro. Vol 13. No 6. pp 50-53
  10. Roden, R., Smith, T., & Sacrey, D. (2015). Geologic pattern recognition from seismic attributes: Principal component analysis and self-organizing maps. Interpretation. Vol, 3. No 4.
  11. Infante, L., Marfurt, K., 2019. Using Machine learning as an aid to seismic geomorphology, which attributes are the best input? Interpretation, Vol.7, No 3. p. SE1–SE18, 29 FIGS.
  12. LeCun, Y., Bengio, Y., Hinton, G, 2015. Deep Learning. NATURE. doi:10.1038/nature14539
  13. 13.0 13.1 Waldeland, A., Solberg, A., 2017. Salt Classification Using Deep Learning.

This page was authored by a student at the University of Oklahoma. This page was completed by December 1 2019.

  1. Image description pops out when signalized by the mouse cursor.