Machine learning (ML) is the scientific study of Algorithms and statistical models that computer systems use to effectively perform a specific task without using explicit instructions, relying on patterns and inference instead. It is seen as a subset of artificial intelligence. Machine learning algorithms build a mathematical model based on sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to perform the task.
What is Machine learning
The Machine learning concept refers to computational methods that allow machines to organize big amounts of data to make either classes or predictions in an accurate way. The machine learning processes comprise complex algorithms or source codes that identify the data and find relations or patterns around it.
Machine learning is used in different sectors for various reasons. Trading systems can be calibrated to identify new investment opportunities. Marketing and e-commerce platforms can be tuned to provide accurate and personalized recommendations to their users based on the users’ internet search history or previous transactions 
Previous applications of machine learning to seismic reflection data focus on the detection of geologic structures, such as faults and salt bodies (e.g., Huang et al., 2017) and unsupervised seismic facies classification, in which an algorithm chooses the number and types of facies (e.g., Coléou et al ). Although early studies primarily used clustering algorithms to classify seismic data, recent studies focus on the application of artificial neural networks (e.g Huang et al.) in Zhao et al 
ML algorithms are being used in the Oil and Gas sector for various applications such as:
- facies classification
- Quantitative interpretation
- Geobody interpretation
- micro-seismic event detection
- velocity picking
- Image analysis of rock thin sections
- Seismic processing such as ground-roll noise attenuation
Although ML algorithms appeared decades ago, SEG members started publishing about them in the mid-90s. In 2019, in response to "the digital transformation of Oil and Gas" the AAPG, SEG & SPE decided to organize the first conference fully dedicated to the topic: "Energy in Data"
Generally, machine learning methods could be subdivided in two according to their end members, Unsupervised and supervised methods (Figure 1)
Unsupervised learning methods refer to those where information that is used to train has not been classified or labeled. In other words, the machine attempts to find similarities, patterns, or relations among the introduced inputs by deciphering hidden or intrinsic structures from the data and reduce dimensionality. We are in some sense working blind the term unsupervised is used because we lack a response variable that can supervise (or teach or evaluate) our analysis . In this methods, error is difficult to determine. Some examples of Unsupervised learning techniques are:
- Association mining
Supervised machine learning methods, also known as feeding forward methods  are those where inputs and outputs are already known. Therefore, the models are trained to predict a somehow known result data. The goal is to fit a model that relates response to predictors for accurately predicting the response of other observations or understanding the relationship between the response and the predictors (inference) .The evaluation of these method relies on error measures (e.g., MSE, misclassification rate, expected loss, Confusion matrix). Some examples of Supervised learning techniques are:
- Decision trees
- Linear regression (OLS, PLS, ridge, lasso, elastic net)
- Logistic regression (including penalized variants)
- Support Vector machine
Machine learning methods can be subdivided in a number of ways: algorithm architecture, loss function, learning style, label data type, and so on. However, for readers new to machine learning, it is simplest to categorize machine learning by the data set you have (inputs) and by the kind of data you want to end with (outputs). Figure 2 illustrates machine learning categorized by the kinds of problems commonly solved by geophysicists. 
Principal Component Analysis (PCA)
Principal component analysis (PCA) is one of the oldest multivariate analysis techniques. It was first introduced by Pearson in 1901 . This quantitative unsupervised ML technique consists of a mathematical procedure that transforms a set of variables into a smaller number of variables that are called principal components (Eigenvectors). For this is cataloged as a dimensionality reduction technique. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component (orthogonal to each preceding) accounts for as much of the remaining variability (Guo et al., 2009; Haykin, 2009 in  (Figure 3)
Self Organizing Maps (SOM)
SOMs are a manifold projection technique first described by Teuvo Kohonen  “Kohonen map”. It is a type of neural network  originally developed for pattern recognition, nowadays used to cluster multi-dimensional data.
Self-organizing maps learn the latent space by a recursive clustering algorithm. An initial manifold is selected and uniformly populated with cluster centers. The observed waveforms are then recursively entered into the model in a random manner. Each observation is mapped to a neighborhood of closest clusters deﬁned by point-to-cluster distances, and the clusters are subsequently updated, thereby pulling the latent space to better ﬁ t the data. It is superior to the commonly used k-means algorithm as it assigns the ordered clusters which can be used with an ordered color map , and it is this ordering that justiﬁes categorizing SOM as a method of latent space modeling. While the clusters themselves are deﬁned in the original n-dimensional space, they are mapped into a lower-dimensional, typically one- or two-dimensional, latent space. A latent space is a lower-dimensional space, into which the original input data are projected. Analyzing data in a particular latent space may discover data properties that are easily overlooked in the original space (Figure 4)
Recent work using SOM and PCA has revealed geologic features that were not previously identified or easily interpreted from the seismic data. The ultimate goal in this multiattribute analysis is to enable the geoscientist to produce a more accurate interpretation and reduce exploration and development risk 
The SOM, a form of unsupervised neural networks, has proven to take many of these seismic attributes and produce meaningful and easily interpretable results (Figure 5). SOM analysis reveals the natural clustering and patterns in data and has been beneficial in defining stratigraphy, seismic facies (Figure 6), direct hydrocarbon indicator features, and aspects of shale plays, such as fault/fracture trends and sweet spots. With modern visualization capabilities and the application of 2D color maps, SOM routinely identifies meaningful geologic patterns 
Convolutional Neural Net (CNN)
CNN's are part of Deep learning methods that discover intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. The architecture of a typical ConvNet (Figure 7) is structured as a series of stages. The first few stages are composed of two types of layers: convolutional layers and pooling layers. Units in a convolu¬tional layer are organized in feature maps, within which each unit is connected to local patches in the feature maps of the previous layer through a set of weights called a filter bank. The result of this locally weighted sum is then passed through a non-linearity such as a ReLU. All units in a feature map share the same filter bank. Different feature maps in a layer use different filter banks. The reason for this architecture is twofold. First, in array data such as images, local groups of values are often highly correlated, forming distinctive local motifs that are easily detected. Second, the local statistics of images and other signals are invariant to location. In other words, if a motif can appear in one part of the image, it could appear anywhere, hence the idea of units at different locations sharing the same weights and detecting the same pattern in different parts of the array. Mathematically, the filtering operation performed by a feature map is a discrete Convolution, hence the name. 
Waldeland et al  performed Salt classification using deep learning, demonstrating how CNNs can classify salt bodies in seismic data. He trained a CNN on one manually labeled slice of a 3D seismic cube and used the network o automatically classify the full 3D salt body (Figure 8) without needing seismic attributes for it. This shows how Machine learning techniques are advancing, becoming more frequently used and accurately help in the definition of features, which facilitates the interpretation and saves time to the interpreter.
Seismic attributes https://wiki.seg.org/wiki/Dictionary:Principal_component_analysis_(PCA) (Principal component analysis) https://wiki.seg.org/wiki/Self_Organizing_Map_and_Multi-attribute_Analysis (Self organizing map) https://wiki.seg.org/wiki/Dictionary:Learning,_supervised/unsupervised (Unsupervised learning) https://wiki.seg.org/wiki/Dictionary:Learning,_supervised/unsupervised (Supervised learning)
- AASPI Website
- Geophysical Insights - Self-Organizing Map
- Wikipedia - Machine Learning
- Nicholson, C., 2016. Data Analytic lectures. The University of Oklahoma, Gallogly College of Engineering, School of Industrial and systems engineering
- Dahl, M., Geo, P., 2018, A Gentle introduction to machine learning. Recorder. Vol 43. No 01. https://csegrecorder.com/articles/view/a-gentle-introduction-to-machine-learning
- Huang, L., Dong, X., Clee, E., 2017. A scalable deep learning platform for identifying geologic features from seismic attributes. The leading Edge. Special section: Data analytics and machine learning. Vol 521, 433-444p https://library.seg.org/doi/10.1190/tle36030249.1
- Coléou, T., M. Poupon, and K. Azbel, 2003, Unsupervised seismic facies classification: A review and comparison of techniques and implementation: The Leading Edge, 22, 942–953, doi: 10.1190/1.1623635. https://www.researchgate.net/publication/276950950_Unsupervised_seismic_facies_classification_A_review_and_comparison_of_techniques_and_implementation
- Zhao, T; Jayaram, V; Roy, A and Marfurt, K, 2015, A comparison of classification techniques for seismic facies recognition: Interpretation. Special edition. Pattern recognition and machine learning. DOI: 10.1190/INT-2015-0044.1. https://www.researchgate.net/publication/281783417_A_comparison_of_classification_techniques_for_seismic_facies_recognition
- Kohonen, T., 1982, Self-organized formation of topologically correct feature maps: Biological Cybernetics, 43, 59 –69 https://link.springer.com/article/10.1007/BF00337288
- Roden, R., and Sacrey, D., 2016. Seismic interpretation with machine learning. GeoExpro. Vol 13. No 6. pp 50-53 https://www.geoexpro.com/articles/2017/01/seismic-interpretation-with-machine-learning
- Roden, R., Smith, T., & Sacrey, D. (2015). Geologic pattern recognition from seismic attributes: Principal component analysis and self-organizing maps. Interpretation. Vol, 3. No 4. http://dx.doi.org/10.1190/INT-2015-0037.1.
- Infante, L., Marfurt, K., 2019. Using Machine learning as an aid to seismic geomorphology, which attributes are the best input? Interpretation, Vol.7, No 3. p. SE1–SE18, 29 FIGS. http://dx.doi.org/10.1190/INT-2018-0096.1
- LeCun, Y., Bengio, Y., Hinton, G, 2015. Deep Learning. NATURE. doi:10.1038/nature14539 https://www.nature.com/articles/nature14539
- Waldeland, A., Solberg, A., 2017. Salt Classification Using Deep Learning. http://www.earthdoc.org/publication/publicationdetails/?publication=88635