Using PyCHNO to Generate Training-Image Datasets for Machine Learning Ichnology

Eric Timmer<sup>1</sup>, Murray Gingras<sup>2</sup>, Calla Knudson<sup>2</sup>, John-Paul Zonneveld<sup>2</sup>

Using PyCHNO to Generate Training-Image Datasets for Machine Learning Ichnology

Eric Timmer¹, Murray Gingras², Calla Knudson², John-Paul Zonneveld²

¹Alberta Geological Survey;
²University of Alberta

May 19-22 2019 – 2019 AAPG Annual Convention and Exhibition, San Antonio, Texas

Posted: June 30, 2019

Abstract

Labeled image training datasets have been modeled using deep-learning approaches to successfully identify unlabeled image datasets (e.g. identifying cats in images). However, in order to for this method to correctly recognize images, thousands of labeled training images are typically required.

Considering that a labeled ichnofossil image library does not exist, this approach is generally deemed unsuitable for the automated detection of trace fossils. Ichnofossil detection is further complicated by the variability of trace-fossil cuts, size, and deformation, as well as lithological and regional changes in core. Therefore, an accurate deep-learning model likely requires tens of thousands of training images.

Sedimentary core datasets sometimes include thousands of preserved trace fossils. Using the open source ichnology data collection software PyCHNO, a skilled ichnology worker can rapidly click on, label, and reference thousands of trace fossils from core images. In this presentation we present a software add-on that can be used to extract thousands of labeled trace-fossil images from a PyCHNO-collected core dataset in order to generate a labeled database for machine learning approaches. In addition, images from zones containing no trace fossils can be extracted from core datasets for solving bioturbated vs. unbioturbated zone problems. We use an example from the Cretaceous McMurray Formation of NE Alberta to demonstrate this approach.

Using this approach, a few core-image datasets from wells in close proximity can be used to generate a labeled training dataset comprising thousands of images. These images can subsequently be input into a deep learning framework (e.g. Keras) for generating a model of trace fossil identification, which can potentially be used to label, at the very least, bioturbated vs. unbioturbated zones from unlabeled core images.

It is important to note that the proposed approach requires an expert to collect PyCHNO data. Indeed, the quality of the model depends on the skills of the ichnologist identifying trace fossils. And, generated models, may not be initially well suited to regional or cross-formational labeling of trace fossils from core images. Also note that PyCHNO can be used to collect sedimentary structures and can similarly be used to generate sedimentary structure image training datasets. The proposed method has the potential of significantly increasing the power of core-image datasets, by allowing ichnologists to focus on interpreting trace-fossil distributions rather than spending considerable time collecting data.

Search and Discovery
Featured Articles

AAPG Store
Featured Digital Pubs

GIS Map Publishing Program

Online Journal for E&P Geoscientists

Using PyCHNO to Generate Training-Image Datasets for Machine Learning Ichnology

Abstract

Search and Discovery
Featured Articles

Archives

AAPG Store
Featured Digital Pubs

GIS Map Publishing Program

Online Journal for E&P Geoscientists

Using PyCHNO to Generate Training-Image Datasets for Machine Learning Ichnology

Abstract

Search and DiscoveryFeatured Articles

Archives

AAPG StoreFeatured Digital Pubs

GIS Map Publishing Program

Search and Discovery
Featured Articles

AAPG Store
Featured Digital Pubs