--> Source Rock and Organic Matter Data Extracted From Geoscience Literature by Using Watson (IBM)

AAPG Hedberg Conference, The Evolution of Petroleum Systems Analysis

Datapages, Inc.Print this page

Source Rock and Organic Matter Data Extracted From Geoscience Literature by Using Watson (IBM)


Basin modelling is now considered as a commodity for Oil&Gas exploration, but too much human time is still spent in gathering information and input data prior to the model construction. In this feasibility study, we concentrated on source rock and petroleum system data mining. Huge mass of information on source rock intervals and organic matter are available in the scientific literature, company reports and oral presentation files. However, searching for such detailed knowledge in this mass of documents has become a challenge. We evaluated two of the Watson (IBM) machine learning capabilities: image recognition (Watson Visual Recognition WVR) and natural language understanding (Watson Knowledge Studio WKS). As we observe that most of geological information can be found on figures, WVR was trained to identify and sort images and charts from scientific publications in the field of petroleum system analysis. WVR proved to be very efficient to distinguish petroleum system charts, stratigraphic columns, burial curves and well logs from other types of graphs. Automatic text recognition on these images was also tested. On the other hand, WKS was trained to understand the semantic framework of textual knowledge related to source rocks, in order search a text for terms representing a petroleum basin, a stratigraphic interval or a source rock, as well as some information about source rock lithology, organic matter type, and maturity. WVR and WKS were then combined to build advanced queries which help to find much more detailed and petroleum technology oriented knowledge than any other every‐day search engine.