--> Rapid Exploration in a Mature Area Incorporating Data Partitioning in Northwest Kansas: Improving the Resolution of Statistical Analysis of Big Data

AAPG Annual Convention and Exhibition

Datapages, Inc.Print this page

Rapid Exploration in a Mature Area Incorporating Data Partitioning in Northwest Kansas: Improving the Resolution of Statistical Analysis of Big Data

Abstract

With the rapid changing environments associated with leasing new viable oil and gas plays in a mature region, attention has turned to the statistical analysis of all of the available data (i.e. big data). This has occurred on a basin-wide scale down to the individual play. For the most-part, the tools used are statistical, geostatistical and multivariate based. Oftentimes, the user is either given a toolset in a larger program or works with one of the many fine programs available on the market. Even with a deep understanding of the myriad of assumptions associated with of these approaches, it is difficult to extract all but the most observable results from the various types of geologic data and even more thorny to quantify the economic confidence of any result. A common workflow is to gather in the best data available (geologic, geophysical, geochemical, log-based and so forth), create multiple layers/surfaces of geo-located information and then perform some appropriate multivariate analysis. A great many of the assumptions associated with the common multivariate techniques are based on the necessity of the data being derived from either one or a fixed number of known populations. With big data, verifying these assumptions are often overlooked with statistically ambiguous or difficult to validate results being common. An extra step in the workflow needs to be added in these cases - partitioning the data in an appropriate way and then analyzing each partition separately. Recognizing when this partitioning is needed via visual, statistical, geostatistical and deterministic techniques was a large part of the study described below. The ‘big data’ consisting of well-based and geophysical data (gravity and magnetic) in several counties in Northwest Kansas. In this area, the early Paleozoic rocks appear to be dominated by basement tectonics at the time of deposition whereas the later Paleozoic formations appear to simply overlie these rocks. After recognizing the visual hints that data partition was appropriate, computer programs designed for data partitioning (Polytopic Vector Analysis-based programs, Fuzzy Clustering including Fuzzy N-Varieties) were applied. The results showed that partitioning the data focused the results such that a more refined probability of success could be defined by the multivariate analysis. This same workflow can be applied to the analysis of basins as well.