The impact of automation frameworks on today's data science competencies
Authors: Maria Potanin, Maike Holtkemper, Christian Beecks
Abstract: The digital transformation has led to a rapid advancement of today's working environment. The increasing demand for AI solutions, especially in skilled professions, is deepening the relationships between humans and smart machines. These new relationships require an evolution and transformation of the skills of today's workforce. Furthermore, technologies can be used to close potential skills gaps and thus make the interaction and collaboration between human and technology work in the best conceivable way. Among the many digital competences required in today's working environment, data scientific competencies like data-preprocessing, feature-engineering and Model-generation are essential in order to cope and analyze small-to-big data sets arising in various data spaces. These Data Science competencies combine statistics, computer science and expert knowledge to gain insights from data and to develop a data-driven solution. With the amount of data increasing every day, there is a growing demand for subject matter experts with Data Science skills to help companies to be competitive by making data-based decisions, identify future trends, identify patterns, optimize processes, or develop new products and services. The competencies needed to accomplish these tasks are as multidisciplinary as the field of Data Science itself. Starting with the ability to write in multiple programming languages, to apply mathematical and statistical concepts, to practice machine learning algorithms and techniques, to the ability to understand industry-specific requirements, to critically question and visualize their results, to the ability to work in a team and to interact socially. These various requirements are met by employees, each of whom is not capable of handling such a wide range of competencies. Since the combination of these competencies is in high demand on the job market, but the small number of high talented employees cannot meet this demand, intelligent solutions are being sought that can replace some of these competencies. The goal of this paper is therefore to investigate whether AI-based automation frameworks have Data Science competencies in their design. It serves as an impulse for further discourse on whether AI-based automation frameworks can empower humans in acquiring Data Science competencies. For this purpose, a total of four automation frameworks out of 17 were selected in an initial analysis, which were most indicative of the following characteristics: automation, flexibility, ease of use and interoperability. These are AutoPrep, Auto Gluon, AutoClust and DeepEye. Based on the EDISON Data Science Framework, these research papers were coded by means of a qualitative content analysis and the obtained data were subsequently analyzed and discussed. Initial results show that by using automation frameworks, needed skills for applying Data Science competencies can be made available to companies and workers in the short term. This concerns components from the EDISON competence groups Data Management, Data Analysis, Data Modeling and Communication.
Keywords: Data Science competencies, automation frameworks, EDISON
Cite this paper: