Efficient Inductive Logic Programming based on Predictive A*-like Algorithm

Open Access
Conference Proceedings
Authors: Moeko OkawaraJunji FukuharaMUNEHIRO TAKIMOTOTsutomu KumazawaYasushi Kambayashi

Abstract: Various machine learning (ML) techniques have been developed widely over the last decade. Especially, deep learning (DL) contributes to ML for creating a lot of structured data such as tables from unstructured data such as images and sounds. The results have led to a lot of successes in engineering, but most of their decisions and actions are hard to be explained or verified. On the other hand, as a perfectly explainable ML approach, i.e., inductive logic programming (ILP), has been used in data mining. ILP, which is based on the first order predicate logic, is one of the symbolic approaches that is useful to deal with structured data and the relations between them. In a practical sense, we can add the results generated by ILP into given background knowledge, and make the knowledge database rich. Thus, ILP becomes more important for data mining than before, and we can extract meaningful relations between the structured data. However, contrary to DL, it is not easy for ILP to perform a learning process efficiently, because we cannot make ILP processes uniformly executable in parallel on GPU. One learning process corresponds to an inductive prediction process, where training samples correspond to positive and negative examples. In the process, ILP explores hypothesis candidates while calculating a cover set that is a set of examples deduced from each candidate. Notice that from the finally obtained hypothesis, the positive examples should be deduced, and the negative ones should not be deduced with the background knowledge. The cover set is known to be uniformly calculated in relational operations, which are executed on GPU or a relational database management system (RDBMS) such as SQL. Since modern RDBMSs can not only manage memory operations safely but also execute SQL in parallel utilizing GPU. Thus, we can partially execute ILP in parallel. But the overhead of launching the procedure for each cover set calculation is heavy and we cannot ignore the significance of the total overhead. In order to mitigate this problem, we propose an extension of the algorithm for searching a hypothesis in Progol, which is one of the most popular ILP systems. Progol uses A*-like algorithm for searching a hypothesis. The algorithm incrementally refines each hypothesis candidate through adding a literal to it, calculating its cover set in order to check whether it satisfies the condition as a hypothesis. In our approach, our algorithm simultaneously performs several refinements with high possibility as a hypothesis. We call it predictive refinement. Even though the refinements may include redundant ones because the same hypothesis may be found earlier, the predictive refinement reduces a lot of overhead cost for launching the procedure of cover set calculation. Thus our algorithm can generate a hypothesis more efficiently than the conventional search algorithm. We have extended Progol to implement the predictive refinement and cover set calculation of generating hypothesis candidates on PostgreSQL. We have successfully demonstrated that our extended Progol works significantly well to obtain practical experimental results.

Keywords: inductive logic programming, cover set, SQL

DOI: 10.54941/ahfe1002934

Cite this paper: