Software package Master-A powerful tool for data mining

A useful software package, Master, has been built based on a series of experimental design, optimization and machine learning methods. It is a comprehensive system consisting of orthogonal design, statistical analysis, data visualization, pattern recognition, regression analysis, artificial neural networks (ANN) and support vector machine (SVM) etc. Master has been successfully used to solve a series of problems in: (1) materials design including the composition design of rare-earth containing phosphor and cathode materials of Ni/H battery, optimization of high temperature superconductor, VPTC ceramic semiconductors, (2) molecular design including the structure-activity relationship of antagonists, molecular screening of triazoles, structure-property relationship of azo dyestuff and (3) industrial optimization including nitriding technique for crankshaft production, Springback prediction in sheet metal forming.

Support Vector Machine Applied in Chemistry and Chemical Engineering

Support Vector Machine (SVM) is especially suitable to find the regularities from the small data set, i.e., data set with fewer samples, giving results of modeling with good generalization ability. Therefore, the SVM may become a useful tool for solving many problems in Chemistry and Chemical Engineering, since there are many tasks for extraction of useful information from fewer training sets in those areas.

Computer prediction on formation of ternary intermetallic compounds

The prediction of the formation of ternary intermetallic compounds in ternary alloy systems is a very important and difficult task for the calculation of ternary alloy phase diagrams. In this research, the regularities of ternary intermetallic compound formation were found by pattern-recognition methods using Villars's system of atomic parameters. It has been found that the representative points of ternary compound forming alloy systems and that of ternary alloy systems without a ternary compound are distributed in different regions in the multi-dimensional space spanned by the atomic parameters or their functions. A hyperpolyhedron model obtained by pattern-recognition methods can be used to describe the boundaries of the zone of ternary compound formation in multi-dimensional space, with good computerized prediction for ternary intermetallic compound formation. Based on the regularities obtained, we can make a computerized prediction of ternary intermetallic compounds using the atomic parameter-pattern recognition method.

Application of Data Mining Optimized System (DMOS) in Chemical Industry

Based on our experiences on industrial optimization in past many years, and the operation mode of oversea companies dealing with advanced optimal control for petrochemical industries, a software series-DMOS has been developed. It includes a bank of data mining methods. Many pattern recognition methods, support vector machines, ANN and linear/nonlinear regression methods have been organized as a data-processing flow sheet. It can be used to treat the data of user to produce operation version of DMOS software series, which can be directly used for optimal control in factories. The development and application of DMOS is a decisive step for the commercialization of industrial optimization work in our country.