Lecture Notes in Computer Science (Proceedings of the 11th International Conference on Database and Expert Systems Applications - DEXA'2000), Springer Verlag, Vol. 1873, pp 509-525, London, UK, Sept. 4 - 8, 2000 ------------------------------------------------------------------------- Cost Estimation for Large Queries via Fractional Analysis and Markov Chain in Dynamic Multidatabase Environments Qiang Zhu S. Motheramgari Yu Sun Department of Computer and Information Science The University of Michigan -Dearborn Dearborn, MI 48128, U.S.A. {qzhu,motheram,yusun}@umich.edu Abstract Research on query cost estimation for local database systems in a multidatabase system (MDBS) has attracted many researchers in the database area recently. Obtaining good query cost estimates is crucial for performing effective global query optimization in an MDBS. However, most techniques suggested so far, including database calibrating and query sampling, consider only a static multidatabase environment. Recently, we proposed a qualitative approach to developing cost models for a dynamic multidatabase environment. It has been shown that this approach is promising in estimating the cost of a query run in any given contention state for a dynamic environment. However, a large (cost) query in practice may experience multiple contention states during its execution, which cannot be directly handled by the qualitative approach. In this paper, we propose two new techniques, i.e., fractional analysis and probabilistic approach, to solve the problem. The fractional analysis technique, which is suitable for a system environment that changes contention states gradually and smoothly, estimates a query cost by analyzing its fractions. The probabilistic approach, which is suitable for a system environment that changes contention states rapidly and randomly, utilizes the theory of Markov chains to estimate a query cost. Cost estimation formulas for both approaches are derived. Their properties are studied. Our experimental results demonstrate that the suggested techniques are quite promising in estimating costs for large queries in a dynamic multidatabase environment.