ŞEHİR e-arşiv

Estimating the selectivity of LIKE queries using pattern-based histograms

Show simple item record

dc.contributor.author Aytimur, Mehmet
dc.contributor.author Çakmak, Ali
dc.date.accessioned 2019-01-23T07:11:32Z
dc.date.available 2019-01-23T07:11:32Z
dc.date.issued 2018
dc.identifier.citation Aytimur, Mehmet; Çakmak, Ali. (2018). Estimating the selectivity of LIKE queries using pattern-based histograms. Turkish Journal of Electrical Engineering and Computer Sciences, 26(6), pp. 3319-3334. en_US
dc.identifier.issn 1300-0632
dc.identifier.uri http://hdl.handle.net/11498/55992
dc.description.abstract Accurate cost and time estimation of a query is one of the major success indicators for database management systems. SQL allows the expression of flexible queries on text-formatted data. The LIKE operator is used to search for a specified pattern (e.g., LIKE "luck%") in a string database. It is vital to estimate the selectivity of such flexible predicates for the query optimizer to choose an efficient execution plan. In this paper, we study the problem of estimating the selectivity of a LIKE query predicate over a bag of strings. We propose a new type of pattern-based histogram structure to summarize the data distribution in a particular column. More specifically, we first mine sequential patterns over a given string database and then construct a special histogram out of the mined patterns. During query optimization time, pattern-based histograms are exploited to estimate the selectivity of a LIKE predicate. The experimental results on a real dataset from DBLP show that the proposed technique outperforms the state of the art for generic LIKE queries like %s(1)%s(2)%...%s(n) % where s(i) represents one or more characters. What is more, the proposed histogram structure requires more than two orders of magnitude smaller memory space, and the estimation time is almost an order of magnitude less in comparison to the state of the art. en_US
dc.language.iso eng en_US
dc.publisher TUBİTAK en_US
dc.relation.isversionof 10.3906/elk-1806-96 en_US
dc.rights info:eu-repo/semantics/embargoedAccess en_US
dc.subject Histograms en_US
dc.subject Data Management en_US
dc.subject Sequence Mining en_US
dc.subject Sequences (Mathematics) en_US
dc.subject Mathematical Optimization en_US
dc.subject Histogramlar en_US
dc.subject Veri Yönetimi en_US
dc.subject Dizi Madenciliği en_US
dc.subject Dizi (Matematik) en_US
dc.subject Matematiksel Optimizasyon en_US
dc.title Estimating the selectivity of LIKE queries using pattern-based histograms en_US
dc.type Article en_US
dc.contributor.authorID 31867 en_US
dc.relation.journal Turkish Journal of Electrical Engineering and Computer Sciences en_US
dc.contributor.department İstanbul Şehir University. College of Engineering and Natural Sciences. Department of Computer Science. en_US
dc.identifier.volume 26 en_US
dc.identifier.issue 6 en_US
dc.identifier.endpage 3334 en_US
dc.identifier.startpage 3319 en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search ŞEHİR e-arşiv


Advanced Search

Browse

My Account

Statistics