dc.contributor.advisor |
Bulut, Ahmet |
|
dc.contributor.author |
Nabi-Abdolyousefi, Razieh |
|
dc.date.accessioned |
2015-01-20T13:46:31Z |
|
dc.date.available |
2015-01-20T13:46:31Z |
|
dc.date.issued |
2015-01-20 |
|
dc.date.submitted |
2015-01-06 |
|
dc.identifier.other |
000110157002 |
|
dc.identifier.uri |
http://hdl.handle.net/11498/3024 |
|
dc.description.abstract |
Search engines hold online auctions among search advertisers who are bidding for the
advertisement slots in the search engine results pages. Search engines employ a pay-
per-click model in which advertisers are charged whenever their ads are clicked by users.
If a user clicks on an ad and then takes a particular action, which the corresponding
advertiser has defined as valuable to her business, such as an online purchase, or signing
up for a newsletter, or a phone call, then the user’s action is counted as a conversion. A
naive estimate of the conversion rate (CR) of an ad is the average number of conversions
per click. The average number of clicks and the average position of the ad also affect its
conversion rate. However, all such ad statistics are heuristics at best. The challenge here
is that there is no performance statistics accrued for the newly created ads. In order
to get any kind of performance data, new ads have to be advertised first and precious
marketing dollars have to be spent. If CR estimates are precise, then advertisers can
manage their campaigns more effectively and can have a better return on their invest-
ments. Alternatively, one can use the available data for the existing ads and engineer
a set of features that best characterize conversions for an advertisement campaign in
general. We took the second approach and used probabilistic inference for extracting
text features. Using these text features, we built a prediction model to estimate the
true CRs of unknown ads. Our experiment results demonstrated that such text features
improved the accuracy of our predictions. Furthermore, hybrid models that combine
text and numeric features achieved a superior predictive power compared to using only
text features or only numeric features. |
en_US |
dc.description.tableofcontents |
Contents
Declaration of Authorship ii
Abstract iii
Oz iv
Acknowledgments vi
List of Figures ix
List of Tables x
1 Introduction 1
1.1 Online Advertising Overview . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Goal and Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Literature Review 7
3 Probabilistic Models 10
3.1 Topic Models and LDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1.1 Introduction to LDA . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1.2 Graphical Representation of LDA . . . . . . . . . . . . . . . . . . . 12
3.1.2.1 Bayesian Networks . . . . . . . . . . . . . . . . . . . . . . 12
3.1.2.2 Dirichlet Distribution . . . . . . . . . . . . . . . . . . . . 14
3.1.2.3 LDA Posteriori . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.3 Approximation of Posterior Inference . . . . . . . . . . . . . . . . 16
3.1.3.1 Markov chain Monte Carlo (MCMC) . . . . . . . . . . . 17
3.1.3.2 LDA Posterior Approximation . . . . . . . . . . . . . . . 18
4 Methodology 21 4.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2.1 Numeric Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2.2 Average CR of Keywords in Same Ad-group . . . . . . . . . . . . . 22
4.2.3 Average CR of Keywords in Same Campaign . . . . . . . . . . . . 22
4.2.4 Match Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.5 Non-linear Features . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.6 Topic based Features . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.3 Model Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5 Experiments 26 5.1 Experiment Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5.1.1 Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.1.2 Pre-processing Data . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.1.2.1 Cleaning Data . . . . . . . . . . . . . . . . . . . . . . . . 28
5.1.2.2 Scaling Attributes . . . . . . . . . . . . . . . . . . . . . . 28
5.1.2.3 Training, Cross Validation, and Testing . . . . . . . . . . 29
5.1.3 Evaluation Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.1.3.1 Evaluation Protocols . . . . . . . . . . . . . . . . . . . . 30
5.1.3.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . 30
5.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
6 Conclusion 32
A Terms and Evaluation Metrics 34 A.1 Evaluation Metrics Used in Ranked Results . . . 34
A.2 Evaluation Metrics Used in Regression Predictions . . . . . . . . . . . . . 35
Bibliography |
en_US |
dc.language.iso |
eng |
en_US |
dc.rights |
info:eu-repo/semantics/embargoedAccess |
en_US |
dc.subject |
Internet Advertising |
en_US |
dc.subject |
İnternet Reklamcılığı |
en_US |
dc.title |
Conversion rate prediction in search engine marketing |
en_US |
dc.type |
Thesis |
en_US |
dc.contributor.department |
The Graduate School of Natural and Applied Sciences of Istanbul Sehir University |
en_US |