开发像 Google 的 adsense 这样的广告系统有什么建议吗?
为了每次展示一个最佳匹配广告,至少需要做这些事情:
- 检索当前页面的主要信息
- 获取与上面检索到的信息相关的广告
但上面的差不多了对于非搜索引擎公司来说这是不可能的。
那么对于非 Google 公司来说,实现最佳匹配广告系统的实用方法是什么?
In order to show a best match ad each time,there are at least these things to do:
- retrieve the main information of the current page
- get an ad that's related with the information retrieved above
But the above is almost impossible for a non-search-engine company.
So what's the practical way for a non-google company to approach a best matching ad system?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
你基本上无法实时执行第 1 点——时间间隔太短。因此,您需要事先分析要投放广告的所有页面,并以可以在广告投放时快速访问的方式存储该信息。
这并不一定意味着“成为一家搜索引擎公司”:毕竟,您可能不会在数十亿个不同的网址上投放广告,而只会在属于您公司或其合作伙伴的极少数网址上投放广告(因此,您大概还可以从 URL 所有者那里获得协作:例如,您不需要一般的蜘蛛,但可以依靠正确使用站点地图协议的所有者来让您了解新的、更新的或删除的 URL,您可以信任每个 URL页面的关键字、标题和标题以提供重要信息,等等)。
因此,使用相对较少数量的服务器(比如几十个,可能在 EC2 或其他“云”服务中),您可以保留一个内存中的分布式哈希表,将 URL 映射到(例如)相关关键字集和关键字的权重。相对重要性,以及候选广告的类似表格 - 事实上,如果您的系统没有“实时拍卖”功能,您甚至可能会预先计算 URL 与广告的对应关系(大概您< em>确实想要进行一些动态调整,拍卖方式或其他,但有一些合理的近似,可以建模为预先计算的对应关系上的简单增量操作)。
如果您确实需要扩展到在数十亿个网址上投放广告,那么您确实需要一种比在 SO 答案上有效总结的方法复杂得多的方法 - 但是,如果这就是您的野心规模,您最好将其组合起来一个不被任务吓倒的工程团队(并且远远超过几十台服务器;-)。
You basically can't do point 1 in real time -- the time interval is just too short. So you need to analyze beforehand all the pages you're going to be serving ads on, and store that information in a way that it can be rapidly accessed at ad-serving time.
That doesn't necessarily imply "being a search engine company": presumably you're not going to serve ads on billions of different URLs, after all, but only on a far smaller number of URLs that belong to your company or its partners (so you can presumably also get collaboration from the URLs' owners: e.g., you don't need a general spider but can rely on the owners using the sitemaps protocol properly to let you know about new, updated or removed URLs, you can trust each page's keywords , title and headers to provide important info, and so forth).
So with a relatively small number of servers (say a few dozens, maybe in EC2 or other "cloud" service) you can keep an in-memory distributed hash table mapping URLs to (for example) sets of related keywords and weights for keywords' relative importance, and a similar table for candidate ads -- indeed, if you don't have a "real-time auction" aspect to your system, you might even get away with precomputing a URL-to-ads correspondence (presumably you do want to do some dynamic adjustment, auction-wise or other, but with some reasonable approximation that can be modeled as a simple incremental op on the precomputed correspondence).
If you do need to scale to serving ads on billions of URLs, then you do need a far more sophisticated approach than can be effectively summarized on a SO answer -- but then, if that's the scale of your ambition, you had better put together an engineering team that's not daunted by the task (and far more than a few dozen servers;-).
当客户注册在其网站上投放广告时,您需要客户告诉您他们的页面是关于什么的。您还需要非常擅长 JavaScript,以便可以跟踪广告被观看的次数。尝试查看现有广告公司使用的代码。它非常复杂...
You would need the customer to tell you what their page is about when they sign up to have advertisements placed on their site. You're also going to need to be very good at javascript so you can keep track of how many times an advertisement is viewed. Try looking at the code used by existing ad companies. Its very complicated...