项目过期时从 Google 中删除上传的文件

发布于 2024-08-21 23:22:12 字数 337 浏览 6 评论 0原文

我们使用 Google CSE(自定义搜索引擎)付费服务对我们网站上的内容建立索引。该站点主要由使用包含文件组装的 PHP 页面构建,但也有一些动态页面将信息从数据库提取到单个页面模板中(例如新版本)。我们遇到的问题是我可以为数据库中的内容设置过期日期,因此说“id=2”将显示“此内容已过期”通知。但是,如果 ID 2 附加了上传的 PDF,则该 PDF 文件仍保留在搜索索引中。

我知道我可以编写一个清理脚本并让 cron 运行它来查看数据库,查找过期内容,检查是否附加了任何上传的文件并重命名或删除它们,但必须有一个更好的解决方案(我希望)。

请告诉我您过去是否遇到过这种情况以及您的建议。

谢谢, D .

We're using the Google CSE (Custom Search Engine) paid service to index content on our website. The site is built of mostly PHP pages that are assembled with include files, but there are some dynamic pages that pull info from a database into a single page template (new releases for example). The issue we have is I can set an expire date on the content in the database so say "id=2" will bring up a "This content is expired" notice. However, if ID 2 had an uploaded PDF attached to it, the PDF file remains in the search index.

I know I could write a cleanup script and have cron run it that looks at the db, finds expired content, checks to see if any uploaded files were attached and either renames or removes them, but there has to be a better solution (I hope).

Please let me know if you have encountered this in the past, and what you suggest.

Thanks,
D.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

以往的大感动 2024-08-28 23:22:12

不幸的是,目前无法给您直接的答案:我们不知道您的 PDF 如何“附加”到您的页面或您的数据库的结构如何。

最好的解决方案是创建一个 robots.txt 文件来阻止您要删除的特定 PDF 文件的 URL。谷歌将在下次访问时(通常在大约一个小时内)将它们从索引中删除。

http://www.robotstxt.org/

There's unfortunately no way to give you a straight answer at this time: we have no knowledge of how your PDFs are "attached" to your pages or how your DB is structured.

The best solution would be to create a robots.txt file that blocks the URLs for the particular PDF files that you want to remove. Google will drop them from the index on its next pass (usually in about an hour).

http://www.robotstxt.org/

压抑⊿情绪 2024-08-28 23:22:12

我们最终所做的是将一个检查脚本绑定到上传脚本,一旦完成当前上传,旧文件将被“取消链接”,并且数据库记录将被删除。

对于我们来说,这是有效的,因为这是一种“添加一个/删除一个”的情况,我们希望一组项目以滚动顺序出现。

What we ended up doing was tying a check script to the upload script that once it completed the current upload, old files were "unlinked" and the DB records were deleted.

For us, this works because it's kind of an "add one/remove one" situation where we want a set number of of items to appear in a rolling order.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文