“社交网络”编程难题

发布于 2024-10-31 01:06:46 字数 401 浏览 1 评论 0原文

电影《社交网络》中有一个简洁的序列,其中角色编写了一个 Perl 脚本,从校园联谊会网络服务器上抓取图像。他的目标是为每个联谊会的每位成员拍一张照片,并尽量减少错过的成员。通常,这只是涉及他从公共目录或其他小圈子中获取它,例如返回所有成员的空搜索,但他描述了一个非常有趣的设置,但从未给出解决方案。

一个联谊会的网站允许搜索并返回匹配成员的图片。但是,如果搜索返回超过 20 个匹配项,则不会显示任何内容。

假设没有其他方法来访问图片并且没有联谊会成员的姓名列表,在这种情况下是否有一种优雅的方法来获取至少大多数成员的图片?或者有什么办法吗?

编辑:这是电影场景的链接,稍作剪辑以显示仅编码部分。

There's a neat sequence in the movie The Social Network in which the character writes a perl script to grab images from sorority web servers on campus. His goal is to get a picture for every member of each sorority with a minimum of missed members. Typically, this just involves him grabbing it from a public directory or other little hoops like an empty search which returns all members, but he describes one really interesting set up and never gives a solution for it.

One sorority's site allows for searching and returns the pictures for matching members. However, if a search returns more than 20 matches, nothing is displayed.

Assuming no other way to access the pictures and without a list of the names of sorority members, is there an elegant way to get at least a majority of member pictures in this case? Or any way at all?

Edit: Here's a link to the scene from the movie, slightly cut up to show only the coding parts.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

别再吹冷风 2024-11-07 01:06:46

拿起电话,索要一份校园名录,并将这些名字输入联谊会的搜索中,以便一次找回一个成员。

毕竟,这是社交网络。

Pick up the phone, ask for a campus directory and feed those names into the sorority's search to get members back one at a time.

This is, after all, the social network.

兔小萌 2024-11-07 01:06:46

我还没看过这部电影,但让我陈述一些假设:

  • 您有一个按名称搜索的搜索字段。
  • 你不知道联谊会成员的名字。
  • 没有其他方法可以访问图片(除了搜索框)。

在这种情况下,我认为没有一个优雅的答案。这可能是电影中的“我用一种未知的语言翻译古代石板”之类的时刻之一。我的猜测是你最好的选择是暴力搜索。

  1. 如果您按常用名(和姓氏)搜索,您可能会找到大多数成员。
  2. 如果您有时间并且愿意,实际的蛮力(逐个字母等)最终将填补第 1 项所遗漏的空白。

编辑:
而且,理论上,如果联谊会足够大,每个人都叫“简·史密斯”,那就没有解决办法。

I haven't seen the movie, but let me state some assumptions:

  • You have a search field which searches by name.
  • You don't know the names of any sorority members.
  • There is no other way to access the pictures (except the search box).

In this scenario, I do not think there is an elegant answer. This may be one of those cinematic "I translate the ancient tablet in an unknown language" sort of moments. My guess is your best bet would be a brute-force search.

  1. If you search by common names (and last names), you could get a majority of the members.
  2. If you have the time and will, an actual brute force (letter by letter, etc) would eventually fill up the gaps missed by item 1.

Edit:
Also, in theory, if the sorority is large enough and everyone is named "Jane Smith", there would be no solution.

自控 2024-11-07 01:06:46
findall(prefix):
   res = set()
   for char in alphabet:
      sresults = search(prefix + char)
      if len(sresults) == 0 and len(prefix) < ABORT_SIZE:
          res += findall(prefix + char)
      else:
          res += sresults
   return res
findall("")

请注意,如果名称分布不近似相等,则解决方案将花费很长时间,因为如果有 20 个人匹配“abc”,1 个人匹配“abd”,它将无休止地枚举“ab”的所有后缀。您可以修改 ABORT_SIZE 以平衡时间和完整性。它应该高于 log|alphabet|(n),其中 n 是最终结果的(未知)数量。

findall(prefix):
   res = set()
   for char in alphabet:
      sresults = search(prefix + char)
      if len(sresults) == 0 and len(prefix) < ABORT_SIZE:
          res += findall(prefix + char)
      else:
          res += sresults
   return res
findall("")

Notice the solution will take a long time if the distribution of names is not approximately equal because it will endlessly enumerate all suffixes for "ab" if there are 20 people matching "abc" and 1 matching "abd". You can modify ABORT_SIZE to balance time and completeness. It should be higher than log|alphabet|(n), where n is the (unknown) number of final results.

我只土不豪 2024-11-07 01:06:46

我想你可以使用一种“字典攻击” - 你知道你可以从美国人口普查局下载所有姓氏和/或名字吗?我的另一个选择是@phihag 提出的。

I guess you could use a kind of a "dictionary attack" - did you know that you can download all last and/or first names from US census bureau? My other option was what @phihag proposed.

违心° 2024-11-07 01:06:46

我这里有《意外的亿万富翁》一书,它直接引用了他的 LiveJournal,而且……那并没有发生。您确实知道编写脚本的人对代码一无所知,对吧?

从我读到的内容来看,他使用了一些 LWP 来蜘蛛“Lowell”和“Adams”,然后他到达了“Dunster”,这是一个有 20 个结果问题的问题,他写道“我稍后再回来” 。所以据我所知,他可能已经放弃了。

他还写道,“需要多次尝试才能编译脚本”。

I've got the book "The Accidental Billionaires" here and it quotes directly from his LiveJournal, and ... that didn't happen. You do know that the guy who wrote the script knows absolutely nothing about code, right?

From what I'm reading, he used something LWP to spider 'Lowell' and 'Adams', then he got to 'Dunster' which is the one with the 20-results problem, and he writes "I'll come back later". So for all I can see, he may have just given up.

He also writes that "It's taking a few tries to compile the script".

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文