检测和存储路径组合以供稍后分析的最佳方法

发布于 2024-09-30 12:42:48 字数 957 浏览 6 评论 0原文

我正在寻找有关如何存储用户路径模式的想法/示例 - 目标是分析他们的行为,并在我们能够以某种方式检测到它们时对“最常用的路径”进行优化。

例如。他们在做什么之后执行哪个操作,以便我们稍后可以检查某些操作是否一遍又一遍地执行 - 因此开发一个快捷方式或将某些操作组装成一个组合的多重操作。

我的第一个猜测是某种“简单日志”,可能以某种 SQL 方式存储,我们可以将每个操作保留为索引,然后记录所有内容。

问题是路径/操作可能会动态更改 - 即使在记录时 - 因此我们在稍后查找模式时也需要能够照顾到这一事实。

您会先记录所有“重要”内容,然后在一段时间后对每一个细节进行后期处理,还是您对其他策略有丰富的经验?

我担心的是,这会占用大量空间,同时每天记录 1000 个用户,持续一个月或更长时间。

希望这是有道理的,我很好奇是否有人可以提供示例代码、伪代码或者有用的链接。

我们的工具将是 C#、SQL 数据库、XML 和 .NET 3.5 - 如果需要,客户还可以获得 .NET 4.0。

按照我们期望的模式示例

...
User #1001: A-B-A-A-A-B-C-E-F-G-H-A-A-A-C-B-A
User #1002: B-A-A-B-C-E-F
User #1003: F-B-B-A-E-C-A-A-A   
User #1002: C-E-F
...

等。没有真正的方法知道他们下一步要做什么,也不知道他们会使用多少个,他们会多久做一次。

第二个目标,如果可能的话,如果我们稍后添加一个名为 G 的新“动作”(只是示例来说明,将有数百个动作),我们如何检测这些新行为对先前模式的影响。

为了更好地解释它,我的想法是用某种方法来检测“模式中的模式”,有点像压缩的工作原理,以便“重复模式”是斑点的。我们不知道这些模式可能持续多长时间,也不知道它们出现的频率。我们如何将其分解为“小片段”——您认为最好的方法是什么?

I am searching for ideas/examples on how to store path patterns from users - with the goal of analysing their behaviours and optimizing on "most used path" when we can detect them somehow.

Eg. which action do they do after what, so that we later on can check to see if certain actions are done over and over again - therefore developing a shortcut or assembling some of the actions into a combined multiaction.

My first guess would be some sort of "simple log", perhaps stored in some SQL-manner, where we can keep each action as an index and then just record everything.

Problem is that the path/action might be dynamically changed - even while logging - so we need to be able to take care of this fact too, when looking for patterns later.

Would you log everthing "bigtime" first and then POST-process every bit of details after some time or do you have great experience with other tactics?

My worry is that this is going to take up space, BIG TIME while logging 1000 users each day for a month or more.

Hope this makes sense and I am curious to see if anyone can provide sample code, pseudocode or perhaps links to something usefull.

Our tools will be C#, SQL-database, XML and .NET 3.5 - clients could also get .NET 4.0 if needed.

Patterns examples as we expect them

...
User #1001: A-B-A-A-A-B-C-E-F-G-H-A-A-A-C-B-A
User #1002: B-A-A-B-C-E-F
User #1003: F-B-B-A-E-C-A-A-A   
User #1002: C-E-F
...

etc. no real way to know what they do next nor how many they will use, how often they will do it.

A secondary goal, if possible, if we later on add a new "action" called G (just sample to illustrate, there will be hundreds of actions) how could we detect these new behaviours influence on the previous patterns.

To explain it better, my thought here would be some way to detect "patterns within patterns", sort of like how compressions work, so that "repeative patterns" are spottet. We dont know how long these patterns might be, nor how often they might come. How do we break this down into "small bits and pieces" - whats the best approach you think?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

撑一把青伞 2024-10-07 12:42:48

我不确定你所说的路径是什么意思,但是,如果你给路径中的每个动作一个唯一的符号,你可以将问题减少到最长公共子串或子序列。

或者有一张包含该操作发生次数的路径图。每次发生特定路径时,都会增加该路径的计数。然后排序以找到最常见的。

I am not sure what you mean by path, but, if you gave every action in a path a unique symbol, you could reduce the problem to longest common substring or subsequence.

Or have a map of paths to the number of times that action occurred. Every time a certain path happens, increment the count for that path. Then sort to find the most common.

云裳 2024-10-07 12:42:48

到目前为止的伪想法/实现

  1. 将用户操作记录到一个列表/一系列操作中,批量有点风格(文本文件/SQL - 不管怎样,只需存储整个内容以进行后处理)< /p>

  2. 开始计算每个“1 个操作”、“2 个操作” ”、“3 个动作”直到一定数量(假设 30 个级别)

  3. < p>通过为某些操作(可能是产生最终结果的操作)提供重要值来对它们进行排序

也许是一个有用的结果?

如果我们计算所有[A]、[AA]、[AB]、[AC]、[AAA]、[AAB]等,它会形成一个又长又好的列表其中的操作经常在行中使用,这是正确的方向,因为如果其中一些结果变得太高,我们可能需要更短的路径。那么问题是,什么动作太少需要优化,什么是需要搜索的最长的动作列表?我的猜测是我们需要先进行计数,然后检查数字。

问题是,这将是我们正在开发的分析工具的一部分,并且在实施之前我们没有数据,因此我们不知道在实际完成之前要寻找什么。嗯...想知道这个问题是否真的有答案。

Pseudo idea/implementation so far

  1. Log ever users action into a list/series of actions, bulk kinda style (textfiles/SQL - what ever, just store the whole thing for post-processing)

  2. start counting every "1 action", "2 actions", "3 actions" up til a certain amount (lets say 30 levels)

  3. sort them all, by giving values of importants to some of the actions (might be those producing end results)

A usefull result perhaps?

If we count all [A], [A-A], [A-B], [A-C], [A-A-A], [A-A-B] etc. its going to make a LONG and fine list of which actions are used in row frequently, and thats in the right direction, because if some of these results gets too high, we might need a shorter path. Problem is then, whats too few actions to be optimized and whats the longest needed actionlist to search for? My guess is that we need to do this counting first, then examine the numbers.

Problem is that this would be part of an analyzing tool we are developing and we dont have data until implementation, so we dont know what to look for before its actually done. hmm... wondering if there really IS an answer to this one.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文