如何将缺少的行(包含页码的URL)添加到数组(例如Linux中的SEQ)
我有一个由表单的URL组成的数组:
$URLs = @("https://somesite.com/folder1/page/1/"
,"https://somesite.com/folder222/page/1/"
,"https://somesite.com/folder222/page/2/"
,"https://somesite.com/folder444/page/1/"
,"https://somesite.com/folder444/page/3/"
,"https://somesite.com/folderBBB/page/1/"
,"https://somesite.com/folderBBB/page/5/")
它们始终具有/页/1/,我需要添加(或重建)所有丢失的URL从最高页面下降到1,所以最终都像
$URLs = @("https://somesite.com/folder1/page/1/"
,"https://somesite.com/folder222/page/1/"
,"https://somesite.com/folder222/page/2/"
,"https://somesite.com/folder444/page/1/"
,"https://somesite.com/folder444/page/2/"
,"https://somesite.com/folder444/page/3/"
,"https://somesite.com/folderBBB/page/1/"
,"https://somesite.com/folderBBB/page/2/"
,"https://somesite.com/folderBBB/page/3/"
,"https://somesite.com/folderBBB/page/4/"
,"https://somesite.com/folderBBB/page/5/")
:伪代码将是类似的:
- 对于每个文件夹,提取最高页码:
hxxps://somesite.com/folderbbbbbb /page/5
- /
将其从(5)扩展到(1)
hxxps://somesite.com/folderbbb/page/1/ hxxps://somesite.com/folderbbbbbbbbb/page/2/ hxxps://somesite.com/folderbbbbbbbbbbbbbbbbbbbb/page/3/ hxxps://somesite.com/folderbbbbbbbbbbbbbbbb/page/4/ hxxps://somesite.com/folderbbbbbbbbbbbbb/page/5/
将其输出到阵列
欢迎任何指针!
I have an array consisting of URLS of the form:
$URLs = @("https://somesite.com/folder1/page/1/"
,"https://somesite.com/folder222/page/1/"
,"https://somesite.com/folder222/page/2/"
,"https://somesite.com/folder444/page/1/"
,"https://somesite.com/folder444/page/3/"
,"https://somesite.com/folderBBB/page/1/"
,"https://somesite.com/folderBBB/page/5/")
They always have /page/1/, I need to add (or reconstruct) all missing URLS from the highest page down to 1 so it ends up like so:
$URLs = @("https://somesite.com/folder1/page/1/"
,"https://somesite.com/folder222/page/1/"
,"https://somesite.com/folder222/page/2/"
,"https://somesite.com/folder444/page/1/"
,"https://somesite.com/folder444/page/2/"
,"https://somesite.com/folder444/page/3/"
,"https://somesite.com/folderBBB/page/1/"
,"https://somesite.com/folderBBB/page/2/"
,"https://somesite.com/folderBBB/page/3/"
,"https://somesite.com/folderBBB/page/4/"
,"https://somesite.com/folderBBB/page/5/")
I'd imagine the Pseudo-Code would be something like:
- For each folder, extract the highest page number:
hxxps://somesite.com/folderBBB/page/5/
Expand this out from (5) to (1)
hxxps://somesite.com/folderBBB/page/1/ hxxps://somesite.com/folderBBB/page/2/ hxxps://somesite.com/folderBBB/page/3/ hxxps://somesite.com/folderBBB/page/4/ hxxps://somesite.com/folderBBB/page/5/
Output this into an array
Any pointers would be welcome!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以通过
group-object
cmdlet如下:注意:
假设是,每个共享相同前缀的URL中的第一个和最后一个元素始终包含启动和启动所需枚举的终点。
如果该假设不起作用,请改用以下内容:
基于正则
-replace
操作员用于两件事:-replace'[^/]+/$'
消除每个URL的最后一个组件,以便通过其共享前缀进行分组。-replace'^。+/([^/]+)/$',$ 1'
有效地从每个给定的URL中提取最后一个组件,即表示代表的数字所需枚举的开始和终点。程序替代:
,'$ 1'| 措施 - 对象 - 最小-maximum $从,$到= $ minmax.mimimime,$ minmax.maximim基于正则
-replace
操作员用于两件事:-replace'[^/]+/$'
消除每个URL的最后一个组件,以便通过其共享前缀进行分组。-replace'^。+/([^/]+)/$',$ 1'
有效地从每个给定的URL中提取最后一个组件,即表示代表的数字所需枚举的开始和终点。程序替代:
} | # Group by shared prefix ForEach-Object { # Extract the start and end number for the group at hand. [int] $from, [int] $to = ($_.Group[0], $_.Group[-1]) -replace '^.+/([^/]+)/注意:
假设是,每个共享相同前缀的URL中的第一个和最后一个元素始终包含启动和启动所需枚举的终点。
如果该假设不起作用,请改用以下内容:
基于正则
-replace
操作员用于两件事:-replace'[^/]+/$'
消除每个URL的最后一个组件,以便通过其共享前缀进行分组。-replace'^。+/([^/]+)/$',$ 1'
有效地从每个给定的URL中提取最后一个组件,即表示代表的数字所需枚举的开始和终点。程序替代:
, '$1' # Generate the output URLs. # You can assign the entire pipeline to a variable # ($generatedUrls = $URLs | ...) to capture them in an array. foreach ($i in $from..$to) { $_.Name + $i + '/' } }注意:
假设是,每个共享相同前缀的URL中的第一个和最后一个元素始终包含启动和启动所需枚举的终点。
如果该假设不起作用,请改用以下内容:
基于正则
-replace
操作员用于两件事:-replace'[^/]+/$'
消除每个URL的最后一个组件,以便通过其共享前缀进行分组。-replace'^。+/([^/]+)/$',$ 1'
有效地从每个给定的URL中提取最后一个组件,即表示代表的数字所需枚举的开始和终点。程序替代:
You can use a pipeline-based solution via the
Group-Object
cmdlet as follows:Note:
The assumption is that the first and last element in each group of URLs that share the same prefix always contain the start and end point of the desired enumeration, respectively.
If that assumption doesn't hold, use the following instead:
The regex-based
-replace
operator is used for two things:-replace '[^/]+/$'
eliminates the last component from each URL, so as to group them by their shared prefix.-replace '^.+/([^/]+)/$', '$1'
effectively extracts the last component from each given URL, i.e. the numbers that represent the start and end point of the desired enumeration.Procedural alternative:
, '$1' | Measure-Object -Minimum -Maximum $from, $to = $minMax.Minimum, $minMax.MaximumThe regex-based
-replace
operator is used for two things:-replace '[^/]+/$'
eliminates the last component from each URL, so as to group them by their shared prefix.-replace '^.+/([^/]+)/$', '$1'
effectively extracts the last component from each given URL, i.e. the numbers that represent the start and end point of the desired enumeration.Procedural alternative:
} | # Group by shared prefix ForEach-Object { # Extract the start and end number for the group at hand. [int] $from, [int] $to = ($_.Group[0], $_.Group[-1]) -replace '^.+/([^/]+)/Note:
The assumption is that the first and last element in each group of URLs that share the same prefix always contain the start and end point of the desired enumeration, respectively.
If that assumption doesn't hold, use the following instead:
The regex-based
-replace
operator is used for two things:-replace '[^/]+/$'
eliminates the last component from each URL, so as to group them by their shared prefix.-replace '^.+/([^/]+)/$', '$1'
effectively extracts the last component from each given URL, i.e. the numbers that represent the start and end point of the desired enumeration.Procedural alternative:
, '$1' # Generate the output URLs. # You can assign the entire pipeline to a variable # ($generatedUrls = $URLs | ...) to capture them in an array. foreach ($i in $from..$to) { $_.Name + $i + '/' } }Note:
The assumption is that the first and last element in each group of URLs that share the same prefix always contain the start and end point of the desired enumeration, respectively.
If that assumption doesn't hold, use the following instead:
The regex-based
-replace
operator is used for two things:-replace '[^/]+/$'
eliminates the last component from each URL, so as to group them by their shared prefix.-replace '^.+/([^/]+)/$', '$1'
effectively extracts the last component from each given URL, i.e. the numbers that represent the start and end point of the desired enumeration.Procedural alternative: