比较页面或 CSV 文件中的关键字：PHP ?重击？

发布于 2024-11-18 03:20:57 字数 137 浏览 2 评论 0原文

我在 HTML 网页中有一系列关键字 - 它们以逗号分隔，因此我可以将它们转换为 CSV，并且想知道哪些关键字不在显示为 html 网页的另一个 CSV 文件中。你会如何进行这种比较？我对 mysql 和表有想法，但这是 CSV 或 html 源。谢谢！

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

二智少女猫性小仙女 2024-11-25 03:20:57

在 Python 中，给定 2 个 csv 文件 a.csv 和 b.csv，此脚本将创建（或编辑，如果已存在）一个新文件 out.csv，其中包含 a.csv 中 b.csv 中未找到的所有内容。

import urllib

url = 'http://www.website.com/x.csv'
urllib.urlretrieve(url, 'b.csv')


file_a = open('a.csv', 'r')
file_b = open('b.csv', 'r')    
file_out = open('out.csv', 'w')

list_a = [x.strip() for x in file_a.read().split(',')]
list_b = [x.strip() for x in file_b.read().split(',')]    
list_out = list(set(list_a) - set(list_b)) # Reverse if necessary

file_out.write(','.join(list_out))
file_out.close()

In Python, given 2 csv files, a.csv and b.csv, this script will create (or edit if it already exists) a new file out.csv that contains everything in a.csv that's not found in b.csv.

import urllib

url = 'http://www.website.com/x.csv'
urllib.urlretrieve(url, 'b.csv')


file_a = open('a.csv', 'r')
file_b = open('b.csv', 'r')    
file_out = open('out.csv', 'w')

list_a = [x.strip() for x in file_a.read().split(',')]
list_b = [x.strip() for x in file_b.read().split(',')]    
list_out = list(set(list_a) - set(list_b)) # Reverse if necessary

file_out.write(','.join(list_out))
file_out.close()

回复收藏 0 原文

没有你我更好 2024-11-25 03:20:57

如果只是一个关键字列表，你想进行搜索和替换（可以使用sed）将所有逗号替换为回车符。因此，您最终会得到一个文件，每一行都包含一个关键字。对列表的两个版本都执行此操作。然后使用“join”命令：

join -v 1 leftfile rightfile

这将报告左文件中不在右文件中的所有条目。不要忘记先对文件进行排序，否则连接将不起作用。还有一个用于排序的 bash 工具（毫不奇怪，它被称为“排序”）。

If it is just a list of keywords, you want to do a search and replace (you can use sed) to replace all the commas with carriage returns. So you will end up with a file containing one keyword on each line. Do that to both versions of the list. Then use the "join" command:

join -v 1 leftfile rightfile

This will report all the entries in leftfile that are not in rightfile. Don't forget to sort the files first, or join won't work. There is a bash tool for sorting too (it's called, not surprisingly, "sort").

回复收藏 0 原文

与他有关 2024-11-25 03:20:57

PHP解决方案..
获取关键字作为字符串，然后转换为数组并使用 array_diff 函数：

<?php
$csv1 = 'a1, a2, a3, a4';
$csv2 = 'a1, a4';

$csv1_arr = explode(',', $csv1);
$csv2_arr = explode(',', $csv2);

$diff = array_diff($csv1_arr, $csv2_arr);
print_r($diff);

PHP solution..
Get keywords as strings, convert then in arrays and use array_diff function:

<?php
$csv1 = 'a1, a2, a3, a4';
$csv2 = 'a1, a4';

$csv1_arr = explode(',', $csv1);
$csv2_arr = explode(',', $csv2);

$diff = array_diff($csv1_arr, $csv2_arr);
print_r($diff);

回复收藏 0 原文

~没有更多了~