获取不同电话号码列表的最佳方法(不删除原始格式)?
我们有一个主 Person
记录和一个(或多个)重复的 Person
,我们正在合并他们的数据,优先考虑主记录而不是重复记录。
当涉及电话号码时,目标是合并数据,将单个电话号码放入“电话”字段,将任何其他电话号码放入注释字段(以免完全丢弃它们) 。记录可能包含也可能不包含电话号码。
为了整洁起见,我们不想在注释字段中添加一堆基本相同的数字。因此我们不希望该字段包含:
(1234) 123123
1234 123123
如果我们可以放弃格式和空格,这会很容易,但我们需要保留它们(除了开头/结尾的空格)。
我们首先创建一个结构(不知道为什么我们有结构而不是类,但无论如何)
Friend Structure PhoneNumber
Private _Raw As String
Public Property Raw() As String
Get
Return _Raw
End Get
Set(ByVal value As String)
_Raw = value
End Set
End Property
Private _Stripped As String
Public Property Stripped() As String
Get
Return _Stripped
End Get
Set(ByVal value As String)
_Stripped = value
End Set
End Property
Sub New(ByVal num As String)
Raw = num
Dim RegexObj As New System.Text.RegularExpressions.Regex("[^\d]")
Stripped = RegexObj.Replace(num, "")
MsgBox(num & vbCrLf & Stripped)
End Sub
End Structure
然后,合并代码如下所示:
Dim phones As New List(Of PhoneNumber)
If master.Phone.Trim.Length > 1 Then
phones.Add(New PhoneNumber(master.Phone.Trim))
End If
For Each x As Person In duplicates
If x.Phone.Trim.Length > 1 And Not phones.Contains(New PhoneNumber(x.Phone.Trim)) Then
phones.Add(New PhoneNumber(x.Phone.Trim))
End If
Next
If phones.Count > 0 Then
master.Phone = phones(0).Raw
End If
For i = 1 To phones.Count - 1
master.Notes &= vbCrLf & "Alt. Phone: " & phones(i).Raw
Next
但是,显然,这里的问题是它允许重复。
我们希望 Contains
仅匹配“剥离”值,但它当然不知道这样做。
对于这样一个小功能来说,这似乎代码太多了,但目前我们正在考虑编写一些东西(在结构中?)来替换 Contains
并仅在剥离时进行匹配。有更简洁的方法吗?
代码是 VB 语言,但欢迎使用 C# 回答。
还要记住,我们必须优先考虑 master,因此,如果我们使用 LINQ 和 Distinct,我们需要确保不会丢失排序顺序(这是我的理解)。
We have a master Person
record and one (or more) duplicate Persons
and we are merging their data, prioritising the master over the duplicate(s).
When it comes to phone numbers the goal is to merge their data, with a single phone number going into the Phone
field and any other phone numbers going into a notes field (so as not to discard them completely). Records may or may not contain a phone number.
For neatness we don't want to add to the notes field a bunch of numbers which are basically the same. So we don't want the field to contain:
(1234) 123123
1234 123123
This would be easy if we could just discard the formatting and spaces but we need to retain those (except for white space on the beginning/end).
We started by creating a Structure (not sure why we have a Structure versus a Class, but anyway)
Friend Structure PhoneNumber
Private _Raw As String
Public Property Raw() As String
Get
Return _Raw
End Get
Set(ByVal value As String)
_Raw = value
End Set
End Property
Private _Stripped As String
Public Property Stripped() As String
Get
Return _Stripped
End Get
Set(ByVal value As String)
_Stripped = value
End Set
End Property
Sub New(ByVal num As String)
Raw = num
Dim RegexObj As New System.Text.RegularExpressions.Regex("[^\d]")
Stripped = RegexObj.Replace(num, "")
MsgBox(num & vbCrLf & Stripped)
End Sub
End Structure
Then, the merge code looks like this:
Dim phones As New List(Of PhoneNumber)
If master.Phone.Trim.Length > 1 Then
phones.Add(New PhoneNumber(master.Phone.Trim))
End If
For Each x As Person In duplicates
If x.Phone.Trim.Length > 1 And Not phones.Contains(New PhoneNumber(x.Phone.Trim)) Then
phones.Add(New PhoneNumber(x.Phone.Trim))
End If
Next
If phones.Count > 0 Then
master.Phone = phones(0).Raw
End If
For i = 1 To phones.Count - 1
master.Notes &= vbCrLf & "Alt. Phone: " & phones(i).Raw
Next
But, obviously, the problem here is it's allowing the duplicates.
We kind of want the Contains
to match on "stripped" values only, but of course it doesn't know to do that.
This already seems like too much code for such a minor feature, but at the moment we're looking at writing something (in the Structure?) that will replace the Contains
and match on stripped only. Is there a neater way?
Code is in VB, but C# answers welcome.
Remember too that we have to prioritise the master, so if we use LINQ and Distinct we need to ensure we don't lose the sort order (that's my understanding).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
找到一个更好的方法来做到这一点是使用
字典
。这样我们就可以在没有结构的情况下对键(剥离的电话号码)和值(格式化的原始号码)使用字典查找。像这样的事情:
Figured out a better way to do this was to use a
Dictionary
. That way we can do without the Structure and use Dictionary lookups on both the Key (the stripped phone number) and the Value (the formatted original).Something like this: