打乱 SQL Server 中的列?

发布于 2024-08-10 11:01:43 字数 164 浏览 8 评论 0原文

我们有一个网络应用程序,想向潜在客户演示,但我们最好的方法是使用现有数据,以获得完整的体验。当然,我们不想使用应用程序中可见的实际客户名称或地址等来执行此操作。 SQL Server 中有没有一种简单的方法来随机化或打乱 varchar 或文本字段?

无论如何,这些列都不是键,无论是主键还是外键。

We have a web app we'd like to demo to prospects, but our best way of doing so is with existing data, for a full experience. Certainly, we don't want to do this with actual customer names or addresses, etc visible in the app. Is there an easy way in SQL Server to randomize or scramble a varchar or text field?

None of these columns are keys in anyway, either primary or foreign.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

方觉久 2024-08-17 11:01:43

这是一个迟到的答案,但我对有关此事的任何互联网搜索都不满意。下面是一个示例,它将打乱客户表中的名字和姓氏以创建新名称:

--Replace Customers with your table name
select * from Customers

--Be sure int match your id column datatype
Declare @id int

--Add a WHERE here to select just a subset of your table
DECLARE mycursor CURSOR FOR SELECT id FROM Customers
OPEN mycursor
FETCH NEXT FROM mycursor INTO @id;

WHILE (@@FETCH_STATUS = 0) 
BEGIN
    --We loop
    --Warning: NEWID() is generated once per query, so we update the fullname in two queries.
    UPDATE Customers
        SET FirstName = (SELECT TOP 1 FirstName FROM Customers ORDER BY NEWID())
    WHERE id = @id
    UPDATE Customers
        SET LastName = (SELECT TOP 1 LastName FROM Customers ORDER BY NEWID())
    WHERE id = @id
    FETCH NEXT FROM mycursor INTO @id;
END

CLOSE mycursor;
DEALLOCATE mycursor;

select * from Customers

This is a late answer but I was not satisfied with any internet search on this matter. Here's an example that will shuffle the firstname and lastname in a customers table to create new names:

--Replace Customers with your table name
select * from Customers

--Be sure int match your id column datatype
Declare @id int

--Add a WHERE here to select just a subset of your table
DECLARE mycursor CURSOR FOR SELECT id FROM Customers
OPEN mycursor
FETCH NEXT FROM mycursor INTO @id;

WHILE (@@FETCH_STATUS = 0) 
BEGIN
    --We loop
    --Warning: NEWID() is generated once per query, so we update the fullname in two queries.
    UPDATE Customers
        SET FirstName = (SELECT TOP 1 FirstName FROM Customers ORDER BY NEWID())
    WHERE id = @id
    UPDATE Customers
        SET LastName = (SELECT TOP 1 LastName FROM Customers ORDER BY NEWID())
    WHERE id = @id
    FETCH NEXT FROM mycursor INTO @id;
END

CLOSE mycursor;
DEALLOCATE mycursor;

select * from Customers
蓝眸 2024-08-17 11:01:43

Redgate 有相应的工具: http://www.red-gate.com/ products/SQL_Data_Generator/index.htm

没用过,但是redgate工具非常好。

编辑

它生成数据,而不是打乱,但仍然有用。

Redgate has tool for it: http://www.red-gate.com/products/SQL_Data_Generator/index.htm

Didn't use it, but redgate tools are very good.

EDIT

It generates data, not scrambles, but still can be useful.

↘紸啶 2024-08-17 11:01:43

我通过更改字段内的字母来扰乱数据一次。所以,如果你有一个名字“Mike Smith”,并且你把所有的 i 改为 o,m 改为 l,e 改为 a,s 改为 t,t 改为 rr,你最终得到的结果

Moke Smoth
Loke Sloth
Loka Sloth
Loka Tloth
Loka Rrlorrh

足以使名称不可读,而且您也无法返回并确定它是什么(我更改了一些已经更改为字母的字母。)但是,它仍然是可读的。

I scrambled data once by changing letters within the field. So, if you have a name "Mike Smith" and you change all the i's to o's, the m's to l's, the e's to a's, the s's to t's, the t's to rr's, you end up with

Moke Smoth
Loke Sloth
Loka Sloth
Loka Tloth
Loka Rrlorrh

which was enough to make the name unreadable, and also you can't go back and determine what it was (I changed some letters which had already had letters changed to them.) But, it's still kind of readable.

甜柠檬 2024-08-17 11:01:43

不可能将数据保留在表中并以某种方式仅以混乱的形式显示。

您的选择是通过以某种方式打乱数据来替换数据、生成具有相同通用形式的新数据、编写一个函数(CLR 或 T-SQL)将其打乱作为您使用的查询的一部分,或者加密数据,在这种情况下,只有当用户也有适当的解密密钥时才能显示。

如果您决定替换数据,除了前面提到的 Red Gate 工具之外,您还可以考虑使用 Visual Studio Team Database 附带的数据生成器,或者可能是 Integration Services。如果您想从更复杂的转换中受益,后者可能特别有用。

It's not possible to just leave your data in the tables and somehow only display it in a scrambled form.

Your options are to either replace the data by scrambling it in some way, generate new data that's in the same general form, write a function (CLR or T-SQL) that scrambles it as part of the queries you use, or encrypt the data, in which case it can only be displayed if the user also has the appropriate decryption key.

If you decide to replace the data, in addition to the Red Gate tool previously mentioned, you might also consider using the data generator that comes with Visual Studio Team Database, or perhaps Integration Services. The latter can be particularly useful if you would benefit from a more complex transformation.

柠檬心 2024-08-17 11:01:43

dbForge 有一个免费的数据生成工具:http://www.devart.com/dbforge /sql/数据生成器/

dbForge has a free tool for data generation: http://www.devart.com/dbforge/sql/data-generator/

星光不落少年眉 2024-08-17 11:01:43

这里有几个简单的方法,它们具有相当好的性能并且可以应用于表:

use master;

declare @length as  int = 50;   --acts as maximum length for random length expressions
declare @rows   as  int = 10;

SELECT 
    CONVERT( VARCHAR(max), crypt_gen_random( @length     ))                                 as  FixedLengthText
,   CONVERT(NVARCHAR(max), crypt_gen_random( @length * 2 ))                                 as  FixedLengthUnicode
,   (   select crypt_gen_random((@length/8*6)) 
        where value."type" is not null  --refer to outer query, to get different value for each row
        FOR XML PATH(''))                                                                   as  FixedLengthBase64
,   CONVERT( VARCHAR(max), crypt_gen_random( (ABS(CHECKSUM(NewId())) %  @length     )+1 ))  as  RandomLengthText
,   CONVERT(NVARCHAR(max), crypt_gen_random( (ABS(CHECKSUM(NewId())) % (@length * 2))+1 ))  as  RandomLengthUnicode
,   (   select crypt_gen_random( ( (ABS(CHECKSUM(NewId())) % @length )+1 )/8*6 )
        where value."type" is not null  --refer to outer query, to get different value for each row
        FOR XML PATH(''))                                                                   as  RandomLengthBase64
FROM  dbo.spt_values  AS  value
WHERE   value."type" = 'P'  --Limit "number" to integers between 0-2047
    and value.number <= @rows
;

Here are couple simple methods that have quite nice performance and can be applied to a table:

use master;

declare @length as  int = 50;   --acts as maximum length for random length expressions
declare @rows   as  int = 10;

SELECT 
    CONVERT( VARCHAR(max), crypt_gen_random( @length     ))                                 as  FixedLengthText
,   CONVERT(NVARCHAR(max), crypt_gen_random( @length * 2 ))                                 as  FixedLengthUnicode
,   (   select crypt_gen_random((@length/8*6)) 
        where value."type" is not null  --refer to outer query, to get different value for each row
        FOR XML PATH(''))                                                                   as  FixedLengthBase64
,   CONVERT( VARCHAR(max), crypt_gen_random( (ABS(CHECKSUM(NewId())) %  @length     )+1 ))  as  RandomLengthText
,   CONVERT(NVARCHAR(max), crypt_gen_random( (ABS(CHECKSUM(NewId())) % (@length * 2))+1 ))  as  RandomLengthUnicode
,   (   select crypt_gen_random( ( (ABS(CHECKSUM(NewId())) % @length )+1 )/8*6 )
        where value."type" is not null  --refer to outer query, to get different value for each row
        FOR XML PATH(''))                                                                   as  RandomLengthBase64
FROM  dbo.spt_values  AS  value
WHERE   value."type" = 'P'  --Limit "number" to integers between 0-2047
    and value.number <= @rows
;
不打扰别人 2024-08-17 11:01:43

您可以创建需要更新的列的列表,然后简单地迭代该列表并执行一些动态 sql,以某种方式更新该行。我做了一个相当基本的打乱函数,它只会对数据进行 sha1(使用随机盐),以便它对于大多数用途来说应该足够安全。

if exists (select 1 where object_id('tempdb..#columnsToUpdate') is not null)
begin
    drop table #columnsToUpdate
end
create table #columnsToUpdate(tableName varchar(max), columnName varchar(max), max_length int)

if exists (select 1 where object_id('fnGetSanitizedName') is not null)
begin 
    drop function fnGetSanitizedName
end

if exists (select 1 where object_id('random') is not null)
begin 
    drop view random
end

if exists (select 1 where object_id('randUniform') is not null)
begin 
    drop function randUniform
end

GO

create view random(value) as select rand();
go

create function dbo.randUniform() returns real
begin
    declare @v real
    set @v = (select value from random)
    return @v
end

go

CREATE FUNCTION dbo.fnGetSanitizedName 
(
    @functionName nvarchar(max),
    @length int
)
RETURNS varchar(max)
AS
BEGIN
    return left(SUBSTRING(master.dbo.fn_varbintohexstr(HashBytes('SHA1', cast(cast(cast(dbo.randUniform() * 10000 as int) as varchar(8)) as varchar(40)) + @functionName)), 3, 32), @length)

END
GO

begin transaction
set nocount on

insert into #columnsToUpdate
select tables.name, columns.name, 
    case
        when types.name = 'nvarchar' then columns.max_length / 2
        else columns.max_length
    end as max_length
from sys.tables tables
inner join sys.columns columns on tables.object_id=columns.object_id
inner join sys.types types on columns.system_type_id = types.system_type_id
where types.name in ('nvarchar', 'varchar')


declare @tableName varchar(max)
declare @columnName varchar(max)
declare @length int
declare @executingSql varchar(max)

declare tableUpdateCursor cursor 
    for select tableName, columnName, max_length from #columnsToUpdate
open tableUpdateCursor

    fetch next from tableUpdateCursor into @tableName, @columnName, @length
    while @@fetch_status = 0
    begin
        set @executingSql = 'update ' + @tableName + ' set ' + @columnName + ' = dbo.fnGetSanitizedName(' + @columnName + ',' + cast(@length as varchar(max)) + ')'
        print @executingSql
        exec(@executingSql)

        fetch next from tableUpdateCursor into @tableName, @columnName, @length
    end


close tableUpdateCursor
deallocate tableUpdateCursor

set nocount off

rollback -- Can remove the rollback when you are sure about what your are doing.

drop table #columnsToUpdate
drop function dbo.fnGetSanitizedName
drop view random
drop function randUniform

You can create a list of the columns that need updating and then simply iterate over said list and execute some dynamic sql that will update the row in some fashion. I made a fairly basic scramble function that will just sha1 the data (with a random salt) so that it should be secure enough for most purposes.

if exists (select 1 where object_id('tempdb..#columnsToUpdate') is not null)
begin
    drop table #columnsToUpdate
end
create table #columnsToUpdate(tableName varchar(max), columnName varchar(max), max_length int)

if exists (select 1 where object_id('fnGetSanitizedName') is not null)
begin 
    drop function fnGetSanitizedName
end

if exists (select 1 where object_id('random') is not null)
begin 
    drop view random
end

if exists (select 1 where object_id('randUniform') is not null)
begin 
    drop function randUniform
end

GO

create view random(value) as select rand();
go

create function dbo.randUniform() returns real
begin
    declare @v real
    set @v = (select value from random)
    return @v
end

go

CREATE FUNCTION dbo.fnGetSanitizedName 
(
    @functionName nvarchar(max),
    @length int
)
RETURNS varchar(max)
AS
BEGIN
    return left(SUBSTRING(master.dbo.fn_varbintohexstr(HashBytes('SHA1', cast(cast(cast(dbo.randUniform() * 10000 as int) as varchar(8)) as varchar(40)) + @functionName)), 3, 32), @length)

END
GO

begin transaction
set nocount on

insert into #columnsToUpdate
select tables.name, columns.name, 
    case
        when types.name = 'nvarchar' then columns.max_length / 2
        else columns.max_length
    end as max_length
from sys.tables tables
inner join sys.columns columns on tables.object_id=columns.object_id
inner join sys.types types on columns.system_type_id = types.system_type_id
where types.name in ('nvarchar', 'varchar')


declare @tableName varchar(max)
declare @columnName varchar(max)
declare @length int
declare @executingSql varchar(max)

declare tableUpdateCursor cursor 
    for select tableName, columnName, max_length from #columnsToUpdate
open tableUpdateCursor

    fetch next from tableUpdateCursor into @tableName, @columnName, @length
    while @@fetch_status = 0
    begin
        set @executingSql = 'update ' + @tableName + ' set ' + @columnName + ' = dbo.fnGetSanitizedName(' + @columnName + ',' + cast(@length as varchar(max)) + ')'
        print @executingSql
        exec(@executingSql)

        fetch next from tableUpdateCursor into @tableName, @columnName, @length
    end


close tableUpdateCursor
deallocate tableUpdateCursor

set nocount off

rollback -- Can remove the rollback when you are sure about what your are doing.

drop table #columnsToUpdate
drop function dbo.fnGetSanitizedName
drop view random
drop function randUniform
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文