Renaming or not is a choice you will have to make, depending on your website, user base, how obscure you would like to be and, obviously your architecture. Would you prefer to have a file named kate_at_the_beach.jpg or 1304357611.jpg? This is really up to you to decide, but search engines (obviouslly) like the first one better.
One thing you should do is always sanitize and normalize the filenames, personally I would only allow the following chars: 0-9, a-z, A-Z, _, -, . - if you choose this sanitation alphabet. normalization basically means just converting the filename to either lower or upper case (to avoid losing files if for instance you switch from a case sensitive file-system to a case insensitive one, like Windows).
Here is some sample code I use in phunction (shameless plug, I know :P):
If the destination file already exists, it will be overwritten.
So, before you call move_uploaded_file() you better check if the file already exists, if it does then you should (if you don't want to lose your older file) rename your new file, usually appending a time / random / unique token before the file extension, doing something like this:
This will have the effect of inserting the $token before the file extension, like I stated above. As for the choice of the $token value you have several options:
time() - ensures uniqueness every second but sucks handling duplicate files
random - not a very good idea, since it doesn't ensure uniqueness and doesn't handle duplicates
unique - using an hash of the file contents is my favorite approach, since it guarantees content uniqueness and saves you HD space since you'll only have at most 2 identical files (one with the original filename and another one with the hash appended), sample code:
(Dummy text so that the next line gets formatted as code.)
There is no such convention, but usually, the name is randomly generated to make guessing less probable. Allowing the filename without sanitizing is strongly discouraged, take at least a whitelist approach in which you remove all characters except for those in the whitelist. The key is security, uploading is a risky feature and can be dangerous if not properly handled.
Just make some convention internally yourself. You could for example just store the files as userId_timestamp in the folder, and keep the original filename in some database. Or you just make it userId_originalFilename or some other combination of things that make it unique.
在类似的情况下,我将信息保存在表中(使用用户 ID 作为外键),使用文件名前导零格式化自动数字 ID(即 000345.jpg),并将原始名称存储在表中。
In a similar case, I save the info in a table (with the user ID as foreign key), format the autonumeric ID with leading zeroes for the filename (ie 000345.jpg) and store the original name in the table.
发布评论
评论(5)
没有标准约定,但有一些最佳实践:
将文件组织到(用户和/或日期)感知文件夹中
类似:
这将有一些好处:
(不是)重命名/清理文件
名重命名是否是您必须做出的选择,具体取决于您的网站、用户群、您想要的默默无闻以及显然您的架构。您想要一个名为
kate_at_the_beach.txt 的文件吗? jpg
或1304357611.jpg
?这实际上取决于您来决定,但搜索引擎(显然)更喜欢第一个。您应该做的一件事是始终清理并标准化文件名,我个人只允许使用以下字符:
0-9
、az
、AZ
、_
、-
、.
- 如果您选择此卫生字母表。规范化基本上意味着将文件名转换为小写或大写(以避免丢失文件,例如从区分大小写的文件系统切换到不区分大小写的文件系统,例如 Windows)。这是我在 phunction 中使用的一些示例代码(无耻的插件,我知道:P):
处理重复文件名
作为
move_uploaded_file()
状态:因此,在调用
move_uploaded_file()
之前,您最好检查该文件是否已经存在,如果存在,那么您应该(如果您不想丢失旧文件)重命名新文件,通常会附加时间/随机/唯一标记在文件扩展名之前,执行如下操作:这将具有在文件扩展名之前插入
$token
的效果文件扩展名,就像我上面所说的那样。至于$token
值的选择,您有几个选项:time()
- 确保每秒的唯一性,但处理重复文件很糟糕(虚拟文本,以便下一行被格式化为代码。)
希望它有帮助! ;)
There are no standard conventions, but there a couple of best-practices:
Organizing your files into (User and/or Date) Aware Folders
Something like:
This will have some benefits:
(Not) Renaming / Sanitizing Filenames
Renaming or not is a choice you will have to make, depending on your website, user base, how obscure you would like to be and, obviously your architecture. Would you prefer to have a file named
kate_at_the_beach.jpg
or1304357611.jpg
? This is really up to you to decide, but search engines (obviouslly) like the first one better.One thing you should do is always sanitize and normalize the filenames, personally I would only allow the following chars:
0-9
,a-z
,A-Z
,_
,-
,.
- if you choose this sanitation alphabet. normalization basically means just converting the filename to either lower or upper case (to avoid losing files if for instance you switch from a case sensitive file-system to a case insensitive one, like Windows).Here is some sample code I use in phunction (shameless plug, I know :P):
Handling Duplicate Filenames
As the documentation entry on
move_uploaded_file()
states:So, before you call
move_uploaded_file()
you better check if the file already exists, if it does then you should (if you don't want to lose your older file) rename your new file, usually appending a time / random / unique token before the file extension, doing something like this:This will have the effect of inserting the
$token
before the file extension, like I stated above. As for the choice of the$token
value you have several options:time()
- ensures uniqueness every second but sucks handling duplicate files(Dummy text so that the next line gets formatted as code.)
Hope it helps! ;)
没有这样的约定,但通常,名称是随机生成的,以减少猜测的可能性。强烈建议不要允许文件名未经清理,至少采用白名单方法,删除除白名单中的字符之外的所有字符。关键是安全,上传是一个有风险的功能,如果处理不当可能会产生危险。
There is no such convention, but usually, the name is randomly generated to make guessing less probable. Allowing the filename without sanitizing is strongly discouraged, take at least a whitelist approach in which you remove all characters except for those in the whitelist. The key is security, uploading is a risky feature and can be dangerous if not properly handled.
自己在内部制定一些约定即可。例如,您可以将文件作为
userId_timestamp
存储在文件夹中,并将原始文件名保留在某个数据库中。或者您只需将其设置为userId_originalFilename
或其他一些使其独一无二的组合。Just make some convention internally yourself. You could for example just store the files as
userId_timestamp
in the folder, and keep the original filename in some database. Or you just make ituserId_originalFilename
or some other combination of things that make it unique.在类似的情况下,我将信息保存在表中(使用用户 ID 作为外键),使用文件名前导零格式化自动数字 ID(即 000345.jpg),并将原始名称存储在表中。
In a similar case, I save the info in a table (with the user ID as foreign key), format the autonumeric ID with leading zeroes for the filename (ie 000345.jpg) and store the original name in the table.
您可以使用用户名和上传日期的某种组合吗?
Could you use some combination of the user's name and the upload date?