如何确定 CSV 文件字段是制表符分隔还是逗号分隔?

发布于 2024-09-12 14:00:49 字数 64 浏览 5 评论 0原文

我试图确定 CSV 文件字段是制表符分隔还是逗号分隔?我需要 PHP 验证。

我怎样才能确定这一点?

I'm trying to determine if CSV file fields are tab delimited or comma delimited? I need PHP validation for this.

How can I determine this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(15

铃予 2024-09-19 14:00:49

现在回答这个问题已经太晚了,但希望它能对某人有所帮助。

这是一个简单的函数,它将返回文件的分隔符。

function getFileDelimiter($file, $checkLines = 2){
        $file = new SplFileObject($file);
        $delimiters = array(
          ',',
          '\t',
          ';',
          '|',
          ':'
        );
        $results = array();
        $i = 0;
         while($file->valid() && $i <= $checkLines){
            $line = $file->fgets();
            foreach ($delimiters as $delimiter){
                $regExp = '/['.$delimiter.']/';
                $fields = preg_split($regExp, $line);
                if(count($fields) > 1){
                    if(!empty($results[$delimiter])){
                        $results[$delimiter]++;
                    } else {
                        $results[$delimiter] = 1;
                    }   
                }
            }
           $i++;
        }
        $results = array_keys($results, max($results));
        return $results[0];
    }

使用此函数如下所示:

$delimiter = getFileDelimiter('abc.csv'); //Check 2 lines to determine the delimiter
$delimiter = getFileDelimiter('abc.csv', 5); //Check 5 lines to determine the delimiter

PS 我使用了 preg_split() 而不是explode(),因为explode('\t', $value) 不会给出正确的结果。

更新:感谢@RichardEB指出代码中的错误。我现在已经更新了。

It's too late to answer this question but hope it will help someone.

Here's a simple function that will return a delimiter of a file.

function getFileDelimiter($file, $checkLines = 2){
        $file = new SplFileObject($file);
        $delimiters = array(
          ',',
          '\t',
          ';',
          '|',
          ':'
        );
        $results = array();
        $i = 0;
         while($file->valid() && $i <= $checkLines){
            $line = $file->fgets();
            foreach ($delimiters as $delimiter){
                $regExp = '/['.$delimiter.']/';
                $fields = preg_split($regExp, $line);
                if(count($fields) > 1){
                    if(!empty($results[$delimiter])){
                        $results[$delimiter]++;
                    } else {
                        $results[$delimiter] = 1;
                    }   
                }
            }
           $i++;
        }
        $results = array_keys($results, max($results));
        return $results[0];
    }

Use this function as shown below:

$delimiter = getFileDelimiter('abc.csv'); //Check 2 lines to determine the delimiter
$delimiter = getFileDelimiter('abc.csv', 5); //Check 5 lines to determine the delimiter

P.S I have used preg_split() instead of explode() because explode('\t', $value) won't give proper results.

UPDATE: Thanks for @RichardEB pointing out a bug in the code. I have updated this now.

绝不放开 2024-09-19 14:00:49

这就是我所做的。

  1. 解析 CSV 文件的前 5 行 计算
  2. 每行中分隔符 [逗号、制表符、分号和冒号] 的数量
  3. 比较每行中分隔符的数量。如果您有格式正确的 CSV,则每行中的分隔符计数之一将匹配。

这不会 100% 有效,但它是一个不错的起点。至少,它将减少可能的分隔符的数量(使您的用户更容易选择正确的分隔符)。

/* Rearrange this array to change the search priority of delimiters */
$delimiters = array('tab'       => "\t",
                'comma'     => ",",
                'semicolon' => ";"
                );

$handle = file( $file );    # Grabs the CSV file, loads into array

$line = array();            # Stores the count of delimiters in each row

$valid_delimiter = array(); # Stores Valid Delimiters

# Count the number of Delimiters in Each Row
for ( $i = 1; $i < 6; $i++ ){
foreach ( $delimiters as $key => $value ){
    $line[$key][$i] = count( explode( $value, $handle[$i] ) ) - 1;
}
}


# Compare the Count of Delimiters in Each line
foreach ( $line as $delimiter => $count ){

# Check that the first two values are not 0
if ( $count[1] > 0 and $count[2] > 0 ){
    $match = true;

    $prev_value = '';
    foreach ( $count as $value ){

        if ( $prev_value != '' )
            $match = ( $prev_value == $value and $match == true ) ? true : false;

        $prev_value = $value;
    }

} else { 
    $match = false;
}

if ( $match == true )    $valid_delimiter[] = $delimiter;

}//foreach

# Set Default delimiter to comma
$delimiter = ( $valid_delimiter[0] != '' ) ? $valid_delimiter[0] : "comma";


/*  !!!! This is good enough for my needs since I have the priority set to "tab"
!!!! but you will want to have to user select from the delimiters in $valid_delimiter
!!!! if multiple dilimiter counts match
*/

# The Delimiter for the CSV
echo $delimiters[$delimiter]; 

Here's what I do.

  1. Parse the first 5 lines of a CSV file
  2. Count the number of delimiters [commas, tabs, semicolons and colons] in each line
  3. Compare the number of delimiters in each line. If you have a properly formatted CSV, then one of the delimiter counts will match in each row.

This will not work 100% of the time, but it is a decent starting point. At minimum, it will reduce the number of possible delimiters (making it easier for your users to select the correct delimiter).

/* Rearrange this array to change the search priority of delimiters */
$delimiters = array('tab'       => "\t",
                'comma'     => ",",
                'semicolon' => ";"
                );

$handle = file( $file );    # Grabs the CSV file, loads into array

$line = array();            # Stores the count of delimiters in each row

$valid_delimiter = array(); # Stores Valid Delimiters

# Count the number of Delimiters in Each Row
for ( $i = 1; $i < 6; $i++ ){
foreach ( $delimiters as $key => $value ){
    $line[$key][$i] = count( explode( $value, $handle[$i] ) ) - 1;
}
}


# Compare the Count of Delimiters in Each line
foreach ( $line as $delimiter => $count ){

# Check that the first two values are not 0
if ( $count[1] > 0 and $count[2] > 0 ){
    $match = true;

    $prev_value = '';
    foreach ( $count as $value ){

        if ( $prev_value != '' )
            $match = ( $prev_value == $value and $match == true ) ? true : false;

        $prev_value = $value;
    }

} else { 
    $match = false;
}

if ( $match == true )    $valid_delimiter[] = $delimiter;

}//foreach

# Set Default delimiter to comma
$delimiter = ( $valid_delimiter[0] != '' ) ? $valid_delimiter[0] : "comma";


/*  !!!! This is good enough for my needs since I have the priority set to "tab"
!!!! but you will want to have to user select from the delimiters in $valid_delimiter
!!!! if multiple dilimiter counts match
*/

# The Delimiter for the CSV
echo $delimiters[$delimiter]; 
吾家有女初长成 2024-09-19 14:00:49

没有 100% 可靠的方法来确定这一点。您可以做的是,

  • 如果您有一种方法来验证您读取的字段,请尝试使用任一分隔符读取一些字段并根据您的方法进行验证。如果损坏,请使用另一块。
  • 计算文件中制表符或逗号的出现次数。通常一个明显高于另一个
  • 最后但并非最不重要的一点:询问用户,并让他推翻你的猜测。

There is no 100% reliable way to detemine this. What you can do is

  • If you have a method to validate the fields you read, try to read a few fields using either separator and validate against your method. If it breaks, use another one.
  • Count the occurrence of tabs or commas in the file. Usually one is significantly higher than the other
  • Last but not least: Ask the user, and allow him to override your guesses.
偏爱自由 2024-09-19 14:00:49

我只是计算 CSV 文件中不同分隔符的出现次数,出现次数最多的分隔符可能应该是正确的分隔符:

//The delimiters array to look through
$delimiters = array(
    'semicolon' => ";",
    'tab'       => "\t",
    'comma'     => ",",
);

//Load the csv file into a string
$csv = file_get_contents($file);
foreach ($delimiters as $key => $delim) {
    $res[$key] = substr_count($csv, $delim);
}

//reverse sort the values, so the [0] element has the most occured delimiter
arsort($res);

reset($res);
$first_key = key($res);

return $delimiters[$first_key]; 

I'm just counting the occurrences of the different delimiters in the CSV file, the one with the most should probably be the correct delimiter:

//The delimiters array to look through
$delimiters = array(
    'semicolon' => ";",
    'tab'       => "\t",
    'comma'     => ",",
);

//Load the csv file into a string
$csv = file_get_contents($file);
foreach ($delimiters as $key => $delim) {
    $res[$key] = substr_count($csv, $delim);
}

//reverse sort the values, so the [0] element has the most occured delimiter
arsort($res);

reset($res);
$first_key = key($res);

return $delimiters[$first_key]; 
彻夜缠绵 2024-09-19 14:00:49

在我的情况下,用户提供 csv 文件,然后将其输入到 SQL 数据库中。他们可能会将 Excel 电子表格另存为逗号或制表符分隔文件。将电子表格转换为 SQL 的程序需要自动识别字段是制表符分隔还是逗号分隔。

许多 Excel csv 导出将字段标题作为第一行。标题测试不太可能包含逗号,除非作为分隔符。对于我的情况,我计算了第一行的逗号和制表符,并使用较大的数字来确定它是 csv 还是制表符

In my situation users supply csv files which are then entered into an SQL database. They may save an Excel Spreadsheet as comma or tab delimited files. A program converting the spreadsheet to SQL needs to automatically identify whether fields are tab separated or comma

Many Excel csv export have field headings as the first line. The heading test is unlikely to contain commas except as a delimiter. For my situation I counted the commas and tabs of the first line and use that with the greater number to determine if it is csv or tab

悸初 2024-09-19 14:00:49

感谢您的所有投入,我使用您的技巧制作了我的:preg_split、fgetcsv、循环等。

但我实现了一些令人惊讶的东西,这里没有,使用 fgets 而不是读取整个文件,如果文件很重,效果会更好!

这是代码:

ini_set("auto_detect_line_endings", true);
function guessCsvDelimiter($filePath, $limitLines = 5) {
    if (!is_readable($filePath) || !is_file($filePath)) {
        return false;
    }

    $delimiters = array(
        'tab'       => "\t",
        'comma'     => ",",
        'semicolon' => ";"
    );

    $fp = fopen($filePath, 'r', false);
    $lineResults = array(
        'tab'       => array(),
        'comma'     => array(),
        'semicolon' => array()
    );

    $lineIndex = 0;
    while (!feof($fp)) {
        $line = fgets($fp);

        foreach ($delimiters as $key=>$delimiter) {
            $lineResults[$key][$lineIndex] = count (fgetcsv($fp, 1024, $delimiter)) - 1;
        }

        $lineIndex++;
        if ($lineIndex > $limitLines) break;
    }
    fclose($fp);

    // Calculating average
    foreach ($lineResults as $key=>$entry) {
        $lineResults[$key] = array_sum($entry)/count($entry);
    }

    arsort($lineResults);
    reset($lineResults);
    return ($lineResults[0] !== $lineResults[1]) ? $delimiters[key($lineResults)] : $delimiters['comma'];
}

Thanks for all your inputs, I made mine using your tricks : preg_split, fgetcsv, loop, etc.

But I implemented something that was surprisingly not here, the use of fgets instead of reading the whole file, way better if the file is heavy!

Here's the code :

ini_set("auto_detect_line_endings", true);
function guessCsvDelimiter($filePath, $limitLines = 5) {
    if (!is_readable($filePath) || !is_file($filePath)) {
        return false;
    }

    $delimiters = array(
        'tab'       => "\t",
        'comma'     => ",",
        'semicolon' => ";"
    );

    $fp = fopen($filePath, 'r', false);
    $lineResults = array(
        'tab'       => array(),
        'comma'     => array(),
        'semicolon' => array()
    );

    $lineIndex = 0;
    while (!feof($fp)) {
        $line = fgets($fp);

        foreach ($delimiters as $key=>$delimiter) {
            $lineResults[$key][$lineIndex] = count (fgetcsv($fp, 1024, $delimiter)) - 1;
        }

        $lineIndex++;
        if ($lineIndex > $limitLines) break;
    }
    fclose($fp);

    // Calculating average
    foreach ($lineResults as $key=>$entry) {
        $lineResults[$key] = array_sum($entry)/count($entry);
    }

    arsort($lineResults);
    reset($lineResults);
    return ($lineResults[0] !== $lineResults[1]) ? $delimiters[key($lineResults)] : $delimiters['comma'];
}
Saygoodbye 2024-09-19 14:00:49

我使用 @Jay Bhatt 的解决方案来查找 csv 文件的分隔符,但它对我不起作用,因此我应用了一些修复和注释以使该过程更容易理解。

请参阅我的 @Jay Bhatt 函数版本:

function decide_csv_delimiter($file, $checkLines = 10) {

    // use php's built in file parser class for validating the csv or txt file
    $file = new SplFileObject($file);

    // array of predefined delimiters. Add any more delimiters if you wish
    $delimiters = array(',', '\t', ';', '|', ':');

    // store all the occurences of each delimiter in an associative array
    $number_of_delimiter_occurences = array();

    $results = array();

    $i = 0; // using 'i' for counting the number of actual row parsed
    while ($file->valid() && $i <= $checkLines) {

        $line = $file->fgets();

        foreach ($delimiters as $idx => $delimiter){

            $regExp = '/['.$delimiter.']/';
            $fields = preg_split($regExp, $line);

            // construct the array with all the keys as the delimiters
            // and the values as the number of delimiter occurences
            $number_of_delimiter_occurences[$delimiter] = count($fields);

        }

       $i++;
    }

    // get key of the largest value from the array (comapring only the array values)
    // in our case, the array keys are the delimiters
    $results = array_keys($number_of_delimiter_occurences, max($number_of_delimiter_occurences));


    // in case the delimiter happens to be a 'tab' character ('\t'), return it in double quotes
    // otherwise when using as delimiter it will give an error,
    // because it is not recognised as a special character for 'tab' key,
    // it shows up like a simple string composed of '\' and 't' characters, which is not accepted when parsing csv files
    return $results[0] == '\t' ? "\t" : $results[0];
}

我个人使用此函数来帮助使用 PHPExcel 自动解析文件,而且它工作起来又漂亮又快。

我建议解析至少 10 行,以使结果更加准确。我个人使用它有 100 行,它运行速度很快,没有延迟或滞后。解析的行越多,结果就越准确。

注意:这只是 @Jay Bhatt 对问题的解决方案的修改版本。所有积分均归@Jay Bhatt 所有。

I used @Jay Bhatt's solution for finding out a csv file's delimiter, but it didn't work for me, so I applied a few fixes and comments for the process to be more understandable.

See my version of @Jay Bhatt's function:

function decide_csv_delimiter($file, $checkLines = 10) {

    // use php's built in file parser class for validating the csv or txt file
    $file = new SplFileObject($file);

    // array of predefined delimiters. Add any more delimiters if you wish
    $delimiters = array(',', '\t', ';', '|', ':');

    // store all the occurences of each delimiter in an associative array
    $number_of_delimiter_occurences = array();

    $results = array();

    $i = 0; // using 'i' for counting the number of actual row parsed
    while ($file->valid() && $i <= $checkLines) {

        $line = $file->fgets();

        foreach ($delimiters as $idx => $delimiter){

            $regExp = '/['.$delimiter.']/';
            $fields = preg_split($regExp, $line);

            // construct the array with all the keys as the delimiters
            // and the values as the number of delimiter occurences
            $number_of_delimiter_occurences[$delimiter] = count($fields);

        }

       $i++;
    }

    // get key of the largest value from the array (comapring only the array values)
    // in our case, the array keys are the delimiters
    $results = array_keys($number_of_delimiter_occurences, max($number_of_delimiter_occurences));


    // in case the delimiter happens to be a 'tab' character ('\t'), return it in double quotes
    // otherwise when using as delimiter it will give an error,
    // because it is not recognised as a special character for 'tab' key,
    // it shows up like a simple string composed of '\' and 't' characters, which is not accepted when parsing csv files
    return $results[0] == '\t' ? "\t" : $results[0];
}

I personally use this function for helping automatically parse a file with PHPExcel, and it works beautifully and fast.

I recommend parsing at least 10 lines, for the results to be more accurate. I personally use it with 100 lines, and it is working fast, no delays or lags. The more lines you parse, the more accurate the result gets.

NOTE: This is just a modifed version of @Jay Bhatt's solution to the question. All credits goes to @Jay Bhatt.

洋洋洒洒 2024-09-19 14:00:49

当我输出 TSV 文件时,我使用 \t 编写选项卡,同样的方法会编写像 \n 这样的换行符,因此我猜一种方法可能如下:

<?php
$mysource = YOUR SOURCE HERE, file_get_contents() OR HOWEVER YOU WISH TO GET THE SOURCE;
 if(strpos($mysource, "\t") > 0){
   //We have a tab separator
 }else{
   // it might be CSV
 }
?>

我猜这可能不是正确的方式,因为你实际内容中也可以有制表符和逗号。这只是一个想法。使用正则表达式可能会更好,尽管我对此不太了解。

When I output a TSV file I author the tabs using \t the same method one would author a line break like \n so that being said I guess a method could be as follows:

<?php
$mysource = YOUR SOURCE HERE, file_get_contents() OR HOWEVER YOU WISH TO GET THE SOURCE;
 if(strpos($mysource, "\t") > 0){
   //We have a tab separator
 }else{
   // it might be CSV
 }
?>

I Guess this may not be the right manner, because you could have tabs and commas in the actual content as well. It's just an idea. Using regular expressions may be better, although I am not too clued up on that.

猫腻 2024-09-19 14:00:49

您可以简单地使用 fgetcsv(); PHP原生函数是这样的:

function getCsvDelimeter($file)
{
    if (($handle = fopen($file, "r")) !== FALSE) {
        $delimiters = array(',', ';', '|', ':'); //Put all that need check

        foreach ($delimiters AS $item) {
            //fgetcsv() return array with unique index if not found the delimiter
            if (count(fgetcsv($handle, 0, $item, '"')) > 1) {
                $delimiter = $item;

                break;
            }
        }
    }

    return (isset($delimiter) ? $delimiter : null);
}

you can simply use the fgetcsv(); PHP native function in this way:

function getCsvDelimeter($file)
{
    if (($handle = fopen($file, "r")) !== FALSE) {
        $delimiters = array(',', ';', '|', ':'); //Put all that need check

        foreach ($delimiters AS $item) {
            //fgetcsv() return array with unique index if not found the delimiter
            if (count(fgetcsv($handle, 0, $item, '"')) > 1) {
                $delimiter = $item;

                break;
            }
        }
    }

    return (isset($delimiter) ? $delimiter : null);
}
小姐丶请自重 2024-09-19 14:00:49

除了 c sv 文件始终以逗号分隔的简单答案 - 它在名称中,我认为您无法提出任何硬性规则。 TSV 和 CSV 文件的指定都足够宽松,您可以提出可以接受的文件。

A\tB,C
1,2\t3

(假设 \t == TAB)

你如何决定这是 TSV 还是 CSV?

Aside from the trivial answer that c sv files are always comma-separated - it's in the name, I don't think you can come up with any hard rules. Both TSV and CSV files are sufficiently loosely specified that you can come up with files that would be acceptable as either.

A\tB,C
1,2\t3

(Assuming \t == TAB)

How would you decide whether this is TSV or CSV?

温柔少女心 2024-09-19 14:00:49

您还可以使用 fgetcsv (http://php.net/manual/en/function.fgetcsv .php)向其传递一个分隔符参数。如果函数返回 false,则意味着 $delimiter 参数不是

检查分隔符是否为“;”的正确样本。

if (($data = fgetcsv($your_csv_handler, 1000, ';')) !== false) { $csv_delimiter = ';'; }

You also can use fgetcsv (http://php.net/manual/en/function.fgetcsv.php) passing it a delimiter parameter. If the function returns false it means that the $delimiter parameter wasn't the right one

sample to check if the delimiter is ';'

if (($data = fgetcsv($your_csv_handler, 1000, ';')) !== false) { $csv_delimiter = ';'; }
暮凉 2024-09-19 14:00:49

来点简单的怎么样?

function findDelimiter($filePath, $limitLines = 5){
    $file = new SplFileObject($filePath);
    $delims = $file->getCsvControl();
    return $delims[0];
}

How about something simple?

function findDelimiter($filePath, $limitLines = 5){
    $file = new SplFileObject($filePath);
    $delims = $file->getCsvControl();
    return $delims[0];
}
流绪微梦 2024-09-19 14:00:49

这是我的解决方案。
如果你知道你期望有多少列,它就会起作用。
最后,分隔符是 $actual_separation_character

$separator_1=",";
$separator_2=";";
$separator_3="\t";
$separator_4=":";
$separator_5="|";

$separator_1_number=0;
$separator_2_number=0;
$separator_3_number=0;
$separator_4_number=0;
$separator_5_number=0;

/* YOU NEED TO CHANGE THIS VARIABLE */
// Expected number of separation character ( 3 colums ==> 2 sepearation caharacter / row )
$expected_separation_character_number=2;  


$file = fopen("upload/filename.csv","r");
while(! feof($file)) //read file rows
{
    $row= fgets($file);

    $row_1_replace=str_replace($separator_1,"",$row);
    $row_1_length=strlen($row)-strlen($row_1_replace);

    if(($row_1_length==$expected_separation_character_number)or($expected_separation_character_number==0)){
    $separator_1_number=$separator_1_number+$row_1_length;
    }

    $row_2_replace=str_replace($separator_2,"",$row);
    $row_2_length=strlen($row)-strlen($row_2_replace);

    if(($row_2_length==$expected_separation_character_number)or($expected_separation_character_number==0)){
    $separator_2_number=$separator_2_number+$row_2_length;
    }

    $row_3_replace=str_replace($separator_3,"",$row);
    $row_3_length=strlen($row)-strlen($row_3_replace);

    if(($row_3_length==$expected_separation_character_number)or($expected_separation_character_number==0)){
    $separator_3_number=$separator_3_number+$row_3_length;
    }

    $row_4_replace=str_replace($separator_4,"",$row);
    $row_4_length=strlen($row)-strlen($row_4_replace);

    if(($row_4_length==$expected_separation_character_number)or($expected_separation_character_number==0)){
    $separator_4_number=$separator_4_number+$row_4_length;
    }

    $row_5_replace=str_replace($separator_5,"",$row);
    $row_5_length=strlen($row)-strlen($row_5_replace);

    if(($row_5_length==$expected_separation_character_number)or($expected_separation_character_number==0)){
    $separator_5_number=$separator_5_number+$row_5_length;
    }

} // while(! feof($file))  END
fclose($file);

/* THE FILE ACTUAL SEPARATOR (delimiter) CHARACTER */
/* $actual_separation_character */

if ($separator_1_number==max($separator_1_number,$separator_2_number,$separator_3_number,$separator_4_number,$separator_5_number)){$actual_separation_character=$separator_1;}
else if ($separator_2_number==max($separator_1_number,$separator_2_number,$separator_3_number,$separator_4_number,$separator_5_number)){$actual_separation_character=$separator_2;}
else if ($separator_3_number==max($separator_1_number,$separator_2_number,$separator_3_number,$separator_4_number,$separator_5_number)){$actual_separation_character=$separator_3;}
else if ($separator_4_number==max($separator_1_number,$separator_2_number,$separator_3_number,$separator_4_number,$separator_5_number)){$actual_separation_character=$separator_4;}
else if ($separator_5_number==max($separator_1_number,$separator_2_number,$separator_3_number,$separator_4_number,$separator_5_number)){$actual_separation_character=$separator_5;}
else {$actual_separation_character=";";}

/* 
if the number of columns more than what you expect, do something ...
*/

if ($expected_separation_character_number>0){
if ($separator_1_number==0 and $separator_2_number==0 and $separator_3_number==0 and $separator_4_number==0 and $separator_5_number==0){/* do something ! more columns than expected ! */}
}

This is my solution.
Its works if you know how many columns you expect.
Finally, the separator character is the $actual_separation_character

$separator_1=",";
$separator_2=";";
$separator_3="\t";
$separator_4=":";
$separator_5="|";

$separator_1_number=0;
$separator_2_number=0;
$separator_3_number=0;
$separator_4_number=0;
$separator_5_number=0;

/* YOU NEED TO CHANGE THIS VARIABLE */
// Expected number of separation character ( 3 colums ==> 2 sepearation caharacter / row )
$expected_separation_character_number=2;  


$file = fopen("upload/filename.csv","r");
while(! feof($file)) //read file rows
{
    $row= fgets($file);

    $row_1_replace=str_replace($separator_1,"",$row);
    $row_1_length=strlen($row)-strlen($row_1_replace);

    if(($row_1_length==$expected_separation_character_number)or($expected_separation_character_number==0)){
    $separator_1_number=$separator_1_number+$row_1_length;
    }

    $row_2_replace=str_replace($separator_2,"",$row);
    $row_2_length=strlen($row)-strlen($row_2_replace);

    if(($row_2_length==$expected_separation_character_number)or($expected_separation_character_number==0)){
    $separator_2_number=$separator_2_number+$row_2_length;
    }

    $row_3_replace=str_replace($separator_3,"",$row);
    $row_3_length=strlen($row)-strlen($row_3_replace);

    if(($row_3_length==$expected_separation_character_number)or($expected_separation_character_number==0)){
    $separator_3_number=$separator_3_number+$row_3_length;
    }

    $row_4_replace=str_replace($separator_4,"",$row);
    $row_4_length=strlen($row)-strlen($row_4_replace);

    if(($row_4_length==$expected_separation_character_number)or($expected_separation_character_number==0)){
    $separator_4_number=$separator_4_number+$row_4_length;
    }

    $row_5_replace=str_replace($separator_5,"",$row);
    $row_5_length=strlen($row)-strlen($row_5_replace);

    if(($row_5_length==$expected_separation_character_number)or($expected_separation_character_number==0)){
    $separator_5_number=$separator_5_number+$row_5_length;
    }

} // while(! feof($file))  END
fclose($file);

/* THE FILE ACTUAL SEPARATOR (delimiter) CHARACTER */
/* $actual_separation_character */

if ($separator_1_number==max($separator_1_number,$separator_2_number,$separator_3_number,$separator_4_number,$separator_5_number)){$actual_separation_character=$separator_1;}
else if ($separator_2_number==max($separator_1_number,$separator_2_number,$separator_3_number,$separator_4_number,$separator_5_number)){$actual_separation_character=$separator_2;}
else if ($separator_3_number==max($separator_1_number,$separator_2_number,$separator_3_number,$separator_4_number,$separator_5_number)){$actual_separation_character=$separator_3;}
else if ($separator_4_number==max($separator_1_number,$separator_2_number,$separator_3_number,$separator_4_number,$separator_5_number)){$actual_separation_character=$separator_4;}
else if ($separator_5_number==max($separator_1_number,$separator_2_number,$separator_3_number,$separator_4_number,$separator_5_number)){$actual_separation_character=$separator_5;}
else {$actual_separation_character=";";}

/* 
if the number of columns more than what you expect, do something ...
*/

if ($expected_separation_character_number>0){
if ($separator_1_number==0 and $separator_2_number==0 and $separator_3_number==0 and $separator_4_number==0 and $separator_5_number==0){/* do something ! more columns than expected ! */}
}
放飞的风筝 2024-09-19 14:00:49

如果您有一个非常大的文件示例(以 GB 为单位),请在前几行开头,放入临时文件。在vi中打开临时文件

head test.txt > te1
vi te1

If you have a very large file example in GB, head the first few line, put in a temporary file. Open the temporary file in vi

head test.txt > te1
vi te1
薄荷梦 2024-09-19 14:00:49

我回答这个问题的最简单方法是在纯文本编辑器或 TextMate 中打开它。

Easiest way I answer this is open it in a plain text editor, or in TextMate.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文