最好的自动换行算法？

睫毛溺水了 2024-07-11 08:20:11

最近我有机会写一个自动换行功能，我想分享一下我的想法。

我使用了一种 TDD 方法，几乎与示例。我从包装字符串“Hello, world!”的测试开始。宽度为 80 时应返回“Hello, World!”。显然，最简单的方法就是原封不动地返回输入字符串。从那时起，我进行了越来越复杂的测试，并最终得到了一个递归解决方案，该解决方案（至少对于我的目的）非常有效地处理任务。

递归解决方案的伪代码：

Function WordWrap (inputString, width)
    Trim the input string of leading and trailing spaces.

    If the trimmed string's length is <= the width,
        Return the trimmed string.
    Else,
        Find the index of the last space in the trimmed string, starting at width

        If there are no spaces, use the width as the index.

        Split the trimmed string into two pieces at the index.

        Trim trailing spaces from the portion before the index,
        and leading spaces from the portion after the index.

        Concatenate and return:
          the trimmed portion before the index,
          a line break,
          and the result of calling WordWrap on the trimmed portion after
            the index (with the same width as the original call).

这仅在空格处换行，如果要换行已经包含换行符的字符串，则需要在换行符处将其拆分，将每个部分发送到此函数，然后重新组装字符串。即便如此，在快速机器上运行的 VB.NET 中，每秒可以处理大约 20 MB。

I had occasion to write a word wrap function recently, and I want to share what I came up with.

I used a TDD approach almost as strict as the one from the Go example. I started with the test that wrapping the string "Hello, world!" at 80 width should return "Hello, World!". Clearly, the simplest thing that works is to return the input string untouched. Starting from that, I made more and more complex tests and ended up with a recursive solution that (at least for my purposes) quite efficiently handles the task.

Pseudocode for the recursive solution:

Function WordWrap (inputString, width)
    Trim the input string of leading and trailing spaces.

    If the trimmed string's length is <= the width,
        Return the trimmed string.
    Else,
        Find the index of the last space in the trimmed string, starting at width

        If there are no spaces, use the width as the index.

        Split the trimmed string into two pieces at the index.

        Trim trailing spaces from the portion before the index,
        and leading spaces from the portion after the index.

        Concatenate and return:
          the trimmed portion before the index,
          a line break,
          and the result of calling WordWrap on the trimmed portion after
            the index (with the same width as the original call).

This only wraps at spaces, and if you want to wrap a string that already contains line breaks, you need to split it at the line breaks, send each piece to this function and then reassemble the string. Even so, in VB.NET running on a fast machine, this can handle about 20 MB/second.

回复收藏 0 原文

聽兲甴掵 2024-07-11 08:20:11

Donald E. Knuth 在他的 TeX 排版系统中的换行算法上做了很多工作。这可以说是最好的换行算法之一——就结果的视觉外观而言是“最佳”。

他的算法避免了贪婪线填充的问题，在这种情况下，您可能会得到一条非常密集的线，然后是一条非常松散的线。

可以使用动态规划来实现有效的算法。

一篇关于 TeX 换行的论文。

回复收藏 0 原文

初熏 2024-07-11 08:20:11

我对我自己的编辑器项目也有同样的疑问。我的解决方案是一个两步过程：

找到线端并将它们存储在数组中。
对于很长的线，以大约 1K 的间隔找到合适的断点，并将它们也保存到线数组中。这是为了捕获“4 MB 文本，没有一个换行符”。

当您需要显示文本时，找到有问题的行并即时换行。将此信息记住在缓存中以便快速重绘。当用户滚动整个页面时，刷新缓存并重复。

如果可以，请在后台线程中加载/分析整个文本。这样，您就可以显示第一页文本，同时仍在检查文档的其余部分。这里最简单的解决方案是切掉前 16 KB 文本并在子字符串上运行算法。这非常快，即使您的编辑器仍在加载文本，您也可以立即渲染第一页。

当光标最初位于文本末尾时，您可以使用类似的方法；只需阅读最后 16 KB 的文本并进行分析即可。在这种情况下，请使用两个编辑缓冲区，并将除最后 16KB 之外的所有内容加载到第一个缓冲区中，同时用户被锁定到第二个缓冲区中。当您关闭编辑器时，您可能会想记住文本有多少行，这样滚动条看起来就不会很奇怪。

当用户可以将光标放在中间的某个位置启动编辑器时，它会变得很棘手，但最终这只是最终问题的扩展。只需记住上一个会话的字节位置、当前行号和总行数，再加上您需要三个编辑缓冲区，或者您需要一个可以在中间删除 16 KB 的编辑缓冲区。

或者，在加载文本时锁定滚动条和其他界面元素；允许用户在文本完全加载时查看文本。

回复收藏 0 原文

倾其所爱 2024-07-11 08:20:11

我不知道任何具体的算法，但以下可能是它应该如何工作的粗略轮廓：

对于当前的文本大小，字体，显示大小，窗口大小，边距等，确定可以容纳多少个字符一条线（如果是固定类型），或者一条线上可以容纳多少个像素（如果不是固定类型）。
逐个字符地浏览该行，计算自该行开始以来已记录了多少个字符或像素。
当您超过该行的最大字符/像素时，移回到最后一个空格/标点符号，并将所有文本移动到下一行。
重复此操作，直到浏览完文档中的所有文本。

在 .NET 中，自动换行功能内置于 TextBox 等控件中。我确信其他语言也存在类似的内置功能。

回复收藏 0 原文

晨敛清荷 2024-07-11 08:20:11

有或没有连字符？

没有它很容易。只需将文本封装为每个单词的单词对象，并为它们提供一个方法 getWidth() 即可。然后从第一个单词开始累加行长度，直到它大于可用空间。如果是这样，请包装最后一个单词，并开始再次计数从该单词开始的下一行，依此类推。

使用连字符，您需要采用常见格式的连字符规则，例如： hy-phen-a-tion

那么它与上面的相同，除了您需要拆分导致溢出的最后一个单词。

四人帮中给出了一个很好的示例和教程，说明如何构建优秀的文本编辑器的代码

回复收藏 0 原文

千里故人稀 2024-07-11 08:20:11

这是我今天在 C 中为了好玩而工作的：

这是我的考虑因素：

不复制字符，只是打印到标准输出。因此，由于我不喜欢修改 argv[x] 参数，并且因为我喜欢挑战，所以我想在不修改它的情况下完成它。我并没有想到插入 '\n'。
我不想
```
 此行在此中断 
  
```
成为
```
 此行中断 
        这里 
  
```
因此，鉴于此目标，将字符更改为 '\n' 不是一个选项。
如果线宽设置为 80，并且第 80 个字符位于单词的中间，则整个单词必须放在下一行。因此，当您扫描时，您必须记住最后一个不超过 80 个字符的单词的结尾位置。
所以这是我的，它不干净；在过去的一个小时里，我一直在绞尽脑汁地试图让它工作，到处添加一些东西。它适用于我所知道的所有边缘情况。
```
#include ; 
  #include ; 
  #include ; 

  int isDelim(char c){ 
     开关（c）{ 
        案例“\0”： 
        案例“\t”： 
        案件 ' ' ： 
           返回1； 
           休息;   /* 作为一种风格问题，无论如何都要放置“break”，即使上面有一个 return。*/ 
        默认： 
           返回0； 
     } 
  } 

  int printLine(const char * 开始, const char * 结束){ 
     const char * p = 开始； 
     while ( p <= 结束 ) 
         putchar(*p++); 
     putchar('\n'); 
  } 

  int main ( int argc , char ** argv ) { 

     if( argc <= 2 ) 
         退出（1）； 

     char * 开始 = argv[1]; 
     char *lastChar = argv[1]; 
     char * 当前 = argv[1]; 
     int wrapLength = atoi(argv[2]); 

     整数字符= 1； 
     while( *当前！= '\0' ){ 
        while( 字符 <= 包装长度 ){ 
           while ( !isDelim( *current ) ) ++current, ++chars; 
           if( 字符 <= 包装长度){ 
              if(*当前=='\0'){ 
                 投入（开始）； 
                 返回0； 
              } 
              最后一个字符=当前-1； 
              当前++，字符++； 
           } 
        } 

        if( 最后一个字符 == 开始 ) 
           最后一个字符=当前-1； 

        printLine(开始,lastChar); 
        当前 = 最后一个字符 + 1； 
        while(isDelim(*当前)){ 
           if( *当前 == '\0') 
              返回0； 
           别的 
              ++当前； 
        } 
        开始=当前； 
        最后一个字符=当前； 
        字符数=1； 
     } 
     返回0； 
  } 
  
```
所以基本上，我想将 start 和 lastChar 设置为行的开头和行的最后一个字符。设置完毕后，我将从头到尾的所有字符输出到标准输出，然后输出 '\n'，然后转到下一行。
最初，所有内容都指向开头，然后我使用 while(!isDelim(*current)) ++current,++chars; 跳过单词。当我这样做时，我记得 80 个字符之前的最后一个字符 (lastChar)。
如果在一个单词的末尾，我已经传递了我的字符数 (80)，那么我就会退出 while(chars <= wrapLength) 块。我输出 start 和 lastChar 之间的所有字符以及 newline。
然后我将 current 设置为 lastChar+1 并跳过分隔符（如果这引导我到达字符串的末尾，我们就完成了，return 0）。将 start、lastChar 和 current 设置为下一行的开头。
```
if(*current == '\0'){ 
      投入（开始）； 
      返回0； 
  } 
  
```
部分适用于太短而无法包装一次的字符串。我在写这篇文章之前添加了这个，因为我尝试了一个短字符串，但它不起作用。
我觉得这可能可以以更优雅的方式实现。如果有人有什么建议，我很乐意尝试。
当我写这篇文章时，我问自己“如果我有一个字符串，其中一个单词比我的包装长度长，会发生什么”好吧，它不起作用。所以我添加了
```
if(lastChar == start) 
      最后一个字符=当前-1； 
  
```
在 printLine() 语句之前（如果 lastChar 没有移动，那么我们的单词对于单行来说太长了，所以我们只需将无论如何，整个事情都在线）。
自从我写这篇文章以来，我从代码中删除了注释，但我真的觉得一定有一种比我不需要注释的方法更好的方法。
这就是我如何写这个东西的故事。我希望它对人们有用，我也希望有人对我的代码不满意并提出一种更优雅的方法。
应该注意的是，它适用于所有边缘情况：一行中的单词太长、字符串短于一个 wrapLength 以及空字符串。

Here is mine that I was working on today for fun in C:

Here are my considerations:

No copying of characters, just printing to standard output. Therefore, since I don't like to modify the argv[x] arguments, and because I like a challenge, I wanted to do it without modifying it. I did not go for the idea of inserting '\n'.
I don't want
```
 This line breaks     here
```
to become
```
 This line breaks
      here
```
so changing characters to '\n' is not an option given this objective.
If the linewidth is set at say 80, and the 80th character is in the middle of a word, the entire word must be put on the next line. So as you're scanning, you have to remember the position of the end of the last word that didn't go over 80 characters.
So here is mine, it's not clean; I've been breaking my head for the past hour trying to get it to work, adding something here and there. It works for all edge cases that I know of.
```
#include <stdlib.h>
#include <string.h>
#include <stdio.h>

int isDelim(char c){
   switch(c){
      case '\0':
      case '\t':
      case ' ' :
         return 1;
         break; /* As a matter of style, put the 'break' anyway even if there is a return above it.*/
      default:
         return 0;
   }
}

int printLine(const char * start, const char * end){
   const char * p = start;
   while ( p <= end )
       putchar(*p++);
   putchar('\n');
}

int main ( int argc , char ** argv ) {

   if( argc <= 2 )
       exit(1);

   char * start = argv[1];
   char * lastChar = argv[1];
   char * current = argv[1];
   int wrapLength = atoi(argv[2]);

   int chars = 1;
   while( *current != '\0' ){
      while( chars <= wrapLength ){
         while ( !isDelim( *current ) ) ++current, ++chars;
         if( chars <= wrapLength){
            if(*current == '\0'){
               puts(start);
               return 0;
            }
            lastChar = current-1;
            current++,chars++;
         }
      }

      if( lastChar == start )
         lastChar = current-1;

      printLine(start,lastChar);
      current = lastChar + 1;
      while(isDelim(*current)){
         if( *current == '\0')
            return 0;
         else
            ++current;
      }
      start = current;
      lastChar = current;
      chars = 1;
   }
   return 0;
}
```
So basically, I have start and lastChar that I want to set as the start of a line and the last character of a line. When those are set, I output to standard output all the characters from start to end, then output a '\n', and move on to the next line.
Initially everything points to the start, then I skip words with the while(!isDelim(*current)) ++current,++chars;. As I do that, I remember the last character that was before 80 chars (lastChar).
If, at the end of a word, I have passed my number of chars (80), then I get out of the while(chars <= wrapLength) block. I output all the characters between start and lastChar and a newline.
Then I set current to lastChar+1 and skip delimiters (and if that leads me to the end of the string, we're done, return 0). Set start, lastChar and current to the start of the next line.
The
```
if(*current == '\0'){
    puts(start);
    return 0;
}
```
part is for strings that are too short to be wrapped even once. I added this just before writing this post because I tried a short string and it didn't work.
I feel like this might be doable in a more elegant way. If anyone has anything to suggest I'd love to try it.
And as I wrote this I asked myself "what's going to happen if I have a string that is one word that is longer than my wraplength" Well it doesn't work. So I added the
```
if( lastChar == start )
    lastChar = current-1;
```
before the printLine() statement (if lastChar hasn't moved, then we have a word that is too long for a single line so we just have to put the whole thing on the line anyway).
I took the comments out of the code since I'm writing this but I really feel that there must be a better way of doing this than what I have that wouldn't need comments.
So that's the story of how I wrote this thing. I hope it can be of use to people and I also hope that someone will be unsatisfied with my code and propose a more elegant way of doing it.
It should be noted that it works for all edge cases: words too long for a line, strings that are shorter than one wrapLength, and empty strings.

回复收藏 0 原文

儭儭莪哋寶赑 2024-07-11 08:20:11

我不能声称它没有错误，但我需要一个能够自动换行并遵守缩进边界的文件。除了到目前为止它对我有用之外，我对这段代码没有任何声明。这是一种扩展方法，违反了 StringBuilder 的完整性，但它可以用您想要的任何输入/输出来实现。

public static void WordWrap(this StringBuilder sb, int tabSize, int width)
{
    string[] lines = sb.ToString().Replace("\r\n", "\n").Split('\n');
    sb.Clear();
    for (int i = 0; i < lines.Length; ++i)
    {
        var line = lines[i];
        if (line.Length < 1)
            sb.AppendLine();//empty lines
        else
        {
            int indent = line.TakeWhile(c => c == '\t').Count(); //tab indents 
            line = line.Replace("\t", new String(' ', tabSize)); //need to expand tabs here
            string lead = new String(' ', indent * tabSize); //create the leading space
            do
            {
                //get the string that fits in the window
                string subline = line.Substring(0, Math.Min(line.Length, width));
                if (subline.Length < line.Length && subline.Length > 0)
                {
                    //grab the last non white character
                    int lastword = subline.LastOrDefault() == ' ' ? -1 : subline.LastIndexOf(' ', subline.Length - 1);
                    if (lastword >= 0)
                        subline = subline.Substring(0, lastword);
                    sb.AppendLine(subline);

                    //next part
                    line = lead + line.Substring(subline.Length).TrimStart();
                }
                else  
                {
                    sb.AppendLine(subline); //everything fits
                    break;
                }
            }
            while (true);
        }
    }
}

I cant claim the bug-free-ness of this, but I needed one that word wrapped and obeyed boundaries of indentation. I claim nothing about this code other than it has worked for me so far. This is an extension method and violates the integrity of the StringBuilder but it could be made with whatever inputs / outputs you desire.

public static void WordWrap(this StringBuilder sb, int tabSize, int width)
{
    string[] lines = sb.ToString().Replace("\r\n", "\n").Split('\n');
    sb.Clear();
    for (int i = 0; i < lines.Length; ++i)
    {
        var line = lines[i];
        if (line.Length < 1)
            sb.AppendLine();//empty lines
        else
        {
            int indent = line.TakeWhile(c => c == '\t').Count(); //tab indents 
            line = line.Replace("\t", new String(' ', tabSize)); //need to expand tabs here
            string lead = new String(' ', indent * tabSize); //create the leading space
            do
            {
                //get the string that fits in the window
                string subline = line.Substring(0, Math.Min(line.Length, width));
                if (subline.Length < line.Length && subline.Length > 0)
                {
                    //grab the last non white character
                    int lastword = subline.LastOrDefault() == ' ' ? -1 : subline.LastIndexOf(' ', subline.Length - 1);
                    if (lastword >= 0)
                        subline = subline.Substring(0, lastword);
                    sb.AppendLine(subline);

                    //next part
                    line = lead + line.Substring(subline.Length).TrimStart();
                }
                else  
                {
                    sb.AppendLine(subline); //everything fits
                    break;
                }
            }
            while (true);
        }
    }
}

回复收藏 0 原文

暖心男生 2024-07-11 08:20:11

我也可以加入我制作的 Perl 解决方案，因为 gnu fold -s 留下了尾随空格和其他不良行为。该解决方案不能（正确）处理包含制表符或退格键或嵌入回车符等的文本，尽管它确实处理 CRLF 行结束符，将它们全部转换为 LF。它对文本的更改最小，特别是它从不拆分单词（不更改 wc -w），并且对于行中不超过一个空格（并且没有 CR）的文本，它不会改变 wc -c （因为它用 LF 替换 空格，而不是插入 LF）。

#!/usr/bin/perl

use strict;
use warnings;

my $WIDTH = 80;

if ($ARGV[0] =~ /^[1-9][0-9]*$/) {
  $WIDTH = $ARGV[0];
  shift @ARGV;
}

while (<>) {

s/\r\n$/\n/;
chomp;

if (length $_ <= $WIDTH) {
  print "$_\n";
  next;
}

@_=split /(\s+)/;

# make @_ start with a separator field and end with a content field
unshift @_, "";
push @_, "" if @_%2;

my ($sep,$cont) = splice(@_, 0, 2);
do {
  if (length $cont > $WIDTH) {
    print "$cont";
    ($sep,$cont) = splice(@_, 0, 2);
  }
  elsif (length($sep) + length($cont) > $WIDTH) {
    printf "%*s%s", $WIDTH - length $cont, "", $cont;
    ($sep,$cont) = splice(@_, 0, 2);
  }
  else {
    my $remain = $WIDTH;
    { do {
      print "$sep$cont";
      $remain -= length $sep;
      $remain -= length $cont;
      ($sep,$cont) = splice(@_, 0, 2) or last;
    }
    while (length($sep) + length($cont) <= $remain);
    }
  }
  print "\n";
  $sep = "";
}
while ($cont);

}

I may as well chime in with a perl solution that I made, because gnu fold -s was leaving trailing spaces and other bad behavior. This solution does not (properly) handle text containing tabs or backspaces or embedded carriage returns or the like, although it does handle CRLF line-endings, converting them all to just LF. It makes minimal change to the text, in particular it never splits a word (doesn't change wc -w), and for text with no more than single space in a row (and no CR) it doesn't change wc -c (because it replaces space with LF rather than inserting LF).

#!/usr/bin/perl

use strict;
use warnings;

my $WIDTH = 80;

if ($ARGV[0] =~ /^[1-9][0-9]*$/) {
  $WIDTH = $ARGV[0];
  shift @ARGV;
}

while (<>) {

s/\r\n$/\n/;
chomp;

if (length $_ <= $WIDTH) {
  print "$_\n";
  next;
}

@_=split /(\s+)/;

# make @_ start with a separator field and end with a content field
unshift @_, "";
push @_, "" if @_%2;

my ($sep,$cont) = splice(@_, 0, 2);
do {
  if (length $cont > $WIDTH) {
    print "$cont";
    ($sep,$cont) = splice(@_, 0, 2);
  }
  elsif (length($sep) + length($cont) > $WIDTH) {
    printf "%*s%s", $WIDTH - length $cont, "", $cont;
    ($sep,$cont) = splice(@_, 0, 2);
  }
  else {
    my $remain = $WIDTH;
    { do {
      print "$sep$cont";
      $remain -= length $sep;
      $remain -= length $cont;
      ($sep,$cont) = splice(@_, 0, 2) or last;
    }
    while (length($sep) + length($cont) <= $remain);
    }
  }
  print "\n";
  $sep = "";
}
while ($cont);

}

回复收藏 0 原文

梦途 2024-07-11 08:20:11

@ICR，感谢分享 C# 示例。

我没有成功使用它，但我想出了另一个解决方案。如果对此有任何兴趣，请随意使用：
C# 中的 WordWrap 函数。源代码可在 GitHub 上获取。

我已经包含了单元测试/示例。

回复收藏 0 原文

谁对谁错谁最难过 2024-07-11 08:20:11

这是我用 C# 编写的自动换行算法。
翻译成其他语言应该相当容易（IndexOfAny 除外）。

static char[] splitChars = new char[] { ' ', '-', '\t' };

private static string WordWrap(string str, int width)
{
    string[] words = Explode(str, splitChars);

    int curLineLength = 0;
    StringBuilder strBuilder = new StringBuilder();
    for(int i = 0; i < words.Length; i += 1)
    {
        string word = words[i];
        // If adding the new word to the current line would be too long,
        // then put it on a new line (and split it up if it's too long).
        if (curLineLength + word.Length > width)
        {
            // Only move down to a new line if we have text on the current line.
            // Avoids situation where
            // wrapped whitespace causes emptylines in text.
            if (curLineLength > 0)
            {
                strBuilder.Append(Environment.NewLine);
                curLineLength = 0;
            }

            // If the current word is too long
            // to fit on a line (even on its own),
            // then split the word up.
            while (word.Length > width)
            {
                strBuilder.Append(word.Substring(0, width - 1) + "-");
                word = word.Substring(width - 1);

                strBuilder.Append(Environment.NewLine);
            }

            // Remove leading whitespace from the word,
            // so the new line starts flush to the left.
            word = word.TrimStart();
        }
        strBuilder.Append(word);
        curLineLength += word.Length;
    }

    return strBuilder.ToString();
}

private static string[] Explode(string str, char[] splitChars)
{
    List<string> parts = new List<string>();
    int startIndex = 0;
    while (true)
    {
        int index = str.IndexOfAny(splitChars, startIndex);
        
        if (index == -1)
        {
            parts.Add(str.Substring(startIndex));
            return parts.ToArray();
        }

        string word = str.Substring(startIndex, index - startIndex);
        char nextChar = str.Substring(index, 1)[0];
        // Dashes and the like should stick to the word occuring before it.
        // Whitespace doesn't have to.
        if (char.IsWhiteSpace(nextChar))
        {
            parts.Add(word);
            parts.Add(nextChar.ToString());
        }
        else
        {
            parts.Add(word + nextChar);
        }

        startIndex = index + 1;
    }
}

它相当原始——它按空格、制表符和破折号进行分割。

它确实确保破折号粘在它之前的单词上
（所以你不会最终得到“stack
-溢出”），
虽然它不利于移动小的连字符单词
到一个新行而不是拆分它们。

如果单词对于一行来说太长，它确实会分割单词。

它也具有相当的文化特色，
因为我不太了解其他文化的自动换行规则。

Here is a word-wrap algorithm I've written in C#.
It should be fairly easy to translate into other languages (except perhaps for IndexOfAny).

static char[] splitChars = new char[] { ' ', '-', '\t' };

private static string WordWrap(string str, int width)
{
    string[] words = Explode(str, splitChars);

    int curLineLength = 0;
    StringBuilder strBuilder = new StringBuilder();
    for(int i = 0; i < words.Length; i += 1)
    {
        string word = words[i];
        // If adding the new word to the current line would be too long,
        // then put it on a new line (and split it up if it's too long).
        if (curLineLength + word.Length > width)
        {
            // Only move down to a new line if we have text on the current line.
            // Avoids situation where
            // wrapped whitespace causes emptylines in text.
            if (curLineLength > 0)
            {
                strBuilder.Append(Environment.NewLine);
                curLineLength = 0;
            }

            // If the current word is too long
            // to fit on a line (even on its own),
            // then split the word up.
            while (word.Length > width)
            {
                strBuilder.Append(word.Substring(0, width - 1) + "-");
                word = word.Substring(width - 1);

                strBuilder.Append(Environment.NewLine);
            }

            // Remove leading whitespace from the word,
            // so the new line starts flush to the left.
            word = word.TrimStart();
        }
        strBuilder.Append(word);
        curLineLength += word.Length;
    }

    return strBuilder.ToString();
}

private static string[] Explode(string str, char[] splitChars)
{
    List<string> parts = new List<string>();
    int startIndex = 0;
    while (true)
    {
        int index = str.IndexOfAny(splitChars, startIndex);
        
        if (index == -1)
        {
            parts.Add(str.Substring(startIndex));
            return parts.ToArray();
        }

        string word = str.Substring(startIndex, index - startIndex);
        char nextChar = str.Substring(index, 1)[0];
        // Dashes and the like should stick to the word occuring before it.
        // Whitespace doesn't have to.
        if (char.IsWhiteSpace(nextChar))
        {
            parts.Add(word);
            parts.Add(nextChar.ToString());
        }
        else
        {
            parts.Add(word + nextChar);
        }

        startIndex = index + 1;
    }
}

It's fairly primitive - it splits on spaces, tabs and dashes.

It does make sure that dashes stick to the word before it
(so you don't end up with "stack
-overflow"),
though it doesn't favour moving small hyphenated words
to a new line rather than splitting them.

It does split up words if they are too long for a line.

It's also fairly culturally specific,
as I don't know much about the word-wrapping rules of other cultures.

回复收藏 0 原文

最好的自动换行算法？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（10）

关于作者

相关话题

热门标签

推荐作者

linfzu01

§对你不离不弃

可遇━不可求

枕梦

qq_3LFa8Q

JP

友情链接

最好的自动换行算法？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（10）

关于作者

相关话题

热门标签

推荐作者

linfzu01

§对你不离不弃

可遇━不可求

枕梦

qq_3LFa8Q

JP

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。