Latex - 对字符串中的每个字符应用操作
我正在使用 LaTeX,但在字符串操作方面遇到问题。 我想要对字符串的每个字符应用一个操作,特别是 我想用“\discretionary{}{}{}x”替换每个字符“x”。我想做 这是因为我有一个很长的字符串(DNA),我希望能够在 任何没有连字符的点。
因此,我想要一个名为“myDNA”的命令来为我做这件事,而不是 在每个字符后手动插入 \discretionary{}{}{} 。
这可能吗?我浏览过网络并没有太多帮助 关于这个主题的信息(至少没有我能理解的),我希望 你可以帮忙。
- 编辑 澄清一下: 我想在完成的文档中看到的是这样的:
the dna sequence is CTAAAGAAAACAGGACGATTAGATGAGCTTGAGAAAGCCATCACCACTCA AATACTAAATGTGTTACCATACCAAGCACTTGCTCTGAAATTTGGGGACTGAGTACACCAAATACGATAG ATCAGTGGGATACAACAGGCCTTTACAGCTTCTCTGAACAAACCAGGTCTCTTGATGGTCGTCTCCAGGT ATCCCATCGAAAAGGATTGCCACATGTTATATATTGCCGATTATGGCGCTGGCCTGATCTTCACAGTCAT CATGAACTCAAGGCAATTGAAAACTGCGAATATGCTTTTAATCTTAAAAAGGATGAAGTATGTGTAAACC CTTACCACTATCAGAGAGTTGAGACACCAGTTTTGCCTCCAGTATTAGTGCCCCGACACACCGAGATCCT AACAGAACTTCCGCCTCTGGATGACTATACTCACTCCATTCCAGAAAACACTAACTTCCCAGCAGGAATT
只是简单的换行符,没有任何连字符。 DNA 序列将是一 没有任何空格或任何东西的长字符串,但它可以在任何时候断开。 这就是为什么我的想法是在每个之后插入一个“\discretionary{}{}{}” 字符,以便它可以在任何点断开,而无需插入任何连字符。
I am using LaTeX and I have a problem concerning string manipulation.
I want to have an operation applied to every character of a string, specifically
I want to replace every character "x" with "\discretionary{}{}{}x". I want to do
this because I have a long string (DNA) which I want to be able to separate at
any point without hyphenation.
Thus I would like to have a command called "myDNA" that will do this for me instead of
inserting manually \discretionary{}{}{} after every character.
Is this possible? I have looked around the web and there wasnt much helpful
information on this topic (at least not any I could understand) and I hoped
that you could help.
--edit
To clarify:
What I want to see in the finished document is something like this:
the dna sequence is CTAAAGAAAACAGGACGATTAGATGAGCTTGAGAAAGCCATCACCACTCA AATACTAAATGTGTTACCATACCAAGCACTTGCTCTGAAATTTGGGGACTGAGTACACCAAATACGATAG ATCAGTGGGATACAACAGGCCTTTACAGCTTCTCTGAACAAACCAGGTCTCTTGATGGTCGTCTCCAGGT ATCCCATCGAAAAGGATTGCCACATGTTATATATTGCCGATTATGGCGCTGGCCTGATCTTCACAGTCAT CATGAACTCAAGGCAATTGAAAACTGCGAATATGCTTTTAATCTTAAAAAGGATGAAGTATGTGTAAACC CTTACCACTATCAGAGAGTTGAGACACCAGTTTTGCCTCCAGTATTAGTGCCCCGACACACCGAGATCCT AACAGAACTTCCGCCTCTGGATGACTATACTCACTCCATTCCAGAAAACACTAACTTCCCAGCAGGAATT
just plain linebreaks, without any hyphens. The DNA sequence will be one
long string without any spaces or anything but it can break at any point.
This is why my idea was to inesert a "\discretionary{}{}{}" after every
character, so that it can break at any point without inserting any hyphens.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
这需要一个字符串作为参数,并在每个字符后调用
\discretionary{}{}{}
。输入字符串在第一个美元符号处停止,因此您不应使用它。您可以将其称为 \hyphenateWholeString{CTAAAGAAAACAGGACG}。
如果您更喜欢(并且处于乳胶环境中),您也可以尝试 \hspace{0pt},而不是 \discretionary{}{}{}。为了对齐右边距,我认为您需要做一些更多的微调(但见下文)。当然,通过使用固定宽度的字体可以最大限度地减少这种影响。
修订:
史蒂夫建议使用
\hskip
对我来说听起来是一个非常好的主意,所以我做了一些更正。请注意,我已重命名\say
宏,并使其更有用,因为它现在实际上执行转换。 (但是,如果您从\transform
中删除\hskip
,您还需要删除主宏定义中的\unskip
。编辑:
还有seqsplit< /a> 包似乎是为打印 DNA 数据或长数字而设计的,它们还提供了一些更好的输出选项,所以也许这就是您正在寻找的......
This takes a string as an argument and calls
\discretionary{}{}{}
after each character. The input string stops at the first dollar sign, so you should not use that.You’d call it like \hyphenateWholeString{CTAAAGAAAACAGGACG}.
Instead of \discretionary{}{}{} you can also try \hspace{0pt}, if you like that more (and are in a latex environment). In order to align the right margin, I think you’d need to do some more fine tuning (but see below). The effect is of course minimised by using a font of fixed width.
Revision:
Steve’s suggestion of using
\hskip
sounds like a very good idea to me, so I made a few corrections. Note that I’ve renamed the\say
macro and made it more useful in that it now actually does the transformation. (However, if you remove the\hskip
from\transform
, you’ll also need to remove the\unskip
in the main macro definition.Edit:
There is also the seqsplit package which seems to be made for printing DNA data or long numbers. They also bring a few options for nicer output, so maybe that is what you’re looking for…
Debilski 的帖子绝对是一种可靠的方法,尽管
\say
不是必需的。下面是使用一些 LaTeX 内部快捷方式(\@gobble
和\@ifnextchar
)的更短方法:注意使用
\hskip 0pt plus 1pt 而不是
\discretionary
- 当我尝试你的示例时,我最终得到了参差不齐的边距,因为没有可拉伸性。\hskip
在每个字符之间添加一些可拉伸的粘合(然后\unskip
取消我们添加的额外粘合)。另请注意,LaTeX 样式约定“最终用户”宏都是小写的,而内部宏在某处有一个@
,这样用户就不会意外调用它们。如果你想弄清楚它是如何工作的,
\@gobble
只是吃掉它前面的任何东西(在本例中是$
,因为该分支仅在 < code>$ 是下一个字符)。要点是\sw@p
在“else”分支中仅给出一个参数,因此它将该参数与下一个字符(不是$
)。我们也可以编写\def\hyphenate#next#1{#1\hskip...\xHyphen@te}
并将其不带任何参数放在“else”分支中,但是(在我看来)\sw@p
更通用(而且我很惊讶它还没有在标准 LaTeX 中)。Debilski's post is definitely a solid way to do it, although the
\say
is not necessary. Here's a shorter way that makes use of some LaTeX internal shortcuts (\@gobble
and\@ifnextchar
):Note the use of
\hskip 0pt plus 1pt
instead of\discretionary
- when I tried your example I ended up with a ragged margin because there's no stretchability. The\hskip
adds some stretchable glue in between each character (and the\unskip
afterwards cancels the extra one we added). Also note the LaTeX style convention that "end user" macros are all lowercase, while internal macros have an@
in them somewhere so that users don't accidentally call them.If you want to figure out how this works,
\@gobble
just eats whatever's in front of it (in this case the$
, since that branch is only run when a$
is the next char). The main point is that\sw@p
is only given one argument in the "else" branch, so it swaps that argument with the next char (that isn't a$
). We could just as well have written\def\hyphenate#next#1{#1\hskip...\xHyphen@te}
and put that with no args in the "else" branch, but (in my opinion)\sw@p
is more general (and I'm surprised it's not in standard LaTeX already).CTAN 上有一个 contrib 包,用于处理排版DNA 序列。它的作用不仅仅是断行,例如,它还支持着色。我不确定是否可以获得您想要的输出,而且我在 DNA 序列排版领域没有经验,但是一长串是最可读的表示吗?
There is a contrib package on CTAN that deals with typesetting DNA sequences. It does a little more than just line-breaking, for example, it also supports colouring. I'm not sure if it is possible to get the output you are after though, and I have no experience in the DNA-sequence-typesetting area, but is one long string the most readable representation?
\newcommand{}{}
。像这样:\newcommand{\myDNA}{blah blah blah}
如果这不能满足您的要求,我建议:
2. 将字符串分解为最小的部分,然后使用
\newcommand
,然后按顺序调用新命令:\myDNA1 \myDNA2
。如果这仍然不起作用,您可能需要考虑编写一个 perl 脚本来满足您的字符串替换需求。
\newcommand{}{}
. Like this:\newcommand{\myDNA}{blah blah blah}
if that doesn't satisfy your requirements, I suggest:
2. Break the strings down to the smallest portion, then use the
\newcommand
and then call the new commands in sequence:\myDNA1 \myDNA2
.If that still doesn't work, you might want to look at writing a perl script to satisfy your string replacement needs.