当前位置：文江博客话题详情

LaTeX regex quotes

如何将常规引号（即 ', "）转换为 LaTeX/TeX 引号（即 `', ``''）

发布于 2024-07-10 02:06:57 字数 487 浏览 8 评论 0 原文

给定一份用普通引号编写的文档，例如，

Ben said "buttons, dear sir".
I replied "Did you say 'buttons'?" to him.

如何通过适当的语义将此类内容转换为 LaTeX 引号。即

Ben said ``buttons, dear sir''.
I replied ``Did you say `buttons'?'' to him.

LaTeX 产生：

Ben said “buttons, dear sir”.
I replied “Did you say ‘buttons’?”

我的第一个想法是转向正则表达式。但是，我没有从 Google 或正则表达式库中得到任何“LaTeX 引用正则表达式”的点击，当然“TeX 引用正则表达式”似乎返回了太多。

谢谢。

原文

Given a document written with normal quotes, e.g.

Ben said "buttons, dear sir".
I replied "Did you say 'buttons'?" to him.

What ways can one turn these sort of things into LaTeX quotes, with the appropriate semantics. i.e.

Ben said ``buttons, dear sir''.
I replied ``Did you say `buttons'?'' to him.

So that LaTeX produces:

Ben said “buttons, dear sir”.
I replied “Did you say ‘buttons’?”

My first thought is to turn to a regex. However, I'm not getting any hits from Google or the regex libraries for "LaTeX quotes regular expression", and of course "TeX quotes regular expression" seems to return too many.

Thank you.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

病女 2024-07-17 02:06:57

一般来说，这个问题比看起来更难。

最简单的情况可以用正则表达式处理，但对于更一般的情况，您几乎肯定需要构建一个递归解析器：正则表达式仅在没有嵌套的情况下才起作用。

最大的问题将与识别未配对的单个 "'" 相关——就像收缩一样（"don 中的 "'" 't" 不应更改，和不应配对）。

让我们看看是否可以编写一个可用的 EBNF 描述：

input:       text+
text:        uquote|squote|dquote
squote       "'" text "'"
dquote       """ text """
uquote:      [contraction|.]+
contraction: [A-Za-z]+ "'" [A-Za-z]+

它仅限于单词中间有 "'" 的缩写。所有关联的操作都只会回显输入，但 squote 和 dquote 术语会根据需要替换引号。

我使用正则表达式，然后进行人工修复，以进行相当简单的一次性操作，但这对于正在进行的工作来说将是劳动密集型的。

In general, this problem is harder than it looks.

The simplest cases can be treated with regular expressions, but for more general situations you will almost certainly need to build a recursive parser: regular expression will only work if there is no nesting.

The big problem is going to be associated with identifying single "'"s that are not paired---as is contractions (the "'" in "don't" should not be changed, and should not be paired).

Lets see if we can write a usable EBNF description:

input:       text+
text:        uquote|squote|dquote
squote       "'" text "'"
dquote       """ text """
uquote:      [contraction|.]+
contraction: [A-Za-z]+ "'" [A-Za-z]+

which is limited to contractions that have the "'" in the middle of the word. All the associated action will just echo the input, except that the squote and dquote terms replace the quotes as appropriate.

I used regular expressions followed by human fix-ups for a fairly simple one-off, but that would be labor intensive for on-going work.

回复收藏 0 原文

可爱咩 2024-07-17 02:06:57

这是我用于 Latex 文档的 python 正则表达式：

'([ \w-]+)'", " `\\1'

有一个 python 脚本将正则表达式应用于 Latex 文件（此处）。大部分时间都有效。排版愉快！ :)

Here is the python regex that I use for my Latex documents:

'([ \w-]+)'", " `\\1'

There is a python script that applies the regex on a latex file (here). Works most of the time. Happy typesetting! :)

回复收藏 0 原文

做个少女永远怀春 2024-07-17 02:06:57

这里有一些 Perl 正则表达式替换，可能足以满足您的需要。

s/"(\w)/``$1/g;
s/'(\w)/`$1/g;
s/([\w\.?!])"/$1''/g;

该代码假定单引号或双引号后跟字母数字字符开始引号。此外，它还假定字母数字字符或标点符号后面的双引号结束引号。这些假设在大多数情况下可能都是正确的，但也可能有例外。

Here are some Perl regular expression substitutions that might be good enough for what you want to do.

s/"(\w)/``$1/g;
s/'(\w)/`$1/g;
s/([\w\.?!])"/$1''/g;

The code assumes that a single or double quote followed by an alphanumeric character begins a quote. Also, it assumes that a double quote following an alphanumeric character or punctuation mark ends a quote. These assumptions are probably true most of the time but there may be exceptions.

回复收藏 0 原文

流云如水 2024-07-17 02:06:57

感谢您的投入 - 很有帮助并且值得赞赏。

我也遇到过这个，来自 CPAN 的 Latex::Encode.pm：

    # A single or double quote before a word character, preceded
    # by start of line, whitespace or punctuation gets converted
    # to "`" or "``" respectively.

    $text =~ s{ ( ^ | [\s\p{IsPunct}] )( ['"] ) (?= \w ) }
              { $2 eq '"' ? "$1``" : "$1`" }mgxe;

    # A double quote preceded by a word or punctuation character
    # and followed by whitespace or end of line gets converted to
    # "''".  (Final single quotes are represented by themselves so
    # we don't need to worry about those.)

    $text =~ s{ (?<= [\w\p{IsPunct}] ) " (?= \s | $ ) }
              { "''" }mgxe

Thanks for the input - helpful and appreciated.

I've also come across this, from CPAN's Latex::Encode.pm:

    # A single or double quote before a word character, preceded
    # by start of line, whitespace or punctuation gets converted
    # to "`" or "``" respectively.

    $text =~ s{ ( ^ | [\s\p{IsPunct}] )( ['"] ) (?= \w ) }
              { $2 eq '"' ? "$1``" : "$1`" }mgxe;

    # A double quote preceded by a word or punctuation character
    # and followed by whitespace or end of line gets converted to
    # "''".  (Final single quotes are represented by themselves so
    # we don't need to worry about those.)

    $text =~ s{ (?<= [\w\p{IsPunct}] ) " (?= \s | $ ) }
              { "''" }mgxe

回复收藏 0 原文

落叶缤纷 2024-07-17 02:06:57

不要使用正则表达式来完成此类任务！

也许您可以从SmartyPants中获得一些灵感？

回复收藏 0 原文

身边 2024-07-17 02:06:57

我一直在寻找这个问题的答案，并决定今天学习一点 lisp。我将此 lisp 函数放入 ~/.emacs 文件中，然后使用 Mx tex-set-quotes 运行：

(defun tex-set-quotes ()  
  (interactive)  
  (latex-mode)  
  (while (search-forward "\"" nil t)  
   (replace-match "" nil t)  
   (tex-insert-quote nil)))

I was looking for an answer to this problem and decided to learn a little lisp today. I put this lisp function in my ~/.emacs file and then run with M-x tex-set-quotes:

(defun tex-set-quotes ()  
  (interactive)  
  (latex-mode)  
  (while (search-forward "\"" nil t)  
   (replace-match "" nil t)  
   (tex-insert-quote nil)))

回复收藏 0 原文