就地游程长度解码?
给定一个游程长度编码的字符串,例如“A3B1C2D1E1”,就地解码该字符串。 编码字符串的答案是“AAABCCDE”。假设编码数组足够大以容纳解码字符串,即您可以假设数组大小 = MAX[length(encodedstirng),length(decodedstring)]。
这看起来并不简单,因为仅仅将 A3 解码为“AAA”就会导致覆盖原始字符串的“B”。
此外,不能假设解码的字符串总是大于编码的字符串。 例如:编码字符串 - 'A1B1',解码字符串为 'AB'。有什么想法吗?
并且它始终是字母数字对,即不会要求您将 0515
转换为 0000055555
Given a run length encoded string, say "A3B1C2D1E1", decode the string in-place.
The answer for the encoded string is "AAABCCDE". Assume that the encoded array is large enough to accommodate the decoded string, i.e. you may assume that the array size = MAX[length(encodedstirng),length(decodedstring)].
This does not seem trivial, since merely decoding A3 as 'AAA' will lead to over-writing 'B' of the original string.
Also, one cannot assume that the decoded string is always larger than the encoded string.
Eg: Encoded string - 'A1B1', Decoded string is 'AB'. Any thoughts?
And it will always be a letter-digit pair, i.e. you will not be asked to converted 0515
to 0000055555
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果我们还不知道,我们应该首先扫描,将数字相加,以计算解码字符串的长度。
它始终是字母数字对,因此您可以从字符串中删除
1
而不会产生任何混淆。变成
这里是一些 C++ 代码,用于从字符串中删除
1
(复杂度为 O(n))。现在,该字符串保证比最终解码的字符串短或长度相同。我们不能对原始字符串做出这样的声明,但我们可以对这个修改后的字符串做出这样的声明。
(现在一个可选的、简单的步骤是将每个
2
替换为前一个字母。A3BCCDE
,但我们不需要这样做)。现在我们可以从最后开始工作了。我们已经计算了解码字符串的长度,因此我们确切地知道最终字符将在哪里。我们可以简单地将字符从短字符串的末尾复制到它们的最终位置。
在从右到左的复制过程中,如果我们遇到一个数字,我们必须复制该数字左侧的字母的多个副本。您可能担心这可能会覆盖太多数据。但我们之前证明了我们的编码字符串或其任何子字符串永远不会比其相应的解码字符串长;这意味着总会有足够的空间。
If we don't already know, we should scan through first, adding up the digits, in order to calculate the length of the decoded string.
It will always be a letter-digit pair, hence you can delete the
1
s from the string without any confusion.becomes
Here is some code, in C++, to remove the
1
s from the string (O(n) complexity).Now, this string is guaranteed to be shorter than, or the same length as, the final decoded string. We can't make that claim about the original string, but we can make it about this modified string.
(An optional, trivial, step now is to replace every
2
with the previous letter.A3BCCDE
, but we don't need to do that).Now we can start working from the end. We have already calculated the length of the decoded string, and hence we know exactly where the final character will be. We can simply copy the characters from the end of our short string to their final location.
During this copy process from right-to-left, if we come across a digit, we must make multiple copies of the letter that is just to the left of the digit. You might be worried that this might risk overwriting too much data. But we proved earlier that our encoded string, or any substring thereof, will never be longer than its corresponding decoded string; this means that there will always be enough space.
以下解决方案是
O(n)
且就地解决。该算法不应该访问它不应该访问的内存,无论是读还是写。我做了一些调试,对于我提供的示例测试来说,它似乎是正确的。高级概述:
The following solution is
O(n)
and in-place. The algorithm should not access memory it shouldn't, both read and write. I did some debugging, and it appears correct to the sample tests I fed it.High level overview:
这是一个很模糊的问题,不过仔细想想也不是特别困难。正如您所说,将
A3
解码为AAA
并将其写入到位将覆盖字符B
和1
,那么为什么不先将它们沿着阵列移得更远呢?例如,一旦您读过
A3
,您就知道需要为一个额外字符腾出空间,如果是A4
,则需要两个,依此类推。为了实现这一点,您需要在数组中找到字符串的末尾(预先执行此操作并存储它的索引)。然后循环,将字符移动到新的位置:
开始:
A|3|B|1|C|2|||||||
有一个名为
end
的变量来存储索引 5,即最后一个非空白条目。您将读取第一对,使用名为
cursor
的变量来存储您当前的位置 - 因此在读取A
和3
后它将被设置为 1(带有 3 的插槽)。移动的伪代码:
var n = array[cursor] - 2; // n = 1,A3 中的 3,然后减去 2 以允许配对。
for(i = 结束; i > 光标; i++)
{
数组[i + n] = 数组[i];
这
会给你留下:
A|3|A|3|B|1|C|2|||||
现在
A
已经在那里了,所以现在你想从cursor
中存储的索引开始写入n + 1
A
:
A|A|A|A|B|1|C|2|||||
然后,您将指向下一对值的开头,准备再次开始。我意识到这个答案有一些漏洞,尽管这是故意的,因为这是一个面试问题!例如,在您指定
A1B1
的边缘情况下,您将需要一个不同的循环来向后而不是向前移动后续字符。This is a very vague question, though it's not particularly difficult if you think about it. As you say, decoding
A3
asAAA
and just writing it in place will overwrite the charsB
and1
, so why not just move those farther along the array first?For instance, once you've read
A3
, you know that you need to make space for one extra character, if it wasA4
you'd need two, and so on. To achieve this you'd find the end of the string in the array (do this upfront and store it's index).Then loop though, moving the characters to their new slots:
To start:
A|3|B|1|C|2|||||||
Have a variable called
end
storing the index 5, i.e. the last, non-blank, entry.You'd read in the first pair, using a variable called
cursor
to store your current position - so after reading in theA
and the3
it would be set to 1 (the slot with the 3).Pseudocode for the move:
var n = array[cursor] - 2; // n = 1, the 3 from A3, and then minus 2 to allow for the pair.
for(i = end; i > cursor; i++)
{
array[i + n] = array[i];
}
This would leave you with:
A|3|A|3|B|1|C|2|||||
Now the
A
is there once already, so now you want to writen + 1
A
's starting at the index stored incursor
:Giving:
A|A|A|A|B|1|C|2|||||
Then you're pointing at the start of the next pair of values, ready to go again. I realise there are some holes in this answer, though that is intentional as it's an interview question! For instance, in the edge cases you specified
A1B1
, you'll need a different loop to move subsequent characters backwards rather than forwards.接下来是另一个 O(n^2) 解决方案。
鉴于答案的复杂性没有限制,这个简单的解决方案似乎工作得很好。
其中:
可用空间大小是数组中剩余的空元素数量。
可扩展元素是这样的元素:
要点是,在从游程码到达扩展字符串的过程中,每一步至少有
一个可以扩展的元素(易于证明)。
Another O(n^2) solution follows.
Given that there is no limit on the complexity of the answer, this simple solution seems to work perfectly.
Where:
Free space size is the number of empty elements left in the array.
An expandable element is an element that:
The point is that in the process of reaching from the run-length code to the expanded string, at each step, there is at least
one element that can be expanded (easy to prove).